Towards Automated Language Classification: a clustering Approach Armin Buch, David Erschler, Gerhard Jäger, and Andrei Lupas



Download 1.74 Mb.
Page8/15
Date05.05.2018
Size1.74 Mb.
#48097
1   ...   4   5   6   7   8   9   10   11   ...   15


Thus, the similarity is 1 if the words are identical and 0 if they are totally different.



Now consider the similarity value for a specific potential cognate pair , . (Now these are two words with a same meaning!) By itself, this value is not very telling. What we want to estimate, is how likely it is for a random pair of words from the two languages to have the same (or higher) similarity value. We estimate this probability, , as the number of pairs with the similarity greater or equal to , divided by the overall number of pairs.

Download 1.74 Mb.

Share with your friends:
1   ...   4   5   6   7   8   9   10   11   ...   15




The database is protected by copyright ©ininet.org 2024
send message

    Main page