Towards Automated Language Classification: a clustering Approach Armin Buch, David Erschler, Gerhard Jäger, and Andrei Lupas



Download 1.74 Mb.
Page3/15
Date05.05.2018
Size1.74 Mb.
#48097
1   2   3   4   5   6   7   8   9   ...   15
Figure 1. Clustering of the DKB database.

or 133 languages that contain sufficiently many feature values in WALS, we computed a pairwise similarity matrix. The similarity of two languages is defined as the sum of weights of all WALS features where both languages have defined but different values. The weight w(f) of a feature f is defined as the mutual information between the value of this feature and the language family affiliation (as listed in the WALS database) of the languages in question.

I




Figure 2. CLANS clustering of WALS.





Download 1.74 Mb.

Share with your friends:
1   2   3   4   5   6   7   8   9   ...   15




The database is protected by copyright ©ininet.org 2024
send message

    Main page