Figure 3. Geography of the language sample.
n this way, features which contain much information about the genetic affiliation of languages receive a high weight (and vice versa). This decision was motivated by the hope to extract a deep genetic signal from the WALS data.
The resulting cluster map (see Fig. 2) shows a circular structure. There are two large clusters of languages at opposite sides of the circle (shown in gray and black), and a third, smaller cluster (shown in white) in between. The other languages are arranged somewhere on the circle between these three regions without forming distinct groups.
The map on Fig 3 shows the geographic distribution of respective languages (colors on the map match the colors on Fig. 2).2
A manual inspection of this outcome reveals that this cluster map captures a strong typological and a somewhat weaker areal signal, but no usable information about genetic affiliations. The cluster shown in grey contains languages with head-initial basic word order (SVO or VSO), small phoneme inventories, and lack of case marking. The black cluster, on the other hand, is characterized by head-final word order, nominative-accusative alignment both for pronouns and full NPs, a large number of cases (mostly more than 6) and predominant dependent marking. Figure 2 shows that these groupings are neither genetically nor areally motivated.
That perfectly well agrees with the findings of Greenhill et al (2011) and Donohue et al (2011): The distribution of morphosyntactic features does not sufficiently well reflect genetic relationships between languages.
I
Share with your friends: |