Quantitative comparative linguistics

eLinguistics.net's Mission & Statement:
  -> Making language relatedness easily perceivable with a simple quantification.
  -> Setting up a 100% automated language classification.
New! Interactive IE-Tree (Jan. 2017) | Language families timelines (Dec. 2016)

This blog presents a completely computerized model for comparative linguistics. The quantification of language relationships is based on basic vocabulary and generates an automated language classification into families and subfamilies.

You can compare languages in the calculator and get values for the relatedness (genetic proximity) between languages. An evolutionary tree summarizes all results of the distances between 220 languages.

This comparative linguistics approach takes you to a short digital trip in the history of languages... You will see how 18 words (when carefully chosen) can deliver values which are enough to calculate a distance between two and more languages and represent it on a tree. The distances are expressed as values between 0 (the nearest distance - so the same language) to 100 (biggest possible distance). Play with these values in the calculator! You will recognize proximities you can feel by yourself if you know some of the languages used in this study...

A few examples to illustrate the idea behind this comparative linguistics project: the system's assessment for the distance from 0 to 100 between following languages is:
  • English to German: 31
  • Dutch to German: 19
  • Danish to Norwegian: 4
  • Russian to German: 61
  • Russian to Polish: 1
  • Arabic to Hebrew: 20
  • Arabic to Maltese: 20
  • Finnish to Hungarian: 55
  • Finnish to Estonian: 11
Tower of Babel

Try out a comparison:

Or browse on world map:

Facebook Google LinkedIn Twitter VK

This gives you a first idea what this site is about. With the few examples above, you can conclude that the degree of proximity between Russian and German (both Indo-European languages) is quite the same as the degree of proximity between Finnish and Hungarian (both Finno-Ugric).

Once we can get such values, we can generate a matrix. like this one, summing up genetic distances between some languages (values from the few examples above have a green background in the matrix):

Language matrix
...and finally, out of this distance matrix, we generate a rooted evolutionary tree - using the same system - and in fact the same software - like in genetics and biology (details under Resources):
Language evolitionary tree

Blog author: