Bayesian phylolinguistics infers the internal structure and the time-depth of the Turkic language family

Abstract Despite more than 200 years of research, the internal structure of the Turkic language family remains subject to debate. Classifications of Turkic so far are based on both classical historical–comparative linguistic and distance-based quantitative approaches. Although these studies yield an internal structure of the Turkic family, they cannot give us an understanding of the statistical robustness of the proposed branches, nor are they capable of reliably inferring absolute divergence dates, without assuming constant rates of change. Here we use computational Bayesian phylogenetic methods to build a phylogeny of the Turkic languages, express the reliability of the proposed branches in terms of probability, and estimate the time-depth of the family within credibility intervals. To this end, we collect a new dataset of 254 basic vocabulary items for thirty-two Turkic language varieties based on the recently introduced Leipzig–Jakarta list. Our application of Bayesian phylogenetic inference on lexical data of the Turkic languages is unprecedented. The resulting phylogenetic tree supports a binary structure for Turkic and replicates most of the conventional sub-branches in the Common Turkic branch. We calculate the robustness of the inferences for subgroups and individual languages whose position in the tree seems to be debatable. We infer the time-depth of the Turkic family at around 2100 years before present, thus providing a reliable quantitative basis for previous estimates based on classical historical linguistics and lexicostatistics.

Download Full-text

A Bayesian approach to the classification of Tungusic languages

Diachronica ◽

10.1075/dia.20010.osk ◽

2021 ◽

Author(s):

Sofia Oskolskaya ◽

Ezequiel Koile ◽

Martine Robbeets

Keyword(s):

Russian Far East ◽

Far East ◽

Posterior Density ◽

Phylogenetic Methods ◽

12Th Century ◽

Quantitative Basis ◽

Divergence Dates ◽

Basic Vocabulary ◽

Historical Comparative ◽

Highest Posterior Density

AbstractThe Tungusic language family is comprised of languages spoken in Siberia, the Russian Far East, Northeast China and Xinjiang. There is a general consensus that these languages are genealogically related and descend from a common ancestral language. Nevertheless, there is considerable disagreement with regard to the internal structure of the Tungusic family and the time depth of its separation into daughter languages. Here we use computational Bayesian phylogenetic methods to generate a phylogeny of Tungusic languages and estimate the time-depth of the family. Our analysis is based on the recently introduced Leipzig-Jakarta-Jena list, a dataset of 254 basic vocabulary items collected for 21 Tungusic doculects. Our results are consistent with two basic classifications previously proposed in the literature, notably a Manchu-Tungusic classification, in which the break-up of Jurchenic constitutes the first split in the tree, as well as a North-South classification, which includes a Jurchenic-Nanaic and an Orochic-Ewenic branch. In addition, we obtain a time-depth for the age of Proto-Tungusic between the 8th century BC and the 12th century AD (95% highest posterior density interval). Previous classifications of Tungusic were based on both classical historical comparative linguistic and lexicostatistic approaches, but the application of Bayesian phylogenetic methods to the Tungusic languages has not so far been attempted. In contrast to previous approaches, our Bayesian analysis adds an understanding of the statistical robustness of the proposed branches and infers absolute divergence dates, allowing variation of rates of change across branches and cognate sets. In this way, our research provides a reliable quantitative basis for previous estimates based on classical historical linguistic and lexicostatistic approaches.

Download Full-text

The classification of the Transeurasian languages

The Oxford Guide to the Transeurasian Languages ◽

10.1093/oso/9780198804628.003.0004 ◽

2020 ◽

pp. 31-39

Author(s):

Martine Robbeets

Keyword(s):

Internal Structure ◽

Comparative Method ◽

Family Tree ◽

Comparative Linguistics ◽

Phylogenetic Methods ◽

The Family ◽

Quantitative Basis ◽

Competing Hypotheses ◽

Historical Comparative

Even if the hypothesis of Transeurasian affiliation is gradually gaining acceptance, supporters do not coincide on the internal structure of the family. Over the last century, a range of different classifications has been proposed. While these proposals show some remarkable overlap, the position of the Tungusic branch in the family tree remains a recurrent issue. Here the best supportable tree for the Transeurasian family is inferred, notably a binary topology with a Japano-Koreanic and an Altaic branch, in which Tungusic is the first to split off from the Altaic branch. To this end, the power of classical historical-comparative linguistics is combined with computational Bayesian phylogenetic methods. In this way, a quantitative basis is introduced to test various competing hypotheses with regard to the internal structure of the Transeurasian family and to solve uncertainties associated with the application of the classical historical-comparative method.

Download Full-text

Phonetic Laws Related to Vowels in Dialects

10.21203/rs.3.rs-1212552/v1 ◽

2021 ◽

Author(s):

Ibrokhim Omonovich Darveshov

Keyword(s):

Research Work ◽

Initial Point ◽

Point Of View ◽

Turkic Languages ◽

Key Policy ◽

Historical Comparative ◽

Areal Linguistics ◽

Historical Reflection ◽

Linguistic Methods ◽

Comprehensive Study

Abstract Today, in carrying out the reforms for the development of our society, there are created full opportunities and conditions for the fulfillment of the tasks set before the Uzbek linguistics, purposeful research work is carried out on the issue of comprehensive study of our language. At the same time, the study of the features of Uzbek dialects, relying on the theoretical bases of areal linguistic research, is defined as one of the priority directions in the historical-comparative and ethnolinguistic aspects.This sphere consists of imperfect, simple descriptive and illustrative aspects, indefinite places need to learn and fill on the basis of new views, from a mental point of view. The article gives an idea of the peculiarities of the Namangan Kipchak and Karluk dialects, the historical genesis of the system of vowels, the issues areal of their prevalence and application. The phonetic-phonological linguistic character of the dialect is a comparative-historical reflection of the processes of events of features and laws. In its turn, there are described opinions about the events of umlaut in the Turkic dialects of synharmonism and Karluk dialects in Kipchak dialects related to the vowels in the Turkic languages. Key policy insights.The study through areal-typological and areal-linguistic methods, which gave Mahmud Kashgariy in Turkic languages, the initial point of any linguistic theory and conceptions, the study of dialects, their specific features, is still one of the important issues today. The emergence of areal linguistics has opened up a wide way to evaluate new issues and concepts in the field of dialectology, to solve them in new ways. Relying on the theoretical basis of dialect and slang areas in the holistic study of the language system, the fact that historical-comparative and ethnologic research is defined as one of the priority areas imposes new responsibilities on Uzbek linguistics and Uzbek linguists.

Download Full-text

A Bayesian approach to the classification of the Turkic languages

The Oxford Guide to the Transeurasian Languages ◽

10.1093/oso/9780198804628.003.0010 ◽

2020 ◽

pp. 114-124

Author(s):

Alexander Savelyev

Keyword(s):

Bayesian Approach ◽

Controversial Issue ◽

Language Family ◽

Independent Verification ◽

Linguistic History ◽

The Family ◽

Turkic Languages ◽

The Common ◽

Widespread View

Despite more than 150 years of research, the internal structure of the Turkic language family remains a controversial issue. In this study, the Bayesian phylogenetic approach is employed in order to provide an independent verification of the contemporary views on Turkic linguistic history. The data underlying the study are Turkic basic vocabularies, which are resistant to replacement and likely to reflect the genealogical relationships among the Turkic languages. The method tested in the chapter is based on the strict clock model of evolution, which assumes that relevant changes occur at the same rate at every branch of the family. This study supports the widespread view that the binary split between Bulgharic and Common Turkic was the earliest split in the Turkic family. The model further replicates most of the conventional subgroups within the Common Turkic branch. Based on a Bayesian analysis, the time depth of Proto-Turkic is estimated to be around 2,119 years BP, which is in accordance with the traditional estimates of 2,000–2,500 years BP.

Download Full-text

A brief response to Fellner and Hill’s “Word families, allofams, and the comparative method”

Cahiers de linguistique - Asie orientale ◽

10.1163/19606028-04802003 ◽

2019 ◽

Vol 48 (2) ◽

pp. 125-141 ◽

Cited By ~ 1

Author(s):

Zev HANDEL

Keyword(s):

Atlantic

The Oxford Handbook of African Languages ◽

10.1093/oxfordhb/9780199609895.013.44 ◽

2020 ◽

pp. 160-173

Author(s):

Friederike Lüpke

Keyword(s):

Complex Systems ◽

Language Contact ◽

Comparative Method ◽

Atlantic Coast ◽

Genetic Group ◽

Language Family ◽

The Status ◽

Noun Class ◽

Lexical Data

Atlantic is one of the controversial branches of the Niger-Congo language family. Both its validity as a genetic group and its internal classification are far from being settled. The longstanding debate on the status and structure of Atlantic cannot be closed before the descriptive situation of these languages allows for sufficient and reliable lexical data; before attempts at applying the comparative method have been made; and before the extensive role of language contact for shaping the languages in question is taken into account. Although no typological feature or feature combinations characterizes the group as a whole, several features are considered typical for Atlantic languages, including noun class systems, consonant mutation, and complex systems of verbal derivation, which have been used to justify suggested genealogical groupings. Atlantic languages, with the exception of Fula, are attested in an area from Liberia to Senegal, stretching from the Atlantic coast to the hinterland.

Download Full-text

Language contacts of Azerbaijani and Kazakh turkic languages (on the basic of Azerbaijanian dialectologist, academician Mammadaga Shiraliyev’s creative works)

Turkic Studies Journal ◽

10.32523/tsj.02-2019/2-4 ◽

2019 ◽

Vol 1 (2) ◽

pp. 34-40

Author(s):

M. Huseynova ◽

Keyword(s):

Grammatical Structure ◽

Geographical Differences ◽

Native Languages ◽

Language Varieties ◽

The Past ◽

Relevant Today ◽

Modern Age ◽

Turkic Languages ◽

Phonetic Features ◽

The One

Turkic literary languages or dialects have very long common roots, and despite various political and geographical differences, these native languages have preserved their ancient roots, vocabulary, grammatical structure, and phonetic features. Even today, the carriers of these languages can easily understand each other in a common language. Dialectological studies while speaking of the phonetic features of Turkish dialects and language varieties traditionally refer to both vowel and consonant displacements and phonetic laws. The works, monographs and articles written by scholars of Turkic peoples are also mentioned when we talk about the idea of forming a common Turkic language in our modern age. M. Shiraliyev correctly points out that on the one hand our ancient written monuments, works of our writers of the past, and on the other hand materials of the Turkic languages, help to identify phonetic events and to refine their forms. In this regard, the works of the great Azerbaijani dialectologist and Turkish scientist, academician Mammadaga Shirali oglu Shiraliyev are of great importance. The researcher made a number of words that are typical for the Turkic languages and dialects, giving a comparatively phonetic explanation. The scientist has researched specific phonetic phenomena and laws in the dialects and varieties of the Azerbaijani language and has also touched upon the integration of Kazakh language and dialects when it comes to location. M. Shiraliyev did not exclude the specific features of the numerous words used in dialects and dialects of the Azerbaijani language, compared with the literary language and dialects of the Kazakh Turkish, revealed the similarities and differences with the skills of a true dialectologist. Here, too, there were positive results.It is also clear that not only Azerbaijani dialectologists, linguists, but also world Turkology scholars have benefited and will benefit from the work of academician M.Shiraliyev. M.Shiraliyev’s research on the phonetic, morphological and lexical features of dialects and dialects of the Azerbaijani and Kazakh languages is still relevant today

Download Full-text

Sogdian Archetypes in Chatkal Oronymia as an Ancient Substrate of Toponyms of Central Asia

Bulletin of Science and Practice ◽

10.33619/2414-2948/74/45 ◽

2022 ◽

Vol 8 (1) ◽

pp. 304-307

Author(s):

D. Kenzhebaev ◽

D. Abdullaev

Keyword(s):

Central Asia ◽

Language Family ◽

European Language ◽

Important Condition ◽

Mountain Ranges ◽

Turkic Languages ◽

History Of ◽

Linguistic Structures

The relevance of studying the oronymy of the Chatkal area of Kyrgyzstan is associated with the fact that many mountain names are well preserved in sound and semantic terms. This factor is an important condition for studying the retrospective of any language, including the Turkic languages too. Also, in the sound shells of mountain names, despite their deep antiquity, long disappeared elements of languages that are in contact in the same linguistic area in the deep past have survived. As part of the mountain names of the Chatkal zone of the mountain ranges of Kyrgyzstan, individual morphemes and sounds of the ancient Turkic languages have been preserved, and at the same time, East Iranian topolexemes of the Indo-European language family are found. At the same time, the structure of oronyms to some extent shows the evolution of the language as a whole and of each tier in it - in particular. The history of the Kyrgyz language and its interaction with various systemic linguistic structures are reflected in the stratigraphy of oronymy. This allows you to explore the historical plan of the Turkic languages in more depth in the diachronic sense.

Download Full-text

Revisiting Phonotactic Generalizations in Australian Languages

Proceedings of the Annual Meetings on Phonology ◽

10.3765/amp.v1i1.17 ◽

2014 ◽

Vol 1 (1) ◽

Cited By ~ 3

Author(s):

Emily Gasser ◽

Claire Bowern

Keyword(s):

Optimality Theory ◽

Phylogenetic Diversity ◽

Large Scale ◽

Vowel Harmony ◽

Language Family ◽

Large Scale Survey ◽

Lexical Items ◽

Lexical Data ◽

Australian Languages

Australian languages are famous for their uniform phonological systems. Cross-linguistic surveys of (or including) Australian languages have reinforced this view of Australian inventories and phonotactics. Such uniformity is surprising and unusual given the phylogenetic diversity in the country (28 phylic families). Moreover, although Australianists have assumed that uniformity in phonemic inventory is coupled with unity in phonotactics, this has not been tested. Here we statistically test the generalizations current in the literature on Australian languages by deriving inventory information from lexical data (rather than grammatical descriptions). We utilize a comparative database of lexical items from predominantly Pama-Nyungan languages in order to test published generalizations about phoneme inventories, phonotactics, and other phenomena (such as root internal vowel harmony patterns). By using lexical materials to derive inventories and segment frequencies, we are able to assemble a nuanced picture of the diversity of systems present among the languages. Inventory studies confirm, to some degree, the impression of uniformity. However, phoneme frequencies vary substantially across the sample even among languages with similar inventory types. This work is of particular importance to phonological typologies of Australian languages, but it has implications for wider phonological theory as well. The survey used here is the largest comparative database of a single language family. Rarely do we have the opportunity to conduct a large-scale typological investigation of related languages in this way. We also make a contribution to the role of typology in Optimality Theory. A large-scale survey of markedness patterns (in related languages) allows us to study occurring and non-occurring grammars. Finally, we can investigate the predictions of competing theories.

Download Full-text

A Bayesian Phylogenetic Classification of Tupí-Guaraní

LIAMES Línguas Indígenas Americanas ◽

10.20396/liames.v15i2.8642301 ◽

2015 ◽

Vol 15 (2) ◽

pp. 193 ◽

Cited By ~ 4

Author(s):

Lev Michael ◽

Natalia Chousou-Polydouri ◽

Keith Bartolomei ◽

Erin Donnelly ◽

Sérgio Meira ◽

...

Keyword(s):

Phylogenetic Analysis ◽

Internal Structure ◽

Character Table ◽

Loss Model ◽

Phylogenetic Classification ◽

Bayesian Phylogenetic Analysis ◽

Binary Coding ◽

Semantic Shift ◽

Lexical Data

This paper presents an internal classification of Tupí-Guaraní based on lexical data from 30 Tupí-Guaraní languages and 2 non-Tupí-Guaraní Tupian languages, Awetí and Mawé. A Bayesian phylogenetic analysis using a generalized binary cognate gain and loss model was carried out on a character table based on the binary coding of cognate sets, which were formed with attention to semantic shift. The classification shows greater internal structure than previous ones, but is congruent with them in several ways.

Download Full-text