A model for the evolution of languages

Rafael Izbicki, Joseph B. Kadane

Abstract

Historical Linguistics studies language change over time. If a group of languages derives from changes to a common ancestor language (proto-language) then they are said to be related. Whenever there exists a lack of written records for an ancestor language, a relevant question in Historical Linguistics is to determine whether two languages are related. The gold standard for finding these relationships is the Comparative Method. Despite the success of the Comparative Method in finding language relationships, it suffers from at least two limitations. First, the Comparative Method involves the manual comparison of various features from a group of languages. Second, the Comparative Method doesn’t provide a numerical measure of evidence for how much the database under consideration corroborates an hypothesis. Given the above limitations, the field of Computational Historical Linguistics is presented as a complement to the Comparative Method. This field has experienced a recent expansion with the adaptation of methods from biological phylogenetics. Nevertheless, there is debate whether the evolutionary models used in phylogenetics also incorporate valid linguistical assumptions. In this tutorial, I’ll present a model for infering the evolution of the phonology of languages. A relevant innovation of this model is that it captures the regularity of sound changes. In order to compute the probability of linguistic hypotheses regarding language relationships new algorithms were developed. The main problem that this algorithm overcomes is that it efficiently explores the possible regular sound changes, mutations in languages that simultaneously affect several words The algorithm is based on a new variant of Nested Sequential Monte Carlo that is used to explore the large space of language relationships and regular sound changes.

Date

Jun 1, 2017 1:00 PM — 3:00 PM

Event

MaxEnt 2017

Phylology Historical Linguistics Computational Linguistics