Abstract
Languages, like genes, provide vital clues about human history1,2. The origin of the Indo-European language family is “the most intensively studied, yet still most recalcitrant, problem of historical linguistics”3. Numerous genetic studies of Indo-European origins have also produced inconclusive results4,5,6. Here we analyse linguistic data using computational methods derived from evolutionary biology. We test two theories of Indo-European origin: the ‘Kurgan expansion’ and the ‘Anatolian farming’ hypotheses. The Kurgan theory centres on possible archaeological evidence for an expansion into Europe and the Near East by Kurgan horsemen beginning in the sixth millennium BP7,8. In contrast, the Anatolian theory claims that Indo-European languages expanded with the spread of agriculture from Anatolia around 8,000–9,500 years bp9. In striking agreement with the Anatolian hypothesis, our analysis of a matrix of 87 languages with 2,449 lexical items produced an estimated age range for the initial Indo-European divergence of between 7,800 and 9,800 years bp. These results were robust to changes in coding procedures, calibration points, rooting of the trees and priors in the bayesian analysis.
This is a preview of subscription content, access via your institution
Access options
Subscribe to this journal
Receive 51 print issues and online access
$199.00 per year
only $3.90 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout

Similar content being viewed by others
References
Pagel, M. in Time Depth in Historical Linguistics (eds Renfrew, C., McMahon, A. & Trask, L.) 189–207 (The McDonald Institute for Archaeological Research, Cambridge, UK, 2000)
Gray, R. D. & Jordan, F. M. Language trees support the express-train sequence of Austronesian expansion. Nature 405, 1052–1055 (2000)
Diamond, J. & Bellwood, P. Farmers and their languages: the first expansions. Science 300, 597–603 (2003)
Richards, M. et al. Tracing European founder lineage in the Near Eastern mtDNA pool. Am. J. Hum. Genet. 67, 1251–1276 (2000)
Semoni, O. et al. The genetic legacy of Paleolithic Homo sapiens in extant Europeans: a Y chromosome perspective. Science 290, 1155–1159 (2000)
Chikhi, L., Nichols, R. A., Barbujani, G. & Beaumont, M. A. Y genetic data support the Neolithic Demic Diffusion Model. Proc. Natl Acad. Sci. USA 99, 11008–11013 (2002)
Gimbutas, M. The beginning of the Bronze Age in Europe and the Indo-Europeans 3500–2500 B.C. J. Indo-Eur. Stud. 1, 163–214 (1973)
Mallory, J. P. Search of the Indo-Europeans: Languages, Archaeology and Myth (Thames & Hudson, London, 1989)
Renfrew, C. in Time Depth in Historical Linguistics (eds Renfrew, C., McMahon, A. & Trask, L.) 413–439 (The McDonald Institute for Archaeological Research, Cambridge, UK, 2000)
Swadesh, M. Lexico-statistic dating of prehistoric ethnic contacts. Proc. Am. Phil. Soc. 96, 453–463 (1952)
Bergsland, K. & Vogt, H. On the validity of glottochronology. Curr. Anthropol. 3, 115–153 (1962)
Blust, R. in Time Depth in Historical Linguistics (eds Renfrew, C., McMahon, A. & Trask, L.) 311–332 (The McDonald Institute for Archaeological Research, Cambridge, UK, 2000)
Steel, M. A., Hendy, M. D. & Penny, D. Loss of information in genetic distances. Nature 333, 494–495 (1988)
Swofford, D. L., Olsen, G. J., Waddell, P. J. & Hillis, D. M. in Molecular Systematics (eds Hillis, D., Moritz, C. & Mable, B. K.) 407–514 (Sinauer Associates, Inc, Sunderland, Massachusetts, 1996)
Dixon, R. M. W. The Rise and Fall of Language (Cambridge Univ. Press, Cambridge, UK, 1997)
Metropolis, N., Rosenbluth, A. W., Rosenbluth, M. N., Teller, A. H. & Teller, E. Equations of state calculations by fast computing machines. J. Chem. Phys. 21, 1087–1091 (1953)
Huelsenbeck, J. P., Ronquist, F., Nielsen, R. & Bollback, J. P. Bayesian inference of phylogeny and its impact on evolutionary biology. Science 294, 2310–2314 (2001)
Huson, D. H. SplitsTree: analyzing and visualizing evolutionary data. Bioinformatics 14, 68–73 (1998)
Sanderson, M. R8s, Analysis of Rates of Evolution,Version 1.50 (Univ. California, Davis, 2002)
Dyen, I., Kruskal, J. B. & Black, P. FILE IE-DATA1. Available at 〈http://www.ntu.edu.au/education/langs/ielex/IE-DATA1〉 (1997).
Sanderson, M. J. & Donoghue, M. J. Patterns of variation in levels of homoplasy. Evolution 43, 1781–1795 (1989)
Gamkrelidze, T. V. & Ivanov, V. V. Trends in Linguistics 80: Indo-European and the Indo-Europeans (Mouton de Gruyter, Berlin, 1995)
Rexova, K., Frynta, D. & Zrzavy, J. Cladistic analysis of languages: Indo-European classification based on lexicostatistical data. Cladistics 19, 120–127 (2003)
Ringe, D., Warnow, T. & Taylor, A. IndoEuropean and computational cladistics. Trans. Philol. Soc. 100, 59–129 (2002)
Gkiasta, M., Russell, T., Shennan, S. & Steele, J. Neolithic transition in Europe: the radiocarbon record revisited. Antiquity 77, 45–62 (2003)
Cavalli-Sforza, L. L., Menozzi, P. & Piazza, A. The History and Geography of Human Genes (Princeton Univ. Press, Princeton, 1994)
Holden, C. J. Bantu language trees reflect the spread of farming across sub-Saharan Africa: a maximum-parsimony analysis. Proc. R. Soc. Lond. B 269, 793–799 (2002)
Barbrook, A. C., Howe, C. J., Blake, N. & Robinson, P. The phylogeny of The Canterbury Tales. Nature 394, 839 (1998)
McMahon, A. & McMahon, R. Finding families: Quantitative methods in language classification. Trans. Philol. Soc. 101, 7–55 (2003)
Huelsenbeck, J. P. & Ronquist, F. MRBAYES: Bayesian inference of phylogeny. Bioinformatics 17, 754–755 (2001)
Acknowledgements
We thank S. Allan, L. Campbell, L. Chikhi, M. Corballis, N. Gavey, S. Greenhill, J. Hamm, J. Huelsenbeck, G. Nichols, A. Rodrigo, F. Ronquist, M. Sanderson and S. Shennan for useful advice and/or comments on the manuscript.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Competing interests
The authors declare that they have no competing financial interests.
Supplementary information
Rights and permissions
About this article
Cite this article
Gray, R., Atkinson, Q. Language-tree divergence times support the Anatolian theory of Indo-European origin. Nature 426, 435–439 (2003). https://doi.org/10.1038/nature02029
Received:
Accepted:
Issue Date:
DOI: https://doi.org/10.1038/nature02029
This article is cited by
-
Inferring language dispersal patterns with velocity field estimation
Nature Communications (2024)
-
Reliability models in cultural phylogenetics
Biology & Philosophy (2023)
-
Valence-dependent mutation in lexical evolution
Nature Human Behaviour (2022)
-
An evolutionary view of institutional complexity
Journal of Evolutionary Economics (2022)
-
Analysis of evolution of movies using massive movie-tag meme network data
Journal of the Korean Physical Society (2022)