Dataset Information

Predictions of native American population structure using linguistic covariates in a hidden regression framework.

ABSTRACT: BACKGROUND: The mainland of the Americas is home to a remarkable diversity of languages, and the relationships between genes and languages have attracted considerable attention in the past. Here we investigate to which extent geography and languages can predict the genetic structure of Native American populations. METHODOLOGY/PRINCIPAL FINDINGS: Our approach is based on a Bayesian latent cluster regression model in which cluster membership is explained by geographic and linguistic covariates. After correcting for geographic effects, we find that the inclusion of linguistic information improves the prediction of individual membership to genetic clusters. We further compare the predictive power of Greenberg's and The Ethnologue classifications of Amerindian languages. We report that The Ethnologue classification provides a better genetic proxy than Greenberg's classification at the stock and at the group levels. Although high predictive values can be achieved from The Ethnologue classification, we nevertheless emphasize that Choco, Chibchan and Tupi linguistic families do not exhibit a univocal correspondence with genetic clusters. CONCLUSIONS/SIGNIFICANCE: The Bayesian latent class regression model described here is efficient at predicting population genetic structure using geographic and linguistic information in Native American populations.

SUBMITTER: Jay F

PROVIDER: S-EPMC3031544 | biostudies-other | 2011

REPOSITORIES: biostudies-other

ACCESS DATA

Publications

Predictions of native American population structure using linguistic covariates in a hidden regression framework.

Jay Flora F François Olivier O Blum Michael G B MG

PloS one 20110131 1

<h4>Background</h4>The mainland of the Americas is home to a remarkable diversity of languages, and the relationships between genes and languages have attracted considerable attention in the past. Here we investigate to which extent geography and languages can predict the genetic structure of Native American populations.<h4>Methodology/principal findings</h4>Our approach is based on a Bayesian latent cluster regression model in which cluster membership is explained by geographic and linguistic c ...[more]

PMID: 21305006

Dataset Information

Predictions of native American population structure using linguistic covariates in a hidden regression framework.

Publications

Predictions of native American population structure using linguistic covariates in a hidden regression framework.

Similar Datasets

OmicsDI is part of the ELIXIR infrastructure

Tweets

Similar Datasets

Native American Population Genetics
| PRJEB2060 | ENA

Gene flow across linguistic boundaries in Native North American populations.
| S-EPMC547813 | biostudies-literature

Hierarchical structure guides rapid linguistic predictions during naturalistic listening.
| S-EPMC6334990 | biostudies-literature

Reconstructing Native American population history.
| S-EPMC3615710 | biostudies-literature

Communicative predictions can overrule linguistic priors.
| S-EPMC5730607 | biostudies-literature

Understanding the Hidden Complexity of Latin American Population Isolates.
| S-EPMC6218714 | biostudies-literature

Complementing machine learning-based structure predictions with native mass spectrometry.
| S-EPMC9123603 | biostudies-literature

Native American admixture in the Quebec founder population.
| S-EPMC3680396 | biostudies-literature

Bayesian measurement-error-driven hidden Markov regression model for calibrating the effect of covariates on multistate outcomes: Application to androgenetic alopecia.
| S-EPMC6120552 | biostudies-literature

Generalized Regression Estimators with High-Dimensional Covariates.
| S-EPMC7313320 | biostudies-literature