admixture proportion
Recently Published Documents


TOTAL DOCUMENTS

17
(FIVE YEARS 12)

H-INDEX

4
(FIVE YEARS 1)

2022 ◽  
Author(s):  
Colin M Brand ◽  
Frances J White ◽  
Alan R Rogers ◽  
Timothy H Webster

Introgression appears increasingly ubiquitous in the evolutionary history of various taxa, including humans. However, accurately estimating introgression is difficult, particularly when 1) there are many parameters, 2) multiple models fit the data well, and 3) parameters are not simultaneously estimated. Here, we use the software Legofit to investigate the evolutionary history of bonobos (Pan paniscus) and chimpanzees (P. troglodytes) using whole genome sequences. This approach 1) ignores within-population variation, reducing the number of parameters requiring estimation, 2) allows for model selection, and 3) simultaneously estimates all parameters. We tabulated site patterns from the autosomes of 71 bonobos and chimpanzees representing all five extant Pan lineages. We then compared previously proposed demographic models and estimated parameters using a deterministic approach. We further considered sex bias in Pan evolutionary history by analyzing the site patterns from the X chromosome. Introgression from bonobos into the ancestor of eastern and central chimpanzees and from western into eastern chimpanzees best explained the autosomal site patterns. This second event was substantial with an estimated 0.21 admixture proportion. Estimates of effective population size and most divergence dates are consistent with previous findings; however, we observe a deeper divergence within chimpanzees at 987 ka. Finally, we identify male-biased reproduction in Pan evolutionary history and suggest that western to eastern chimpanzee introgression was driven by western males mating with eastern females.


2022 ◽  
Author(s):  
Siddharth Avadhanam ◽  
Amy L Williams

Population genetic analyses of local ancestry tracts routinely assume that the ancestral admixture process is identical for both parents of an individual, an assumption that may be invalid when considering recent admixture. Here we present Parental Admixture Proportion Inference (PAPI), a Bayesian tool for inferring the admixture proportions and admixture times for each parent of a single admixed individual. PAPI analyzes unphased local ancestry tracts and has two components models: a binomial model that exploits the informativeness of homozygous ancestry regions to infer parental admixture proportions, and a hidden Markov model (HMM) that infers admixture times from tract lengths. Crucially, the HMM employs an approximation to the pedigree crossover dynamics that accounts for unobserved within-ancestry recombination, enabling inference of parental admixture times. We compared the accuracy of PAPI's admixture proportion estimates with those of ANCESTOR in simulated admixed individuals and found that PAPI outperforms ANCESTOR by an average of 46% in a representative set of simulation scenarios, with PAPI's estimates deviating from the ground truth by 0.047 on average. Moreover, PAPI's admixture time estimates were strongly correlated with the ground truth in these simulations (R = 0.76), but have an average downward bias of 1.01 generations that is partly attributable to inaccuracies in local ancestry inference. As an illustration of its utility, we ran PAPI on real African Americans from the PAGE study (N = 5,786) and found strong evidence of assortative mating by ancestry proportion: couples' ancestry proportions are closer to each other than expected by chance (P<10-6), and are highly correlated (R = 0.87). We anticipate that PAPI will be useful in studying the population dynamics of admixture and will also be of interest to individuals seeking to learn about their personal genealogies.


2021 ◽  
Author(s):  
Junxia Yuan ◽  
Michael Vincent Westbury ◽  
Shungang Chen ◽  
Jiaming Hu ◽  
Fengli Zhang ◽  
...  

The extinct Camelus knoblochi is known as the largest camel in genus Camelus, but its relationship to modern Camelus species remains unclear. In this study, we reported the first mitochondrial and nuclear analyses of seven Late Pleistocene C. knoblochi samples from Northeastern China. We found that they are inseparable to wild Bactrian camel on the matrilineal side, but belong to a distinct cluster on the biparental nuclear side. Further admixture proportion analyses suggested hybrid ancestry between both the ancestors of the modern wild and domesticated Bactrian camels, with ~65% contribution from the former and ~35% from the later. By calculating the coalescence time for three Camelus species above, we estimated the hybridization event occurred between approximately 0.8 and 0.33 Ma. We also used Bayesian skyline to reconstruct the maternal demographic trajectories for different Camelus to better compare their evolutionary histories. Our results provide molecular insights into C. knoblochi and fill in a vital piece in understanding the genus Camelus.


2021 ◽  
Vol 12 ◽  
Author(s):  
Jing Chen ◽  
Guanglin He ◽  
Zheng Ren ◽  
Qiyan Wang ◽  
Yubo Liu ◽  
...  

As a major part of the modern Trans-Eurasian or Altaic language family, most of the Mongolic and Tungusic languages were mainly spoken in northern China, Mongolia, and southern Siberia, but some were also found in southern China. Previous genetic surveys only focused on the dissection of genetic structure of northern Altaic-speaking populations; however, the ancestral origin and genomic diversification of Mongolic and Tungusic–speaking populations from southwestern East Asia remain poorly understood because of the paucity of high-density sampling and genome-wide data. Here, we generated genome-wide data at nearly 700,000 single-nucleotide polymorphisms (SNPs) in 26 Mongolians and 55 Manchus collected from Guizhou province in southwestern China. We applied principal component analysis (PCA), ADMIXTURE, f statistics, qpWave/qpAdm analysis, qpGraph, TreeMix, Fst, and ALDER to infer the fine-scale population genetic structure and admixture history. We found significant genetic differentiation between northern and southern Mongolic and Tungusic speakers, as one specific genetic cline of Manchu and Mongolian was identified in Guizhou province. Further results from ADMIXTURE and f statistics showed that the studied Guizhou Mongolians and Manchus had a strong genetic affinity with southern East Asians, especially for inland southern East Asians. The qpAdm-based estimates of ancestry admixture proportion demonstrated that Guizhou Mongolians and Manchus people could be modeled as the admixtures of one northern ancestry related to northern Tungusic/Mongolic speakers or Yellow River farmers and one southern ancestry associated with Austronesian, Tai-Kadai, and Austroasiatic speakers. The qpGraph-based phylogeny and neighbor-joining tree further confirmed that Guizhou Manchus and Mongolians derived approximately half of the ancestry from their northern ancestors and the other half from southern Indigenous East Asians. The estimated admixture time ranged from 600 to 1,000 years ago, which further confirmed the admixture events were mediated via the Mongolians Empire expansion during the formation of the Yuan dynasty.


2021 ◽  
Vol 12 ◽  
Author(s):  
Yan Liu ◽  
Mengge Wang ◽  
Pengyu Chen ◽  
Zheng Wang ◽  
Jing Liu ◽  
...  

The Tibetan Plateau (TP) is considered to be one of the last terrestrial environments conquered by the anatomically modern human. Understanding of the genetic background of highland Tibetans plays a pivotal role in archeology, anthropology, genetics, and forensic investigations. Here, we genotyped 22 forensic genetic markers in 1,089 Tibetans residing in Nagqu Prefecture and collected 1,233,013 single nucleotide polymorphisms (SNPs) in the highland East Asians (Sherpa and Tibetan) from the Simons Genome Diversity Project and ancient Tibetans from Nepal and Neolithic farmers from northeastern Qinghai-Tibetan Plateau from public databases. We subsequently merged our two datasets with other worldwide reference populations or eastern ancient Eurasians to gain new insights into the genetic diversity, population movements, and admixtures of high-altitude East Asians via comprehensive population genetic statistical tools [principal component analysis (PCA), multidimensional scaling plot (MDS), STRUCTURE/ADMIXTURE, f3, f4, qpWave/qpAdm, and qpGraph]. Besides, we also explored their forensic characteristics and extended the Chinese National Database based on STR data. We identified 231 alleles with the corresponding allele frequencies spanning from 0.0005 to 0.5624 in the forensic low-density dataset, in which the combined powers of discrimination and the probability of exclusion were 1–1.22E-24 and 0.999999998, respectively. Additionally, comprehensive population comparisons in our low-density data among 57 worldwide populations via the Nei’s genetic distance, PCA, MDS, NJ tree, and STRUCTURE analysis indicated that the highland Tibeto-Burman speakers kept the close genetic relationship with ethnically close populations. Findings from the 1240K high-density dataset not only confirmed the close genetic connection between modern Highlanders, Nepal ancients (Samdzong, Mebrak, and Chokhopani), and the upper Yellow River Qijia people, suggesting the northeastern edge of the TP served as a geographical corridor for ancient population migrations and interactions between highland and lowland regions, but also evidenced that late Neolithic farmers permanently colonized into the TP by adopting cold-tolerant barley agriculture that was mediated via the acculturation of idea via the millet farmer and not via the movement of barley agriculturalist as no obvious western Eurasian admixture signals were identified in our analyzed modern and ancient populations. Besides, results from the qpAdm-based admixture proportion estimation and qpGraph-based phylogenetic relationship reconstruction consistently demonstrated that all ancient and modern highland East Asians harbored and shared the deeply diverged Onge/Hoabinhian-related eastern Eurasian lineage, suggesting a common Paleolithic genetic legacy existed in high-altitude East Asians as the first layer of their gene pool.


2020 ◽  
Author(s):  
Yusuke Watanabe ◽  
Jun Ohashi

AbstractModern Japanese are considered to derive from a mixture of two major ancestral populations: the indigenous Jomon people and immigrants from continental East Asia. Since most of the existing methods for detecting genetic components from ancestral populations require their genomes, ancestral genomic components in Japanese could not detected so far due to the lack of precisely sequenced ancient Jomon genomes. To overcome the difficulty, we developed a reference-free detection method using a novel summary statistic, the ancestry-marker index (AMI). We applied the AMI to modern Japanese samples from the 1000 Genomes Project and identified 208,648 ancestry-marker SNPs that were likely derived from the Jomon people (Jomon-derived SNPs). Comparing the Jomon allele score detected in this study with modern Japanese and two ancient Jomon individuals showed that the Jomon derived SNPs were detected with high accuracy by the AMI in real data, and that the Jomon derived SNPs were detected by several tens of times from a single Jomon individual by the AMI. The analysis of Jomon-derived SNPs in 10,842 modern Japanese individuals recruited from all the 47 prefectures of Japan showed that the genetic differences among the prefectures were mainly caused by differences in the admixture proportion of the Jomon people, due to the difference of population size of immigrants in the final Jomon to the Yayoi period. We also confirmed the presence of the Jomon alleles around phenotype associated SNPs characteristic of East Asians to clarify whether these phenotypes of modern Japanese were derived from the Jomon people.


2020 ◽  
Author(s):  
Gonzalo Oteo–García ◽  
José–Angel Oteo

AbstractA detailed derivation of the f–statistics formalism is made from a geometrical framework. It is shown that the f–statistics appear when a genetic distance matrix is constrained to describe a four population phylogenetic tree. The choice of genetic metric is crucial and plays an outstanding role as regards the tree–like–ness criterion. The case of lack of treeness is interpreted in the formalism as presence of population admixture. In this respect, four formulas are given to estimate the admixture proportions. One of them is the so–called f4–ratio estimate and we show that a second one is related to a known result developed in terms of the fixation index FST. An illustrative numerical simulation of admixture proportion estimates is included. Relationships of the formalism with coalescence times and pairwise sequence differences are also provided.


2019 ◽  
Author(s):  
Li-Ju Wang ◽  
Catherine W. Zhang ◽  
Sophia C. Su ◽  
Hung-I H. Chen ◽  
Yu-Chiao Chiu ◽  
...  

AbstractBackgroundEuropeans and American Indians were major genetic ancestry of Hispanics in the U.S. In those ancestral groups, it has markedly different incidence rates and outcomes in many types of cancers. Therefore, the genetic admixture may cause biased genetic association study with cancer susceptibility variants specifically in Hispanics. The incidence rate and genetic mutational pattern of liver cancer have been shown substantial disparity between Hispanic, Asian and non-Hispanic white populations. Currently, ancestry informative marker (AIM) panels have been widely utilized with up to a few hundred ancestry-informative single nucleotide polymorphisms (SNPs) to infer ancestry admixture. Notably, current available AIMs are predominantly located in intron and intergenic regions, while the whole exome sequencing (WES) protocols commonly used in translational research and clinical practice do not contain these markers, thus, the challenge to accurately determine a patient’s admixture proportion without subject to additional DNA testing.MethodsHere we designed a bioinformatics pipeline to obtain an AIM panel. The panel infers 3-way genetic admixture from three distinct continental populations (African (AFR), European (EUR), and East Asian (EAS)) constraint within evolutionary-conserved exome regions. Briefly, we extract ∼1 million exonic SNPs from all individuals of three populations in the 1000 Genomes Project. Then, the SNPs were trimmed by their linkage disequilibrium (LD), restricted to biallelic variants only, and assembled as an AIM panel with the top ancestral informativeness statistics based on the In-statistic. The selected AIM panel was applied to training dataset and clinical dataset. Finally, The ancestral proportions of each individual was estimated by STRUCTURE.ResultsIn this study, the optimally selected AIM panel with 250 markers, or the UT-AIM250 panel, was performed with better accuracy as one of the published AIM panels when we tested with 3 ancestral populations (Accuracy: 0.995 ± 0.012 for AFR, 0.997 ± 0.007 for EUR, and 0.994 ± 0.012 for EAS). We demonstrated the utility of UT-AIM250 panel on the admixed American (AMR) of the 1000 Genomes Project and obtained similar results (AFR: 0.085 ± 0.098; EUR: 0.665 ± 0.182; and EAS 0.250 ± 0.205) to previously published AIM panels (Phillips-AIM34: AFR: 0.096 ± 0.127, EUR: 0.575 ± 0.29; and EAS: 0.330 ± 0.315; Wei-AIM278: AFR: 0.070 ± 0.096, EUR: 0.537 ± 0.267, and EAS: 0.393 ± 0.300) with no significant difference (Pearson correlation, P < 10-50, n = 347 samples). Subsequently, we applied UT-AIM250 panel to clinical datasets of self-reported Hispanic patients in South Texas with hepatocellular carcinoma (26 patients). Our estimated admixture proportions from adjacent non-cancer liver tissue data of Hispanics in South Texas is (AFR: 0.065 ± 0.043; EUR: 0.594 ± 0.150; and EAS: 0.341 ± 0.160), with smaller variation due to its unique Texan/Mexican American population in South Texas. Similar admixture proportion from the corresponding tumor tissue we also obtained. In addition, we estimated admixture proportions of entire TCGA-LIHC samples (376 patients) using UT-AIM250 panel. We demonstrated that our AIM panel estimate consistent admixture proportions from DNAs derived from tumor and normal tissues, and 2 possible incorrect reported race/ethnicity, and/or provide race/ethnicity determination if necessary.ConclusionsTaken together, we demonstrated the feasibility of using evolutionary-conserved exome regions to distinguish genetic ancestry descendants based on 3 continental-ancestry proportion, provided a robust and reliable control for sample collection or patient stratification for genetic analysis. R implementation of UT-AIM250 is available at https://github.com/chenlabgccri/UT-AIM250.


Sign in / Sign up

Export Citation Format

Share Document