scholarly journals Detecting Heterogeneity in Population Structure Across the Genome in Admixed Populations

2015 ◽  
Author(s):  
Caitlin McHugh ◽  
Timothy A Thornton ◽  
Lisa Brown

The genetic structure of human populations is often characterized by aggregating measures of ancestry across the autosomal chromosomes. While it may be reasonable to assume that population structure patterns are similar genome-wide in relatively homogeneous populations, this assumption may not be appropriate for admixed populations, such as Hispanics and African Americans, with recent ancestry from two or more continents. Recent studies have suggested that systematic ancestry differences can arise at genomic locations in admixed populations as a result of selection and non-random mating. Here, we propose a method, which we refer to as the chromosomal ancestry differences (CAnD) test, for detecting heterogeneity in population structure across the genome. CAnD uses local ancestry inferred from SNP genotype data to identify chromosomes harboring genomic regions with ancestry contributions that are significantly different than expected. In simulation studies with real genotype data from Phase III of the HapMap Project, we demonstrate the validity and power of CAnD. We apply CAnD to the HapMap Mexican American (MXL) and African American (ASW) population samples; in this analysis the software RFMix is used to infer local ancestry at genomic regions assuming admixing from Europeans, West Africans, and Native Americans. The CAnD test provides strong evidence of heterogeneity in population structure across the genome in the MXL sample ($p=4e-05$), which is largely driven by elevated Native American ancestry and deficit of European ancestry on the X chromosomes. Among the ASW, all chromosomes are largely African derived and no heterogeneity in population structure is detected in this sample.

2011 ◽  
Vol 93 (2) ◽  
pp. 105-114 ◽  
Author(s):  
LEEYOUNG PARK

SummaryIn order to estimate the effective population size (Ne) of the current human population, two new approaches, which were derived from previous methods, were used in this study. One is based on the deviation from linkage equilibrium (LE) between completely unlinked loci in different chromosomes and another is based on the deviation from the Hardy–Weinberg Equilibrium (HWE). When random mating in a population is assumed, genetic drifts in population naturally induce linkage disequilibrium (LD) between chromosomes and the deviation from HWE. The latter provides information on the Ne of the current population, and the former provides the same when the Ne is constant. If Ne fluctuates, recent Ne changes are reflected in the estimates based on LE, and the comparison between two estimates can provide information regarding recent changes of Ne. Using HapMap Phase III data, the estimates were varied from 622 to 10 437, depending on populations and estimates. The Ne appeared to fluctuate as it provided different estimates for each of the two methods. These Ne estimates were found to agree approximately with the overall increment observed in recent human populations.


2017 ◽  
Vol 14 (128) ◽  
pp. 20170057 ◽  
Author(s):  
Luciana W. Zuccherato ◽  
Silvana Schneider ◽  
Eduardo Tarazona-Santos ◽  
Robert J. Hardwick ◽  
Douglas E. Berg ◽  
...  

While multiallelic copy number variation (mCNV) loci are a major component of genomic variation, quantifying the individual copy number of a locus and defining genotypes is challenging. Few methods exist to study how mCNV genetic diversity is apportioned within and between populations (i.e. to define the population genetic structure of mCNV). These inferences are critical in populations with a small effective size, such as Amerindians, that may not fit the Hardy–Weinberg model due to inbreeding, assortative mating, population subdivision, natural selection or a combination of these evolutionary factors. We propose a likelihood-based method that simultaneously infers mCNV allele frequencies and the population structure parameter f , which quantifies the departure of homozygosity from the Hardy–Weinberg expectation. This method is implemented in the freely available software CNVice, which also infers individual genotypes using information from both the population and from trios, if available. We studied the population genetics of five immune-related mCNV loci associated with complex diseases (beta-defensins, CCL3L1/CCL4L1 , FCGR3A , FCGR3B and FCGR2C ) in 12 traditional Native American populations and found that the population structure parameters inferred for these mCNVs are comparable to but lower than those for single nucleotide polymorphisms studied in the same populations.


2021 ◽  
Vol 39 (15_suppl) ◽  
pp. 6549-6549
Author(s):  
Irbaz Bin Riaz ◽  
Mahnoor Islam ◽  
Ahsan Masood Khan ◽  
Syed Arsalan Ahmed Naqvi ◽  
Rabbia Siddiqi ◽  
...  

6549 Background: Representation and outcomes of women, older adults, and racial minorities in ICI trials has not been previously described. Methods: MEDLINE and Embase were searched to identify ICI RCTs. Data for trial characteristics, proportion of trials reporting race, age and sex as well as the proportion of patients by race, age and sex enrolled in ICI trials was collected. Descriptive statistics were reported for trials reporting minority representation and proportion of included patients by race, age and sex. Disparities in representation were calculated using enrollment incidence disparity (EID) and enrollment incidence ratios (EIR) by comparing trial enrollment against U.S. population-based estimates acquired from the SEER 18 incidence database. The relationship of EID to key trial characteristics were compared using standard parametric and non-parametric statistical tests. Trends in EIR were analyzed using the Joinpoint Regression Analysis software. Results: 108 ICI trials from 2009 to 2020 with 48,360 patients were included in this analysis. All RCTs reported sex (101/101). 78 trials reported race (72%), of which only 41 trials (38%) reported data on all 5 U.S. racial categories (Black, White, Asian, Pacific Islander and Native American). Participation of Black patients was reported in 66 trials (61%), White participants in 78 trials (72%), Asians in 69 trials (64%), Native Americans and Pacific Islanders in 41 trials (38%), and Hispanics in 24 trials (22%). Age categories were inconsistently defined, and 80 trials (74%) reported the proportion of patient by age categories. Subgroup analyses of clinical outcomes by race, age and sex were reported in 17 (22%), 62 (79%) and 57 (73%) trials respectively. Women (trial proportion [TP]: 32%; EIR: 0.77), patients aged ≥ 65 years (TP: 42%; EIR: 0.74), Black participants (TP: 1.8%; EIR: 0.17) and Hispanic participants (TP: 5.9%; EIR: 0.67) were largely underrepresented, and Asians were overrepresented (TP: 15.9%; EIR: 2.64). Black patients were underrepresented across all cancer types. Similarly, women, older adults ( > 65 years of age) and Hispanic patients were consistently underrepresented across cancer types with few exceptions. Representation of older adults increased significantly from 2010-2020 (APC: 2.72), while representation of Black patients decreased significantly from 2009-2020 (APC: -23.37). Black patients were found to be significantly underrepresented in phase III trials (p = 0.0005), trials with OS as the primary endpoint (p = 0.004), and PD1 inhibitor trials (p = 0.002). Hispanics were significantly underrepresented in PD1 inhibitor trials (p = 0.003). Conclusions: There is both suboptimal reporting about participation and underrepresentation of women, racial minorities (particularly Black patients) and older adults in ICI trials as compared to their cancer incidence.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Nicole R. Gay ◽  
◽  
Michael Gloudemans ◽  
Margaret L. Antonio ◽  
Nathan S. Abell ◽  
...  

Abstract Background Population structure among study subjects may confound genetic association studies, and lack of proper correction can lead to spurious findings. The Genotype-Tissue Expression (GTEx) project largely contains individuals of European ancestry, but the v8 release also includes up to 15% of individuals of non-European ancestry. Assessing ancestry-based adjustments in GTEx improves portability of this research across populations and further characterizes the impact of population structure on GWAS colocalization. Results Here, we identify a subset of 117 individuals in GTEx (v8) with a high degree of population admixture and estimate genome-wide local ancestry. We perform genome-wide cis-eQTL mapping using admixed samples in seven tissues, adjusted by either global or local ancestry. Consistent with previous work, we observe improved power with local ancestry adjustment. At loci where the two adjustments produce different lead variants, we observe 31 loci (0.02%) where a significant colocalization is called only with one eQTL ancestry adjustment method. Notably, both adjustments produce similar numbers of significant colocalizations within each of two different colocalization methods, COLOC and FINEMAP. Finally, we identify a small subset of eQTL-associated variants highly correlated with local ancestry, providing a resource to enhance functional follow-up. Conclusions We provide a local ancestry map for admixed individuals in the GTEx v8 release and describe the impact of ancestry and admixture on gene expression, eQTLs, and GWAS colocalization. While the majority of the results are concordant between local and global ancestry-based adjustments, we identify distinct advantages and disadvantages to each approach.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Andrew J. Pakstis ◽  
William C. Speed ◽  
Usha Soundararajan ◽  
Haseena Rajeevan ◽  
Judith R. Kidd ◽  
...  

AbstractThe benefits of ancestry informative SNP (AISNP) panels can best accrue and be properly evaluated only as sufficient reference population data become readily accessible. Ideally the set of reference populations should approximate the genetic diversity of human populations worldwide. The Kidd and Seldin AISNP sets are two panels that have separately accumulated thus far the largest and most diverse collections of data on human reference populations from the major continental regions. A recent tally in the ALFRED allele frequency database finds 164 reference populations available for all the 55 Kidd AISNPs and 132 reference populations for all the 128 Seldin AISNPs. Although much more of the genetic diversity in human populations around the world still needs to be documented, 81 populations have genotype data available for all 170 AISNPs in the union of the Kidd and Seldin panels. In this report we examine admixture and principal component analyses on these 81 worldwide populations and some regional subsets of these reference populations to determine how well the combined panel illuminates population relationships. Analyses of this dataset that focused on Native American populations revealed very strong cluster patterns associated with many of the individual populations studied.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Rodrigo Secolin ◽  
Alex Mas-Sandoval ◽  
Lara R. Arauna ◽  
Fábio R. Torres ◽  
Tânia K. de Araujo ◽  
...  

Abstract Admixed American populations have different global proportions of European, Sub-Saharan African, and Native-American ancestry. However, individuals who display the same global ancestry could exhibit remarkable differences in the distribution of local ancestry blocks. We studied for the first time the distribution of local ancestry across the genome of 264 Brazilian admixed individuals, ascertained within the scope of the Brazilian Initiative on Precision Medicine. We found a decreased proportion of European ancestry together with an excess of Native-American ancestry on chromosome 8p23.1 and showed that this is due to haplotypes created by chromosomal inversion events. Furthermore, Brazilian non-inverted haplotypes were more similar to Native-American haplotypes than to European haplotypes, in contrast to what was found in other American admixed populations. We also identified signals of recent positive selection on chromosome 8p23.1, and one gene within this locus, PPP1R3B, is related to glycogenesis and has been associated with an increased risk of type 2 diabetes and obesity. These findings point to a selection event after admixture, which is still not entirely understood in recent admixture events.


2019 ◽  
Author(s):  
Nicole R. Gay ◽  
Michael Gloudemans ◽  
Margaret L. Antonio ◽  
Brunilda Balliu ◽  
YoSon Park ◽  
...  

AbstractBackgroundPopulation structure among study subjects may confound genetic association studies, and lack of proper correction can lead to spurious findings. The Genotype-Tissue Expression (GTEx) project largely contains individuals of European ancestry, but the final release (v8) also includes up to 15% of individuals of non-European ancestry. Assessing ancestry-based adjustments in GTEx provides an opportunity to improve portability of this research across populations and to further measure the impact of population structure on GWAS colocalization.ResultsHere, we identify a subset of 117 individuals in GTEx (v8) with a high degree of population admixture and estimate genome-wide local ancestry. We perform genome-wide cis-eQTL mapping using admixed samples in six tissues, adjusted by either global or local ancestry. Consistent with previous work, we observe improved power with local ancestry adjustment. At loci where the two adjustments produce different lead variants, we observe only 0.8% of tests with GWAS colocalization posterior probabilities that change by 10% or more. Notably, both adjustments produce similar numbers of significant colocalizations. Finally, we identify a small subset of GTEx v8 eQTL-associated variants highly correlated with local ancestry (R2 > 0.7), providing a resource to enhance functional follow-up.ConclusionsWe provide a local ancestry map for admixed individuals in the final GTEx release and describe the impact of ancestry and admixture on gene expression, eQTLs, and GWAS colocalization. While the majority of results are concordant between local and global ancestry-based adjustments, we identify distinct advantages and disadvantages to each approach.


2018 ◽  
Vol 115 (17) ◽  
pp. E4006-E4012 ◽  
Author(s):  
Constanza de la Fuente ◽  
María C. Ávila-Arcos ◽  
Jacqueline Galimany ◽  
Meredith L. Carpenter ◽  
Julian R. Homburger ◽  
...  

Patagonia was the last region of the Americas reached by humans who entered the continent from Siberia ∼15,000–20,000 y ago. Despite recent genomic approaches to reconstruct the continental evolutionary history, regional characterization of ancient and modern genomes remains understudied. Exploring the genomic diversity within Patagonia is not just a valuable strategy to gain a better understanding of the history and diversification of human populations in the southernmost tip of the Americas, but it would also improve the representation of Native American diversity in global databases of human variation. Here, we present genome data from four modern populations from Central Southern Chile and Patagonia (n = 61) and four ancient maritime individuals from Patagonia (∼1,000 y old). Both the modern and ancient individuals studied in this work have a greater genetic affinity with other modern Native Americans than to any non-American population, showing within South America a clear structure between major geographical regions. Native Patagonian Kawéskar and Yámana showed the highest genetic affinity with the ancient individuals, indicating genetic continuity in the region during the past 1,000 y before present, together with an important agreement between the ethnic affiliation and historical distribution of both groups. Lastly, the ancient maritime individuals were genetically equidistant to a ∼200-y-old terrestrial hunter-gatherer from Tierra del Fuego, which supports a model with an initial separation of a common ancestral group to both maritime populations from a terrestrial population, with a later diversification of the maritime groups.


2016 ◽  
Author(s):  
Noah Zaitlen ◽  
Scott Huntsman ◽  
Donglei Hu ◽  
Melissa Spear ◽  
Celeste Eng ◽  
...  

1AbstractStatistical models in medical and population genetics typically assume that individuals assort randomly in a population. While this simplifies model complexity, it contradicts an increasing body of evidence of non-random mating in human populations. Specifically, it has been shown that assortative mating is significantly affected by genomic ancestry. In this work we examine the effects of ancestry-assortative mating on the linkage disequilibrium between local ancestry tracks of individuals in an admixed population. To accomplish this, we develop an extension to the Wright-Fisher model that allows for ancestry based assortative mating. We show that ancestry-assortment perturbs the distribution of local ancestry linkage disequilibrium (LAD) and the variance of ancestry in a population as a function of the number of generations since admixture. This assortment effect can induce errors in demographic inference of admixed populations when methods assume random mating. We derive closed form formulae for LAD under an assortative-mating model with and without migration. We observe that LAD depends on the correlation of global ancestry of couples in each generation, the migration rate of each of the ancestral populations, the initial proportions of ancestral populations, and the number of generations since admixture. We also present the first evidence of ancestry-assortment in African Americans and examine LAD in simulated and real admixed population data of African Americans. We find that demographic inference under the assumption of random mating significantly underestimates the number of generations since admixture, and that accounting for assortative mating using the patterns of LAD results in estimates that more closely agrees with the historical narrative.


2011 ◽  
Author(s):  
Elizabeth Focella ◽  
Jessica Whitehead ◽  
Jeff Stone ◽  
Stephanie Fryberg ◽  
Rebecca Covarrubias

Sign in / Sign up

Export Citation Format

Share Document