Population history of the Sardinian people inferred from whole-genome sequencing

Mapping Intimacies ◽

10.1101/092148 ◽

2016 ◽

Cited By ~ 5

Author(s):

Charleston W K Chiang ◽

Joseph H Marcus ◽

Carlo Sidore ◽

Hussein Al-Asadi ◽

Magdalena Zoledziewska ◽

...

Keyword(s):

Bronze Age ◽

Disease Risk ◽

Association Studies ◽

Demographic History ◽

Population History ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Risk Alleles ◽

Mediterranean Island ◽

History Of

AbstractThe population of the Mediterranean island of Sardinia has made important contributions to genome-wide association studies of traits and diseases. The history of the Sardinian population has also been the focus of much research, and in recent ancient DNA (aDNA) studies, Sardinia has provided unique insight into the peopling of Europe and the spread of agriculture. In this study, we analyze whole-genome sequences of 3,514 Sardinians to address hypotheses regarding the founding of Sardinia and its relation to the peopling of Europe, including examining fine-scale substructure, population size history, and signals of admixture. We find the population of the mountainous Gennargentu region shows elevated genetic isolation with higher levels of ancestry associated with mainland Neolithic farmers and depleted ancestry associated with more recent Bronze Age Steppe migrations on the mainland. Notably, the Gennargentu region also has elevated levels of pre-Neolithic hunter-gatherer ancestry and increased affinity to Basque populations. Further, allele sharing with pre-Neolithic and Neolithic mainland populations is larger on the X chromosome compared to the autosome, providing evidence for a sex-biased demographic history in Sardinia. These results give new insight to the demography of ancestral Sardinians and help further the understanding of sharing of disease risk alleles between Sardinia and mainland populations.

Human demographic history impacts genetic risk prediction across diverse populations

10.1101/070797 ◽

2016 ◽

Cited By ~ 7

Author(s):

Alicia R. Martin ◽

Christopher R. Gignoux ◽

Raymond K. Walters ◽

Genevieve L. Wojcik ◽

Benjamin M. Neale ◽

...

Keyword(s):

Risk Prediction ◽

Large Scale ◽

Disease Risk ◽

Association Studies ◽

Demographic History ◽

Population History ◽

Risk Scores ◽

Genome Wide Association Studies ◽

Summary Statistics ◽

Medical Genomics

AbstractThe vast majority of genome-wide association studies are performed in Europeans, and their transferability to other populations is dependent on many factors (e.g. linkage disequilibrium, allele frequencies, genetic architecture). As medical genomics studies become increasingly large and diverse, gaining insights into population history and consequently the transferability of disease risk measurement is critical. Here, we disentangle recent population history in the widely-used 1000 Genomes Project reference panel, with an emphasis on populations underrepresented in medical studies. To examine the transferability of single-ancestry GWAS, we used published summary statistics to calculate polygenic risk scores for six well-studied traits and diseases. We identified directional inconsistencies in all scores; for example, height is predicted to decrease with genetic distance from Europeans, despite robust anthropological evidence that West Africans are as tall as Europeans on average. To gain deeper quantitative insights into GWAS transferability, we developed a complex trait coalescent-based simulation framework considering effects of polygenicity, causal allele frequency divergence, and heritability. As expected, correlations between true and inferred risk were typically highest in the population from which summary statistics were derived. We demonstrated that scores inferred from European GWAS were biased by genetic drift in other populations even when choosing the same causal variants, and that biases in any direction were possible and unpredictable. This work cautions that summarizing findings from large-scale GWAS may have limited portability to other populations using standard approaches, and highlights the need for generalized risk prediction methods and the inclusion of more diverse individuals in medical genomics.

How genetic disease risks can be misestimated across global populations

10.1101/195768 ◽

2017 ◽

Author(s):

Michelle S Kim ◽

Kane P Patel ◽

Andrew K Teng ◽

Ali J Berens ◽

Joseph Lachance

Keyword(s):

Genetic Disease ◽

Risk Allele ◽

Association Studies ◽

Allele Frequencies ◽

Risk Scores ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Risk Alleles ◽

Disease Associations ◽

Disease Risks

AbstractBackgroundAccurate assessment of health disparities requires unbiased knowledge of genetic risks in different populations. Unfortunately, most genome-wide association studies use genotyping arrays and European samples. Here, we integrate whole genome sequence data from global populations, results from thousands of GWAS, and extensive computer simulations to identify how genetic disease risks can be misestimated.ResultsIn contrast to null expectations, we find that risk allele frequencies at known disease loci are significantly different for African populations compared to other continents. Strikingly, ancestral risk alleles are found at 9.51% higher frequency in Africa and derived risk alleles are found at 5.40% lower frequency in Africa. By simulating GWAS with different study populations, we find that non-African cohorts yield disease associations that have biased allele frequencies and that African cohorts yield disease associations that are relatively free of bias. We also find empirical evidence that genotyping arrays and SNP ascertainment bias contribute to continental differences in risk allele frequencies. Because of these causes, polygenic risk scores can be grossly misestimated for individuals of African descent. Importantly, continental differences in risk allele frequencies are only moderately reduced if GWAS use whole genome sequences and hundreds of thousands of cases and controls. Finally, comparisons between uncorrected and corrected genetic risk scores reveal the benefits of considering whether risk alleles are ancestral or derived.ConclusionsOur results imply that caution must be taken when extrapolating GWAS results from one population to predict disease risks in another population.

ECP06-01 - Genomewide - asscociation studies of psychiatric phenotypes: What they have told us and what to do next

European Psychiatry ◽

10.1016/s0924-9338(11)73507-6 ◽

2011 ◽

Vol 26 (S2) ◽

pp. 1803-1803

Author(s):

T.G. Schulze

Keyword(s):

Psychiatric Disorders ◽

Disease Risk ◽

Association Studies ◽

Future Research ◽

Allelic Heterogeneity ◽

List Type ◽

Genome Wide Association Studies ◽

Monogenic Disorders ◽

Complex Genetics ◽

Risk Alleles

Genome-wide association studies of psychiatric disorders have highlighted several novel susceptibility genes and taught us several importnat lessons.1)Psychiatric disorders are polygenic disorders. The contribution of each locus to risk of disease is modest and disease risk increases substantially with the total burden of risk alleles carried.2)The best findings from GWAS do not necessarily fall within those genes that have previously been widely studied.3)Pursuing a “top-hits-only” strategy may prevent us from understanding the genetic complexity of psychiatric disorders and polygenic disorders in general. A detailed consideration of the wider distribution of association signals across studies may prove to be a valuable strategy in complex genetics.4)Allelic heterogeneity may be an important factor in psychiatric disorders. Allelic heterogeneity means that a phenotype can be caused by different alleles within a gene; this phenomenon has been extensively observed in monogenic disorders such as cystic fibrosis as well as in BRCA1/2-associated breast cancer.5)Finally, as with other complex phenotypes, GWAS in psychiatric disorders demonstrate that the variants identified so far only account for a small fraction of genetic variability.Future research will need to embark on several complementary approaches in order to fill the yet “unexplained” part of the variance. These will among others include sequencing projects, pharmacogenetic studies, detailed genotype-phenotype dissection approaches, and the study of prospectively assessed phenotypes.

Novel Alzheimer’s disease risk variants identified based on whole-genome sequencing of APOE ε4 carriers

Translational Psychiatry ◽

10.1038/s41398-021-01412-9 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Jong-Ho Park ◽

Inho Park ◽

Emilia Moonkyung Youm ◽

Sejoon Lee ◽

June-Hee Park ◽

...

Keyword(s):

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Whole Genome Sequencing ◽

Genome Sequencing ◽

Disease Risk ◽

Association Studies ◽

European Ancestry ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Ε4 Allele

AbstractAlzheimer’s disease (AD) is a progressive neurodegenerative disease associated with a complex genetic etiology. Besides the apolipoprotein E ε4 (APOE ε4) allele, a few dozen other genetic loci associated with AD have been identified through genome-wide association studies (GWAS) conducted mainly in individuals of European ancestry. Recently, several GWAS performed in other ethnic groups have shown the importance of replicating studies that identify previously established risk loci and searching for novel risk loci. APOE-stratified GWAS have yielded novel AD risk loci that might be masked by, or be dependent on, APOE alleles. We performed whole-genome sequencing (WGS) on DNA from blood samples of 331 AD patients and 169 elderly controls of Korean ethnicity who were APOE ε4 carriers. Based on WGS data, we designed a customized AD chip (cAD chip) for further analysis on an independent set of 543 AD patients and 894 elderly controls of the same ethnicity, regardless of their APOE ε4 allele status. Combined analysis of WGS and cAD chip data revealed that SNPs rs1890078 (P = 6.64E−07) and rs12594991 (P = 2.03E−07) in SORCS1 and CHD2 genes, respectively, are novel genetic variants among APOE ε4 carriers in the Korean population. In addition, nine possible novel variants that were rare in individuals of European ancestry but common in East Asia were identified. This study demonstrates that APOE-stratified analysis is important for understanding the genetic background of AD in different populations.

Integrative genetic and epigenetic analysis uncovers regulatory mechanisms of autoimmune disease

10.1101/054361 ◽

2016 ◽

Cited By ~ 3

Author(s):

Parisa Shooshtari ◽

Hailieng Huang ◽

Chris Cotsapas

Keyword(s):

Disease Risk ◽

Inflammatory Diseases ◽

Association Studies ◽

Single Gene ◽

Genome Wide Association Studies ◽

Regulatory Regions ◽

Risk Alleles ◽

Molecular Events ◽

Public Data ◽

Autoimmune And Inflammatory Diseases

Genome-wide association studies in autoimmune and inflammatory diseases (AID) have uncovered hundreds of loci mediating risk1,2. These associations are preferentially located in non-coding DNA regions3,4 and in particular to tissue-specific Dnase I hypersensitivity sites (DHS)5,6. Whilst these analyses clearly demonstrate the overall enrichment of disease risk alleles on gene regulatory regions, they are not designed to identify individual regulatory regions mediating risk or the genes under their control, and thus uncover the specific molecular events driving disease risk. To do so we have departed from standard practice by identifying regulatory regions which replicate across samples, and connect them to the genes they control through robust re-analysis of public data. We find substantial evidence of regulatory potential in 132/301 (44%) risk loci across nine autoimmune and inflammatory diseases, and are able to prioritize a single gene in 104/132 (79%) of these. Thus, we are able to generate testable mechanistic hypotheses of the molecular changes that drive disease risk.

Unique roles of rare variants in the genetics of complex diseases in humans

Journal of Human Genetics ◽

10.1038/s10038-020-00845-2 ◽

2020 ◽

Vol 66 (1) ◽

pp. 11-23

Author(s):

Yukihide Momozawa ◽

Keijiro Mizukami

Keyword(s):

Rare Variants ◽

Disease Risk ◽

Association Studies ◽

Complex Diseases ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Sequencing Analysis ◽

Common Variants ◽

Distinctive Features ◽

Genome Wide

AbstractGenome-wide association studies have identified >10,000 genetic variants associated with various phenotypes and diseases. Although the majority are common variants, rare variants with >0.1% of minor allele frequency have been investigated by imputation and using disease-specific custom SNP arrays. Rare variants sequencing analysis mainly revealed have played unique roles in the genetics of complex diseases in humans due to their distinctive features, in contrast to common variants. Unique roles are hypothesis-free evidence for gene causality, a precise target of functional analysis for understanding disease mechanisms, a new favorable target for drug development, and a genetic marker with high disease risk for personalized medicine. As whole-genome sequencing continues to identify more rare variants, the roles associated with rare variants will also increase. However, a better estimation of the functional impact of rare variants across whole genome is needed to enhance their contribution to improvements in human health.

Medaka population genome structure and demographic history described via genotyping-by-sequencing

10.1101/233411 ◽

2017 ◽

Author(s):

Takafumi Katsumura ◽

Shoji Oda ◽

Mitani Hiroshi ◽

Hiroki Oota

Keyword(s):

Population Structure ◽

Disease Risk ◽

Association Studies ◽

Demographic History ◽

Genotyping By Sequencing ◽

Genetic Population Structure ◽

Genome Wide Association Studies ◽

Genetic Population ◽

Genome Wide ◽

Genomic Study

AbstractMedaka is a model organism in medicine, genetics, developmental biology and population genetics. Lab stocks composed of more than 100 local wild populations are available for research in these fields. Thus, medaka represents a potentially excellent bioresource for screening disease-risk- and adaptation-related genes in genome-wide association studies. Although the genetic population structure should be known before performing such an analysis, a comprehensive study on the genome-wide diversity of wild medaka populations has not been performed. Here, we performed genotyping-by-sequencing (GBS) for 81 and 12 medakas captured from a bioresource and the wild, respectively. Based on the GBS data, we evaluated the genetic population structure and estimated the demographic parameters using an approximate Bayesian computation (ABC) framework. The autosomal data confirmed that there were substantial differences between local populations and supported our previously proposed hypothesis on medaka dispersal based on mitochondrial genome (mtDNA) data. A new finding was that a local group that was thought to be a hybrid between the northern and the southern Japanese groups was actually a sister group of the northern Japanese group. Thus, this paper presents the first population-genomic study of medaka and reveals its population structure and history based on autosomal diversity.

Long-range linkage disequilibrium in French beef cattle breeds

Genetics Selection Evolution ◽

10.1186/s12711-021-00657-8 ◽

2021 ◽

Vol 53 (1) ◽

Author(s):

Abdelmajid El Hou ◽

Dominique Rocha ◽

Eric Venot ◽

Véronique Blanquet ◽

Romain Philippe

Keyword(s):

Linkage Disequilibrium ◽

Beef Cattle ◽

Long Range ◽

Association Studies ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Animal Populations ◽

History Of ◽

First Time ◽

French Beef

Abstract Background Linkage disequilibrium (LD) is a key parameter to study the history of populations and to identify and fine map quantitative trait loci (QTL) and it has been studied for many years in animal populations. The advent of new genotyping technologies has allowed whole-genome LD studies in most cattle populations. However, to date, long-range LD (LRLD) between distant variants on the genome has not been investigated in detail in cattle. Here, we present the first comprehensive study of LRLD in French beef cattle by analysing data on 672 Charolais (CHA), 462 Limousine (LIM) and 326 Blonde d’Aquitaine (BLA) individuals that were genotyped on the Illumina BovineHD Beadchip. Furthermore, whole-genome LD and haplotype block structure were analysed in these three breeds. Results We computed linkage disequilibrium (r2) values for 5.9, 5.6 and 6.0 billion pairs of SNPs on the 29 autosomes of CHA, LIM and BLA, respectively. Mean r2 values drop to less than 0.1 for distances between SNPs greater than 120 kb. However, for the first time, we detected the existence of LRLD in the three main French beef breeds. In total, 598, 266, and 795 LRLD events (r2 ≥ 0.6) were detected in CHA, LIM and BLA, respectively. Each breed had predominantly population-specific LRLD interactions, although shared LRLD events occurred in a number of regions (55 LRLD events were shared between two breeds and nine between the three breeds). Examples of possible functional gene interactions and QTL co-location were observed with some of these LRLD events, which suggests epistatic selection. Conclusions We identified long-range linkage disequilibrium for the first time in French beef cattle populations. Epistatic selection may be the main source of the observed LRLD events, but other forces may also be involved. LRLD information should be accounted for in genome-wide association studies.

Integration of Alzheimer’s disease genetics and myeloid genomics identifies disease risk regulatory elements and genes

Nature Communications ◽

10.1038/s41467-021-21823-y ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Gloriia Novikova ◽

Manav Kapoor ◽

Julia TCW ◽

Edsel M. Abud ◽

Anastasia G. Efthymiou ◽

...

Keyword(s):

Gene Expression ◽

Alzheimer’S Disease ◽

Alzheimer's Disease ◽

Disease Risk ◽

Association Studies ◽

Regulatory Elements ◽

Genome Wide Association Studies ◽

Risk Alleles ◽

Functional Variants ◽

Analytical Approaches

AbstractGenome-wide association studies (GWAS) have identified more than 40 loci associated with Alzheimer’s disease (AD), but the causal variants, regulatory elements, genes and pathways remain largely unknown, impeding a mechanistic understanding of AD pathogenesis. Previously, we showed that AD risk alleles are enriched in myeloid-specific epigenomic annotations. Here, we show that they are specifically enriched in active enhancers of monocytes, macrophages and microglia. We integrated AD GWAS with myeloid epigenomic and transcriptomic datasets using analytical approaches to link myeloid enhancer activity to target gene expression regulation and AD risk modification. We identify AD risk enhancers and nominate candidate causal genes among their likely targets (including AP4E1, AP4M1, APBB3, BIN1, MS4A4A, MS4A6A, PILRA, RABEP1, SPI1, TP53INP1, and ZYX) in twenty loci. Fine-mapping of these enhancers nominates candidate functional variants that likely modify AD risk by regulating gene expression in myeloid cells. In the MS4A locus we identified a single candidate functional variant and validated it in human induced pluripotent stem cell (hiPSC)-derived microglia and brain. Taken together, this study integrates AD GWAS with multiple myeloid genomic datasets to investigate the mechanisms of AD risk alleles and nominates candidate functional variants, regulatory elements and genes that likely modulate disease susceptibility.

Family-based gene-environment interaction using sequence kernel association test (FGE-SKAT) for complex quantitative traits

Scientific Reports ◽

10.1038/s41598-021-86871-2 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Chao-Yu Guo ◽

Reng-Hong Wang ◽

Hsin-Chou Yang

Keyword(s):

Complex Traits ◽

Association Studies ◽

Association Test ◽

Whole Genome Sequence ◽

Environment Interaction ◽

Genome Wide Association Studies ◽

Whole Genome ◽

Sequence Kernel Association Test ◽

Gene Environment ◽

Family Based

AbstractAfter the genome-wide association studies (GWAS) era, whole-genome sequencing is highly engaged in identifying the association of complex traits with rare variations. A score-based variance-component test has been proposed to identify common and rare genetic variants associated with complex traits while quickly adjusting for covariates. Such kernel score statistic allows for familial dependencies and adjusts for random confounding effects. However, the etiology of complex traits may involve the effects of genetic and environmental factors and the complex interactions between genes and the environment. Therefore, in this research, a novel method is proposed to detect gene and gene-environment interactions in a complex family-based association study with various correlated structures. We also developed an R function for the Fast Gene-Environment Sequence Kernel Association Test (FGE-SKAT), which is freely available as supplementary material for easy GWAS implementation to unveil such family-based joint effects. Simulation studies confirmed the validity of the new strategy and the superior statistical power. The FGE-SKAT was applied to the whole genome sequence data provided by Genetic Analysis Workshop 18 (GAW18) and discovered concordant and discordant regions compared to the methods without considering gene by environment interactions.