scholarly journals Rare variant enriched identity-by-descent enables the detection of distant relatedness and older divergence between populations

2020 ◽  
Author(s):  
Amol C. Shetty ◽  
Jeffrey O’Connell ◽  
Braxton D. Mitchell ◽  
Timothy D. O’Connor ◽  
◽  
...  

AbstractMotivationThe global human population has experienced an explosive growth from a few million to roughly 7 billion people in the last 10,000 years. Accompanying this growth has been the accumulation of rare variants that can inform our understanding of human evolutionary history. Common variants have primarily been used to infer the structure of the human population and relatedness between two individuals. However, with the increasing abundance of rare variants observed in large-scale projects, such as Trans-Omics for Precision Medicine (TOPMed), the use of rare variants to decipher cryptic relatedness and fine-scale population structure can be beneficial to the study of population demographics and association studies. Identity-by-descent (IBD) is an important framework used for identifying these relationships. IBD segments are broken down by recombination over time, such that longer shared haplotypes give strong evidence of recent relatedness while shorter shared haplotypes are indicative of more distant relationships. Current methods to identify IBD accurately detect only long segments (> 2cM) found in related individuals.AlgorithmWe describe a metric that leverages rare-variants shared between individuals to improve the detection of short IBD segments. We computed IBD segments using existing methods implemented in Refined IBD where we enrich the signal using our metric that facilitates the detection of short IBD segments (<2cM) by explicitly incorporating rare variants.ResultsTo test our new metric, we simulated datasets involving populations with varying divergent time-scales. We show that rare-variant IBD identifies shorter segments with greater confidence and enables the detection of older divergence between populations. As an example, we applied our metric to the Old-Order Amish cohort with known genealogies dating 14 generations back to validate its ability to detect genetic relatedness between distant relatives. This analysis shows that our method increases the accuracy of identifying shorter segments that in turn capture distant relationships.ConclusionsWe describe a method to enrich the detection of short IBD segments using rare-variant sharing within IBD segments. Leveraging rare-variant sharing improves the information content of short IBD segments better than common variants alone. We validated the method in both simulated and empirical datasets. This method can benefit association analyses, IBD mapping analyses, and demographic inferences.

2017 ◽  
Vol 37 (suppl_1) ◽  
Author(s):  
Jacqueline S Dron ◽  
Jian Wang ◽  
Cécile Low-Kam ◽  
Sumeet A Khetarpal ◽  
John F Robinson ◽  
...  

Rationale: Although HDL-C levels are known to have a complex genetic basis, most studies have focused solely on identifying rare variants with large phenotypic effects to explain extreme HDL-C phenotypes. Objective: Here we concurrently evaluate the contribution of both rare and common genetic variants, as well as large-scale copy number variations (CNVs), towards extreme HDL-C concentrations. Methods: In clinically ascertained patients with low ( N =136) and high ( N =119) HDL-C profiles, we applied our targeted next-generation sequencing panel (LipidSeq TM ) to sequence genes involved in HDL metabolism, which were subsequently screened for rare variants and CNVs. We also developed a novel polygenic trait score (PTS) to assess patients’ genetic accumulations of common variants that have been shown by genome-wide association studies to associate primarily with HDL-C levels. Two additional cohorts of patients with extremely low and high HDL-C (total N =1,746 and N =1,139, respectively) were used for PTS validation. Results: In the discovery cohort, 32.4% of low HDL-C patients carried rare variants or CNVs in primary ( ABCA1 , APOA1 , LCAT ) and secondary ( LPL , LMF1 , GPD1 , APOE ) HDL-C–altering genes. Additionally, 13.4% of high HDL-C patients carried rare variants or CNVs in primary ( SCARB1 , CETP , LIPC , LIPG ) and secondary ( APOC3 , ANGPTL4 ) HDL-C–altering genes. For polygenic effects, patients with abnormal HDL-C profiles but without rare variants or CNVs were ~2-fold more likely to have an extreme PTS compared to normolipidemic individuals, indicating an increased frequency of common HDL-C–associated variants in these patients. Similar results in the two validation cohorts demonstrate that this novel PTS successfully quantifies common variant accumulation, further characterizing the polygenic basis for extreme HDL-C phenotypes. Conclusions: Patients with extreme HDL-C levels have various combinations of rare variants, common variants, or CNVs driving their phenotypes. Fully characterizing the genetic basis of HDL-C levels must extend to encompass multiple types of genetic determinants—not just rare variants—to further our understanding of this complex, controversial quantitative trait.


2019 ◽  
Author(s):  
Elizabeth T. Cirulli ◽  
Simon White ◽  
Robert W. Read ◽  
Gai Elhanan ◽  
William J Metcalf ◽  
...  

Defining the effects that rare variants can have on human phenotypes is essential to advancing our understanding of human health and disease. Large-scale human genetic analyses have thus far focused on common variants, but the development of large cohorts of deeply phenotyped individuals with exome sequence data has now made comprehensive analyses of rare variants possible. We analyzed the effects of rare (MAF<0.1%) variants on 3,166 phenotypes in 40,468 exome-sequenced individuals from the UK Biobank and performed replication as well as meta-analyses with 1,067 phenotypes in 13,470 members of the Healthy Nevada Project (HNP) cohort who underwent Exome+ sequencing at Helix. Our analyses of non-benign coding and loss of function (LoF) variants identified 78 gene-based associations that passed our statistical significance threshold (p<5×10-9). These are associations in which carrying any rare coding or LoF variant in the gene is associated with an enrichment for a specific phenotype, as opposed to GWAS-based associations of strictly single variants. Importantly, our results do not suffer from the test statistic inflation that is often seen with rare variant analyses of biobank-scale data because of our rare variant-tailored methodology, which includes a step that optimizes the carrier frequency threshold for each phenotype based on prevalence. Of the 47 discovery associations whose phenotypes were represented in the replication cohort, 98% showed effects in the expected direction, and 45% attained formal replication significance (p<0.001). Six additional significant associations were identified in our meta-analysis of both cohorts. Among the results, we confirm known associations of PCSK9 and APOB variation with LDL levels; we extend knowledge of variation in the TYRP1 gene, previously associated with blonde hair color only in Solomon Islanders to blonde hair color in individuals of European ancestry; we show that PAPPA, a gene in which common variants had previously associated with height via GWAS, contains rare variants that decrease height; and we make the novel discovery that STAB1 variation is associated with blood flow in the brain. Our results are available for download and interactive browsing in an app (https://ukb.research.helix.com). This comprehensive analysis of the effects of rare variants on human phenotypes marks one of the first steps in the next big phase of human genetics, where large, deeply phenotyped cohorts with next generation sequence data will elucidate the effects of rare variants.


2019 ◽  
Vol 101 ◽  
Author(s):  
Lifeng Liu ◽  
Pengfei Wang ◽  
Jingbo Meng ◽  
Lili Chen ◽  
Wensheng Zhu ◽  
...  

Abstract In recent years, there has been an increasing interest in detecting disease-related rare variants in sequencing studies. Numerous studies have shown that common variants can only explain a small proportion of the phenotypic variance for complex diseases. More and more evidence suggests that some of this missing heritability can be explained by rare variants. Considering the importance of rare variants, researchers have proposed a considerable number of methods for identifying the rare variants associated with complex diseases. Extensive research has been carried out on testing the association between rare variants and dichotomous, continuous or ordinal traits. So far, however, there has been little discussion about the case in which both genotypes and phenotypes are ordinal variables. This paper introduces a method based on the γ-statistic, called OV-RV, for examining disease-related rare variants when both genotypes and phenotypes are ordinal. At present, little is known about the asymptotic distribution of the γ-statistic when conducting association analyses for rare variants. One advantage of OV-RV is that it provides a robust estimation of the distribution of the γ-statistic by employing the permutation approach proposed by Fisher. We also perform extensive simulations to investigate the numerical performance of OV-RV under various model settings. The simulation results reveal that OV-RV is valid and efficient; namely, it controls the type I error approximately at the pre-specified significance level and achieves greater power at the same significance level. We also apply OV-RV for rare variant association studies of diastolic blood pressure.


2021 ◽  
pp. 1-10
Author(s):  
Zoe Guan ◽  
Ronglai Shen ◽  
Colin B. Begg

<b><i>Background:</i></b> Many cancer types show considerable heritability, and extensive research has been done to identify germline susceptibility variants. Linkage studies have discovered many rare high-risk variants, and genome-wide association studies (GWAS) have discovered many common low-risk variants. However, it is believed that a considerable proportion of the heritability of cancer remains unexplained by known susceptibility variants. The “rare variant hypothesis” proposes that much of the missing heritability lies in rare variants that cannot reliably be detected by linkage analysis or GWAS. Until recently, high sequencing costs have precluded extensive surveys of rare variants, but technological advances have now made it possible to analyze rare variants on a much greater scale. <b><i>Objectives:</i></b> In this study, we investigated associations between rare variants and 14 cancer types. <b><i>Methods:</i></b> We ran association tests using whole-exome sequencing data from The Cancer Genome Atlas (TCGA) and validated the findings using data from the Pan-Cancer Analysis of Whole Genomes Consortium (PCAWG). <b><i>Results:</i></b> We identified four significant associations in TCGA, only one of which was replicated in PCAWG (BRCA1 and ovarian cancer). <b><i>Conclusions:</i></b> Our results provide little evidence in favor of the rare variant hypothesis. Much larger sample sizes may be needed to detect undiscovered rare cancer variants.


2020 ◽  
Author(s):  
Patrick Sin-Chan ◽  
Nehal Gosalia ◽  
Chuan Gao ◽  
Cristopher V. Van Hout ◽  
Bin Ye ◽  
...  

SUMMARYAging is characterized by degeneration in cellular and organismal functions leading to increased disease susceptibility and death. Although our understanding of aging biology in model systems has increased dramatically, large-scale sequencing studies to understand human aging are now just beginning. We applied exome sequencing and association analyses (ExWAS) to identify age-related variants on 58,470 participants of the DiscovEHR cohort. Linear Mixed Model regression analyses of age at last encounter revealed variants in genes known to be linked with clonal hematopoiesis of indeterminate potential, which are associated with myelodysplastic syndromes, as top signals in our analysis, suggestive of age-related somatic mutation accumulation in hematopoietic cells despite patients lacking clinical diagnoses. In addition to APOE, we identified rare DISP2 rs183775254 (p = 7.40×10−10) and ZYG11A rs74227999 (p = 2.50×10−08) variants that were negatively associated with age in either both sexes combined and females, respectively, which were replicated with directional consistency in two independent cohorts. Epigenetic mapping showed these variants are located within cell-type-specific enhancers, suggestive of important transcriptional regulatory functions. To discover variants associated with extreme age, we performed exome-sequencing on persons of Ashkenazi Jewish descent ascertained for extensive lifespans. Case-Control analyses in 525 Ashkenazi Jews cases (Males ≥ 92 years, Females ≥ 95years) were compared to 482 controls. Our results showed variants in APOE (rs429358, rs6857), and TMTC2 (rs7976168) passed Bonferroni-adjusted p-value, as well as several nominally-associated population-specific variants. Collectively, our Age-ExWAS, the largest performed to date, confirmed and identified previously unreported candidate variants associated with human age.


2011 ◽  
Vol 26 (S2) ◽  
pp. 1346-1346
Author(s):  
D. Benmessaoud ◽  
A.-M. Lepagnol-Bestel ◽  
M. Delepine ◽  
J. Hager ◽  
J.-M. Moalic ◽  
...  

Genome wide association studies (GWAS) of Schizophrenia (SZ) patients have identified common variants in ten genes including SMARCA2 (Koga et al., HMG, 2009). We found that the SZ-GWAS genes are part of an interacting network centered on SMARCA2 (Loe-Mie et al., HMG, 2010). Furthermore, SMARCA2 was found disrupted in SZ (Walsh et al., Science, 2008). SMARCA2 encodes the ATPase (BRM) of the SWI/SNF chromatin remodeling complex that is at the interface of genome and environmental adaptation.Taking advantage of an Algerian trio cohort of one hundred SZ patients (Benmessaoud et al., BMC Psychiatry, 2008), we replicated the association of SNP rs2296212 localized in exon 33, already shown associated in Koga study and resulting in D1546E amino acid change in the SMARCA2 protein. We studied SMARCA2 codons and found that exon 33 displays a signature of positive evolution in the primate lineage.Our working hypothesis is that the coding regions displaying positive selection are target of novel rare variants. To address this question, we sequenced two exons displaying positive evolution and one exon without evidence of positive evolution.We found (i) that rare variants are significantly in excess in SZ-patients compared to their parents (p = 0.038, Fisher test) and (ii) a higher proportion of rare variants in the primate-accelerated exons compared with the non-evolutionary exon in SZ-patients (p = 0.032, Fisher test).SMARCA2 exon sequencing and whole exome sequencing from patients harboring SNP rs2296212 common variant are under progress. Altogether, these results are expected to give new insights into the genetic architecture of SZ.


2018 ◽  
Vol 55 (12) ◽  
pp. 831-836 ◽  
Author(s):  
Xiao Chang ◽  
Renata Pellegrino ◽  
James Garifallou ◽  
Michael March ◽  
James Snyder ◽  
...  

BackgroundGenome-wide association studies (GWASs) have identified multiple susceptibility loci for migraine in European adults. However, no large-scale genetic studies have been performed in children or African Americans with migraine.MethodsWe conducted a GWAS of 380 African-American children and 2129 ancestry-matched controls to identify variants associated with migraine. We then attempted to replicate our primary analysis in an independent cohort of 233 African-American patients and 4038 non-migraine control subjects.ResultsThe results of this study indicate that common variants at 5q33.1 associated with migraine risk in African-American children (rs72793414, p=1.94×10−9). The association was validated in an independent study (p=3.87×10−3) for an overall meta-analysis p value of 3.81×10−10. eQTL (Expression quantitative trait loci) analysis of the Genotype-Tissue Expression data also shows the genotypes of rs72793414 were strongly correlated with the mRNA expression levels of NMUR2 at 5q33.1. NMUR2 encodes a G protein-coupled receptor of neuromedin-U (NMU). NMU, a highly conserved neuropeptide, participates in diverse physiological processes of the central nervous system.ConclusionsThis study provides new insights into the genetic basis of childhood migraine and allow for precision therapeutic development strategies targeting migraine patients of African-American ancestry.


2020 ◽  
Vol 3 (2) ◽  
pp. 25-30
Author(s):  
Renata Zunec

Coronavirus disease 2019 (COVID-19), caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), is reported to vary across different populations in the prevalence of infection, in the death rate of patients, in the severity of symptoms and in the drug response of patients. Among host genetic factors that can influence all these attributes human leukocyte antigen (HLA) genetic system stands out as one of the leading candidates. Case-control studies, large-scale population-based studies, as well as experimental bioinformatics studies are of utmost importance to confirm HLA susceptibility spectrum of COVID-19. This review presents the results of the first case-control and epidemiological studies performed in several populations, early after the pandemic breakout. The results are pointing to several susceptible and protective HLA alleles and haplotypes associations with COVID-19, some of which might be of interest for the future studies in Croatia, due to its common presence in the population. However, further multiple investigations from around the world, as numerous as possible, are needed to confirm or deteriorate these preliminary results.


2021 ◽  
Vol 6 (1) ◽  
Author(s):  
Allison A. Dilliott ◽  
Abdalla Abdelhady ◽  
Kelly M. Sunderland ◽  
Sali M. K. Farhan ◽  
Agessandro Abrahao ◽  
...  

AbstractGenetic factors contribute to neurodegenerative diseases, with high heritability estimates across diagnoses; however, a large portion of the genetic influence remains poorly understood. Many previous studies have attempted to fill the gaps by performing linkage analyses and association studies in individual disease cohorts, but have failed to consider the clinical and pathological overlap observed across neurodegenerative diseases and the potential for genetic overlap between the phenotypes. Here, we leveraged rare variant association analyses (RVAAs) to elucidate the genetic overlap among multiple neurodegenerative diagnoses, including Alzheimer’s disease, amyotrophic lateral sclerosis, frontotemporal dementia (FTD), mild cognitive impairment, and Parkinson’s disease (PD), as well as cerebrovascular disease, using the data generated with a custom-designed neurodegenerative disease gene panel in the Ontario Neurodegenerative Disease Research Initiative (ONDRI). As expected, only ~3% of ONDRI participants harboured a monogenic variant likely driving their disease presentation. Yet, when genes were binned based on previous disease associations, we observed an enrichment of putative loss of function variants in PD genes across all ONDRI cohorts. Further, individual gene-based RVAA identified significant enrichment of rare, nonsynonymous variants in PARK2 in the FTD cohort, and in NOTCH3 in the PD cohort. The results indicate that there may be greater heterogeneity in the genetic factors contributing to neurodegeneration than previously appreciated. Although the mechanisms by which these genes contribute to disease presentation must be further explored, we hypothesize they may be a result of rare variants of moderate phenotypic effect contributing to overlapping pathology and clinical features observed across neurodegenerative diagnoses.


2021 ◽  
pp. annrheumdis-2020-218359
Author(s):  
Xinyi Meng ◽  
Xiaoyuan Hou ◽  
Ping Wang ◽  
Joseph T Glessner ◽  
Hui-Qi Qu ◽  
...  

ObjectiveJuvenile idiopathic arthritis (JIA) is the most common type of arthritis among children, but a few studies have investigated the contribution of rare variants to JIA. In this study, we aimed to identify rare coding variants associated with JIA for the genome-wide landscape.MethodsWe established a rare variant calling and filtering pipeline and performed rare coding variant and gene-based association analyses on three RNA-seq datasets composed of 228 JIA patients in the Gene Expression Omnibus against different sets of controls, and further conducted replication in our whole-exome sequencing (WES) data of 56 JIA patients. Then we conducted differential gene expression analysis and assessed the impact of recurrent functional coding variants on gene expression and signalling pathway.ResultsBy the RNA-seq data, we identified variants in two genes reported in literature as JIA causal variants, as well as additional 63 recurrent rare coding variants seen only in JIA patients. Among the 44 recurrent rare variants found in polyarticular patients, 10 were replicated by our WES of patients with the same JIA subtype. Several genes with recurrent functional rare coding variants have also common variants associated with autoimmune diseases. We observed immune pathways enriched for the genes with rare coding variants and differentially expressed genes.ConclusionThis study elucidated a novel landscape of recurrent rare coding variants in JIA patients and uncovered significant associations with JIA at the gene pathway level. The convergence of common variants and rare variants for autoimmune diseases is also highlighted in this study.


Sign in / Sign up

Export Citation Format

Share Document