scholarly journals Association Analysis and Meta-Analysis of Multi-Allelic Variants for Large-Scale Sequence Data

Genes ◽  
2020 ◽  
Vol 11 (5) ◽  
pp. 586
Author(s):  
Yu Jiang ◽  
Sai Chen ◽  
Xingyan Wang ◽  
Mengzhen Liu ◽  
William G. Iacono ◽  
...  

There is great interest in understanding the impact of rare variants in human diseases using large sequence datasets. In deep sequence datasets of >10,000 samples, ~10% of the variant sites are observed to be multi-allelic. Many of the multi-allelic variants have been shown to be functional and disease-relevant. Proper analysis of multi-allelic variants is critical to the success of a sequencing study, but existing methods do not properly handle multi-allelic variants and can produce highly misleading association results. We discuss practical issues and methods to encode multi-allelic sites, conduct single-variant and gene-level association analyses, and perform meta-analysis for multi-allelic variants. We evaluated these methods through extensive simulations and the study of a large meta-analysis of ~18,000 samples on the cigarettes-per-day phenotype. We showed that our joint modeling approach provided an unbiased estimate of genetic effects, greatly improved the power of single-variant association tests among methods that can properly estimate allele effects, and enhanced gene-level tests over existing approaches. Software packages implementing these methods are available online.

2017 ◽  
Author(s):  
Xiaowei Zhan ◽  
Sai Chen ◽  
Yu Jiang ◽  
Mengzhen Liu ◽  
William G. Iacono ◽  
...  

AbstractMotivation:There is great interest to understand the impact of rare variants in human diseases using large sequence datasets. In deep sequences datasets of >10,000 samples, ∼10% of the variant sites are observed to be multi-allelic. Many of the multi-allelic variants have been shown to be functional and disease relevant. Proper analysis of multi-allelic variants is critical to the success of a sequencing study, but existing methods do not properly handle multi-allelic variants and can produce highly misleading association results.Results:We propose novel methods to encode multi-allelic sites, conduct single variant and gene-level association analyses, and perform meta-analysis for multi-allelic variants. We evaluated these methods through extensive simulations and the study of a large meta-analysis of ∼18,000 samples on the cigarettes-per-day phenotype. We showed that our joint modeling approach provided an unbiased estimate of genetic effects, greatly improved the power of single variant association tests, and enhanced gene-level tests over existing approaches.Availability:Software packages implementing these methods are available at (https://github.com/zhanxw/rvtestshttp://genome.sph.umich.edu/wiki/RareMETAL).Contact:[email protected]; [email protected]


2021 ◽  
Vol 80 (Suppl 1) ◽  
pp. 406.2-407
Author(s):  
K. Pavelcova ◽  
J. Bohata ◽  
B. Stiburkova

Background:The level of uric acid is largely determined by the functions of urate transporters, which are located in the kidney and intestine. The ABCG2 protein is the major excretor of uric acid and its dysfunction may lead to the development of hyperuricemia and gout.Objectives:The aim of our study was to detect the occurrence and frequency of allelic variants in the ABCG2 gene that can lead to impaired function of the ABCG2 protein and to the development of hyperuricemia and gout.Methods:We examined allelic variants of ABCG2 using PCR amplification and Sanger sequencing of all coding regions and exon-intron boundaries in 359 patients with primary hyperuricemia and gout.Results:We found a rare in-frame deletion p.K360del and 15 missense variants, two of which were common (p.V12M, p.Q141K) and 13 were very rare (p.M71V, p.G74D, p.M131I, p.R147W, p.T153M, p.I242T, p.R236X, p.F373C, p.T421A, p.T434M, p.S476P, p.S572R, p.D620N). The p.R236X variant leads to a premature stop codon. The p.V12M variant probably has a protective effect against gout (minor allele frequency – MAF – in our cohort = 0,025 / MAF in the European population = 0,061), while the p.Q141K variant increases the risk of gout (MAF in our cohort = 0,213 / MAF in the European population = 0,094) (1). As for the rare variants, the p.R147W, p.T153M, p.F373C, p.T434M, p.S476P and p.S572R according to functional analyzes reduce the function of the ABCG2 protein (2). Based on in silico prediction, the impact on reduced function is expected for variants p.M71V, p.G74D, p.M131I, p.R147W, p.I242T, p.F373C, p.T434M, p.S476P and p.S572R.Conclusion:Our data suggest that the common variant p.Q141K and most of the rare variants in the ABCG2 gene affect the function of the ABCG2 urate transporter and are a genetic risk factor for hyperuricemia and gout.References:[1]Stiburkova B, et al. Functional non-synonymous variants of ABCG2 and gout risk. Rheumatology (Oxford). 2017 Nov 1; 56(11):1982-1992.[2]Toyoda Y, et al. Functional characterization of clinically-relevant rare variants in ABCG2 identified in a gout and hyperuricemia cohort. Cells. 2019 Apr 18;8(4).Acknowledgements:This study was supported by the project for conceptual development of research organization 00023728 (Institute of Rheumatology) and RVO VFN64165.Disclosure of Interests:None declared


2021 ◽  
pp. 1-14
Author(s):  
Mi Su ◽  
Yongyan Song

<b><i>Background:</i></b> Genetic factors were suggested to have influence on the development of post-traumatic stress disorder (PTSD). The possible association between catechol-O-methyltransferase (<i>COMT</i>) Val158Met polymorphism and PTSD has been evaluated in several studies. But the results were still controversial. Therefore, we conduct this meta-analysis to address these issues. <b><i>Methods:</i></b> The PubMed, EMBASE, Cochrane Library, and Web of Science databases were searched for eligible studies. The pooled odds ratio (OR) with 95% confidence interval (CI) was calculated to estimate the association between <i>COMT</i> Val158Met polymorphism and PTSD. <b><i>Results:</i></b> Five articles including 6 studies with 893 cases and 968 controls were finally included in the present meta-analysis. The pooled analyses did not demonstrate a significant association between the <i>COMT</i> Val158Met polymorphism and PTSD in any of the selected genetic models: allele model (OR = 1.13, 95% CI: 0.97–1.31), dominant model (OR = 1.17, 95% CI: 0.93–1.46), recessive model (OR = 1.44, 95% CI: 0.78–2.66), and additive model (OR = 1.54, 95% CI: 0.85–2.80). Subgroup analyses suggested that the Hardy-Weinberg equilibrium status of genotype distributions could influence the relationship of <i>COMT</i> Val158Met polymorphism and PTSD. <b><i>Conclusions:</i></b> The present meta-analysis suggested that the <i>COMT</i> Val158Met polymorphism may not be associated with the PTSD risk. Further large-scale and population-representative studies are warranted to evaluate the impact of the <i>COMT</i> Val158Met polymorphism on the risk of PTSD.


Biostatistics ◽  
2019 ◽  
Author(s):  
Jingchunzi Shi ◽  
Michael Boehnke ◽  
Seunggeun Lee

Summary Trans-ethnic meta-analysis is a powerful tool for detecting novel loci in genetic association studies. However, in the presence of heterogeneity among different populations, existing gene-/region-based rare variants meta-analysis methods may be unsatisfactory because they do not consider genetic similarity or dissimilarity among different populations. In response, we propose a score test under the modified random effects model for gene-/region-based rare variants associations. We adapt the kernel regression framework to construct the model and incorporate genetic similarities across populations into modeling the heterogeneity structure of the genetic effect coefficients. We use a resampling-based copula method to approximate asymptotic distribution of the test statistic, enabling efficient estimation of p-values. Simulation studies show that our proposed method controls type I error rates and increases power over existing approaches in the presence of heterogeneity. We illustrate our method by analyzing T2D-GENES consortium exome sequence data to explore rare variant associations with several traits.


2015 ◽  
Vol 2 (1) ◽  
pp. 18-25 ◽  
Author(s):  
Andreas Raith

Abstract In Germany, all-day care and all-day schooling are currently increasing on a large-scale. The extended time children spend in educational institutions could potentially result in limited access to nature experience for children. On the other hand, it could equally create opportunities for informal nature experience if school playgrounds have a specific nature-oriented design. This article is written from the perspective of a primary school teacher and presents the findings of a meta-analysis which looks at the impact nature experience has on the development of children. Furthermore, the first results of a research study on green playgrounds in primary schools is discussed. The results so far seem to indicate that green school playgrounds have the potential of providing nature experience particularly for primary students


2016 ◽  
Author(s):  
Antonio F Pardiñas ◽  
Peter Holmans ◽  
Andrew J Pocklington ◽  
Valentina Escott-Price ◽  
Stephan Ripke ◽  
...  

Schizophrenia is a debilitating psychiatric condition often associated with poor quality of life and decreased life expectancy. Lack of progress in improving treatment outcomes has been attributed to limited knowledge of the underlying biology, although large-scale genomic studies have begun to provide such insight. We report the largest single cohort genome-wide association study of schizophrenia (11,260 cases and 24,542 controls) and through meta-analysis with existing data we identify 50 novel GWAS loci. Using gene-wide association statistics we implicate an additional set of 22 novel associations that map onto a single gene. We show for the first time that the common variant association signal is highly enriched among genes that are intolerant to loss of function mutations and that variants in these genes persist in the population despite the low fecundity associated with the disorder through the process of background selection. Associations point to novel areas of biology (e.g. metabotropic GABA-B signalling and acetyl cholinesterase), reinforce those implicated in earlier GWAS studies (e.g. calcium channel function), converge with earlier rare variants studies (e.g. NRXN1, GABAergic signalling), identify novel overlaps with autism (e.g. RBFOX1, FOXP1, FOXG1), and support early controversial candidate gene hypotheses (e.g. ERBB4 implicating neuregulin signalling). We also demonstrate the involvement of six independent central nervous system functional gene sets in schizophrenia pathophysiology. These findings provide novel insights into the biology and genetic architecture of schizophrenia, highlight the importance of mutation intolerant genes and suggest a mechanism by which common risk variants are maintained in the population.


2021 ◽  
Author(s):  
Tony Zeng ◽  
Yang I Li

Recent progress in deep learning approaches have greatly improved the prediction of RNA splicing from DNA sequence. Here, we present Pangolin, a deep learning model to predict splice site strength in multiple tissues that has been trained on RNA splicing and sequence data from four species. Pangolin outperforms state of the art methods for predicting RNA splicing on a variety of prediction tasks. We use Pangolin to study the impact of genetic variants on RNA splicing, including lineage-specific variants and rare variants of uncertain significance. Pangolin predicts loss-of-function mutations with high accuracy and recall, particularly for mutations that are not missense or nonsense (AUPRC = 0.93), demonstrating remarkable potential for identifying pathogenic variants.


2021 ◽  
pp. annrheumdis-2020-218359
Author(s):  
Xinyi Meng ◽  
Xiaoyuan Hou ◽  
Ping Wang ◽  
Joseph T Glessner ◽  
Hui-Qi Qu ◽  
...  

ObjectiveJuvenile idiopathic arthritis (JIA) is the most common type of arthritis among children, but a few studies have investigated the contribution of rare variants to JIA. In this study, we aimed to identify rare coding variants associated with JIA for the genome-wide landscape.MethodsWe established a rare variant calling and filtering pipeline and performed rare coding variant and gene-based association analyses on three RNA-seq datasets composed of 228 JIA patients in the Gene Expression Omnibus against different sets of controls, and further conducted replication in our whole-exome sequencing (WES) data of 56 JIA patients. Then we conducted differential gene expression analysis and assessed the impact of recurrent functional coding variants on gene expression and signalling pathway.ResultsBy the RNA-seq data, we identified variants in two genes reported in literature as JIA causal variants, as well as additional 63 recurrent rare coding variants seen only in JIA patients. Among the 44 recurrent rare variants found in polyarticular patients, 10 were replicated by our WES of patients with the same JIA subtype. Several genes with recurrent functional rare coding variants have also common variants associated with autoimmune diseases. We observed immune pathways enriched for the genes with rare coding variants and differentially expressed genes.ConclusionThis study elucidated a novel landscape of recurrent rare coding variants in JIA patients and uncovered significant associations with JIA at the gene pathway level. The convergence of common variants and rare variants for autoimmune diseases is also highlighted in this study.


Circulation ◽  
2013 ◽  
Vol 127 (suppl_12) ◽  
Author(s):  
Belinda K Cornes ◽  
Jennifer Brody ◽  
Alanna C Morrison ◽  
David Siscovick ◽  
James B Meigs ◽  
...  

Introduction: Common variants in the gene encoding insulin receptor substrate 1 ( IRS1 ) and nearby on 2q36.3 have been associated with levels of fasting insulin (FI). We hypothesized that a greater burden of rare variants in these regions is associated with higher FI. Methods: CHARGE-S sequenced (average coverage >60x) the IRS1 and 2q36.6 regions (totaling 185 kb) in 3,539 individuals on the SOLiD platform. FI information among non-diabetics was available in 3 studies: Framingham Heart Study ( N =811), Cardiovascular Heart Study ( N =967) and Atherosclerosis Risk in Communities Study ( N =1761). We analyzed rare variants (MAF < 1%) using a weighted sum test, similar to Madsen-Browning (powerful to detect an association if effects of casual rare variants are in the same direction), and the SKAT test (preferred method if variant effects are in opposite directions). Meta-analyses of weighted rare variants results used the inverse-variance method while SKAT results used a similar approach. For multi-variant tests, the threshold for significance was considered to be α = 0.05. Coding annotation predictions were obtained from the dbNSFP database which includes functional predictions from SIFT, MutationTaster, Polyphen-2, Phylo-P and LRT. Non-coding annotation information (protein binding regions, transcription factor binding sites, DNase hypersensitivity sites, conservation scores) was obtained from ENCODE and ORegAnno databases. From these annotations, we grouped different types of variants together (possible loss of function; possibly regulatory) in order to determine specific variants contributing most to the effect. Results: Sequencing found 4,534 variants in two regions, 86.7% of which were rare and novel, not seen in 1000 genomes or dbSNP. Approximately 20% of variants had annotation information available; of these, 34 variants were possibly damaging. We found suggestive association with FI ( p =0.03) for all rare variants in the meta-analysis of weighted-sum tests at 2q36.3 but not at IRS1 . At IRS1 (but not at 2q36.3), SKAT meta-analysis tests showed evidence for all rare variants associated with FI ( p =0.03). SKAT tests restricted to N =365 possibly damaging variants at IRS1 suggested an association with FI in coding ( p =0.06) and in non-coding ( p =0.02) variants. Conclusion: Large scale deep sequencing in the IRS1 and 2q36.3 regions found very large numbers of new, rare variants. Multi-variant tests suggest that rare variation in these regions influence FI levels, with individuals with more and rarer variants having higher FI. Further investigation is warranted to address why weighted sum and SKAT tests provide different levels of evidence for association in the two regions. Also, conditional analyses will test whether new rare variants at IRS1 or 2q36 explain observed GWAS associations.


2016 ◽  
Author(s):  
Alan Medlar ◽  
Laura Laakso ◽  
Andreia Miraldo ◽  
Ari Löytynoja

AbstractHigh-throughput RNA-seq data has become ubiquitous in the study of non-model organisms, but its use in comparative analysis remains a challenge. Without a reference genome for mapping, sequence data has to be de novo assembled, producing large numbers of short, highly redundant contigs. Preparing these assemblies for comparative analyses requires the removal of redundant isoforms, assignment of orthologs and converting fragmented transcripts into gene alignments. In this article we present Glutton, a novel tool to process transcriptome assemblies for downstream evolutionary analyses. Glutton takes as input a set of fragmented, possibly erroneous transcriptome assemblies. Utilising phylogeny-aware alignment and reference data from a closely related species, it reconstructs one transcript per gene, finds orthologous sequences and produces accurate multiple alignments of coding sequences. We present a comprehensive analysis of Glutton’s performance across a wide range of divergence times between study and reference species. We demonstrate the impact choice of assembler has on both the number of alignments and the correctness of ortholog assignment and show substantial improvements over heuristic methods, without sacrificing correctness. Finally, using inference of Darwinian selection as an example of downstream analysis, we show that Glutton-processed RNA-seq data give results comparable to those obtained from full length gene sequences even with distantly related reference species. Glutton is available from http://wasabiapp.org/software/glutton/ and is licensed under the GPLv3.


Sign in / Sign up

Export Citation Format

Share Document