scholarly journals PyVar: An Extensible Framework for Variant Annotator Comparison

2016 ◽  
Author(s):  
Julie Wertz ◽  
Qianli Liao ◽  
Thomas B Bair ◽  
Michael S Chimenti

AbstractModern genomics projects are generating millions of variant calls that must be annotated for predicted functional consequences at the level of gene expression and protein function. Many of these variants are of interest owing to their potential clinical significance. Unfortunately, state-of-the-art methods do not always agree on downstream effects for any given variant. Here we present a readily extensible python framework (PyVar) for comparing the output of variant annotator methods in order to aid the research community in quickly assessing differences between methods and benchmarking new methods as they are developed. We also apply our framework to assess the annotation performance of ANNOVAR, VEP, and SnpEff when annotating 81 million variants from the ‘1000 Genomes Project’ against both RefSeq and Ensembl human transcript sets.

2020 ◽  
Author(s):  
Peter Pfaffelhuber ◽  
Elisabeth Sester-Huss ◽  
Franz Baumdicker ◽  
Jana Naue ◽  
Sabine Lutz-Bonengel ◽  
...  

AbstractThe inference of biogeographic ancestry (BGA) has become a focus of forensic genetics. Mis-inference of BGA can have profound unwanted consequences for investigations and society. We show that recent admixture can lead to misclassification and erroneous inference of ancestry proportions, using state of the art analysis tools with (i) simulations, (ii) 1000 genomes project data, and (iii) two individuals analyzed using the ForenSeq DNA Signature Prep Kit. Subsequently, we extend existing tools for estimation of individual ancestry (IA) by allowing for different IA in both parents, leading to estimates of parental individual ancestry (PIA), and a statistical test for recent admixture. Estimation of PIA outperforms IA in most scenarios of recent admixture. Furthermore, additional information about parental ancestry can be acquired with PIA that may guide casework.


2019 ◽  
Author(s):  
Madeline H. Kowalski ◽  
Huijun Qian ◽  
Ziyi Hou ◽  
Jonathan D. Rosen ◽  
Amanda L. Tapia ◽  
...  

AbstractMost genome-wide association and fine-mapping studies to date have been conducted in individuals of European descent, and genetic studies of populations of Hispanic/Latino and African ancestry are still limited. In addition to the limited inclusion of these populations in genetic studies, these populations have more complex linkage disequilibrium structure that may reduce the number of variants associated with a phenotype. In order to better define the genetic architecture of these understudied populations, we leveraged >100,000 phased sequences available from deep-coverage whole genome sequencing through the multi-ethnic NHLBI Trans-Omics for Precision Medicine (TOPMed) program to impute genotypes into admixed African and Hispanic/Latino samples with commercial genome-wide genotyping array data. We demonstrate that using TOPMed sequencing data as the imputation reference panel improves genotype imputation quality in these populations, which subsequently enhances gene-mapping power for complex traits. For rare variants with minor allele frequency (MAF) < 0.5%, we observed a 2.3 to 6.1-fold increase in the number of well-imputed variants, with 11-34% improvement in average imputation quality, compared to the state-of-the-art 1000 Genomes Project Phase 3 and Haplotype Reference Consortium reference panels, respectively. Impressively, even for extremely rare variants with sample minor allele count <10 (including singletons) in the imputation target samples, average information content rescued was >86%. Subsequent association analyses of TOPMed reference panel-imputed genotype data with hematological traits (hemoglobin (HGB), hematocrit (HCT), and white blood cell count (WBC)) in ~20,000 self-identified African descent individuals and ~23,000 self-identified Hispanic/Latino individuals identified associations with two rare variants in theHBBgene (rs33930165 with higher WBC (p=8.1×10−12) in African populations, rs11549407 with lower HGB (p=1.59×10−12) and HCT (p=1.13×10−9) in Hispanics/Latinos). By comparison, neither variant would have been genome-wide significant if either 1000 Genomes Project Phase 3 or Haplotype Reference Consortium reference panels had been used for imputation. Our findings highlight the utility of TOPMed imputation reference panel for identification of novel associations between rare variants and complex traits not previously detected in similar sized genome-wide studies of under-represented African and Hispanic/Latino populations.Author summaryAdmixed African and Hispanic/Latino populations remain understudied in genome-wide association and fine-mapping studies of complex diseases. These populations have more complex linkage disequilibrium (LD) structure that can impair mapping of variants associated with complex diseases and their risk factors. Genotype imputation represents an approach to improve genome coverage, especially for rare or ancestry-specific variation; however, these understudied populations also have smaller relevant imputation reference panels that need to be expanded to represent their more complex LD patterns. In this study, we leveraged >100,000 phased sequences generated from the multi-ethnic NHLBI TOPMed project to impute in admixed cohorts encompassing ~20,000 individuals of African ancestry (AAs) and ~23,000 Hispanics/Latinos. We demonstrated substantially higher imputation quality for low frequency and rare variants in comparison to the state-of-the-art reference panels (1000 Genomes Project and Haplotype Reference Consortium). Association analyses of ~35 million (AAs) and ~27 million (Hispanics/Latinos) variants passing stringent post-imputation filtering with quantitative hematological traits led to the discovery of associations with two rare variants in theHBBgene; one of these variants was replicated in an independent sample, and the other is known to cause anemia in the homozygous state. By comparison, the sameHBBvariants would not have been genome-wide significant using other state-of-the-art reference panels due to lower imputation quality. Our findings demonstrate the power of the TOPMed whole genome sequencing data for imputation and subsequent association analysis in admixed African and Hispanic/Latino populations.


2021 ◽  
Vol 11 (3) ◽  
pp. 231
Author(s):  
Faven Butler ◽  
Ali Alghubayshi ◽  
Youssef Roman

Gout is an inflammatory condition caused by elevated serum urate (SU), a condition known as hyperuricemia (HU). Genetic variations, including single nucleotide polymorphisms (SNPs), can alter the function of urate transporters, leading to differential HU and gout prevalence across different populations. In the United States (U.S.), gout prevalence differentially affects certain racial groups. The objective of this proposed analysis is to compare the frequency of urate-related genetic risk alleles between Europeans (EUR) and the following major racial groups: Africans in Southwest U.S. (ASW), Han-Chinese (CHS), Japanese (JPT), and Mexican (MXL) from the 1000 Genomes Project. The Ensembl genome browser of the 1000 Genomes Project was used to conduct cross-population allele frequency comparisons of 11 SNPs across 11 genes, physiologically involved and significantly associated with SU levels and gout risk. Gene/SNP pairs included: ABCG2 (rs2231142), SLC2A9 (rs734553), SLC17A1 (rs1183201), SLC16A9 (rs1171614), GCKR (rs1260326), SLC22A11 (rs2078267), SLC22A12 (rs505802), INHBC (rs3741414), RREB1 (rs675209), PDZK1 (rs12129861), and NRXN2 (rs478607). Allele frequencies were compared to EUR using Chi-Square or Fisher’s Exact test, when appropriate. Bonferroni correction for multiple comparisons was used, with p < 0.0045 for statistical significance. Risk alleles were defined as the allele that is associated with baseline or higher HU and gout risks. The cumulative HU or gout risk allele index of the 11 SNPs was estimated for each population. The prevalence of HU and gout in U.S. and non-US populations was evaluated using published epidemiological data and literature review. Compared with EUR, the SNP frequencies of 7/11 in ASW, 9/11 in MXL, 9/11 JPT, and 11/11 CHS were significantly different. HU or gout risk allele indices were 5, 6, 9, and 11 in ASW, MXL, CHS, and JPT, respectively. Out of the 11 SNPs, the percentage of risk alleles in CHS and JPT was 100%. Compared to non-US populations, the prevalence of HU and gout appear to be higher in western world countries. Compared with EUR, CHS and JPT populations had the highest HU or gout risk allele frequencies, followed by MXL and ASW. These results suggest that individuals of Asian descent are at higher HU and gout risk, which may partly explain the nearly three-fold higher gout prevalence among Asians versus Caucasians in ambulatory care settings. Furthermore, gout remains a disease of developed countries with a marked global rising.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Benedikt Frieg ◽  
Boris Görg ◽  
Holger Gohlke ◽  
Dieter Häussinger

Abstract Glutamine synthetase (GS) in the liver is expressed in a small perivenous, highly specialized hepatocyte population and is essential for the maintenance of low, non-toxic ammonia levels in the organism. However, GS activity can be impaired by tyrosine nitration of the enzyme in response to oxidative/nitrosative stress in a pH-sensitive way. The underlying molecular mechanism as investigated by combined molecular simulations and in vitro experiments indicates that tyrosine nitration can lead to a fully reversible and pH-sensitive regulation of protein function. This approach was also used to understand the functional consequences of several recently described point mutations of human GS with clinical relevance and to suggest an approach to restore impaired GS activity.


Author(s):  
Jose A. Gallud ◽  
Monica Carreño ◽  
Ricardo Tesoriero ◽  
Andrés Sandoval ◽  
María D. Lozano ◽  
...  

AbstractTechnology-based education of children with special needs has become the focus of many research works in recent years. The wide range of different disabilities that are encompassed by the term “special needs”, together with the educational requirements of the children affected, represent an enormous multidisciplinary challenge for the research community. In this article, we present a systematic literature review of technology-enhanced and game-based learning systems and methods applied on children with special needs. The article analyzes the state-of-the-art of the research in this field by selecting a group of primary studies and answering a set of research questions. Although there are some previous systematic reviews, it is still not clear what the best tools, games or academic subjects (with technology-enhanced, game-based learning) are, out of those that have obtained good results with children with special needs. The 18 articles selected (carefully filtered out of 614 contributions) have been used to reveal the most frequent disabilities, the different technologies used in the prototypes, the number of learning subjects, and the kind of learning games used. The article also summarizes research opportunities identified in the primary studies.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Jianfeng Xu ◽  
Jiejun Shi ◽  
Xiaodong Cui ◽  
Ya Cui ◽  
Jingyi Jessica Li ◽  
...  

AbstractPromoter DNA methylation is a well-established mechanism of transcription repression, though its global correlation with gene expression is weak. This weak correlation can be attributed to the failure of current methylation quantification methods to consider the heterogeneity among sequenced bulk cells. Here, we introduce Cell Heterogeneity–Adjusted cLonal Methylation (CHALM) as a methylation quantification method. CHALM improves understanding of the functional consequences of DNA methylation, including its correlations with gene expression and H3K4me3. When applied to different methylation datasets, the CHALM method enables detection of differentially methylated genes that exhibit distinct biological functions supporting underlying mechanisms.


Sign in / Sign up

Export Citation Format

Share Document