PyVar: An Extensible Framework for Variant Annotator Comparison

AbstractModern genomics projects are generating millions of variant calls that must be annotated for predicted functional consequences at the level of gene expression and protein function. Many of these variants are of interest owing to their potential clinical significance. Unfortunately, state-of-the-art methods do not always agree on downstream effects for any given variant. Here we present a readily extensible python framework (PyVar) for comparing the output of variant annotator methods in order to aid the research community in quickly assessing differences between methods and benchmarking new methods as they are developed. We also apply our framework to assess the annotation performance of ANNOVAR, VEP, and SnpEff when annotating 81 million variants from the ‘1000 Genomes Project’ against both RefSeq and Ensembl human transcript sets.

Download Full-text

Inference of recent admixture using genotype data

10.1101/2020.09.16.300640 ◽

2020 ◽

Author(s):

Peter Pfaffelhuber ◽

Elisabeth Sester-Huss ◽

Franz Baumdicker ◽

Jana Naue ◽

Sabine Lutz-Bonengel ◽

...

Keyword(s):

State Of The Art ◽

Forensic Genetics ◽

Statistical Test ◽

Genotype Data ◽

1000 Genomes Project ◽

Additional Information ◽

1000 Genomes ◽

Project Data ◽

Ancestry Proportions ◽

Individual Ancestry

AbstractThe inference of biogeographic ancestry (BGA) has become a focus of forensic genetics. Mis-inference of BGA can have profound unwanted consequences for investigations and society. We show that recent admixture can lead to misclassification and erroneous inference of ancestry proportions, using state of the art analysis tools with (i) simulations, (ii) 1000 genomes project data, and (iii) two individuals analyzed using the ForenSeq DNA Signature Prep Kit. Subsequently, we extend existing tools for estimation of individual ancestry (IA) by allowing for different IA in both parents, leading to estimates of parental individual ancestry (PIA), and a statistical test for recent admixture. Estimation of PIA outperforms IA in most scenarios of recent admixture. Furthermore, additional information about parental ancestry can be acquired with PIA that may guide casework.

Download Full-text

Use of >100,000 NHLBI Trans-Omics for Precision Medicine (TOPMed) Consortium whole genome sequences improves imputation quality and detection of rare variant associations in admixed African and Hispanic/Latino populations

10.1101/683201 ◽

2019 ◽

Cited By ~ 1

Author(s):

Madeline H. Kowalski ◽

Huijun Qian ◽

Ziyi Hou ◽

Jonathan D. Rosen ◽

Amanda L. Tapia ◽

...

Keyword(s):

Complex Traits ◽

Rare Variants ◽

State Of The Art ◽

African Ancestry ◽

Genotype Imputation ◽

Reference Panel ◽

Whole Genome ◽

1000 Genomes Project ◽

1000 Genomes ◽

Genome Wide

AbstractMost genome-wide association and fine-mapping studies to date have been conducted in individuals of European descent, and genetic studies of populations of Hispanic/Latino and African ancestry are still limited. In addition to the limited inclusion of these populations in genetic studies, these populations have more complex linkage disequilibrium structure that may reduce the number of variants associated with a phenotype. In order to better define the genetic architecture of these understudied populations, we leveraged >100,000 phased sequences available from deep-coverage whole genome sequencing through the multi-ethnic NHLBI Trans-Omics for Precision Medicine (TOPMed) program to impute genotypes into admixed African and Hispanic/Latino samples with commercial genome-wide genotyping array data. We demonstrate that using TOPMed sequencing data as the imputation reference panel improves genotype imputation quality in these populations, which subsequently enhances gene-mapping power for complex traits. For rare variants with minor allele frequency (MAF) < 0.5%, we observed a 2.3 to 6.1-fold increase in the number of well-imputed variants, with 11-34% improvement in average imputation quality, compared to the state-of-the-art 1000 Genomes Project Phase 3 and Haplotype Reference Consortium reference panels, respectively. Impressively, even for extremely rare variants with sample minor allele count <10 (including singletons) in the imputation target samples, average information content rescued was >86%. Subsequent association analyses of TOPMed reference panel-imputed genotype data with hematological traits (hemoglobin (HGB), hematocrit (HCT), and white blood cell count (WBC)) in ~20,000 self-identified African descent individuals and ~23,000 self-identified Hispanic/Latino individuals identified associations with two rare variants in theHBBgene (rs33930165 with higher WBC (p=8.1×10−12) in African populations, rs11549407 with lower HGB (p=1.59×10−12) and HCT (p=1.13×10−9) in Hispanics/Latinos). By comparison, neither variant would have been genome-wide significant if either 1000 Genomes Project Phase 3 or Haplotype Reference Consortium reference panels had been used for imputation. Our findings highlight the utility of TOPMed imputation reference panel for identification of novel associations between rare variants and complex traits not previously detected in similar sized genome-wide studies of under-represented African and Hispanic/Latino populations.Author summaryAdmixed African and Hispanic/Latino populations remain understudied in genome-wide association and fine-mapping studies of complex diseases. These populations have more complex linkage disequilibrium (LD) structure that can impair mapping of variants associated with complex diseases and their risk factors. Genotype imputation represents an approach to improve genome coverage, especially for rare or ancestry-specific variation; however, these understudied populations also have smaller relevant imputation reference panels that need to be expanded to represent their more complex LD patterns. In this study, we leveraged >100,000 phased sequences generated from the multi-ethnic NHLBI TOPMed project to impute in admixed cohorts encompassing ~20,000 individuals of African ancestry (AAs) and ~23,000 Hispanics/Latinos. We demonstrated substantially higher imputation quality for low frequency and rare variants in comparison to the state-of-the-art reference panels (1000 Genomes Project and Haplotype Reference Consortium). Association analyses of ~35 million (AAs) and ~27 million (Hispanics/Latinos) variants passing stringent post-imputation filtering with quantitative hematological traits led to the discovery of associations with two rare variants in theHBBgene; one of these variants was replicated in an independent sample, and the other is known to cause anemia in the homozygous state. By comparison, the sameHBBvariants would not have been genome-wide significant using other state-of-the-art reference panels due to lower imputation quality. Our findings demonstrate the power of the TOPMed whole genome sequencing data for imputation and subsequent association analysis in admixed African and Hispanic/Latino populations.

Download Full-text

1000 Genomes Project reveals human variation

Nature ◽

10.1038/news.2010.567 ◽

2010 ◽

Cited By ~ 3

Author(s):

Alla Katsnelson

Keyword(s):

Human Variation ◽

1000 Genomes Project ◽

1000 Genomes

Download Full-text

Clinical significance of Notch and VEGF gene expression in serious ovarian carcinoma

10.26226/morressier.5770e29cd462b80290b4c15c ◽

2016 ◽

Author(s):

SANG GEUN JUNG

Keyword(s):

Gene Expression ◽

Ovarian Carcinoma ◽

Clinical Significance ◽

Vegf Gene

Download Full-text

State-of-the-art of Cluster Analysis of Gene Expression Data

ACTA AUTOMATICA SINICA ◽

10.3724/sp.j.1004.2008.00113 ◽

2009 ◽

Vol 34 (2) ◽

pp. 113-120 ◽

Cited By ~ 3

Author(s):

Feng YUE

Keyword(s):

Gene Expression ◽

Cluster Analysis ◽

Gene Expression Data ◽

State Of The Art ◽

Expression Data

Download Full-text

Clinical Significance of Umami Taste and Umami-Related Gene Expression Analysis for the Objective Assessment of Umami Taste Loss

Current Pharmaceutical Design ◽

10.2174/1381612822666160216150511 ◽

2016 ◽

Vol 22 (15) ◽

pp. 2238-2244 ◽

Cited By ~ 6

Author(s):

Noriaki Shoji ◽

Shizuko Satoh-Ku riwada ◽

Takashi Sasano

Keyword(s):

Gene Expression ◽

Clinical Significance ◽

Expression Analysis ◽

Gene Expression Analysis ◽

Objective Assessment ◽

Related Gene ◽

Umami Taste

Download Full-text

The Epidemiology and Genetics of Hyperuricemia and Gout across Major Racial Groups: A Literature Review and Population Genetics Secondary Database Analysis

Journal of Personalized Medicine ◽

10.3390/jpm11030231 ◽

2021 ◽

Vol 11 (3) ◽

pp. 231

Author(s):

Faven Butler ◽

Ali Alghubayshi ◽

Youssef Roman

Keyword(s):

Literature Review ◽

Risk Allele ◽

Statistical Significance ◽

Elevated Serum ◽

The United States ◽

Allele Frequencies ◽

Racial Groups ◽

1000 Genomes Project ◽

1000 Genomes ◽

Risk Alleles

Gout is an inflammatory condition caused by elevated serum urate (SU), a condition known as hyperuricemia (HU). Genetic variations, including single nucleotide polymorphisms (SNPs), can alter the function of urate transporters, leading to differential HU and gout prevalence across different populations. In the United States (U.S.), gout prevalence differentially affects certain racial groups. The objective of this proposed analysis is to compare the frequency of urate-related genetic risk alleles between Europeans (EUR) and the following major racial groups: Africans in Southwest U.S. (ASW), Han-Chinese (CHS), Japanese (JPT), and Mexican (MXL) from the 1000 Genomes Project. The Ensembl genome browser of the 1000 Genomes Project was used to conduct cross-population allele frequency comparisons of 11 SNPs across 11 genes, physiologically involved and significantly associated with SU levels and gout risk. Gene/SNP pairs included: ABCG2 (rs2231142), SLC2A9 (rs734553), SLC17A1 (rs1183201), SLC16A9 (rs1171614), GCKR (rs1260326), SLC22A11 (rs2078267), SLC22A12 (rs505802), INHBC (rs3741414), RREB1 (rs675209), PDZK1 (rs12129861), and NRXN2 (rs478607). Allele frequencies were compared to EUR using Chi-Square or Fisher’s Exact test, when appropriate. Bonferroni correction for multiple comparisons was used, with p < 0.0045 for statistical significance. Risk alleles were defined as the allele that is associated with baseline or higher HU and gout risks. The cumulative HU or gout risk allele index of the 11 SNPs was estimated for each population. The prevalence of HU and gout in U.S. and non-US populations was evaluated using published epidemiological data and literature review. Compared with EUR, the SNP frequencies of 7/11 in ASW, 9/11 in MXL, 9/11 JPT, and 11/11 CHS were significantly different. HU or gout risk allele indices were 5, 6, 9, and 11 in ASW, MXL, CHS, and JPT, respectively. Out of the 11 SNPs, the percentage of risk alleles in CHS and JPT was 100%. Compared to non-US populations, the prevalence of HU and gout appear to be higher in western world countries. Compared with EUR, CHS and JPT populations had the highest HU or gout risk allele frequencies, followed by MXL and ASW. These results suggest that individuals of Asian descent are at higher HU and gout risk, which may partly explain the nearly three-fold higher gout prevalence among Asians versus Caucasians in ambulatory care settings. Furthermore, gout remains a disease of developed countries with a marked global rising.

Download Full-text

Glutamine synthetase as a central element in hepatic glutamine and ammonia metabolism: novel aspects

Biological Chemistry ◽

10.1515/hsz-2021-0166 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Benedikt Frieg ◽

Boris Görg ◽

Holger Gohlke ◽

Dieter Häussinger

Keyword(s):

Glutamine Synthetase ◽

Protein Function ◽

Nitrosative Stress ◽

Point Mutations ◽

Central Element ◽

Ph Sensitive ◽

Tyrosine Nitration ◽

Ammonia Metabolism ◽

Functional Consequences

Abstract Glutamine synthetase (GS) in the liver is expressed in a small perivenous, highly specialized hepatocyte population and is essential for the maintenance of low, non-toxic ammonia levels in the organism. However, GS activity can be impaired by tyrosine nitration of the enzyme in response to oxidative/nitrosative stress in a pH-sensitive way. The underlying molecular mechanism as investigated by combined molecular simulations and in vitro experiments indicates that tyrosine nitration can lead to a fully reversible and pH-sensitive regulation of protein function. This approach was also used to understand the functional consequences of several recently described point mutations of human GS with clinical relevance and to suggest an approach to restore impaired GS activity.

Download Full-text

Technology-enhanced and game based learning for children with special needs: a systematic mapping study

Universal Access in the Information Society ◽

10.1007/s10209-021-00824-0 ◽

2021 ◽

Author(s):

Jose A. Gallud ◽

Monica Carreño ◽

Ricardo Tesoriero ◽

Andrés Sandoval ◽

María D. Lozano ◽

...

Keyword(s):

Special Needs ◽

State Of The Art ◽

Research Community ◽

Systematic Mapping Study ◽

Children With Special Needs ◽

Game Based Learning ◽

Mapping Study ◽

Systematic Mapping ◽

Wide Range ◽

Research Questions

AbstractTechnology-based education of children with special needs has become the focus of many research works in recent years. The wide range of different disabilities that are encompassed by the term “special needs”, together with the educational requirements of the children affected, represent an enormous multidisciplinary challenge for the research community. In this article, we present a systematic literature review of technology-enhanced and game-based learning systems and methods applied on children with special needs. The article analyzes the state-of-the-art of the research in this field by selecting a group of primary studies and answering a set of research questions. Although there are some previous systematic reviews, it is still not clear what the best tools, games or academic subjects (with technology-enhanced, game-based learning) are, out of those that have obtained good results with children with special needs. The 18 articles selected (carefully filtered out of 614 contributions) have been used to reveal the most frequent disabilities, the different technologies used in the prototypes, the number of learning subjects, and the kind of learning games used. The article also summarizes research opportunities identified in the primary studies.

Download Full-text

Cellular Heterogeneity–Adjusted cLonal Methylation (CHALM) improves prediction of gene expression

Nature Communications ◽

10.1038/s41467-020-20492-7 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Jianfeng Xu ◽

Jiejun Shi ◽

Xiaodong Cui ◽

Ya Cui ◽

Jingyi Jessica Li ◽

...

Keyword(s):

Gene Expression ◽

Dna Methylation ◽

Cellular Heterogeneity ◽

Biological Functions ◽

Global Correlation ◽

Differentially Methylated Genes ◽

Promoter Dna Methylation ◽

Functional Consequences ◽

Promoter Dna ◽

Underlying Mechanisms

AbstractPromoter DNA methylation is a well-established mechanism of transcription repression, though its global correlation with gene expression is weak. This weak correlation can be attributed to the failure of current methylation quantification methods to consider the heterogeneity among sequenced bulk cells. Here, we introduce Cell Heterogeneity–Adjusted cLonal Methylation (CHALM) as a methylation quantification method. CHALM improves understanding of the functional consequences of DNA methylation, including its correlations with gene expression and H3K4me3. When applied to different methylation datasets, the CHALM method enables detection of differentially methylated genes that exhibit distinct biological functions supporting underlying mechanisms.

Download Full-text