scholarly journals ConsHMM Atlas: conservation state annotations for major genomes and human genetic variation

2020 ◽  
Vol 2 (4) ◽  
Author(s):  
Adriana Arneson ◽  
Brooke Felsheim ◽  
Jennifer Chien ◽  
Jason Ernst

Abstract ConsHMM is a method recently introduced to annotate genomes into conservation states, which are defined based on the combinatorial and spatial patterns of which species align to and match a reference genome in a multi-species DNA sequence alignment. Previously, ConsHMM was only applied to a single genome for one multi-species sequence alignment. Here, we apply ConsHMM to produce 22 additional genome annotations covering human and seven other organisms for a variety of multi-species alignments. Additionally, we extend ConsHMM to generate allele-specific annotations, which we use to produce conservation state annotations for every possible single-nucleotide mutation in the human genome. Finally, we provide a web interface to interactively visualize parameters and annotation enrichments for ConsHMM models. These annotations and visualizations comprise the ConsHMM Atlas, which we expect will be a valuable resource for analyzing a variety of major genomes and genetic variation.

Author(s):  
Adriana Arneson ◽  
Brooke Felsheim ◽  
Jennifer Chien ◽  
Jason Ernst

AbstractConsHMM is a method recently introduced to annotate genomes into conservation states, which are defined based on the combinatorial and spatial patterns of which species align to and match a reference genome in a multi-species DNA sequence alignment. Previously, ConsHMM was only applied to a single genome for one multi-species sequence alignment. Here we apply ConsHMM to produce 22 additional genome annotations covering human and seven other organisms for a variety of multi-species alignments. Additionally, we have extended ConsHMM to generate allele specific annotations, which we used to produce conservation state annotations for every possible single nucleotide mutation in the human genome. Finally, we provide a web interface to interactively visualize parameters and annotation enrichments for ConsHMM models. These annotations and visualizations comprise the ConsHMM Atlas, which we expect will be a valuable resource for analyzing a variety of major genomes and genetic variation.


2018 ◽  
Author(s):  
Adriana Sperlea ◽  
Jason Ernst

AbstractComparative genomics sequence data is an important source of information for interpreting genomes. Genome-wide annotations based on this data have largely focused on univariate scores or binary calls of evolutionary constraint. Here we present a complementary whole genome annotation approach, ConsHMM, which applies a multivariate hidden Markov model to learn de novo different ‘conservation states’ based on the combinatorial and spatial patterns of which species align to and match a reference genome in a multiple species DNA sequence alignment. We applied ConsHMM to a 100-way vertebrate sequence alignment to annotate the human genome at single nucleotide resolution into 100 different conservation states. These states have distinct enrichments for other genomic information including gene annotations, chromatin states, and repeat families, which were used to characterize their biological significance. Conservation states have greater or complementary predictive information than standard constraint based measures for a variety of genome annotations. Bases in constrained elements have distinct heritability enrichments depending on the conservation state assignment, demonstrating their relevance to analyzing phenotypic associated variation. The conservation states also highlight differences in the conservation patterns of bases prioritized by a number of scores used for variant prioritization. The ConsHMM method and conservation state annotations provide a valuable resource for interpreting genomes and genetic variation.


Author(s):  
Jhons Fatriyadi SUWANDI ◽  
Widya ASMARA ◽  
Hari KUSNANTO ◽  
Din SYAFRUDDIN ◽  
Supargiyono SUPARGIYONO

Background: Malaria is an infectious disease caused by Plasmodium sp., that still prevalence in some part of Indonesia. District of Pesawaran is one of malaria endemic area in the Province of Lampung. The purpose of this study was to evaluate the efficacy of the ACT treatment in the District of Pesawaran Province of Lampung, Indonesia from Dec 2012 to Jul 2013 and the genetic variation of the Plasmodium falciparum also studied. Methods: This study was observational analytic study of falciparum malaria patients treated with ACT and primaquine (DHP-PQ and AAQ-PQ) at Hanura Primary Health Centre (Puskesmas). DNA isolation was done with QIAmp DNA Mini Kit. Amplification of PfMDR1, MSP1, and MSP2 genes was done with appropriate forward and reverse primer and procedures optimized first. PCR Product of PfMDR1 gene was prepared for sequencing. Data analysis was done with MEGA 6 software. Results: The results of this research are DHP-PQ effectiveness was still wellness among falciparum malaria patients in District of Pesawaran, Province of Lampung, Indonesia. There is Single-nucleotide mutation of N86Y of PfMDR1 gene. The dominant alleles found are MAD20 and 3D7 alleles with Multiplicity of Infection (MOI) are low. Conclusion: Therapy of DHP-PQ as an antimalarial falciparum in Pesawaran District, Lampung, Indonesia is still good. The genetic variation found was the SNP on the N86Y PfMDR1 gene, with dominant allele MAD20 and 3D7.


2019 ◽  
Author(s):  
Michael D. Kessler ◽  
Douglas P. Loesch ◽  
James A. Perry ◽  
Nancy L. Heard-Costa ◽  
Brian E. Cade ◽  
...  

Abstractde novo Mutations (DNMs), or mutations that appear in an individual despite not being seen in their parents, are an important source of genetic variation whose impact is relevant to studies of human evolution, genetics, and disease. Utilizing high-coverage whole genome sequencing data as part of the Trans-Omics for Precision Medicine (TOPMed) program, we directly estimate and analyze DNM counts, rates, and spectra from 1,465 trios across an array of diverse human populations. Using the resulting call set of 86,865 single nucleotide DNMs, we find a significant positive correlation between local recombination rate and local DNM rate, which together can explain up to 35.5% of the genome-wide variation in population level rare genetic variation from 41K unrelated TOPMed samples. While genome-wide heterozygosity does correlate weakly with DNM count, we do not find significant differences in DNM rate between individuals of European, African, and Latino ancestry, nor across ancestrally distinct segments within admixed individuals. However, interestingly, we do find significantly fewer DNMs in Amish individuals compared with other Europeans, even after accounting for parental age and sequencing center. Specifically, we find significant reductions in the number of T→C mutations in the Amish, which seems to underpin their overall reduction in DNMs. Finally, we calculate near-zero estimates of narrow sense heritability (h2), which suggest that variation in DNM rate is significantly shaped by non-additive genetic effects and/or the environment, and that a less mutagenic environment may be responsible for the reduced DNM rate in the Amish.SignificanceHere we provide one of the largest and most diverse human de novo mutation (DNM) call sets to date, and use it to quantify the genome-wide relationship between local mutation rate and population-level rare genetic variation. While we demonstrate that the human single nucleotide mutation rate is similar across numerous human ancestries and populations, we also discover a reduced mutation rate in the Amish founder population, which shows that mutation rates can shift rapidly. Finally, we find that variation in mutation rates is not heritable, which suggests that the environment may influence mutation rates more significantly than previously realized.


Pathogens ◽  
2021 ◽  
Vol 10 (3) ◽  
pp. 363
Author(s):  
Sulochana K. Wasala ◽  
Dana K. Howe ◽  
Louise-Marie Dandurand ◽  
Inga A. Zasada ◽  
Dee R. Denver

Globodera pallida is among the most significant plant-parasitic nematodes worldwide, causing major damage to potato production. Since it was discovered in Idaho in 2006, eradication efforts have aimed to contain and eradicate G. pallida through phytosanitary action and soil fumigation. In this study, we investigated genome-wide patterns of G. pallida genetic variation across Idaho fields to evaluate whether the infestation resulted from a single or multiple introduction(s) and to investigate potential evolutionary responses since the time of infestation. A total of 53 G. pallida samples (~1,042,000 individuals) were collected and analyzed, representing five different fields in Idaho, a greenhouse population, and a field in Scotland that was used for external comparison. According to genome-wide allele frequency and fixation index (Fst) analyses, most of the genetic variation was shared among the G. pallida populations in Idaho fields pre-fumigation, indicating that the infestation likely resulted from a single introduction. Temporal patterns of genome-wide polymorphisms involving (1) pre-fumigation field samples collected in 2007 and 2014 and (2) pre- and post-fumigation samples revealed nucleotide variants (SNPs, single-nucleotide polymorphisms) with significantly differentiated allele frequencies indicating genetic differentiation. This study provides insights into the genetic origins and adaptive potential of G. pallida invading new environments.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Melek Pehlivan ◽  
Tülay K. Ayna ◽  
Maşallah Baran ◽  
Mustafa Soyöz ◽  
Aslı Ö. Koçyiğit ◽  
...  

Abstract Objectives There are several hypotheses on the effects of the rs1738074 T/C single nucleotide polymorphism in the TAGAP gene; however, there has been no study on Turkish pediatric patients. We aimed to investigate the association of celiac disease (CD) and type 1 diabetes mellitus (T1DM) comorbidity with the polymorphism in the TAGAP gene of Turkish pediatric patients. Methods Totally, 127 pediatric CD patients and 100 healthy children were included. We determined the polymorphism by the allele-specific polymerase chain reaction method. We used IBM SPSS Statistics version 25.0 and Arlequin 3.5.2 for the statistical analyses. The authors have no conflict of interest. Results It was determined that 72% (n=154) of only CD patients had C allele, whereas 28% (n=60) had T allele. Of the patients with celiac and T1DM, 42.5% (n=17) and 57.5% (n=23) had T and C alleles, respectively. Of the individuals in control group, 67% (n=134) had C allele, whereas 33% (n=66) had T allele. Conclusions There was no significant difference in the genotype and allele frequencies between the patient and control groups (p>0.05). There was no significant association between the disease risk and the polymorphism in our study group.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
M. Joseph Tomlinson ◽  
Shawn W. Polson ◽  
Jing Qiu ◽  
Juniper A. Lake ◽  
William Lee ◽  
...  

AbstractDifferential abundance of allelic transcripts in a diploid organism, commonly referred to as allele specific expression (ASE), is a biologically significant phenomenon and can be examined using single nucleotide polymorphisms (SNPs) from RNA-seq. Quantifying ASE aids in our ability to identify and understand cis-regulatory mechanisms that influence gene expression, and thereby assist in identifying causal mutations. This study examines ASE in breast muscle, abdominal fat, and liver of commercial broiler chickens using variants called from a large sub-set of the samples (n = 68). ASE analysis was performed using a custom software called VCF ASE Detection Tool (VADT), which detects ASE of biallelic SNPs using a binomial test. On average ~ 174,000 SNPs in each tissue passed our filtering criteria and were considered informative, of which ~ 24,000 (~ 14%) showed ASE. Of all ASE SNPs, only 3.7% exhibited ASE in all three tissues, with ~ 83% showing ASE specific to a single tissue. When ASE genes (genes containing ASE SNPs) were compared between tissues, the overlap among all three tissues increased to 20.1%. Our results indicate that ASE genes show tissue-specific enrichment patterns, but all three tissues showed enrichment for pathways involved in translation.


Sign in / Sign up

Export Citation Format

Share Document