scholarly journals Integrative genomic analysis unifying epigenetic inheritance in adaptation and canalization

2019 ◽  
Author(s):  
Abhay Sharma

AbstractEpigenetic inheritance, especially its biomedical and evolutionary significance, is an immensely interesting but highly controversial subject. Notably, a recent analysis of existing multi-omics has supported the mechanistic plausibility of epigenetic inheritance and its implications in disease and evolution. The evolutionary support stemmed from the specific finding that genes associated with cold induced inheritance and with latitudinal adaptation in mice are exceptionally common. Here, a similar gene set overlap analysis is presented that integrates cold induced inheritance with evolutionary adaptation and genetic canalization in cold environment in Drosophila. Genes showing differential expression in inheritance specifically overrepresent gene sets associated with differential and allele specific expression, though not with genome-wide genetic differentiation, in adaptation. On the other hand, the differentiated outliers uniquely overrepresent genes dysregulated by radicicol, a decanalization inducer. Both gene sets in turn exclusively show enrichment of genes that accumulate, in intended experimental lines, de novo mutations, a potential source of canalization. Successively, the three gene sets distinctively overrepresent genes exhibiting, between mutation accumulation lines, invariable expression, a potential signal for canalization. Sequentially, the four gene sets solely display enrichment of genes grouped in gene ontology under transcription factor activity, a signature of regulatory canalization. Cumulatively, the analysis suggests that epigenetic inheritance possibly contributes to evolutionary adaptation in the form of cis regulatory variations, with trans variations arising in the course of genetic canalization.

BMC Biology ◽  
2019 ◽  
Vol 17 (1) ◽  
Author(s):  
Joel-E. Kuon ◽  
Weihong Qi ◽  
Pascal Schläpfer ◽  
Matthias Hirsch-Hoffmann ◽  
Philipp Rogalla von Bieberstein ◽  
...  

Abstract Background Cassava is an important food crop in tropical and sub-tropical regions worldwide. In Africa, cassava production is widely affected by cassava mosaic disease (CMD), which is caused by the African cassava mosaic geminivirus that is transmitted by whiteflies. Cassava breeders often use a single locus, CMD2, for introducing CMD resistance into susceptible cultivars. The CMD2 locus has been genetically mapped to a 10-Mbp region, but its organization and genes as well as their functions are unknown. Results We report haplotype-resolved de novo assemblies and annotations of the genomes for the African cassava cultivar TME (tropical Manihot esculenta), which is the origin of CMD2, and the CMD-susceptible cultivar 60444. The assemblies provide phased haplotype information for over 80% of the genomes. Haplotype comparison identified novel features previously hidden in collapsed and fragmented cassava genomes, including thousands of allelic variants, inter-haplotype diversity in coding regions, and patterns of diversification through allele-specific expression. Reconstruction of the CMD2 locus revealed a highly complex region with nearly identical gene sets but limited microsynteny between the two cultivars. Conclusions The genome maps of the CMD2 locus in both 60444 and TME3, together with the newly annotated genes, will help the identification of the causal genetic basis of CMD2 resistance to geminiviruses. Our de novo cassava genome assemblies will also facilitate genetic mapping approaches to narrow the large CMD2 region to a few candidate genes for better informed strategies to develop robust geminivirus resistance in susceptible cassava cultivars.


2015 ◽  
Author(s):  
Kim A. Steige ◽  
Benjamin Laenen ◽  
Johan Reimegård ◽  
Douglas Scofield ◽  
Tanja Slotte

Understanding the causes of cis-regulatory variation is a long-standing aim in evolutionary biology. Although cis-regulatory variation has long been considered important for adaptation, we still have a limited understanding of the selective importance and genomic determinants of standing cis-regulatory variation. To address these questions, we studied the prevalence, genomic determinants and selective forces shaping cis-regulatory variation in the outcrossing plant Capsella grandiflora. We first identified a set of 1,010 genes with common cis-regulatory variation using analyses of allele-specific expression (ASE). Population genomic analyses of whole-genome sequences from 32 individuals showed that genes with common cis-regulatory variation are 1) under weaker purifying selection and 2) undergo less frequent positive selection than other genes. We further identified genomic determinants of cis-regulatory variation. Gene-body methylation (gbM) was a major factor constraining cis-regulatory variation, whereas presence of nearby TEs and tissue specificity of expression increased the odds of ASE. Our results suggest that most common cis-regulatory variation in C. grandiflora is under weak purifying selection, and that gene-specific functional constraints are more important for the maintenance of cis-regulatory variation than genome-scale variation in the intensity of selection. Our results agree with previous findings that suggest TE silencing affects nearby gene expression, and provide novel evidence for a link between gbM and cis-regulatory constraint, possibly reflecting greater dosage-sensitivity of body-methylated genes. Given the extensive conservation of gene-body methylation in flowering plants, this suggests that gene-body methylation could be an important predictor of cis-regulatory variation in a wide range of plant species.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Jing Xie ◽  
Tieming Ji ◽  
Marco A. R. Ferreira ◽  
Yahan Li ◽  
Bhaumik N. Patel ◽  
...  

Abstract Background High-throughput sequencing experiments, which can determine allele origins, have been used to assess genome-wide allele-specific expression. Despite the amount of data generated from high-throughput experiments, statistical methods are often too simplistic to understand the complexity of gene expression. Specifically, existing methods do not test allele-specific expression (ASE) of a gene as a whole and variation in ASE within a gene across exons separately and simultaneously. Results We propose a generalized linear mixed model to close these gaps, incorporating variations due to genes, single nucleotide polymorphisms (SNPs), and biological replicates. To improve reliability of statistical inferences, we assign priors on each effect in the model so that information is shared across genes in the entire genome. We utilize Bayesian model selection to test the hypothesis of ASE for each gene and variations across SNPs within a gene. We apply our method to four tissue types in a bovine study to de novo detect ASE genes in the bovine genome, and uncover intriguing predictions of regulatory ASEs across gene exons and across tissue types. We compared our method to competing approaches through simulation studies that mimicked the real datasets. The R package, BLMRM, that implements our proposed algorithm, is publicly available for download at https://github.com/JingXieMIZZOU/BLMRM. Conclusions We will show that the proposed method exhibits improved control of the false discovery rate and improved power over existing methods when SNP variation and biological variation are present. Besides, our method also maintains low computational requirements that allows for whole genome analysis.


2018 ◽  
Author(s):  
Arkarachai Fungtammasan ◽  
Brett Hannigan

ABSTRACTLong read sequencing technology has allowed researchers to create de novo assemblies with impressive continuity[1,2]. This advancement has dramatically increased the number of reference genomes available and hints at the possibility of a future where personal genomes are assembled rather than resequenced. In 2016 Pacific Biosciences released the FALCON-Unzip framework, which can provide long, phased haplotype contigs from de novo assemblies. This phased genome algorithm enhances the accuracy of highly heterozygous organisms and allows researchers to explore questions that require haplotype information such as allele-specific expression and regulation. However, validation of this technique has been limited to small genomes or inbred individuals[3].As a roadmap to personal genome assembly and phasing, we assess the phasing accuracy of FALCON-Unzip in humans using publicly available data for the Ashkenazi trio from the Genome in a Bottle Consortium[4]. To assess the accuracy of the Unzip algorithm, we assembled the genome of the son using FALCON and FALCON Unzip, genotyped publicly available short read data for the mother and the father, and observed the inheritance pattern of the parental SNPs along the phased genome of the son. We found that 72.8% of haplotype contigs share SNPs with only one parent suggesting that these contigs are correctly phased. Most mis-phased SNPs are random but present in high frequency toward the end of haplotype contigs. Approximately 20.7% of mis-phased haplotype contigs contain clusters of mis-phased SNPs, suggesting that haplotypes were mis-joined by FALCON-Unzip. Mis-joined boundaries in those contigs are located in areas of low SNP density. This research demonstrates that the FALCON-Unzip algorithm can be used to create long and accurate haplotypes for humans and identifies problematic regions that could benefit in future improvement.


Development ◽  
1998 ◽  
Vol 125 (5) ◽  
pp. 889-897 ◽  
Author(s):  
C. Mertineit ◽  
J.A. Yoder ◽  
T. Taketo ◽  
D.W. Laird ◽  
J.M. Trasler ◽  
...  

The spermatozoon and oocyte genomes bear sex-specific methylation patterns that are established during gametogenesis and are required for the allele-specific expression of imprinted genes in somatic tissues. The mRNA for Dnmt1, the predominant maintenance and de novo DNA (cytosine-5)-methyl transferase in mammals, is present at high levels in postmitotic murine germ cells but undergoes alternative splicing of sex-specific 5′ exons, which controls the production and localization of enzyme during specific stages of gametogenesis. An oocyte-specific 5′ exon is associated with the production of very large amounts of active Dnmt1 protein, which is truncated at the N terminus and sequestered in the cytoplasm during the later stages of oocyte growth, while a spermatocyte-specific 5′ exon interferes with translation and prevents production of Dnmt1 during the prolonged crossing-over stage of male meiosis. During the course of postnatal oogenesis, Dnmt1 is present at high levels in nuclei only in growing dictyate oocytes, a stage during which gynogenetic developmental potential is lost and biparental developmental potential is gained.


2021 ◽  
Author(s):  
Xingtan Zhang ◽  
Shuai Chen ◽  
Longqing Shi ◽  
Daping Gong ◽  
Shengcheng Zhang ◽  
...  

AbstractTea is an important global beverage crop and is largely clonally propagated. Despite previous studies on the species, its genetic and evolutionary history deserves further research. Here, we present a haplotype-resolved assembly of an Oolong tea cultivar, Tieguanyin. Analysis of allele-specific expression suggests a potential mechanism in response to mutation load during long-term clonal propagation. Population genomic analysis using 190 Camellia accessions uncovered independent evolutionary histories and parallel domestication in two widely cultivated varieties, var. sinensis and var. assamica. It also revealed extensive intra- and interspecific introgressions contributing to genetic diversity in modern cultivars. Strong signatures of selection were associated with biosynthetic and metabolic pathways that contribute to flavor characteristics as well as genes likely involved in the Green Revolution in the tea industry. Our results offer genetic and molecular insights into the evolutionary history of Camellia sinensis and provide genomic resources to further facilitate gene editing to enhance desirable traits in tea crops.


Blood ◽  
2019 ◽  
Vol 134 (Supplement_1) ◽  
pp. 1235-1235
Author(s):  
Roger Mulet-Lazaro ◽  
Stanley van Herk ◽  
Claudia Erpelinck-Verschueren ◽  
Mathijs A. Sanders ◽  
Eric Bindels ◽  
...  

Introduction Transcriptional deregulation is a central event in the development of acute myeloid leukemia (AML), with most mutations occurring in genes related to transcription, chromatin regulation and DNA methylation. Furthermore, alterations involving cis-regulatory elements have been shown to play a critical role in aberrant gene expression in AML. Genetic variation in cis-regulatory regions usually involves a single allele, which results in differential expression of the two alleles. This phenomenon, termed allele-specific expression (ASE), is therefore an accurate marker for cis-regulatory variation (Pastinen, 2010). We propose that a systematic study of genes with aberrant ASE in AML may uncover aberrantly expressed genes caused by abnormalities in cis-regulatory elements. Therefore we aim to 1) chart the landscape of ASE in AML, 2) establish a link between relevant ASE events and AML subtypes, and 3) investigate the mechanisms driving ASE. Methods We performed whole exome sequencing (WES) and RNA-seq on leukemic blasts from 168 de novo AML patients, representing all major subtypes of the disease. Combining both datasets, we assessed ASE in every gene with informative (non-homozygous) single nucleotide variants (SNVs). Results Patients had a median of 37 genes with ASE, several of which were recurrently detected across multiple patients. To shorten the gene list we selected for this study genes known to be involved either in cancer or in myeloid development. The gene most commonly found to show ASE (53/140 cases with SNVs) was GATA2, which encodes a transcription factor crucial for proliferation and maintenance of hematopoietic stem cells with a known involvement in AML. Interestingly, integration with molecularly defined classification of AML revealed that all cases (n=17) with biallelic CEBPA mutations exhibited GATA2 ASE (p-value = 6.00·10-7, Fisher's test). Biallelic CEBPA mutations (CEBPA DM) identify an AML subtype with favorable clinical outcome and frequently co-occur with GATA2 mutations (Greif PA, 2012), pointing to a functional connection between these two genes. Indeed, 44% of the cases in our cohort exhibited a GATA2 mutation, and 27% carried a second, subclonal mutation in the same gene. Importantly, in cases where a GATA2 mutation was found, the mutant allele was always preferentially expressed. These findings were validated in the TCGA dataset, where all four CEBPA DM patients with informative SNVs in GATA2 exhibited GATA2 ASE. Although GATA2 ASE was present in other AML subtypes, none of these subtypes showed a significant association with this finding. Patients with a t(8;21) rearrangement (n=5), which represses CEBPA expression, did not exhibit GATA2 ASE, and we only observed GATA2 ASE in 4 out of 8 CEBPA silenced leukemias (Wouters BJ, 2007). Altogether, this demonstrates the uniqueness of the 1-to-1 relationship between CEBPA DM and GATA2 ASE, and excludes a causative role for inactive CEBPA protein in mediating mono-allelic expression of GATA2. The average expression of GATA2 in CEBPA DM patients was comparable to other AMLs, even in cases with monoallelic GATA2 expression. This suggests that a) ASE was achieved by repression of one allele rather than dramatically increased expression of the other, b) there was a compensation of the non-repressed allele. DNA methylation analysis of the GATA2 promoter did not reveal methylation-mediated gene silencing of the repressed allele. The long-distance +77 kb GATA2 enhancer appears to be involved in ASE, as RNA read-through levels at the enhancer were significantly different in CEBPA DM AMLs (p-value < 10-4, Wald test) in an allele-specific manner. The involvement of the enhancer was further confirmed by differences in H3K27ac levels between the two alleles. Conclusions An unbiased screen of 168 de novo AML cases revealed that all patients (n=17) with CEBPA biallelic mutations display GATA2 ASE. GATA2 mutations were found in 8 of the 17 cases, always in the allele that is preferentially expressed. Since GATA2 ASE is present in all CEBPA DM and GATA2 mutations only in a fraction, we hypothesize that GATA2 ASE is acquired first and mutations are only selected if they occur in the expressed allele. Moreover, given that other subgroups with CEBPA abnormalities do not show a similar pattern, we propose that ASE of GATA2 is not a consequence of CEBPA mutations, but rather a requirement for the development of AML in these patients. Disclosures No relevant conflicts of interest to declare.


2017 ◽  
Vol 114 (5) ◽  
pp. 1087-1092 ◽  
Author(s):  
Kim A. Steige ◽  
Benjamin Laenen ◽  
Johan Reimegård ◽  
Douglas G. Scofield ◽  
Tanja Slotte

Understanding the causes ofcis-regulatory variation is a long-standing aim in evolutionary biology. Althoughcis-regulatory variation has long been considered important for adaptation, we still have a limited understanding of the selective importance and genomic determinants of standingcis-regulatory variation. To address these questions, we studied the prevalence, genomic determinants, and selective forces shapingcis-regulatory variation in the outcrossing plantCapsella grandiflora. We first identified a set of 1,010 genes with commoncis-regulatory variation using analyses of allele-specific expression (ASE). Population genomic analyses of whole-genome sequences from 32 individuals showed that genes with commoncis-regulatory variation (i) are under weaker purifying selection and (ii) undergo less frequent positive selection than other genes. We further identified genomic determinants ofcis-regulatory variation. Gene body methylation (gbM) was a major factor constrainingcis-regulatory variation, whereas presence of nearby transposable elements (TEs) and tissue specificity of expression increased the odds of ASE. Our results suggest that most commoncis-regulatory variation inC. grandiflorais under weak purifying selection, and that gene-specific functional constraints are more important for the maintenance ofcis-regulatory variation than genome-scale variation in the intensity of selection. Our results agree with previous findings that suggest TE silencing affects nearby gene expression, and provide evidence for a link between gbM andcis-regulatory constraint, possibly reflecting greater dosage sensitivity of body-methylated genes. Given the extensive conservation of gbM in flowering plants, this suggests that gbM could be an important predictor ofcis-regulatory variation in a wide range of plant species.


2018 ◽  
Author(s):  
Hoang T. Nguyen ◽  
Amanda Dobbyn ◽  
Alexander W. Charney ◽  
Julien Bryois ◽  
April Kim ◽  
...  

AbstractTrio family and case-control studies of next-generation sequencing data have proven integral to understanding the contribution of rare inherited and de novo single-nucleotide variants to the genetic architecture of complex disease. Ideally, such studies should identify individual risk genes of moderate to large effect size to generate novel treatment hypotheses for further follow-up. However, due to insufficient power, gene set enrichment analyses have come to be relied upon for detecting differences between cases and controls, implicating sets of hundreds of genes rather than specific targets for further investigation. Here, we present a Bayesian statistical framework, termed gTADA, that integrates gene-set membership information with gene-level de novo and rare inherited case-control counts, to prioritize risk genes with excess rare variant burden within enriched gene sets. Applying gTADA to available whole-exome sequencing datasets for several neuropsychiatric conditions, we replicated previously reported gene set enrichments and identified novel risk genes. For epilepsy, gTADA prioritized 40 risk genes (posterior probabilities > 0.95), 6 of which replicate in an independent whole-genome sequencing study. In addition, 30/40 genes are novel genes. We found that epilepsy genes had high protein-protein interaction (PPI) network connectivity, and show specific expression during human brain development. Some of the top prioritized EPI genes were connected to a PPI subnetwork of immune genes and show specific expression in prenatal microglia. We also identified multiple enriched drug-target gene sets for EPI which included immunostimulants as well as known antiepileptics. Immune biology was supported specifically by case-control variants from familial epilepsies rather than do novo mutations in generalized encephalitic epilepsy.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
M. Joseph Tomlinson ◽  
Shawn W. Polson ◽  
Jing Qiu ◽  
Juniper A. Lake ◽  
William Lee ◽  
...  

AbstractDifferential abundance of allelic transcripts in a diploid organism, commonly referred to as allele specific expression (ASE), is a biologically significant phenomenon and can be examined using single nucleotide polymorphisms (SNPs) from RNA-seq. Quantifying ASE aids in our ability to identify and understand cis-regulatory mechanisms that influence gene expression, and thereby assist in identifying causal mutations. This study examines ASE in breast muscle, abdominal fat, and liver of commercial broiler chickens using variants called from a large sub-set of the samples (n = 68). ASE analysis was performed using a custom software called VCF ASE Detection Tool (VADT), which detects ASE of biallelic SNPs using a binomial test. On average ~ 174,000 SNPs in each tissue passed our filtering criteria and were considered informative, of which ~ 24,000 (~ 14%) showed ASE. Of all ASE SNPs, only 3.7% exhibited ASE in all three tissues, with ~ 83% showing ASE specific to a single tissue. When ASE genes (genes containing ASE SNPs) were compared between tissues, the overlap among all three tissues increased to 20.1%. Our results indicate that ASE genes show tissue-specific enrichment patterns, but all three tissues showed enrichment for pathways involved in translation.


Sign in / Sign up

Export Citation Format

Share Document