scholarly journals Epigenetic features improve TALE target prediction

BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Annett Erkes ◽  
Stefanie Mücke ◽  
Maik Reschke ◽  
Jens Boch ◽  
Jan Grau

Abstract Background The yield of many crop plants can be substantially reduced by plant-pathogenic Xanthomonas bacteria. The infection strategy of many Xanthomonas strains is based on transcription activator-like effectors (TALEs), which are secreted into the host cells and act as transcriptional activators of plant genes that are beneficial for the bacteria.The modular DNA binding domain of TALEs contains tandem repeats, each comprising two hyper-variable amino acids. These repeat-variable diresidues (RVDs) bind to their target box and determine the specificity of a TALE.All available tools for the prediction of TALE targets within the host plant suffer from many false positives. In this paper we propose a strategy to improve prediction accuracy by considering the epigenetic state of the host plant genome in the region of the target box. Results To this end, we extend our previously published tool PrediTALE by considering two epigenetic features: (i) chromatin accessibility of potentially bound regions and (ii) DNA methylation of cytosines within target boxes. Here, we determine the epigenetic features from publicly available DNase-seq, ATAC-seq, and WGBS data in rice.We benchmark the utility of both epigenetic features separately and in combination, deriving ground-truth from RNA-seq data of infections studies in rice. We find an improvement for each individual epigenetic feature, but especially the combination of both.Having established an advantage in TALE target predicting considering epigenetic features, we use these data for promoterome and genome-wide scans by our new tool EpiTALE, leading to several novel putative virulence targets. Conclusions Our results suggest that it would be worthwhile to collect condition-specific chromatin accessibility data and methylation information when studying putative virulence targets of Xanthomonas TALEs.

2021 ◽  
Author(s):  
Annett Erkes ◽  
Stefanie Mücke ◽  
Maik Reschke ◽  
Jens Boch ◽  
Jan Grau

The yield of many crop plants can be substantially reduced by plant-pathogenic Xanthomonas bacteria. The infection strategy of many Xanthomonas strains is based on transcription activator-like effectors (TALEs), which are secreted into the host cells and act as transcriptional activators of plant genes that are beneficial for the bacteria. The modular DNA binding domain of TALEs contains tandem repeats, each comprising two hyper-variable amino acids. These repeat-variable diresidues (RVDs) bind to a continuous DNA stretch (a target box) and determine the specificity of a TALE. All available tools for the prediction of TALE targets within the host plant suffer from many false positives. In this paper we propose a strategy to improve prediction accuracy by considering the epigenetic state of the host plant genome in the region of the target box. To this end, we extend our previously published tool PrediTALE by two epigenetic features: (i) We allow for filtering target boxes according to chromatin accessibility and (ii) we allow for considering the methylation state of cytosines within the target box during prediction, since DNA methylation may affect the binding specificity of RVDs. Here, we determine the epigenetic features from publicly available DNase-seq, ATAC-seq, and WGBS-seq data in rice. We benchmark the utility of both epigenetic features separately and in combination, deriving ground-truth from RNA-seq infections studies in rice. We find an improvement for each individual epigenetic feature, but especially the combination of both. Having established an advantage in TALE target predicting considering epigenetic features, we use these data for promoterome and genome-wide scans by our new tool EpiTALE, leading to several novel putative virulence targets. Our results suggest that it would be worthwhile to collect condition-specific chromatin accessibility data and methylation information when studying putative virulence targets of Xanthomonas TALEs.


2019 ◽  
Author(s):  
Tihana Vondrak ◽  
Laura Ávila Robledillo ◽  
Petr Novák ◽  
Andrea Koblížková ◽  
Pavel Neumann ◽  
...  

AbstractBackgroundAmplification of monomer sequences into long contiguous arrays is the main feature distinguishing satellite DNA from other tandem repeats, yet it is also the main obstacle in its investigation because these arrays are in principle difficult to assemble. Here we explore an alternative, assembly-free approach that utilizes ultra-long Oxford Nanopore reads to infer the length distribution of satellite repeat arrays, their association with other repeats and the prevailing sequence periodicities.ResultsWe have developed a computational workflow for similarity-based detection and downstream analysis of satellite repeats in individual nanopore reads that led to genome-wide characterization of their properties. Using the satellite DNA-rich legume plantLathyrus sativusas a model, we demonstrated this approach by analyzing eleven major satellite repeats using a set of nanopore reads ranging from 30 to over 200 kb in length and representing 0.73x genome coverage. We found surprising differences between the analyzed repeats because only two of them were predominantly organized in long arrays typical for satellite DNA. The remaining nine satellites were found to be derived from short tandem arrays located within LTR-retrotransposons that occasionally expanded in length. While the corresponding LTR-retrotransposons were dispersed across the genome, this array expansion occurred mainly in the primary constrictions of theL. sativuschromosomes, which suggests that these genome regions are favorable for satellite DNA accumulation.ConclusionsThe presented approach proved to be efficient in revealing differences in long-range organization of satellite repeats that can be used to investigate their origin and evolution in the genome.


2019 ◽  
Author(s):  
Annett Erkes ◽  
Stefanie Mücke ◽  
Maik Reschke ◽  
Jens Boch ◽  
Jan Grau

AbstractPlant-pathogenic Xanthomonas bacteria secret transcription activator-like effectors (TALEs) into host cells, where they act as transcriptional activators on plant target genes to support bacterial virulence. TALEs have a unique modular DNA-binding domain composed of tandem repeats. Two amino acids within each tandem repeat, termed repeat-variable diresidues, bind to contiguous nucleotides on the DNA sequence and determine target specificity.In this paper, we propose a novel approach for TALE target prediction to identify potential virulence targets. Our approach accounts for recent findings concerning TALE targeting, including frame-shift binding by repeats of aberrant lengths, and the flexible strand orientation of target boxes relative to the transcription start of the downstream target gene. The computational model can account for dependencies between adjacent RVD positions. Model parameters are learned from the wealth of quantitative data that have been generated over the last years.We benchmark the novel approach, termed PrediTALE, using RNA-seq data after Xanthomonas infection in rice, and find an overall improvement of prediction performance compared with previous approaches. Using PrediTALE, we are able to predict several novel putative virulence targets. However, we also observe that no target genes are predicted by any prediction tool for several TALEs, which we term orphan TALEs for this reason. We postulate that one explanation for orphan TALEs are incomplete gene annotations and, hence, propose to replace promoterome-wide by genome-wide scans for target boxes. We demonstrate that known targets from promoterome-wide scans may be recovered by genome-wide scans, whereas the latter, combined with RNA-seq data, are able to detect putative targets independent of existing gene annotations.Author summaryDiseases caused by plant-pathogenic Xanthomonas bacteria are a serious threat for many important crop plants including rice. Efficiently protecting plants from these pathogens requires a deeper understanding of infection strategies. For many Xanthomonas strains, such infection strategies depend on a special class of effector proteins, termed transcription activator-like effectors (TALEs). TALEs may specifically activate genes of the host plant and, by this means, re-program the plant cell for the benefit of the pathogen. Target sequences and, consequently, target genes of a specific TALE may be predicted computationally from its amino acids. Here, we propose a novel approach for TALE target prediction that makes use of several insights into TALE biology but also of broad experimental data gained over the last years. We demonstrate that this approach yields a higher prediction accuracy than previous approaches. We further postulate that a strategy change from a restricted search only considering promoters of annotated genes to a broad genome-wide search is feasible and yields novel targets including previously neglected protein-coding genes but also non-coding RNAs of possibly regulatory function.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Sarah E. Pierce ◽  
Jeffrey M. Granja ◽  
William J. Greenleaf

AbstractChromatin accessibility profiling can identify putative regulatory regions genome wide; however, pooled single-cell methods for assessing the effects of regulatory perturbations on accessibility are limited. Here, we report a modified droplet-based single-cell ATAC-seq protocol for perturbing and evaluating dynamic single-cell epigenetic states. This method (Spear-ATAC) enables simultaneous read-out of chromatin accessibility profiles and integrated sgRNA spacer sequences from thousands of individual cells at once. Spear-ATAC profiling of 104,592 cells representing 414 sgRNA knock-down populations reveals the temporal dynamics of epigenetic responses to regulatory perturbations in cancer cells and the associations between transcription factor binding profiles.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
An Zheng ◽  
Michael Lamkin ◽  
Yutong Qiu ◽  
Kevin Ren ◽  
Alon Goren ◽  
...  

Abstract Background A major challenge in evaluating quantitative ChIP-seq analyses, such as peak calling and differential binding, is a lack of reliable ground truth data. Accurate simulation of ChIP-seq data can mitigate this challenge, but existing frameworks are either too cumbersome to apply genome-wide or unable to model a number of important experimental conditions in ChIP-seq. Results We present ChIPs, a toolkit for rapidly simulating ChIP-seq data using statistical models of key experimental steps. We demonstrate how ChIPs can be used for a range of applications, including benchmarking analysis tools and evaluating the impact of various experimental parameters. ChIPs is implemented as a standalone command-line program written in C++ and is available from https://github.com/gymreklab/chips. Conclusions ChIPs is an efficient ChIP-seq simulation framework that generates realistic datasets over a flexible range of experimental conditions. It can serve as an important component in various ChIP-seq analyses where ground truth data are needed.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Zhixuan Du ◽  
Qitao Su ◽  
Zheng Wu ◽  
Zhou Huang ◽  
Jianzhong Bao ◽  
...  

AbstractMultidrug and toxic compound extrusion (MATE) proteins are involved in many physiological functions of plant growth and development. Although an increasing number of MATE proteins have been identified, the understanding of MATE proteins is still very limited in rice. In this study, 46 MATE proteins were identified from the rice (Oryza sativa) genome by homology searches and domain prediction. The rice MATE family was divided into four subfamilies based on the phylogenetic tree. Tandem repeats and fragment replication contribute to the expansion of the rice MATE gene family. Gene structure and cis-regulatory elements reveal the potential functions of MATE genes. Analysis of gene expression showed that most of MATE genes were constitutively expressed and the expression patterns of genes in different tissues were analyzed using RNA-seq. Furthermore, qRT-PCR-based analysis showed differential expression patterns in response to salt and drought stress. The analysis results of this study provide comprehensive information on the MATE gene family in rice and will aid in understanding the functional divergence of MATE genes.


2021 ◽  
Author(s):  
Linda Zhou ◽  
Chunmin Ge ◽  
Thomas Malachowski ◽  
Ji Hun Kim ◽  
Keerthivasan Raanin Chandradoss ◽  
...  

AbstractShort tandem repeat (STR) instability is causally linked to pathologic transcriptional silencing in a subset of repeat expansion disorders. In fragile X syndrome (FXS), instability of a single CGG STR tract is thought to repress FMR1 via local DNA methylation. Here, we report the acquisition of more than ten Megabase-sized H3K9me3 domains in FXS, including a 5-8 Megabase block around FMR1. Distal H3K9me3 domains encompass synaptic genes with STR instability, and spatially co-localize in trans concurrently with FMR1 CGG expansion and the dissolution of TADs. CRISPR engineering of mutation-length FMR1 CGG to normal-length preserves heterochromatin, whereas cut-out to pre-mutation-length attenuates a subset of H3K9me3 domains. Overexpression of a pre-mutation-length CGG de-represses both FMR1 and distal heterochromatinized genes, indicating that long-range H3K9me3-mediated silencing is exquisitely sensitive to STR length. Together, our data uncover a genome-wide surveillance mechanism by which STR tracts spatially communicate over vast distances to heterochromatinize the pathologically unstable genome in FXS.One-Sentence SummaryHeterochromatinization of distal synaptic genes with repeat instability in fragile X is reversible by overexpression of a pre-mutation length CGG tract.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Zahra Iqbal ◽  
Mohammed Shariq Iqbal ◽  
Lalida Sangpong ◽  
Gholamreza Khaksar ◽  
Supaart Sirikantaramas ◽  
...  

Abstract Background Fruit ripening is an intricate developmental process driven by a highly coordinated action of complex hormonal networks. Ethylene is considered as the main phytohormone that regulates the ripening of climacteric fruits. Concomitantly, several ethylene-responsive transcription factors (TFs) are pivotal components of the regulatory network underlying fruit ripening. Calmodulin-binding transcription activator (CAMTA) is one such ethylene-induced TF implicated in various stress and plant developmental processes. Results Our comprehensive analysis of the CAMTA gene family in Durio zibethinus (durian, Dz) identified 10 CAMTAs with conserved domains. Phylogenetic analysis of DzCAMTAs, positioned DzCAMTA3 with its tomato ortholog that has already been validated for its role in the fruit ripening process through ethylene-mediated signaling. Furthermore, the transcriptome-wide analysis revealed DzCAMTA3 and DzCAMTA8 as the highest expressing durian CAMTA genes. These two DzCAMTAs possessed a distinct ripening-associated expression pattern during post-harvest ripening in Monthong, a durian cultivar native to Thailand. The expression profiling of DzCAMTA3 and DzCAMTA8 under natural ripening conditions and ethylene-induced/delayed ripening conditions substantiated their roles as ethylene-induced transcriptional activators of ripening. Similarly, auxin-suppressed expression of DzCAMTA3 and DzCAMTA8 confirmed their responsiveness to exogenous auxin treatment in a time-dependent manner. Accordingly, we propose that DzCAMTA3 and DzCAMTA8 synergistically crosstalk with ethylene during durian fruit ripening. In contrast, DzCAMTA3 and DzCAMTA8 antagonistically with auxin could affect the post-harvest ripening process in durian. Furthermore, DzCAMTA3 and DzCAMTA8 interacting genes contain significant CAMTA recognition motifs and regulated several pivotal fruit-ripening-associated pathways. Conclusion Taken together, the present study contributes to an in-depth understanding of the structure and probable function of CAMTA genes in the post-harvest ripening of durian.


Blood ◽  
2020 ◽  
Vol 136 (Supplement 1) ◽  
pp. 32-33
Author(s):  
Rafael Renatino-Canevarolo ◽  
Mark B. Meads ◽  
Maria Silva ◽  
Praneeth Reddy Sudalagunta ◽  
Christopher Cubitt ◽  
...  

Multiple myeloma (MM) is an incurable cancer of bone marrow-resident plasma cells, which evolves from a premalignant state, MGUS, to a form of active disease characterized by an initial response to therapy, followed by cycles of therapeutic successes and failures, culminating in a fatal multi-drug resistant cancer. The molecular mechanisms leading to disease progression and refractory disease in MM remain poorly understood. To address this question, we have generated a new database, consisting of 1,123 MM biopsies from patients treated at the H. Lee Moffitt Cancer Center. These samples ranged from MGUS to late relapsed/refractory (LR) disease, and were comprehensively characterized genetically (844 RNAseq, 870 WES, 7 scRNAseq), epigenetically (10 single-cell chromatin accessibility, scATAC-seq) and phenotypically (537 samples assessed for ex vivo drug resistance). Mutational analysis identified putative driver genes (e.g. NRAS, KRAS) among the highest frequent mutations, as well as a steady increase in mutational load across progression from MGUS to LR samples. However, with the exception of KRAS, these genes did not reach statistical significance according to FISHER's exact test between different disease stages, suggesting that no single mutation is necessary or sufficient to drive MM progression or refractory disease, but rather a common "driver" biology is critical. Pathway analysis of differentially expressed genes identified cell adhesion, inflammatory cytokines and hematopoietic cell identify as under-expressed in active MM vs. MGUS, while cell cycle, metabolism, DNA repair, protein/RNA synthesis and degradation were over-expressed in LR. Using an unsupervised systems biology approach, we reconstructed a gene expression map to identify transcriptomic reprogramming events associated with disease progression and evolution of drug resistance. At an epigenetic regulatory level, these genes were enriched for histone modifications (e.g. H3k27me3 and H3k27ac). Furthermore, scATAC-seq confirmed genome-wide alterations in chromatin accessibility across MM progression, involving shifts in chromatin accessibility of the binding motifs of epigenetic regulator complexes, known to mediate formation of 3D structures (CTCF/YY1) of super enhancers (SE) and cell identity reprograming (POU5F1/SOX2). Additionally, we have identified SE-regulated genes under- (EBF1, RB1, SPI1, KLF6) and over-expressed (PRDM1, IRF4) in MM progression, as well as over-expressed in LR (RFX5, YY1, NBN, CTCF, BCOR). We have found a correlation between cytogenetic abnormalities and mutations with differential gene expression observed in MM progression, suggesting groups of genetic events with equivalent transcriptomic effect: e.g. NRAS, KRAS, DIS3 and del13q are associated with transcriptomic changes observed during MGUS/SMOL=>active MM transition (Figure 1). Taken together, our preliminary data suggests that multiple independent combinations of genetic and epigenetic events (e.g. mutations, cytogenetics, SE dysregulation) alter the balance of master epigenetic regulatory circuitry, leading to genome-wide transcriptional reprogramming, facilitating disease progression and emergence of drug resistance. Figure 1: Topology of transcriptional regulation in MM depicts 16,738 genes whose expression is increased (red) or decreased (green) in presence of genetic abnormality. Differential expression associated with (A) hotspot mutations and (B) cytogenetic abnormalities confirms equivalence of expected pairs (e.g. NRAS and KRAS, BRAF and RAF1), but also proposes novel transcriptomic dysregulation effect of clinically relevant cytogenetic abnormalities, with yet uncharacterized molecular role in MM. Figure 1 Disclosures Kulkarni: M2GEN: Current Employment. Zhang:M2GEN: Current Employment. Hampton:M2GEN: Current Employment. Shain:GlaxoSmithKline: Speakers Bureau; Amgen: Speakers Bureau; Karyopharm: Research Funding, Speakers Bureau; AbbVie: Research Funding; Takeda: Honoraria, Speakers Bureau; Sanofi/Genzyme: Honoraria, Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Janssen: Honoraria, Speakers Bureau; Celgene: Honoraria, Membership on an entity's Board of Directors or advisory committees, Speakers Bureau; Adaptive: Consultancy, Honoraria; BMS: Honoraria, Membership on an entity's Board of Directors or advisory committees, Speakers Bureau. Siqueira Silva:AbbVie: Research Funding; Karyopharm: Research Funding; NIH/NCI: Research Funding.


Sign in / Sign up

Export Citation Format

Share Document