motif enrichment
Recently Published Documents


TOTAL DOCUMENTS

21
(FIVE YEARS 10)

H-INDEX

6
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Klara Kuret ◽  
Aram Gustav Amalietti ◽  
Jernej Ule

AbstractBackgroundCrosslinking and immunoprecipitation (CLIP) is a method used to identify in vivo RNA– protein binding sites on a transcriptome-wide scale. With the increasing amounts of available data for RNA-binding proteins (RBPs), it is important to understand to what degree the enriched motifs specify the RNA binding profiles of RBPs in cells.ResultsWe develop positionally-enriched k-mer analysis (PEKA), a computational tool for efficient analysis of enriched motifs from individual CLIP datasets, which minimises the impact of technical and regional genomic biases by internal data normalisation. We cross-validate PEKA with mCross, and show that background correction by size-matched input doesn’t generally improve the specificity of detected motifs. We identify motif classes with common enrichment patterns across eCLIP datasets and across RNA regions, while also observing variations in the specificity and the extent of motif enrichment across eCLIP datasets, between variant CLIP protocols, and between CLIP and in vitro binding data. Thereby we gain insights into the contributions of technical and regional genomic biases to the enriched motifs, and find how motif enrichment features relate to the domain composition and low-complexity regions (LCRs) of the studied proteins.ConclusionsOur study provides insights into the overall contributions of regional binding preferences, protein domains and LCRs to the specificity of protein-RNA interactions, and shows the value of cross-motif and cross-RBP comparison for data interpretation. Our results are presented for exploratory analysis via an online platform in an RBP-centric and motif-centric manner (https://imaps.goodwright.com/apps/peka/). PEKA is available from https://github.com/ulelab/peka.


Author(s):  
Charles E. Grant ◽  
Timothy L. Bailey

AbstractXSTREME is a web-based tool for performing comprehensive motif discovery and analysis in DNA, RNA or protein sequences, as well as in sequences in user-defined alphabets. It is designed for both very large and very small datasets. XSTREME is similar to the MEME-ChIP tool, but expands upon its capabilities in several ways. Like MEME-ChIP, XSTREME performs two types of de novo motif discovery, and also performs motif enrichment analysis of the input sequences using databases of known motifs. Unlike MEME-ChIP, which ranks motifs based on their enrichment in the centers of the input sequences, XSTREME uses enrichment anywhere in the sequences for this purpose. Consequently, XSTREME is more appropriate for motif-based analysis of sequences regardless of how the motifs are distributed within the sequences. XSTREME uses the MEME and STREME algorithms for motif discovery, and the recently developed SEA algorithm for motif enrichment analysis. The interactive HTML output produced by XSTREME includes highly accurate motif significance estimates, plots of the positional distribution of each motif, and histograms of the number of motif matches in each sequences. XSTREME is easy to use via its web server at https://meme-suite.org, and is fully integrated with the widely-used MEME Suite of sequence analysis tools, which can be freely downloaded at the same web site for non-commercial use.


2021 ◽  
Author(s):  
Timothy L. Bailey ◽  
Charles E. Grant

Motif enrichment algorithms can identify known sequence motifs that are present to a statistically significant degree in DNA, RNA and protein sequences. Databases of such known motifs exist for DNA- and RNA-binding proteins, as well as for many functional protein motifs. The SEA ("Simple Enrichment Analysis") algorithm presented here uses a simple, consistent approach for detecting the enrichment of motifs in DNA, RNA or protein sequences, as well as in sequences using user-defined alphabets. SEA can identify known motifs that are enriched in a single set of input sequences, and can also perform differential motif enrichment analysis when presented with an additional set of control sequences. Using in vivo DNA (ChIP-seq) data as input to SEA, and validating motifs with reference motifs derived from in vitro data, we show that SEA is is faster than three widely-used motif enrichment algorithms (AME, CentriMo and Pscan), while delivering comparable accuracy. We also show that, in contrast to other motif enrichment algorithms, SEA reports accurate estimates of statistical significance. SEA is easy to use via its web server at https://meme-suite.org, and is fully integrated with the widely-used MEME Suite of sequence analysis tools, which can be freely downloaded at the same web site for non-commercial use.


2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Jonathan D. Rubin ◽  
Jacob T. Stanley ◽  
Rutendo F. Sigauke ◽  
Cecilia B. Levandowski ◽  
Zachary L. Maas ◽  
...  

AbstractDetecting changes in the activity of a transcription factor (TF) in response to a perturbation provides insights into the underlying cellular process. Transcription Factor Enrichment Analysis (TFEA) is a robust and reliable computational method that detects positional motif enrichment associated with changes in transcription observed in response to a perturbation. TFEA detects positional motif enrichment within a list of ranked regions of interest (ROIs), typically sites of RNA polymerase initiation inferred from regulatory data such as nascent transcription. Therefore, we also introduce muMerge, a statistically principled method of generating a consensus list of ROIs from multiple replicates and conditions. TFEA is broadly applicable to data that informs on transcriptional regulation including nascent transcription (eg. PRO-Seq), CAGE, histone ChIP-Seq, and accessibility data (e.g., ATAC-Seq). TFEA not only identifies the key regulators responding to a perturbation, but also temporally unravels regulatory networks with time series data. Consequently, TFEA serves as a hypothesis-generating tool that provides an easy, rigorous, and cost-effective means to broadly assess TF activity yielding new biological insights.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Morten Muhlig Nielsen ◽  
Jakob Skou Pedersen

AbstractHigh throughput single-cell RNA sequencing (scRNAseq) can provide mRNA expression profiles for thousands of cells. However, miRNAs cannot currently be studied at the same scale. By exploiting that miRNAs bind well-defined sequence motifs and typically down-regulate target genes, we show that motif enrichment analysis can be used to derive miRNA activity estimates from scRNAseq data. Motif enrichment analyses have traditionally been used to derive binding motifs for regulatory factors, such as miRNAs or transcription factors, that have an effect on gene expression. Here we reverse its use. By starting from the miRNA seed site, we derive a measure of activity for miRNAs in single cells. We first establish the approach on a comprehensive set of bulk TCGA cancer samples (n = 9679), with paired mRNA and miRNA expression profiles, where many miRNAs show a strong correlation with measured expression. By downsampling we show that the method can be used to estimate miRNA activity in sparse data comparable to scRNAseq experiments. We then analyze a human and a mouse scRNAseq data set, and show that for several miRNA candidates, including liver specific miR-122 and muscle specific miR-1 and miR-133a, we obtain activity measures supported by the literature. The methods are implemented and made available in the miReact software. Our results demonstrate that miRNA activities can be estimated at the single cell level. This allows insights into the dynamics of miRNA activity across a range of fields where scRNAseq is applied.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Nathaniel P. Delos Santos ◽  
Lorane Texari ◽  
Christopher Benner

Abstract Background Motif enrichment analysis (MEA) identifies over-represented transcription factor binding (TF) motifs in the DNA sequence of regulatory regions, enabling researchers to infer which transcription factors can regulate transcriptional response to a stimulus, or identify sequence features found near a target protein in a ChIP-seq experiment. Score-based MEA determines motifs enriched in regions exhibiting extreme differences in regulatory activity, but existing methods do not control for biases in GC content or dinucleotide composition. This lack of control for sequence bias, such as those often found in CpG islands, can obscure the enrichment of biologically relevant motifs. Results We developed Motif Enrichment In Ranked Lists of Peaks (MEIRLOP), a novel MEA method that determines enrichment of TF binding motifs in a list of scored regulatory regions, while controlling for sequence bias. In this study, we compare MEIRLOP against other MEA methods in identifying binding motifs found enriched in differentially active regulatory regions after interferon-beta stimulus, finding that using logistic regression and covariates improves the ability to call enrichment of ISGF3 binding motifs from differential acetylation ChIP-seq data compared to other methods. Our method achieves similar or better performance compared to other methods when quantifying the enrichment of TF binding motifs from ENCODE TF ChIP-seq datasets. We also demonstrate how MEIRLOP is broadly applicable to the analysis of numerous types of NGS assays and experimental designs. Conclusions Our results demonstrate the importance of controlling for sequence bias when accurately identifying enriched DNA sequence motifs using score-based MEA. MEIRLOP is available for download from https://github.com/npdeloss/meirlop under the MIT license.


2020 ◽  
Author(s):  
Morten Muhlig Nielsen ◽  
Jakob Skou Pedersen

AbstractHigh throughput single-cell RNA sequencing (scRNAseq) can provide mRNA expression profiles for thousands of cells. However, miRNAs cannot currently be studied at the same scale. By exploiting that miRNAs bind well-defined sequence motifs and typically down-regulate target genes, we show that motif enrichment analysis can be used to derive miRNA activity estimates from scRNAseq data.Motif enrichment analyses have traditionally been used to derive binding motifs for regulatory factors, such as miRNAs or transcription factors, that have an effect on gene expression. Here we reverse its use. By starting from the miRNA seed site, we derive a measure of activity for miRNAs in single cells. We first establish the approach on a comprehensive set of bulk TCGA cancer samples (n=9,679), with paired mRNA and miRNA expression profiles, where many miRNAs show a strong correlation with measured expression. By downsampling we show that the method can be used to estimate miRNA activity in sparse data comparable to scRNAseq experiments. We then analyze a human and a mouse scRNAseq data set, and show that for several miRNA candidates, including liver specific miR-122 and muscle specific miR-1 and miR-133a, we obtain activity measures supported by the literature. The methods are implemented and made available in the miReact software. Our results demonstrate that miRNA activities can be estimated at the single cell level. This allows insights into the dynamics of miRNA activity across a range of fields where scRNAseq is applied.


Blood ◽  
2019 ◽  
Vol 134 (Supplement_1) ◽  
pp. 4329-4329
Author(s):  
Valentina S Caputo ◽  
Nikolaos Trasanidis ◽  
Xiaolin Xiao ◽  
Mark E Robinson ◽  
Alexia Katsarou ◽  
...  

BACKGROUND: Bone disease, a common source of morbidity in multiple myeloma (MM), is caused by RANKL-induced aberrant activation of osteoclasts (OC). RANKL-induced OC lineage commitment requires repression of an Irf-8 dependent macrophage inflammatory transcriptional programme commensurate with activation of an OC lineage-specific programme. Functional data have shown the requirement for the histone acetylation readers Brd2-4 BET proteins and of cMyc for OC lineage development. However, how Brd2-4 and Myc co-operate genome-wide to regulate transcriptome changes that underpin the very early stages of RANKL-induced OC lineage commitment has not been defined. METHODS: The OC progenitor-like murine RAW264.7 cell line was used for osteoclastogenesis. OC were assayed by TRAP staining. We performed RNA-seq for transcriptome analysis and ChIP-seq against Brd2-4, cMyc, and H3K27Ac mark for epigenomic profiling. The pan-Bet inhibitor IBET151 was used alone or in combination with RANKL. ChIP-seq/RNA-seq data were processed using standard bioinformatics pipelines; downstream analyses (pathway and motif enrichment, factor differential binding) were performed by various tools including EnrichR, R packages ChIPpeakAnno/DiffBind, Rose. RESULTS: Transcriptomic profiling of OC progenitors at 0, 4, 14 and 24h post-RANKL treatment identified 12 distinct clusters of expression trends. The 4h activated cluster includes OC master transcription factors (TFs; cMyc, Nfatc1, Fosl), and is enriched in OC-defining pathways. Notably, by 14h the majority of the genes required for mature OC formation and activation are already highly expressed (e.g. Ctsk, Mmp9). The downregulated clusters include monocyte defining TFs (e.g. Irf8, Mafb and Bcl6). These RANKL-dependent transcriptome changes are completely abrogated by iBET151, highlighting the critical role of Brd2-4 in osteoclastogenesis. Differential chromatin binding analysis upon RANKL induction revealed an overall enhanced Brd2-4 binding at already existing or de novo gained sites. This was more pronounced for Brd2&4 and much less for Brd3, with differentially binding sites (DBS) comprising 50% and 20% respectively of all binding sites in RANKL-treated cells. For Brd2&3, DBS were primarily distributed at promoters and for Brd4 at intergenic, candidate enhancers regions. Notably, nearly all gained DBS were sensitive to and abrogated by iBET151. Combinatorial profiling of Brd2 and Brd4 showed that almost half of Brd2 DBS peaks overlap with Brd4 (47%; 897/1896), while only 24% (766/3234) of Brd4 DBS peaks are co-occupied by Brd2. Transcriptome and Brd2&4 DBS integration in combination with motif enrichment analysis, identified genes that are predicted to be regulated by Brd2 and/or Brd4. EnrichR analysis suggests that enhanced binding of Brd2&4, singly or in combination, is required for activation of the critical OC lineage-specific and repression of the macrophage-defining transcriptional programs highlighting the non-redundant roles of Brd2&4 in OC development. Cell lineage commitment often requires 'commissioning' of cell-specific super-enhancers (SE). Combined analysis of genome-wide Brd4/H3K27ac profiles identified 678 RANKL-induced SE and their respective target genes. Further, 110 of these SE showed enhanced Brd4 binding in 2 peaks: 20/110 were linked to significantly up- and 90/100 to down-regulated genes. The repressed genes were significantly enriched to previously described Irf8, MafB and RunX1 targets, suggesting a critical role of SE in the repression of the monocyte/macrophage inflammatory programme during OC lineage commitment. Strikingly, among top hits, we detected a SE linked to the regulation of cMyc. To further investigate its role in OC development, we obtained the cistrome of cMyc after RANKL induction. We identified 560 binding sites which were highly enriched in cMyc, Max, Fli1, Fosl2 and Irf8 motifs. Cistrome-transcriptome integration suggested direct activation of 141 and repression of 52 genes by cMyc in response to RANKL; these are enriched in ribosome biogenesis pathways and Irf8-dependent targets respectively. CONCLUSIONS: Myc and Brd4 mark SE that repress an Irf8-dependent transcriptional programme, a requirement for OC lineage commitment. The non-redundant roles of Brd2&4 suggest that selective targeting of either could inhibit aberrant OC activation associated with MM. Disclosures Caputo: GSK: Research Funding. Auner:Amgen: Other: Consultancy and Research Funding; Takeda: Consultancy; Karyopharm: Consultancy. Karadimitris:GSK: Research Funding.


2019 ◽  
Author(s):  
Yiyang Zhao ◽  
Jianbo Xie ◽  
Weijie Xu ◽  
Sisi Chen ◽  
Yousry A. El-Kassaby ◽  
...  

Abstract Background Photosynthesis has been recognized as a complicated process that is modulated through the intricate regulating network at transcriptional level. However, its underlying mechanism at molecular level under heat stress remains to be understood. Analysis of the adaptive response and regulatory networks of trees to heat stress will expand our understanding of thermostability in perennial plants. In this study, we used a multi-gene network to investigate the regulatory pathway under heat stress, as constructed by a multifaceted approach of combining time-course RNA-seq, regulatory motif enrichment, and expression-trait association analysis. Results By analyzing changes in the transcriptome under heat stress, we identified 77 key photosynthetic genes, of which 97.4% (75 genes) were down-regulated, and these results conformed to the decreased photosynthesis measured values. According to analysis of regulating motif enrichment, these 77 differentially expressed genes (DEGs) had common vital light-responsive elements involved in photosynthesis. When integrating all the differential expressed genes, 5 co-expressed gene modules (1,548 genes) were identified to be significantly correlated with 4 photosynthesis-related traits. Thus, based on this, a three-layered gene regulatory network (GRN) was established, which had included 77 photosynthetic genes (in the bottom layer), 40 TFs/miRNAs (in the second layer), as well as 20 TFs/miRNAs (in the top layer), using a backward elimination random forest (BWERF) algorithm. Importantly, 6 miRNAs and 4 TFs were found to be key regulators in this regulatory pathway, emphasizing the significant roles of TFs/miRNAs in affecting photosynthetic traits. The results imply a functional role for these key genes in mediating photosynthesis under heat stress, demonstrating the potential of combining time-course transcriptome-based regulatory pathway construction, cis-elements enrichment analysis, and expression-trait association approaches to dissect complex genetic networks. Conclusions The heat-responsive pathway in regulating photosynthesis is a multi-layered complex network which is co-controlled by TFs and miRNAs. Our work not only imply a functional role for these key genes in mediating photosynthesis responding to abiotic stress in poplar, but demonstrate time-course transcriptome-based regulatory network construction will facilitate further the genetic network and key nodes examining in plants.


Sign in / Sign up

Export Citation Format

Share Document