scholarly journals Galaxy CLIP-Explorer: a web server for CLIP-Seq data analysis

GigaScience ◽  
2020 ◽  
Vol 9 (11) ◽  
Author(s):  
Florian Heyl ◽  
Daniel Maticzka ◽  
Michael Uhl ◽  
Rolf Backofen

Abstract Background Post-transcriptional regulation via RNA-binding proteins plays a fundamental role in every organism, but the regulatory mechanisms lack important understanding. Nevertheless, they can be elucidated by cross-linking immunoprecipitation in combination with high-throughput sequencing (CLIP-Seq). CLIP-Seq answers questions about the functional role of an RNA-binding protein and its targets by determining binding sites on a nucleotide level and associated sequence and structural binding patterns. In recent years the amount of CLIP-Seq data skyrocketed, urging the need for an automatic data analysis that can deal with different experimental set-ups. However, noncanonical data, new protocols, and a huge variety of tools, especially for peak calling, made it difficult to define a standard. Findings CLIP-Explorer is a flexible and reproducible data analysis pipeline for iCLIP data that supports for the first time eCLIP, FLASH, and uvCLAP data. Individual steps like peak calling can be changed to adapt to different experimental settings. We validate CLIP-Explorer on eCLIP data, finding similar or nearly identical motifs for various proteins in comparison with other databases. In addition, we detect new sequence motifs for PTBP1 and U2AF2. Finally, we optimize the peak calling with 3 different peak callers on RBFOX2 data, discuss the difficulty of the peak-calling step, and give advice for different experimental set-ups. Conclusion CLIP-Explorer finally fills the demand for a flexible CLIP-Seq data analysis pipeline that is applicable to the up-to-date CLIP protocols. The article further shows the limitations of current peak-calling algorithms and the importance of a robust peak detection.

2021 ◽  
Vol 12 ◽  
Author(s):  
Huiyuan Wang ◽  
Sheng Liu ◽  
Xiufang Dai ◽  
Yongkang Yang ◽  
Yunjun Luo ◽  
...  

Populus trichocarpa (P. trichocarpa) is a model tree for the investigation of wood formation. In recent years, researchers have generated a large number of high-throughput sequencing data in P. trichocarpa. However, no comprehensive database that provides multi-omics associations for the investigation of secondary growth in response to diverse stresses has been reported. Therefore, we developed a public repository that presents comprehensive measurements of gene expression and post-transcriptional regulation by integrating 144 RNA-Seq, 33 ChIP-seq, and six single-molecule real-time (SMRT) isoform sequencing (Iso-seq) libraries prepared from tissues subjected to different stresses. All the samples from different studies were analyzed to obtain gene expression, co-expression network, and differentially expressed genes (DEG) using unified parameters, which allowed comparison of results from different studies and treatments. In addition to gene expression, we also identified and deposited pre-processed data about alternative splicing (AS), alternative polyadenylation (APA) and alternative transcription initiation (ATI). The post-transcriptional regulation, differential expression, and co-expression network datasets were integrated into a new P. trichocarpa Stem Differentiating Xylem (PSDX) database, which further highlights gene families of RNA-binding proteins and stress-related genes. The PSDX also provides tools for data query, visualization, a genome browser, and the BLAST option for sequence-based query. Much of the data is also available for bulk download. The availability of PSDX contributes to the research related to the secondary growth in response to stresses in P. trichocarpa, which will provide new insights that can be useful for the improvement of stress tolerance in woody plants.


2016 ◽  
Author(s):  
David Heller ◽  
Martin Vingron ◽  
Ralf Krestel ◽  
Uwe Ohler ◽  
Annalisa Marsico

AbstractRNA-binding proteins (RBPs) play important roles in RNA post-transcriptional regulation and recognize target RNAs via sequence-structure motifs. To which extent RNA structure influences protein binding in the presence or absence of a sequence motif is still poorly understood. Existing RNA motif finders which produce informative motifs and simultaneously capture the relationship between primary sequence and different RNA secondary structures are missing. We developed ssHMM, an RNA motif finder that combines a hidden Markov model (HMM) with Gibbs sampling to learn the joint sequence and structure binding preferences of RBPs from high-throughput data, such as CLIP-Seq sequences, and visualizes them as a graph. Evaluations on synthetic data showed that ssHMM reliably recovers fuzzy sequence motifs in 80 to 100% of the cases. It produces motifs with higher information content than existing tools and is faster than other methods on large datasets. Examples of new sequence-structure motifs identified by ssHMM for uncharacterized RBPs are also discussed. ssHMM is freely available on Github at https://github.molgen.mpg.de/heller/ssHMM.


Author(s):  
Jinkai Wang

Abstract Post-transcriptional processing of RNAs plays important roles in a variety of physiological and pathological processes. These processes can be precisely controlled by a series of RNA binding proteins and cotranscriptionally regulated by transcription factors as well as histone modifications. With the rapid development of high-throughput sequencing techniques, multiomics data have been broadly used to study the mechanisms underlying the important biological processes. However, how to use these high-throughput sequencing data to elucidate the fundamental regulatory roles of post-transcriptional processes is still of great challenge. This review summarizes the regulatory mechanisms of post-transcriptional processes and the general principles and approaches to dissect these mechanisms by integrating multiomics data as well as public resources.


2021 ◽  
Author(s):  
Eun Seon Kim ◽  
Chang Geon Chung ◽  
Jeong Hyang Park ◽  
Byung Su Ko ◽  
Sung Soon Park ◽  
...  

Abstract RNA-binding proteins (RBPs) play essential roles in diverse cellular processes through post-transcriptional regulation of RNAs. The subcellular localization of RBPs is thus under tight control, the breakdown of which is associated with aberrant cytoplasmic accumulation of nuclear RBPs such as TDP-43 and FUS, well-known pathological markers for amyotrophic lateral sclerosis and frontotemporal dementia (ALS/FTD). Here, we report in Drosophila model for ALS/FTD that nuclear accumulation of a cytoplasmic RBP, Staufen, may be a new pathological feature. We found that in Drosophila C4da neurons expressing PR36, one of the arginine-rich dipeptide repeat proteins (DPRs), Staufen accumulated in the nucleus in Importin- and RNA-dependent manner. Notably, expressing Staufen with exogenous NLS—but not with mutated endogenous NLS—potentiated PR-induced dendritic defect, suggesting that nuclear-accumulated Staufen can enhance PR toxicity. PR36 expression increased Fibrillarin staining in the nucleolus, which was enhanced by heterozygous mutation of stau (stau+/−), a gene that codes Staufen. Furthermore, knockdown of fib, which codes Fibrillarin, exacerbated retinal degeneration mediated by PR toxicity, suggesting that increased amount of Fibrillarin by stau+/− is protective. Stau+/− also reduced the amount of PR-induced nuclear-accumulated Staufen and mitigated retinal degeneration and rescued viability of flies expressing PR36. Taken together, our data show that nuclear accumulation of Staufen in neurons may be an important pathological feature contributing to the pathogenesis of ALS/FTD.


2019 ◽  
Vol 24 (3) ◽  
pp. 213-223 ◽  
Author(s):  
Raimo Franke ◽  
Bettina Hinkelmann ◽  
Verena Fetz ◽  
Theresia Stradal ◽  
Florenz Sasse ◽  
...  

Mode of action (MoA) identification of bioactive compounds is very often a challenging and time-consuming task. We used a label-free kinetic profiling method based on an impedance readout to monitor the time-dependent cellular response profiles for the interaction of bioactive natural products and other small molecules with mammalian cells. Such approaches have been rarely used so far due to the lack of data mining tools to properly capture the characteristics of the impedance curves. We developed a data analysis pipeline for the xCELLigence Real-Time Cell Analysis detection platform to process the data, assess and score their reproducibility, and provide rank-based MoA predictions for a reference set of 60 bioactive compounds. The method can reveal additional, previously unknown targets, as exemplified by the identification of tubulin-destabilizing activities of the RNA synthesis inhibitor actinomycin D and the effects on DNA replication of vioprolide A. The data analysis pipeline is based on the statistical programming language R and is available to the scientific community through a GitHub repository.


eLife ◽  
2019 ◽  
Vol 8 ◽  
Author(s):  
Ling-Yu Liu ◽  
Xi Long ◽  
Ching-Po Yang ◽  
Rosa L Miyares ◽  
Ken Sugino ◽  
...  

Temporal patterning is a seminal method of expanding neuronal diversity. Here we unravel a mechanism decoding neural stem cell temporal gene expression and transforming it into discrete neuronal fates. This mechanism is characterized by hierarchical gene expression. First, Drosophila neuroblasts express opposing temporal gradients of RNA-binding proteins, Imp and Syp. These proteins promote or inhibit chinmo translation, yielding a descending neuronal gradient. Together, first and second-layer temporal factors define a temporal expression window of BTB-zinc finger nuclear protein, Mamo. The precise temporal induction of Mamo is achieved via both transcriptional and post-transcriptional regulation. Finally, Mamo is essential for the temporally defined, terminal identity of α’/β’ mushroom body neurons and identity maintenance. We describe a straightforward paradigm of temporal fate specification where diverse neuronal fates are defined via integrating multiple layers of gene regulation. The neurodevelopmental roles of orthologous/related mammalian genes suggest a fundamental conservation of this mechanism in brain development.


2004 ◽  
Vol 24 (14) ◽  
pp. 6241-6252 ◽  
Author(s):  
Kristina L. Carroll ◽  
Dennis A. Pradhan ◽  
Josh A. Granek ◽  
Neil D. Clarke ◽  
Jeffry L. Corden

ABSTRACT RNA polymerase II (Pol II) termination is triggered by sequences present in the nascent transcript. Termination of pre-mRNA transcription is coupled to recognition of cis-acting sequences that direct cleavage and polyadenylation of the pre-mRNA. Termination of nonpolyadenylated [non-poly(A)] Pol II transcripts in Saccharomyces cerevisiae requires the RNA-binding proteins Nrd1 and Nab3. We have used a mutational strategy to characterize non-poly(A) termination elements downstream of the SNR13 and SNR47 snoRNA genes. This approach detected two common RNA sequence motifs, GUA[AG] and UCUU. The first motif corresponds to the known Nrd1-binding site, which we have verified here by gel mobility shift assays. We also show that Nab3 protein binds specifically to RNA containing the UCUU motif. Taken together, our data suggest that Nrd1 and Nab3 binding sites play a significant role in defining non-poly(A) terminators. As is the case with poly(A) terminators, there is no strong consensus for non-poly(A) terminators, and the arrangement of Nrd1p and Nab3p binding sites varies considerably. In addition, the organization of these sequences is not strongly conserved among even closely related yeasts. This indicates a large degree of genetic variability. Despite this variability, we were able to use a computational model to show that the binding sites for Nrd1 and Nab3 can identify genes for which transcription termination is mediated by these proteins.


2016 ◽  
Vol 7 ◽  
Author(s):  
Li Guo ◽  
Kelly S. Allen ◽  
Greg Deiulio ◽  
Yong Zhang ◽  
Angela M. Madeiras ◽  
...  

ChemInform ◽  
2003 ◽  
Vol 34 (21) ◽  
Author(s):  
Muenevver Koekueer ◽  
Fionn Murtagh ◽  
Norman D. McMillan ◽  
Sven Riedel ◽  
Brian O'Rourke ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document