A High-Throughput Sequencing-Based Methodology to Identify All Uncapped and Cleaved RNA Molecules in Eukaryotic Genomes

Author(s):  
Matthew W. Endres ◽  
Rebecca T. Cook ◽  
Brian D. Gregory
eLife ◽  
2015 ◽  
Vol 4 ◽  
Author(s):  
Nathan J Baird ◽  
Sebastian A Leidel

A combination of 3D modeling and high-throughput sequencing may offer a faster way to determine the three-dimensional structures of RNA molecules.


2015 ◽  
Author(s):  
Davide Verzotto ◽  
Axel M Hillmer ◽  
Audrey S M Teo ◽  
Niranjan Nagarajan

Resolution of complex repeat structures and rearrangements in the assembly and analysis of large eukaryotic genomes is often aided by a combination of high-throughput sequencing and mapping technologies (e.g. optical restriction mapping). In particular, mapping technologies can generate sparse maps of large DNA fragments (150 kbp--2 Mbp) and thus provide a unique source of information for disambiguating complex rearrangements in cancer genomes. Despite their utility, combining high-throughput sequencing and mapping technologies has been challenging due to the lack of efficient and freely available software for robustly aligning maps to sequences. Here we introduce two new map-to-sequence alignment algorithms that efficiently and accurately align high-throughput mapping datasets to large, eukaryotic genomes while accounting for high error rates. In order to do so, these methods (OPTIMA for glocal and OPTIMA-Overlap for overlap alignment) exploit the ability to create efficient data structures that index continuous-valued mapping data while accounting for errors. We also introduce an approach for evaluating the significance of alignments that avoids expensive permutation-based tests while being agnostic to technology-dependent error rates. Our benchmarking results suggest that OPTIMA and OPTIMA-Overlap outperform state-of-the-art approaches in sensitivity (1.6--2X improvement) while simultaneously being more efficient (170--200%) and precise in their alignments (99% precision). These advantages are independent of the quality of the data, suggesting that our indexing approach and statistical evaluation are robust and provide improved sensitivity while guaranteeing high precision.


2018 ◽  
Vol 2018 ◽  
pp. 1-15 ◽  
Author(s):  
Kunqi Chen ◽  
Zhen Wei ◽  
Hui Liu ◽  
João Pedro de Magalhães ◽  
Rong Rong ◽  
...  

To date, with well over 100 different types of RNA modifications associated with various molecular functions identified on diverse types of RNA molecules, the epitranscriptome has emerged to be an important layer for gene expression regulation. It is of crucial importance and increasing interest to understand how the epitranscriptome is regulated to facilitate different biological functions from a global perspective, which may be carried forward by finding biologically meaningful epitranscriptome modules that respond to upstream epitranscriptome regulators and lead to downstream biological functions; however, due to the intrinsic properties of RNA molecules, RNA modifications, and relevant sequencing technique, the epitranscriptome profiled from high-throughput sequencing approaches often suffers from various artifacts, jeopardizing the effectiveness of epitranscriptome modules identification when using conventional approaches. To solve this problem, we developed a convenient measurement weighting strategy, which can largely tolerate the artifacts of high-throughput sequencing data. We demonstrated on real data that the proposed measurement weighting strategy indeed brings improved performance in epitranscriptome module discovery in terms of both module accuracy and biological significance. Although the new approach is integrated with Euclidean distance measurement in a hierarchical clustering scenario, it has great potential to be extended to other distance measurements and algorithms as well for addressing various tasks in epitranscriptome analysis. Additionally, we show for the first time with rigorous statistical analysis that the epitranscriptome modules are biologically meaningful with different GO functions enriched, which established the functional basis of epitranscriptome modules, fulfilled a key prerequisite for functional characterization, and deciphered the epitranscriptome and its regulation.


Sign in / Sign up

Export Citation Format

Share Document