scholarly journals Identifying proximal RNA interactions from cDNA-encoded crosslinks with ShapeJumper

2021 ◽  
Vol 17 (12) ◽  
pp. e1009632
Author(s):  
Thomas W. Christy ◽  
Catherine A. Giannetti ◽  
Alain Laederach ◽  
Kevin M. Weeks

SHAPE-JuMP is a concise strategy for identifying close-in-space interactions in RNA molecules. Nucleotides in close three-dimensional proximity are crosslinked with a bi-reactive reagent that covalently links the 2’-hydroxyl groups of the ribose moieties. The identities of crosslinked nucleotides are determined using an engineered reverse transcriptase that jumps across crosslinked sites, resulting in a deletion in the cDNA that is detected using massively parallel sequencing. Here we introduce ShapeJumper, a bioinformatics pipeline to process SHAPE-JuMP sequencing data and to accurately identify through-space interactions, as observed in complex JuMP datasets. ShapeJumper identifies proximal interactions with near-nucleotide resolution using an alignment strategy that is optimized to tolerate the unique non-templated reverse-transcription profile of the engineered crosslink-traversing reverse-transcriptase. JuMP-inspired strategies are now poised to replace adapter-ligation for detecting RNA-RNA interactions in most crosslinking experiments.

2021 ◽  
Author(s):  
Thomas W Christy ◽  
Catherine A Giannetti ◽  
Alain Laederach ◽  
Kevin M Weeks

SHAPE-JuMP is a concise strategy for identifying close-in-space interactions in RNA molecules. Nucleotides in close three-dimensional proximity are crosslinked with a bi-reactive reagent that covalently links the 2'-hydroxyl groups of the ribose moieties. The identities of crosslinked nucleotides are determined using an engineered reverse transcriptase that jumps across crosslinked sites, resulting in a deletion in the cDNA that is detected using massively parallel sequencing. Here we introduce ShapeJumper, a bioinformatics pipeline to process SHAPE-JuMP sequencing data and to accurately identify through-space interactions. ShapeJumper identifies proximal interactions with near-nucleotide resolution using an alignment strategy that is optimized to tolerate the unique non-templated reverse-transcription profile of the engineered crosslink-traversing reverse-transcriptase. JuMP-inspired strategies are now poised to replace adapter-ligation for detecting RNA-RNA interactions in most crosslinking experiments.


2012 ◽  
Vol 2012 ◽  
pp. 1-8 ◽  
Author(s):  
Aniruddha Chatterjee ◽  
Euan J. Rodger ◽  
Peter A. Stockwell ◽  
Robert J. Weeks ◽  
Ian M. Morison

Reduced representation bisulfite sequencing (RRBS), which couples bisulfite conversion and next generation sequencing, is an innovative method that specifically enriches genomic regions with a high density of potential methylation sites and enables investigation of DNA methylation at single-nucleotide resolution. Recent advances in the Illumina DNA sample preparation protocol and sequencing technology have vastly improved sequencing throughput capacity. Although the new Illumina technology is now widely used, the unique challenges associated with multiplexed RRBS libraries on this platform have not been previously described. We have made modifications to the RRBS library preparation protocol to sequence multiplexed libraries on a single flow cell lane of the Illumina HiSeq 2000. Furthermore, our analysis incorporates a bioinformatics pipeline specifically designed to process bisulfite-converted sequencing reads and evaluate the output and quality of the sequencing data generated from the multiplexed libraries. We obtained an average of 42 million paired-end reads per sample for each flow-cell lane, with a high unique mapping efficiency to the reference human genome. Here we provide a roadmap of modifications, strategies, and trouble shooting approaches we implemented to optimize sequencing of multiplexed libraries on an a RRBS background.


2017 ◽  
Author(s):  
Joseph D. Yesselman ◽  
Daniel Eiler ◽  
Erik D. Carlson ◽  
Alexandra N. Ooms ◽  
Wipapat Kladwang ◽  
...  

AbstractThe emerging field of RNA nanotechnology seeks to create nanoscale 3D machines by repurposing natural RNA modules, but successes have been limited to symmetric assemblies of single repeating motifs. We present RNAMake, a suite that automates design of RNA molecules with complex 3D folds. We first challenged RNAMake with the paradigmatic problem of aligning a tetraloop and sequence-distal receptor, previously only solved via symmetry. Single-nucleotide-resolution chemical mapping, native gel electrophoresis, and solution x-ray scattering confirmed that 11 of the 16 ‘miniTTR’ designs successfully achieved clothespin-like folds. A 2.55 Å diffraction-resolution crystal structure of one design verified formation of the target asymmetric nanostructure, with large sections achieving near-atomic accuracy (< 2.0 Å). Finally, RNAMake designed asymmetric segments to tether the 16S and 23S rRNAs together into a synthetic singlestranded ribosome that remains uncleaved by ribonucleases and supports life in Escherichia coli, a challenge previously requiring several rounds of trial-and-error.


Author(s):  
Joël Simoneau ◽  
Simon Dumontier ◽  
Ryan Gosselin ◽  
Michelle S Scott

Abstract Ribonucleic acid sequencing (RNA-seq) identifies and quantifies RNA molecules from a biological sample. Transformation from raw sequencing data to meaningful gene or isoform counts requires an in silico bioinformatics pipeline. Such pipelines are modular in nature, built using selected software and biological references. Software is usually chosen and parameterized according to the sequencing protocol and biological question. However, while biological and technical noise is alleviated through replicates, biases due to the pipeline and choice of biological references are often overlooked. Here, we show that the current standard practice prevents reproducibility in RNA-seq studies by failing to specify required methodological information. Peer-reviewed articles are intended to apply currently accepted scientific and methodological standards. Inasmuch as the bias-less and optimal RNA-seq pipeline is not perfectly defined, methodological information holds a meaningful role in defining the results. This work illustrates the need for a standardized and explicit display of methodological information in RNA-seq experiments.


2019 ◽  
Vol 16 (8) ◽  
pp. 868-881
Author(s):  
Yueping Wang ◽  
Jie Chang ◽  
Jiangyuan Wang ◽  
Peng Zhong ◽  
Yufang Zhang ◽  
...  

Background: S-dihydro-alkyloxy-benzyl-oxopyrimidines (S-DABOs) as non-nucleoside reverse transcriptase inhibitors have received considerable attention during the last decade due to their high potency against HIV-1. Methods: In this study, three-dimensional quantitative structure-activity relationship (3D-QSAR) of a series of 38 S-DABO analogues developed in our lab was studied using Comparative Molecular Field Analysis (CoMFA) and Comparative Molecular Similarity Indices Analysis (CoMSIA). The Docking/MMFF94s computational protocol based on the co-crystallized complex (PDB ID: 1RT2) was used to determine the most probable binding mode and to obtain reliable conformations for molecular alignment. Statistically significant CoMFA (q2=0.766 and r2=0.949) and CoMSIA (q2=0.827 and r2=0.974) models were generated using the training set of 30 compounds on the basis of hybrid docking-based and ligand-based alignment. Results: The predictive ability of CoMFA and CoMSIA models was further validated using a test set of eight compounds with predictive r2 pred values of 0.843 and 0.723, respectively. Conclusion: The information obtained from the 3D contour maps can be used in designing new SDABO derivatives with improved HIV-1 inhibitory activity.


2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Michela Quadrini

Abstract RNA molecules play crucial roles in various biological processes. Their three-dimensional configurations determine the functions and, in turn, influences the interaction with other molecules. RNAs and their interaction structures, the so-called RNA–RNA interactions, can be abstracted in terms of secondary structures, i.e., a list of the nucleotide bases paired by hydrogen bonding within its nucleotide sequence. Each secondary structure, in turn, can be abstracted into cores and shadows. Both are determined by collapsing nucleotides and arcs properly. We formalize all of these abstractions as arc diagrams, whose arcs determine loops. A secondary structure, represented by an arc diagram, is pseudoknot-free if its arc diagram does not present any crossing among arcs otherwise, it is said pseudoknotted. In this study, we face the problem of identifying a given structural pattern into secondary structures or the associated cores or shadow of both RNAs and RNA–RNA interactions, characterized by arbitrary pseudoknots. These abstractions are mapped into a matrix, whose elements represent the relations among loops. Therefore, we face the problem of taking advantage of matrices and submatrices. The algorithms, implemented in Python, work in polynomial time. We test our approach on a set of 16S ribosomal RNAs with inhibitors of Thermus thermophilus, and we quantify the structural effect of the inhibitors.


2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Gundula Povysil ◽  
Monika Heinzl ◽  
Renato Salazar ◽  
Nicholas Stoler ◽  
Anton Nekrutenko ◽  
...  

Abstract Duplex sequencing is currently the most reliable method to identify ultra-low frequency DNA variants by grouping sequence reads derived from the same DNA molecule into families with information on the forward and reverse strand. However, only a small proportion of reads are assembled into duplex consensus sequences (DCS), and reads with potentially valuable information are discarded at different steps of the bioinformatics pipeline, especially reads without a family. We developed a bioinformatics toolset that analyses the tag and family composition with the purpose to understand data loss and implement modifications to maximize the data output for the variant calling. Specifically, our tools show that tags contain polymerase chain reaction and sequencing errors that contribute to data loss and lower DCS yields. Our tools also identified chimeras, which likely reflect barcode collisions. Finally, we also developed a tool that re-examines variant calls from raw reads and provides different summary data that categorizes the confidence level of a variant call by a tier-based system. With this tool, we can include reads without a family and check the reliability of the call, that increases substantially the sequencing depth for variant calling, a particular important advantage for low-input samples or low-coverage regions.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Anupam Bhattacharya ◽  
Simang Champramary ◽  
Tanya Tripathi ◽  
Debajit Thakur ◽  
Ilya Ioshikhes ◽  
...  

Abstract Background Our understanding of genome regulation is ever-evolving with the continuous discovery of new modes of gene regulation, and transcriptomic studies of mammalian genomes have revealed the presence of a considerable population of non-coding RNA molecules among the transcripts expressed. One such non-coding RNA molecule is long non-coding RNA (lncRNA). However, the function of lncRNAs in gene regulation is not well understood; moreover, finding conserved lncRNA across species is a challenging task. Therefore, we propose a novel approach to identify conserved lncRNAs and functionally annotate these molecules. Results In this study, we exploited existing myogenic transcriptome data and identified conserved lncRNAs in mice and humans. We identified the lncRNAs expressing differentially between the early and later stages of muscle development. Differential expression of these lncRNAs was confirmed experimentally in cultured mouse muscle C2C12 cells. We utilized the three-dimensional architecture of the genome and identified topologically associated domains for these lncRNAs. Additionally, we correlated the expression of genes in domains for functional annotation of these trans-lncRNAs in myogenesis. Using this approach, we identified conserved lncRNAs in myogenesis and functionally annotated them. Conclusions With this novel approach, we identified the conserved lncRNAs in myogenesis in humans and mice and functionally annotated them. The method identified a large number of lncRNAs are involved in myogenesis. Further studies are required to investigate the reason for the conservation of the lncRNAs in human and mouse while their sequences are dissimilar. Our approach can be used to identify novel lncRNAs conserved in different species and functionally annotated them.


Membranes ◽  
2021 ◽  
Vol 11 (7) ◽  
pp. 536
Author(s):  
Shaojian He ◽  
Zhongrui Lu ◽  
Wenxu Dai ◽  
Kangning Yang ◽  
Yang Xue ◽  
...  

Phosphotungstic acid (HPW)-filled composite proton exchange membranes possess high proton conductivity under low relative humidity (RH). However, the leaching of HPW limits their wide application. Herein, we propose a novel approach for anchoring water soluble phosphotungstic acid (HPW) by polydopamine (PDA) coated graphene oxide and halloysite nanotubes (DGO and DHNTs) in order to construct hybrid three-dimensional proton transport networks in a sulfonated poly(ether ether ketone) (SPEEK) membrane. The introduction of PDA on the surfaces of the hybrid fillers could provide hydroxyl groups and secondary amine groups to anchor HPW, resulting in the uniform dispersion of HPW in the SPEEK matrix. The SPEEK/DGO/DHNTs/HPW (90/5/5/60) composite membrane exhibited higher water uptake and much better conductivity than the SPEEK membrane at low relative humidity. The best conductivity reached wass 0.062 S cm−1 for the composite membrane, which is quite stable during the water immersion test.


2012 ◽  
Vol 58 (10) ◽  
pp. 1467-1475 ◽  
Author(s):  
Kwan-Wood G Lam ◽  
Peiyong Jiang ◽  
Gary J W Liao ◽  
K C Allen Chan ◽  
Tak Y Leung ◽  
...  

Abstract BACKGROUND A genomewide genetic and mutational profile of a fetus was recently determined via deep sequencing of maternal plasma DNA. This technology could have important applications for noninvasive prenatal diagnosis (NIPD) of many monogenic diseases. Relative haplotype dosage (RHDO) analysis, a core step of this procedure, would allow one to elucidate the maternally inherited half of the fetal genome. For clinical applications, the cost and complexity of data analysis might be reduced via targeted application of this approach to selected genomic regions containing disease-causing genes. There is thus a need to explore the feasibility of performing RHDO analysis in a targeted manner. METHODS We performed target enrichment by using solution-phase hybridization followed by massively parallel sequencing of the β-globin gene region in 2 families undergoing prenatal diagnosis for β-thalassemia. We used digital PCR strategies to physically deduce parental haplotypes. Finally, we performed RHDO analysis with target-enriched sequencing data and parental haplotypes to reveal the β-thalassemic status for the fetuses. RESULTS A mean sequencing depth of 206-fold was achieved in the β-globin gene region by targeted sequencing of maternal plasma DNA. RHDO analysis was successful for the sequencing data obtained from the target-enriched samples, including a region in one of the families in which the parents had similar haplotype structures. Data analysis revealed that both fetuses were heterozygous carriers of β-thalassemia. CONCLUSIONS Targeted sequencing of maternal plasma DNA for NIPD of monogenic diseases is feasible.


Sign in / Sign up

Export Citation Format

Share Document