rna sequences
Recently Published Documents


TOTAL DOCUMENTS

1413
(FIVE YEARS 376)

H-INDEX

81
(FIVE YEARS 13)

2022 ◽  
Vol 23 (2) ◽  
pp. 925
Author(s):  
Sara Illodo ◽  
Cibrán Pérez-González ◽  
Ramiro Barcia ◽  
Flor Rodríguez-Prieto ◽  
Wajih Al-Soufi ◽  
...  

Guanine quadruplexes (G4s) are highly polymorphic four-stranded structures formed within guanine-rich DNA and RNA sequences that play a crucial role in biological processes. The recent discovery of the first G4 structures within mitochondrial DNA has led to a small revolution in the field. In particular, the G-rich conserved sequence block II (CSB II) can form different types of G4s that are thought to play a crucial role in replication. In this study, we decipher the most relevant G4 structures that can be formed within CSB II: RNA G4 at the RNA transcript, DNA G4 within the non-transcribed strand and DNA:RNA hybrid between the RNA transcript and the non-transcribed strand. We show that the more abundant, but unexplored, G6AG7 (37%) and G6AG8 (35%) sequences in CSB II yield more stable G4s than the less profuse G5AG7 sequence. Moreover, the existence of a guanine located 1 bp upstream promotes G4 formation. In all cases, parallel G4s are formed, but their topology changes from a less ordered to a highly ordered G4 when adding small amounts of potassium or sodium cations. Circular dichroism was used due to discriminate different conformations and topologies of nucleic acids and was complemented with gel electrophoresis and fluorescence spectroscopy studies.


2022 ◽  
Author(s):  
Bowen Song ◽  
Daiyun Huang ◽  
Yuxin Zhang ◽  
Zhen Wei ◽  
Jionglong Su ◽  
...  

As the most pervasive epigenetic marker present on mRNA and lncRNA, N6-methyladenosine (m6A) RNA methylation has been shown to participate in essential biological processes. Recent studies revealed the distinct patterns of m6A methylome across human tissues, and a major challenge remains in elucidating the tissue-specific presence and circuitry of m6A methylation. We present here a comprehensive online platform m6A-TSHub for unveiling the context-specific m6A methylation and genetic mutations that potentially regulate m6A epigenetic mark. m6A-TSHub consists of four core components, including (1) m6A-TSDB: a comprehensive database of 184,554 functionally annotated m6A sites derived from 23 human tissues and 499,369 m6A sites from 25 tumor conditions, respectively; (2) m6A-TSFinder: a web server for high-accuracy prediction of m6A methylation sites within a specific tissue from RNA sequences, which was constructed using multi-instance deep neural networks with gated attention; (3) m6A-TSVar: a web server for assessing the impact of genetic variants on tissue-specific m6A RNA modification; and (4) m6A-CAVar: a database of 587,983 TCGA cancer mutations (derived from 27 cancer types) that were predicted to affect m6A modifications in the primary tissue of cancers. The database should make a useful resource for studying the m6A methylome and genetic factor of epitranscriptome disturbance in a specific tissue (or cancer type). m6A-TSHub is accessible at: www.xjtlu.edu.cn/biologicalsciences/m6ats.


2022 ◽  
Vol 12 (1) ◽  
Author(s):  
Jake M. Peterson ◽  
Collin A. O’Leary ◽  
Walter N. Moss

AbstractInfluenza virus is a persistent threat to human health; indeed, the deadliest modern pandemic was in 1918 when an H1N1 virus killed an estimated 50 million people globally. The intent of this work is to better understand influenza from an RNA-centric perspective to provide local, structural motifs with likely significance to the influenza infectious cycle for therapeutic targeting. To accomplish this, we analyzed over four hundred thousand RNA sequences spanning three major clades: influenza A, B and C. We scanned influenza segments for local secondary structure, identified/modeled motifs of likely functionality, and coupled the results to an analysis of evolutionary conservation. We discovered 185 significant regions of predicted ordered stability, yet evidence of sequence covariation was limited to 7 motifs, where 3—found in influenza C—had higher than expected amounts of sequence covariation.


2022 ◽  
Author(s):  
MaKenzie R. Scarpitti ◽  
Julia E. Warrick ◽  
Michael G. Kearse

Loss of functional fragile X mental retardation protein (FMRP) causes fragile X syndrome, the leading form of inherited intellectual disability and the most common monogenic cause of autism spectrum disorders. FMRP is an RNA-binding protein that controls neuronal mRNA localization and translation. Notably, FMRP is thought to inhibit translation elongation after being recruited to target transcripts via binding RNA G-quadruplexes (G4s) within the coding sequence. Here we directly tested this model and report that FMRP inhibits translation elongation independent of mRNA G4s. Furthermore, we found that the RGG box motif together with its natural C-terminal domain forms a non-canonical RNA-binding domain (ncRBD) that binds reporter mRNA and all four polymeric RNA sequences. The ncRBD is essential for FMRP to inhibit translation. Transcripts that are bound by FMRP through the ncRBD co-sediment with heavy polysomes, which is consistent with stalling elongating ribosomes and a subsequent accumulation of slowed polysomes. Together, this work shifts our understanding of how FMRP inhibits translation elongation and supports a model where repression is driven by local FMRP and mRNA concentrations rather than target mRNA sequence.


2022 ◽  
Vol 23 (1) ◽  
Author(s):  
Jörg Winkler ◽  
Gianvito Urgese ◽  
Elisa Ficarra ◽  
Knut Reinert

Abstract Background The function of non-coding RNA sequences is largely determined by their spatial conformation, namely the secondary structure of the molecule, formed by Watson–Crick interactions between nucleotides. Hence, modern RNA alignment algorithms routinely take structural information into account. In order to discover yet unknown RNA families and infer their possible functions, the structural alignment of RNAs is an essential task. This task demands a lot of computational resources, especially for aligning many long sequences, and it therefore requires efficient algorithms that utilize modern hardware when available. A subset of the secondary structures contains overlapping interactions (called pseudoknots), which add additional complexity to the problem and are often ignored in available software. Results We present the SeqAn-based software LaRA 2 that is significantly faster than comparable software for accurate pairwise and multiple alignments of structured RNA sequences. In contrast to other programs our approach can handle arbitrary pseudoknots. As an improved re-implementation of the LaRA tool for structural alignments, LaRA 2 uses multi-threading and vectorization for parallel execution and a new heuristic for computing a lower boundary of the solution. Our algorithmic improvements yield a program that is up to 130 times faster than the previous version. Conclusions With LaRA 2 we provide a tool to analyse large sets of RNA secondary structures in relatively short time, based on structural alignment. The produced alignments can be used to derive structural motifs for the search in genomic databases.


2022 ◽  
Author(s):  
Perfecto Salvador Ramos ◽  
Oliver Escaño Manangkil

Abstract High concentration of cadmium and lead are hazardous to environment. The study isolated and identified potential fungal, bacterial and hyperaccumulating plants as bioremediators in contaminated rice ecosystem. Fungi were identified morphologically and with the use of internal transcribed spacer (ITS) region sequencing. Bacteria were identified using 16S ribosomal RNA sequences. Plants were analyzed for Cadmium and Lead accumulation in root and shoot tissues using atomic absorption spectrophotometer (AAS). Fungal species including Penicillium janthinellum, Trichoderma hamatum, Trichoderma harzianum, and Curvularia lunata along with bacterial species such as Bacillus cereus, Bacillus thuringiensis, Pseudomonas gessardii, Lysinibacillus xylanilyticus, Lysinibacillus sphaericus, and two species of unidentified bacteria were identified. Plants predominant in the area includes Cyperus difformis, Scirpus juncoides, Fimbristylis miliacea, Centella asiatica, Sphagneticola trilobata, and Monochoria vaginalis. Cadmium was detected in the shoots of S. trilobata (3.2 mg kg−1) and roots of C. asiatica (3.6 mg kg−1). Lead was found in the shoots of C. asiatica (2.8 mg kg−1) and roots of both S. juncoides (15.00 mg kg−1) and F. miliacea (15.00 mg kg−1). Phytoremediation potential of S. juncoides, F. miliacea, C. asiatica and S. trilobata was observed. Heavy metal resistant microbes can be harnessed as a very useful biological tool for in-situ bioremediation.


2022 ◽  
Author(s):  
Doaa Hassan Salem ◽  
Aditya Ariyur ◽  
Swapna Vidhur Daulatabad ◽  
Quoseena Mir ◽  
Sarath Chandra Janga

Nm (2′-O-methylation) is one of the most abundant modifications of mRNAs and non-coding RNAs occurring when a methyl group (–CH3) is added to the 2′ hydroxyl (–OH) of the ribose moiety. This modification can appear on any nucleotide (base) regardless of the type of nitrogenous base, because each ribose sugar has a hydroxyl group and so 2′-O-methyl ribose can occur on any base. Nm modification has a great contribution in many biological processes such as the normal functioning of tRNA, the protection of mRNA against degradation by DXO, and the biogenesis and specificity of rRNA. Recently, the single-molecule sequencing techniques for long reads of RNA sequences data offered by Oxford Nanopore technologies have enabled the direct detection of RNA modifications on the molecule that is being sequenced, but to our knowledge there was only one research attempt that applied this technology to predict the stoichiometry of Nm-modified sites in RNA sequence of yeast cells. To this end, in this paper, we extend this research direction by proposing a bio-computational framework, Nm-Nano for predicting Nm sites in Nanopore direct RNA sequencing reads of human cell lines. Nm-Nano framework integrates two supervised machine learning models for predicting Nm sites in Nanopore sequencing data, namely Xgboost and Random Forest (RF). Each model is trained with set of features that are extracted from the raw signal generated by the Oxford Nanopore MinION device, as well as the corresponding basecalled k-mer resulting from inferring the RNA sequence reads from the generated Nanopore signals. The results on two benchmark data sets generated from RNA Nanopore sequencing data of Hela and Hek293 cell lines show a great performance of Nm-Nano. In independent validation testing, Nm-Nano has been able to identify Nm sites with a high accuracy of 93% and 88% using Xgboost and RF models respectively by training each model with Hela benchmark dataset and testing it for identifying Nm sites on Hek293 benchmark dataset. Thus, Nm-Nano outperforms the Nm sites predictors existing in the literature (not relying on Nanopore technology) that were only limited to predict Nm sites on short reads of RNA sequences and unable to predict Nm sites on long RNA sequence reads. By deploying Nm-Nano to predict Nm sites in Hela cell line, it was revealed that a total of 196 genes was identified to have the most abundance of Nm modification among all other genes that have been modified by Nm in this cell line. Similarly, deploying Nm-Nano to predict Nm sites in Hek393 cell line revealed that a total of 196 genes line was identified to have the most abundance of Nm modification among all other genes that have been modified by Nm in this cell line. According to this, a significant enrichment of a wide range of functional processes like high confidences (adjusted p-val < 0.05) enriched ontologies that were more representative of Nm modification role in immune response and cellular homeostasis were revealed in Hela cell line, and "MHC class 1 protein complex", "mitotic spindle assembly", "response to glucocorticoid", and "nucleocytoplasmic transport" were revealed in Hek293 cell line. The source code of Nm-Nano can be freely accessed https://github.com/Janga-Lab/Nm-Nano.


2021 ◽  
Vol 19 (4) ◽  
pp. e49
Author(s):  
Anas Oujja ◽  
Mohamed Riduan Abid ◽  
Jaouad Boumhidi ◽  
Safae Bourhnane ◽  
Asmaa Mourhir ◽  
...  

Nowadays, Genomic data constitutes one of the fastest growing datasets in the world. As of 2025, it is supposed to become the fourth largest source of Big Data, and thus mandating adequate high-performance computing (HPC) platform for processing. With the latest unprecedented and unpredictable mutations in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the research community is in crucial need for ICT tools to process SARS-CoV-2 RNA data, e.g., by classifying it (i.e., clustering) and thus assisting in tracking virus mutations and predict future ones. In this paper, we are presenting an HPC-based SARS-CoV-2 RNAs clustering tool. We are adopting a data science approach, from data collection, through analysis, to visualization. In the analysis step, we present how our clustering approach leverages on HPC and the longest common subsequence (LCS) algorithm. The approach uses the Hadoop MapReduce programming paradigm and adapts the LCS algorithm in order to efficiently compute the length of the LCS for each pair of SARS-CoV-2 RNA sequences. The latter are extracted from the U.S. National Center for Biotechnology Information (NCBI) Virus repository. The computed LCS lengths are used to measure the dissimilarities between RNA sequences in order to work out existing clusters. In addition to that, we present a comparative study of the LCS algorithm performance based on variable workloads and different numbers of Hadoop worker nodes.


2021 ◽  
Author(s):  
Guennadi Kouzaev

In this message, the complete RNA sequences (GISAID) of Omicron (BA.1 and BA.2) SARS CoV-2 viruses are studied using the genomic ATG-walks. These walks are compared visually and numerically with a reference RNA (Wuhan, China, 2020), and the deviation levels are estimated. Statistical characteristics of these distributions are compared, including the fractal dimension values of coding-word length distributions. Most of the 17 RNA ATG walks studied here show relatively small deviations of their characteristics and resistance to forming a new virus family.


Sign in / Sign up

Export Citation Format

Share Document