scholarly journals Structural relation matching: an algorithm to identify structural patterns into RNAs and their interactions

2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Michela Quadrini

Abstract RNA molecules play crucial roles in various biological processes. Their three-dimensional configurations determine the functions and, in turn, influences the interaction with other molecules. RNAs and their interaction structures, the so-called RNA–RNA interactions, can be abstracted in terms of secondary structures, i.e., a list of the nucleotide bases paired by hydrogen bonding within its nucleotide sequence. Each secondary structure, in turn, can be abstracted into cores and shadows. Both are determined by collapsing nucleotides and arcs properly. We formalize all of these abstractions as arc diagrams, whose arcs determine loops. A secondary structure, represented by an arc diagram, is pseudoknot-free if its arc diagram does not present any crossing among arcs otherwise, it is said pseudoknotted. In this study, we face the problem of identifying a given structural pattern into secondary structures or the associated cores or shadow of both RNAs and RNA–RNA interactions, characterized by arbitrary pseudoknots. These abstractions are mapped into a matrix, whose elements represent the relations among loops. Therefore, we face the problem of taking advantage of matrices and submatrices. The algorithms, implemented in Python, work in polynomial time. We test our approach on a set of 16S ribosomal RNAs with inhibitors of Thermus thermophilus, and we quantify the structural effect of the inhibitors.

Biomolecules ◽  
2021 ◽  
Vol 11 (12) ◽  
pp. 1773
Author(s):  
Bahareh Behkamal ◽  
Mahmoud Naghibzadeh ◽  
Mohammad Reza Saberi ◽  
Zeinab Amiri Tehranizadeh ◽  
Andrea Pagnani ◽  
...  

Cryo-electron microscopy (cryo-EM) is a structural technique that has played a significant role in protein structure determination in recent years. Compared to the traditional methods of X-ray crystallography and NMR spectroscopy, cryo-EM is capable of producing images of much larger protein complexes. However, cryo-EM reconstructions are limited to medium-resolution (~4–10 Å) for some cases. At this resolution range, a cryo-EM density map can hardly be used to directly determine the structure of proteins at atomic level resolutions, or even at their amino acid residue backbones. At such a resolution, only the position and orientation of secondary structure elements (SSEs) such as α-helices and β-sheets are observable. Consequently, finding the mapping of the secondary structures of the modeled structure (SSEs-A) to the cryo-EM map (SSEs-C) is one of the primary concerns in cryo-EM modeling. To address this issue, this study proposes a novel automatic computational method to identify SSEs correspondence in three-dimensional (3D) space. Initially, through a modeling of the target sequence with the aid of extracting highly reliable features from a generated 3D model and map, the SSEs matching problem is formulated as a 3D vector matching problem. Afterward, the 3D vector matching problem is transformed into a 3D graph matching problem. Finally, a similarity-based voting algorithm combined with the principle of least conflict (PLC) concept is developed to obtain the SSEs correspondence. To evaluate the accuracy of the method, a testing set of 25 experimental and simulated maps with a maximum of 65 SSEs is selected. Comparative studies are also conducted to demonstrate the superiority of the proposed method over some state-of-the-art techniques. The results demonstrate that the method is efficient, robust, and works well in the presence of errors in the predicted secondary structures of the cryo-EM images.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Guangyao Zhou ◽  
Jackson Loper ◽  
Stuart Geman

Abstract Background A folding RNA molecule encounters multiple opportunities to form non-native yet energetically favorable pairings of nucleotide sequences. Given this forbidding free-energy landscape, mechanisms have evolved that contribute to a directed and efficient folding process, including catalytic proteins and error-detecting chaperones. Among structural RNA molecules we make a distinction between “bound” molecules, which are active as part of ribonucleoprotein (RNP) complexes, and “unbound,” with physiological functions performed without necessarily being bound in RNP complexes. We hypothesized that unbound molecules, lacking the partnering structure of a protein, would be more vulnerable than bound molecules to kinetic traps that compete with native stem structures. We defined an “ambiguity index”—a normalized function of the primary and secondary structure of an individual molecule that measures the number of kinetic traps available to nucleotide sequences that are paired in the native structure, presuming that unbound molecules would have lower indexes. The ambiguity index depends on the purported secondary structure, and was computed under both the comparative (“gold standard”) and an equilibrium-based prediction which approximates the minimum free energy (MFE) structure. Arguing that kinetically accessible metastable structures might be more biologically relevant than thermodynamic equilibrium structures, we also hypothesized that MFE-derived ambiguities would be less effective in separating bound and unbound molecules. Results We have introduced an intuitive and easily computed function of primary and secondary structures that measures the availability of complementary sequences that could disrupt the formation of native stems on a given molecule—an ambiguity index. Using comparative secondary structures, the ambiguity index is systematically smaller among unbound than bound molecules, as expected. Furthermore, the effect is lost when the presumably more accurate comparative structure is replaced instead by the MFE structure. Conclusions A statistical analysis of the relationship between the primary and secondary structures of non-coding RNA molecules suggests that stem-disrupting kinetic traps are substantially less prevalent in molecules not participating in RNP complexes. In that this distinction is apparent under the comparative but not the MFE secondary structure, the results highlight a possible deficiency in structure predictions when based upon assumptions of thermodynamic equilibrium.


2019 ◽  
Author(s):  
Guangyao Zhou ◽  
Jackson Loper ◽  
Stuart Geman

Abstract Background : A folding RNA molecule encounters multiple opportunities to form non-native yet energetically favorable pairings of nucleotide sequences. Given this forbidding free-energy landscape, mechanisms have evolved that contribute to a directed and efficient folding process, including catalytic proteins and error-detecting chaperones. Among structural RNA molecules we make a distinction between "bound" molecules, which are active as part of ribonucleoprotein (RNP) complexes, and "unbound," with physiological functions performed without necessarily being bound in RNP complexes. We hypothesized that unbound molecules, lacking the partnering structure of a protein, would be more vulnerable than bound molecules to kinetic traps that compete with native stem structures. We defined an "ambiguity index"---a normalized function of the primary and secondary structure of an individual molecule that measures the number of kinetic traps available to nucleotide sequences that are paired in the native structure, presuming that unbound molecules would have lower indexes. The ambiguity index depends on the purported secondary structure, and was computed under both the comparative ("gold standard") and an equilibrium-based prediction which approximates the minimum free energy (MFE) structure. Arguing that kinetically accessible metastable structures might be more biologically relevant than thermodynamic equilibrium structures, we also hypothesized that MFE-derived ambiguities would be less effective in separating bound and unbound molecules. Results : We have introduced an intuitive and easily computed function of primary and secondary structures that measures the availability of complementary sequences that could disrupt the formation of native stems on a given molecule---an ambiguity index. Using comparative secondary structures, the ambiguity index is systematically smaller among unbound than bound molecules, as expected. Furthermore, the effect is lost when the presumably more accurate comparative structure is replaced instead by the MFE structure. Conclusions : A statistical analysis of the relationship between the primary and secondary structures of non-coding RNA molecules suggests that stem-disrupting kinetic traps are substantially less prevalent in molecules not participating in RNP complexes. In that this distinction is apparent under the comparative but not the MFE secondary structure, the results highlight a possible deficiency in structure predictions when based upon assumptions of thermodynamic equilibrium.


2020 ◽  
Vol 8 (1) ◽  
pp. 78-83
Author(s):  
P. Agalya ◽  
◽  
V. Velusamy

a-helix, þ-sheet, þ-turns, and random coils are the three-dimensional local segments that constitute a protein secondary structure. Molecular vibrations of proteins are sensitive to structural organizations of peptide chains hence Fourier Transform infrared (FTIR) spectroscopy is one of the recognized techniques for the identification of protein secondary structures. However, the lower frequency region of FTIR especially the amide VI bands (in the region 590-490cm-1) is little studied for proteins. Further, the effect of sugar-free natura on ovalbumin stability is not yet studied to our knowledge. The present study examines the conformational changes in the secondary structure of ovalbumin (OVA) protein under the influence of pH variations (2, 5, 7, 9, and 12) and also cosolvent sugar-free Natura (SFN) inclusion. From the primary absorption spectra of the amide VI bands, the second derivative analysis is furnished to quantify the secondary structural elements of protein thereby conformational changes are analyzed. From obtained results, it is found that conformational changes occur between two major secondary structures of a-helix and þ-sheet of OVA due to variation of pH and inclusion of cosolvent. Also, the results confirm that the denaturation of OVA in the presence of SFN irrespective of pH.


Author(s):  
Radka Novotná ◽  
Zdeněk Trávníček

The asymmetric unit of the title compound, C6H5N3O, consists of discrete molecules of 9-deazahypoxanthine [systematic name: 3H-pyrrolo[3,2-d]pyrimidin-4(5H)-one]. The structure displays N—H...O hydrogen bonding, connecting the molecules into centrosymmetric dimers. These dimers are then connected by N—H...N hydrogen bonds into a ladder-like chain along thecaxis. The secondary structure is stabilized by weak noncovalent contacts of the C—H...O and C—H...C types, as well as by π–π stacking interactions, which organize the structure into a zigzag architecture.


2020 ◽  
Vol 49 (D1) ◽  
pp. D183-D191
Author(s):  
Pan Li ◽  
Xiaolin Zhou ◽  
Kui Xu ◽  
Qiangfeng Cliff Zhang

Abstract RNA molecules fold into complex structures that are important across many biological processes. Recent technological developments have enabled transcriptome-wide probing of RNA secondary structure using nucleases and chemical modifiers. These approaches have been widely applied to capture RNA secondary structure in many studies, but gathering and presenting such data from very different technologies in a comprehensive and accessible way has been challenging. Existing RNA structure probing databases usually focus on low-throughput or very specific datasets. Here, we present a comprehensive RNA structure probing database called RASP (RNA Atlas of Structure Probing) by collecting 161 deduplicated transcriptome-wide RNA secondary structure probing datasets from 38 papers. RASP covers 18 species across animals, plants, bacteria, fungi, and also viruses, and categorizes 18 experimental methods including DMS-seq, SHAPE-Seq, SHAPE-MaP, and icSHAPE, etc. Specially, RASP curates the up-to-date datasets of several RNA secondary structure probing studies for the RNA genome of SARS-CoV-2, the RNA virus that caused the on-going COVID-19 pandemic. RASP also provides a user-friendly interface to query, browse, and visualize RNA structure profiles, offering a shortcut to accessing RNA secondary structures grounded in experimental data. The database is freely available at http://rasp.zhanglab.net.


2019 ◽  
Author(s):  
Guangyao Zhou ◽  
Jackson Loper ◽  
Stuart Geman

Abstract Background: A folding RNA molecule encounters multiple opportunities to form non-native yet energetically favorable pairings of nucleotide sequences. Given this forbidding free-energy landscape, mechanisms have evolved that contribute to a directed and efficient folding process, including catalytic proteins and error-detecting chaperones. Among structural RNA molecules we make a distinction between "bound" molecules, which are active as part of ribonucleoprotein (RNP) complexes, and "unbound," with physiological functions performed without necessarily being bound in RNP complexes. We hypothesized that unbound molecules, lacking the partnering structure of a protein, would be more vulnerable than bound molecules to kinetic traps that compete with native stem structures. We defined an "ambiguity index"---a normalized function of the primary and secondary structure of an individual molecule that measures the number of kinetic traps available to nucleotide sequences that are paired in the native structure, presuming that unbound molecules would have lower indexes. The ambiguity index depends on the purported secondary structure, and was computed under both the comparative ("gold standard") and an equilibrium-based prediction which approximates the minimum free energy (MFE) structure. Arguing that kinetically accessible metastable structures might be more biologically relevant than thermodynamic equilibrium structures, we also hypothesized that MFE-derived ambiguities would be less effective in separating bound and unbound molecules. Results: We have introduced an intuitive and easily computed function of primary and secondary structures that measures the availability of complementary sequences that could disrupt the formation of native stems on a given molecule---an ambiguity index. Using comparative secondary structures, the ambiguity index is systematically smaller among unbound than bound molecules, as expected. Furthermore, the effect is lost when the presumably more accurate comparative structure is replaced instead by the MFE structure. Conclusions: A statistical analysis of the relationship between the primary and secondary structures of non-coding RNA molecules suggests that stem-disrupting kinetic traps are substantially less prevalent in molecules not participating in RNP complexes. In that this distinction is apparent under the comparative but not the MFE secondary structure, the results highlight a possible deficiency in structure predictions when based upon assumptions of thermodynamic equilibrium.


1999 ◽  
Vol 6 (15) ◽  
Author(s):  
Rune B. Lyngsø ◽  
Michael Zuker ◽  
Christian N. S. Pedersen

Though not as abundant in known biological processes as proteins,<br />RNA molecules serve as more than mere intermediaries between<br />DNA and proteins, e.g. as catalytic molecules. Furthermore,<br />RNA secondary structure prediction based on free energy<br />rules for stacking and loop formation remains one of the few major<br />breakthroughs in the field of structure prediction. We present a<br />new method to evaluate all possible internal loops of size at most<br />k in an RNA sequence, s, in time O(k|s|^2); this is an improvement<br />from the previously used method that uses time O(k^2|s|^2).<br />For unlimited loop size this improves the overall complexity of<br />evaluating RNA secondary structures from O(|s|^4) to O(|s|^3) and<br />the method applies equally well to finding the optimal structure<br />and calculating the equilibrium partition function. We use our<br />method to examine the soundness of setting k = 30, a commonly<br />used heuristic.


Molecules ◽  
2021 ◽  
Vol 26 (22) ◽  
pp. 7049
Author(s):  
Maytha Alshammari ◽  
Jing He

Although atomic structures have been determined directly from cryo-EM density maps with high resolutions, current structure determination methods for medium resolution (5 to 10 Å) cryo-EM maps are limited by the availability of structure templates. Secondary structure traces are lines detected from a cryo-EM density map for α-helices and β-strands of a protein. A topology of secondary structures defines the mapping between a set of sequence segments and a set of traces of secondary structures in three-dimensional space. In order to enhance accuracy in ranking secondary structure topologies, we explored a method that combines three sources of information: a set of sequence segments in 1D, a set of amino acid contact pairs in 2D, and a set of traces in 3D at the secondary structure level. A test of fourteen cases shows that the accuracy of predicted secondary structures is critical for deriving topologies. The use of significant long-range contact pairs is most effective at enriching the rank of the maximum-match topology for proteins with a large number of secondary structures, if the secondary structure prediction is fairly accurate. It was observed that the enrichment depends on the quality of initial topology candidates in this approach. We provide detailed analysis in various cases to show the potential and challenge when combining three sources of information.


Author(s):  
Thomas K. F. Wong ◽  
S. M. Yiu

Non-coding RNAs (ncRNAs) are found to be critical for many biological processes. However, identifying these molecules is very difficult and challenging due to the lack of strong detectable signals such as opening read frames. Most computational approaches rely on the observation that the secondary structures of ncRNA molecules are conserved within the same family. Aligning a known ncRNA to a target candidate to determine the sequence and structural similarity helps in identifying de novo ncRNA molecules that are in the same family of the known ncRNA. However, the problem becomes more difficult if the secondary structure contains pseudoknots. Only until recently, many of the existing approaches could not handle structures with pseudoknots. This chapter reviews the state-of-the-art algorithms for different types of structures that contain pseudoknots including standard pseudoknot, simple non-standard pseudoknot, recursive standard pseudoknot, and recursive simple non-standard pseudoknot. Although none of the algorithms is designed for general pseudoknots, these algorithms already cover all known ncRNAs in both Rfam and PseudoBase databases. The evaluation of the algorithms also shows that the approach is useful in identifying ncRNA molecules in other species, which are in the same family of a known ncRNA.


Sign in / Sign up

Export Citation Format

Share Document