scholarly journals R-chie: a web server and R package for visualizing cis and trans RNA–RNA, RNA–DNA and DNA–DNA interactions

2020 ◽  
Vol 48 (18) ◽  
pp. e105-e105 ◽  
Author(s):  
Volodymyr Tsybulskyi ◽  
Mohamed Mounir ◽  
Irmtraud M Meyer

Abstract Interactions between biological entities are key to understanding their potential functional roles. Three fields of research have recently made particular progress: the investigation of transRNA–RNA and RNA–DNA transcriptome interactions and of trans DNA–DNA genome interactions. We now have both experimental and computational methods for examining these interactions in vivo and on a transcriptome- and genome-wide scale, respectively. Often, key insights can be gained by visually inspecting figures that manage to combine different sources of evidence and quantitative information. We here present R-chie, a web server and R package for visualizing cis and transRNA–RNA, RNA–DNA and DNA–DNA interactions. For this, we have completely revised and significantly extended an earlier version of R-chie (1) which was initially introduced for visualizing RNA secondary structure features. The new R-chie offers a range of unique features for visualizing cis and transRNA–RNA, RNA–DNA and DNA–DNA interactions. Particularly note-worthy features include the ability to incorporate evolutionary information, e.g. multiple-sequence alignments, to compare two alternative sets of information and to incorporate detailed, quantitative information. R-chie is readily available via a web server as well as a corresponding R package called R4RNA which can be used to run the software locally.

2020 ◽  
Author(s):  
Adrián López Martín ◽  
Mohamed Mounir ◽  
Irmtraud M Meyer

Abstract RNA structure formation in vivo happens co-transcriptionally while the transcript is being made. The corresponding co-transcriptional folding pathway typically involves transient RNA structure features that are not part of the final, functional RNA structure. These transient features can play important functional roles of their own and also influence the formation of the final RNA structure in vivo. We here present CoBold, a computational method for identifying different functional classes of transient RNA structure features that can either aid or hinder the formation of a known reference RNA structure. Our method takes as input either a single RNA or a corresponding multiple-sequence alignment as well as a known reference RNA secondary structure and identifies different classes of transient RNA structure features that could aid or prevent the formation of the given RNA structure. We make CoBold available via a web-server which includes dedicated data visualisation.


2007 ◽  
Vol 35 (Web Server) ◽  
pp. W645-W648 ◽  
Author(s):  
S. Moretti ◽  
F. Armougom ◽  
I. M. Wallace ◽  
D. G. Higgins ◽  
C. V. Jongeneel ◽  
...  

2020 ◽  
Author(s):  
Aashish Jain ◽  
Genki Terashi ◽  
Yuki Kagaya ◽  
Sai Raghavendra Maddhuri Venkata Subramaniya ◽  
Charles Christoffer ◽  
...  

ABSTRACTProtein 3D structure prediction has advanced significantly in recent years due to improving contact prediction accuracy. This improvement has been largely due to deep learning approaches that predict inter-residue contacts and, more recently, distances using multiple sequence alignments (MSAs). In this work we present AttentiveDist, a novel approach that uses different MSAs generated with different E-values in a single model to increase the co-evolutionary information provided to the model. To determine the importance of each MSA’s feature at the inter-residue level, we added an attention layer to the deep neural network. The model is trained in a multi-task fashion to also predict backbone and orientation angles further improving the inter-residue distance prediction. We show that AttentiveDist outperforms the top methods for contact prediction in the CASP13 structure prediction competition. To aid in structure modeling we also developed two new deep learning-based sidechain center distance and peptide-bond nitrogen-oxygen distance prediction models. Together these led to a 12% increase in TM-score from the best server method in CASP13 for structure prediction.


2016 ◽  
Vol 44 (W1) ◽  
pp. W339-W343 ◽  
Author(s):  
Evan W. Floden ◽  
Paolo D. Tommaso ◽  
Maria Chatzou ◽  
Cedrik Magis ◽  
Cedric Notredame ◽  
...  

2021 ◽  
Author(s):  
Konstantin Weissenow ◽  
Michael Heinzinger ◽  
Burkhard Rost

All state-of-the-art (SOTA) protein structure predictions rely on evolutionary information captured in multiple sequence alignments (MSAs), primarily on evolutionary couplings (co-evolution). Such information is not available for all proteins and is computationally expensive to generate. Prediction models based on Artificial Intelligence (AI) using only single sequences as input are easier and cheaper but perform so poorly that speed becomes irrelevant. Here, we described the first competitive AI solution exclusively inputting embeddings extracted from pre-trained protein Language Models (pLMs), namely from the transformer pLM ProtT5, from single sequences into a relatively shallow (few free parameters) convolutional neural network (CNN) trained on inter-residue distances, i.e. protein structure in 2D. The major advance originated from processing the attention heads learned by ProtT5. Although these models required at no point any MSA, they matched the performance of methods relying on co-evolution. Although not reaching the very top, our lean approach came close at substantially lower costs thereby speeding up development and each future prediction. By generating protein-specific rather than family-averaged predictions, these new solutions could distinguish between structural features differentiating members of the same family of proteins with similar structure predicted alike by all other top methods.


2017 ◽  
Author(s):  
Diego Javier Zea ◽  
Alexander Miguel Monzon ◽  
Gustavo Parisi ◽  
Cristina Marino-Buslje

AbstractConservation and covariation measures, as other evolutionary analysis, require a high number of distant homologous sequences, therefore a lot of structural divergence can be expected in such divergent alignments. However, most works linking evolutionary and structural information use a single structure ignoring the structural variability inside a protein family. That common practice seems unrealistic to the light of this work.In this work we studied how structural divergence affects conservation and covariation estimations. We uncover that, within a protein family, ~51% of multiple sequence alignment columns change their exposed/buried status between structures. Also, ~53% of residue pairs that are in contact in one structure are not in contact in another structure from the same family. We found out that residue conservation is not directly related to the relative solvent accessible surface area of a single protein structure. Using information from all the available structures rather than from a single representative structure gives more confidence in the structural interpretation of the evolutionary signals. That is particularly important for diverse multiple sequence alignments, where structures can drastically differ. High covariation scores tend to indicate residue contacts that are conserved in the family, therefore, are not suitable to find protein/conformer specific contacts.Our results suggest that structural divergence should be considered for a better understanding of protein function, to transfer annotation by homology and to model protein evolution.


1999 ◽  
Vol 181 (22) ◽  
pp. 7070-7079 ◽  
Author(s):  
William A. Fonzi

ABSTRACT PHR1 and PHR2 encode putative glycosylphosphatidylinositol-anchored cell surface proteins of the opportunistic fungal pathogen Candida albicans. These proteins are functionally related, and their expression is modulated in relation to the pH of the ambient environment in vitro and in vivo. Deletion of either gene results in a pH-conditional defect in cell morphology and virulence. Multiple sequence alignments demonstrated a distant relationship between the Phr proteins and β-galactosidases. Based on this alignment, site-directed mutagenesis of the putative active-site residues of Phr1p and Phr2p was conducted and two conserved glutamate residues were shown to be essential for activity. By taking advantage of the pH-conditional expression of the genes, a temporal analysis of cell wall changes was performed following a shift of the mutants from permissive to nonpermissive pH. The mutations did not grossly affect the amount of polysaccharides in the wall but did alter their distribution. The most immediate alteration to occur was a fivefold increase in the rate of cross-linking between β-1,6-glycosylated mannoproteins and chitin. This increase was followed shortly thereafter by a decline in β-1,3-glucan-associated β-1,6-glucans and, within several generations, a fivefold increase in the chitin content of the walls. The increased accumulation of chitin-linked glucans was not due to a block in subsequent processing as determined by pulse-chase analysis. Rather, the results suggest that the glucans are diverted to chitin linkage due to the inability of the mutants to establish cross-links between β-1,6- and β-1,3-glucans. Based on these and previously published results, it is suggested that the Phr proteins process β-1,3-glucans and make available acceptor sites for the attachment of β-1,6-glucans.


2013 ◽  
Vol 30 (4) ◽  
pp. 472-479 ◽  
Author(s):  
Bin Liu ◽  
Deyuan Zhang ◽  
Ruifeng Xu ◽  
Jinghao Xu ◽  
Xiaolong Wang ◽  
...  

Abstract Motivation: Owing to its importance in both basic research (such as molecular evolution and protein attribute prediction) and practical application (such as timely modeling the 3D structures of proteins targeted for drug development), protein remote homology detection has attracted a great deal of interest. It is intriguing to note that the profile-based approach is promising and holds high potential in this regard. To further improve protein remote homology detection, a key step is how to find an optimal means to extract the evolutionary information into the profiles. Results: Here, we propose a novel approach, the so-called profile-based protein representation, to extract the evolutionary information via the frequency profiles. The latter can be calculated from the multiple sequence alignments generated by PSI-BLAST. Three top performing sequence-based kernels (SVM-Ngram, SVM-pairwise and SVM-LA) were combined with the profile-based protein representation. Various tests were conducted on a SCOP benchmark dataset that contains 54 families and 23 superfamilies. The results showed that the new approach is promising, and can obviously improve the performance of the three kernels. Furthermore, our approach can also provide useful insights for studying the features of proteins in various families. It has not escaped our notice that the current approach can be easily combined with the existing sequence-based methods so as to improve their performance as well. Availability and implementation: For users’ convenience, the source code of generating the profile-based proteins and the multiple kernel learning was also provided at http://bioinformatics.hitsz.edu.cn/main/∼binliu/remote/ Contact: [email protected] or [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Jung-Eun Shin ◽  
Adam J. Riesselman ◽  
Aaron W. Kollasch ◽  
Conor McMahon ◽  
Elana Simon ◽  
...  

AbstractThe ability to design functional sequences and predict effects of variation is central to protein engineering and biotherapeutics. State-of-art computational methods rely on models that leverage evolutionary information but are inadequate for important applications where multiple sequence alignments are not robust. Such applications include the prediction of variant effects of indels, disordered proteins, and the design of proteins such as antibodies due to the highly variable complementarity determining regions. We introduce a deep generative model adapted from natural language processing for prediction and design of diverse functional sequences without the need for alignments. The model performs state-of-art prediction of missense and indel effects and we successfully design and test a diverse 105-nanobody library that shows better expression than a 1000-fold larger synthetic library. Our results demonstrate the power of the alignment-free autoregressive model in generalizing to regions of sequence space traditionally considered beyond the reach of prediction and design.


Sign in / Sign up

Export Citation Format

Share Document