scholarly journals linc2function: A deep learning model to identify and assign function to long noncoding RNA (lncRNA)

2021 ◽  
Author(s):  
Yashpal Ramakrishnaiah ◽  
Levin Kuhlmann ◽  
Sonika Tyagi

AbstractMotivationLncRNAs are much more versatile and are involved in many regulatory roles inside the cell than previously believed. Existing databases lack consistencies in lncRNA annotations, and the functionality of over 95% of the known lncRNAs are yet to be established. LncRNA transcript identification involves discriminating them from their coding counterparts, which can be done with traditional experimental approaches, or via in silico methods. The later approach employs various computational algorithms, including machine learning classifiers to predict the lncRNA forming potential of a given transcript. Such approaches provide an economical and faster alternative to the experimental methods. Current in silico methods mainly use primary-sequence based features to build predictive models limiting their accuracy and robustness. Moreover, many of these tools make use of reference genome based features, in consequence making them unsuitable for non-model species. Hence, there is a need to comprehensively evaluate the efficacy of different predictive features to build computational models. Additionally, effective models will have to provide maximum prediction performance using the least number of features in a species-agnostic manner.It is popularly known in the protein world that “structure is function”. This also applies to lncRNAs as their functional mechanisms are similar to those of proteins. Generally, lncRNA function by structurally binding to its target proteins or nucleic acid forming complexes. The secondary structures of the lncRNAs are modular providing interaction sites for their interactome made of DNA, RNA, and proteins. Through these interactions, they epigenetically regulate cellular biology, thereby forming a layer of genomic programming on top of the coding genes. We demonstrate that in addition to using transcript sequence, we can provide comprehensive functional annotation by collating their interactome and secondary structure information.ResultsHere, we evaluated an exhaustive list of sequence-based, secondary-structure, interactome, and physicochemical features for their ability to predict the lncRNA potential of a transcript. Based on our analysis, we built different machine learning models using optimum feature-set. We found our model to be on par or exceeding the execution of the state-of-the-art methods with AUC values of over 0.9 for a diverse collection of species tested. Finally, we built a pipeline called linc2function that provides the information necessary to functionally annotate a lncRNA conveniently in a single window.AvailabilityThe source code is accessible use under MIT license in standalone mode, and as a webserver (https://bioinformaticslab.erc.monash.edu/linc2function).

2020 ◽  
Vol 27 ◽  
Author(s):  
Gabriela Bitencourt-Ferreira ◽  
Camila Rizzotto ◽  
Walter Filgueira de Azevedo Junior

Background: Analysis of atomic coordinates of protein-ligand complexes can provide three-dimensional data to generate computational models to evaluate binding affinity and thermodynamic state functions. Application of machine learning techniques can create models to assess protein-ligand potential energy and binding affinity. These methods show superior predictive performance when compared with classical scoring functions available in docking programs. Objective: Our purpose here is to review the development and application of the program SAnDReS. We describe the creation of machine learning models to assess the binding affinity of protein-ligand complexes. Method: SAnDReS implements machine learning methods available in the scikit-learn library. This program is available for download at https://github.com/azevedolab/sandres. SAnDReS uses crystallographic structures, binding, and thermodynamic data to create targeted scoring functions. Results: Recent applications of the program SAnDReS to drug targets such as Coagulation factor Xa, cyclin-dependent kinases, and HIV-1 protease were able to create targeted scoring functions to predict inhibition of these proteins. These targeted models outperform classical scoring functions. Conclusion: Here, we reviewed the development of machine learning scoring functions to predict binding affinity through the application of the program SAnDReS. Our studies show the superior predictive performance of the SAnDReS-developed models when compared with classical scoring functions available in the programs such as AutoDock4, Molegro Virtual Docker, and AutoDock Vina.


2014 ◽  
Vol 14 (16) ◽  
pp. 1913-1922 ◽  
Author(s):  
Dimitar Dobchev ◽  
Girinath Pillai ◽  
Mati Karelson

2019 ◽  
Vol 19 (5) ◽  
pp. 319-336 ◽  
Author(s):  
Alexander V. Dmitriev ◽  
Alexey A. Lagunin ◽  
Dmitry А. Karasev ◽  
Anastasia V. Rudik ◽  
Pavel V. Pogodin ◽  
...  

Drug-drug interaction (DDI) is the phenomenon of alteration of the pharmacological activity of a drug(s) when another drug(s) is co-administered in cases of so-called polypharmacy. There are three types of DDIs: pharmacokinetic (PK), pharmacodynamic, and pharmaceutical. PK is the most frequent type of DDI, which often appears as a result of the inhibition or induction of drug-metabolising enzymes (DME). In this review, we summarise in silico methods that may be applied for the prediction of the inhibition or induction of DMEs and describe appropriate computational methods for DDI prediction, showing the current situation and perspectives of these approaches in medicinal and pharmaceutical chemistry. We review sources of information on DDI, which can be used in pharmaceutical investigations and medicinal practice and/or for the creation of computational models. The problem of the inaccuracy and redundancy of these data are discussed. We provide information on the state-of-the-art physiologically- based pharmacokinetic modelling (PBPK) approaches and DME-based in silico methods. In the section on ligand-based methods, we describe pharmacophore models, molecular field analysis, quantitative structure-activity relationships (QSAR), and similarity analysis applied to the prediction of DDI related to the inhibition or induction of DME. In conclusion, we discuss the problems of DDI severity assessment, mention factors that influence severity, and highlight the issues, perspectives and practical using of in silico methods.


2021 ◽  
Vol 15 ◽  
Author(s):  
Lichao Zhang ◽  
Zihong Huang ◽  
Liang Kong

Background: RNA-binding proteins establish posttranscriptional gene regulation by coordinating the maturation, editing, transport, stability, and translation of cellular RNAs. The immunoprecipitation experiments could identify interaction between RNA and proteins, but they are limited due to the experimental environment and material. Therefore, it is essential to construct computational models to identify the function sites. Objective: Although some computational methods have been proposed to predict RNA binding sites, the accuracy could be further improved. Moreover, it is necessary to construct a dataset with more samples to design a reliable model. Here we present a computational model based on multi-information sources to identify RNA binding sites. Method: We construct an accurate computational model named CSBPI_Site, based on xtreme gradient boosting. The specifically designed 15-dimensional feature vector captures four types of information (chemical shift, chemical bond, chemical properties and position information). Results: The satisfied accuracy of 0.86 and AUC of 0.89 were obtained by leave-one-out cross validation. Meanwhile, the accuracies were slightly different (range from 0.83 to 0.85) among three classifiers algorithm, which showed the novel features are stable and fit to multiple classifiers. These results showed that the proposed method is effective and robust for noncoding RNA binding sites identification. Conclusion: Our method based on multi-information sources is effective to represent the binding sites information among ncRNAs. The satisfied prediction results of Diels-Alder riboz-yme based on CSBPI_Site indicates that our model is valuable to identify the function site.


Author(s):  
William B. Rouse

This book discusses the use of models and interactive visualizations to explore designs of systems and policies in determining whether such designs would be effective. Executives and senior managers are very interested in what “data analytics” can do for them and, quite recently, what the prospects are for artificial intelligence and machine learning. They want to understand and then invest wisely. They are reasonably skeptical, having experienced overselling and under-delivery. They ask about reasonable and realistic expectations. Their concern is with the futurity of decisions they are currently entertaining. They cannot fully address this concern empirically. Thus, they need some way to make predictions. The problem is that one rarely can predict exactly what will happen, only what might happen. To overcome this limitation, executives can be provided predictions of possible futures and the conditions under which each scenario is likely to emerge. Models can help them to understand these possible futures. Most executives find such candor refreshing, perhaps even liberating. Their job becomes one of imagining and designing a portfolio of possible futures, assisted by interactive computational models. Understanding and managing uncertainty is central to their job. Indeed, doing this better than competitors is a hallmark of success. This book is intended to help them understand what fundamentally needs to be done, why it needs to be done, and how to do it. The hope is that readers will discuss this book and develop a “shared mental model” of computational modeling in the process, which will greatly enhance their chances of success.


Molecules ◽  
2021 ◽  
Vol 26 (9) ◽  
pp. 2505
Author(s):  
Raheem Remtulla ◽  
Sanjoy Kumar Das ◽  
Leonard A. Levin

Phosphine-borane complexes are novel chemical entities with preclinical efficacy in neuronal and ophthalmic disease models. In vitro and in vivo studies showed that the metabolites of these compounds are capable of cleaving disulfide bonds implicated in the downstream effects of axonal injury. A difficulty in using standard in silico methods for studying these drugs is that most computational tools are not designed for borane-containing compounds. Using in silico and machine learning methodologies, the absorption-distribution properties of these unique compounds were assessed. Features examined with in silico methods included cellular permeability, octanol-water partition coefficient, blood-brain barrier permeability, oral absorption and serum protein binding. The resultant neural networks demonstrated an appropriate level of accuracy and were comparable to existing in silico methodologies. Specifically, they were able to reliably predict pharmacokinetic features of known boron-containing compounds. These methods predicted that phosphine-borane compounds and their metabolites meet the necessary pharmacokinetic features for orally active drug candidates. This study showed that the combination of standard in silico predictive and machine learning models with neural networks is effective in predicting pharmacokinetic features of novel boron-containing compounds as neuroprotective drugs.


Author(s):  
Mythili K. ◽  
Manish Narwaria

Quality assessment of audiovisual (AV) signals is important from the perspective of system design, optimization, and management of a modern multimedia communication system. However, automatic prediction of AV quality via the use of computational models remains challenging. In this context, machine learning (ML) appears to be an attractive alternative to the traditional approaches. This is especially when such assessment needs to be made in no-reference (i.e., the original signal is unavailable) fashion. While development of ML-based quality predictors is desirable, we argue that proper assessment and validation of such predictors is also crucial before they can be deployed in practice. To this end, we raise some fundamental questions about the current approach of ML-based model development for AV quality assessment and signal processing for multimedia communication in general. We also identify specific limitations associated with the current validation strategy which have implications on analysis and comparison of ML-based quality predictors. These include a lack of consideration of: (a) data uncertainty, (b) domain knowledge, (c) explicit learning ability of the trained model, and (d) interpretability of the resultant model. Therefore, the primary goal of this article is to shed some light into mentioned factors. Our analysis and proposed recommendations are of particular importance in the light of significant interests in ML methods for multimedia signal processing (specifically in cases where human-labeled data is used), and a lack of discussion of mentioned issues in existing literature.


2021 ◽  
Vol 18 ◽  
pp. 100155
Author(s):  
Zhiyuan Wang ◽  
Piaopiao Zhao ◽  
Xiaoxiao Zhang ◽  
Xuan Xu ◽  
Weihua Li ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document