Recent Advances in Predicting ncRNA-Protein Interactions Based on Machine Learning

2021 ◽  
Vol 01 ◽  
Author(s):  
Jingjing Wang ◽  
Yanpeng Zhao ◽  
Xiaoqian Huang ◽  
Yi Shi ◽  
Jianjun Tan

: Non-coding RNAs (ncRNAs) play significant roles in various physiological and pathological processes via interacting with the proteins. The existing experimental methods used for predicting ncRNA-protein interactions are costly and time-consuming. Therefore, an increasing number of machine learning models have been developed to efficiently predict ncRNA-protein interactions (ncRPIs), including shallow machine learning and deep learning models, which have achieved dramatic achievement on the identification of ncRPIs. In this review, we provided an overview of the recent advances in various machine learning methods for predicting ncRPIs, mainly focusing on ncRNAs-protein interaction databases, classical datasets, ncRNA/protein sequence encoding methods, conventional machine learning-based models, deep learning-based models, and the two integration-based models. Furthermore, we compared the reported accuracy of these approaches and discussed the potential and limitations of deep learning applications in ncRPIs. It was found that the predictive performance of integrated deep learning is the best, and those deep learning-based methods do not always perform better than shallow machine learning-based methods. We discussed the potential of using deep learning and proposed a research approach on the basis of the existing research. We believe that the model based on integrated deep learning is able to achieve higher accuracy in the prediction if substantial experimental data were available in the near future.

2021 ◽  
Author(s):  
Tuomo Hartonen ◽  
Teemu Kivioja ◽  
Jussi Taipale

Deep learning models have in recent years gained success in various tasks related to understanding information coded in the DNA sequence. Rapidly developing genome-wide measurement technologies provide large quantities of data ideally suited for modeling using deep learning or other powerful machine learning approaches. Although offering state-of-the art predictive performance, the predictions made by deep learning models can be difficult to understand. In virtually all biological research, the understanding of how a predictive model works is as important as the raw predictive performance. Thus interpretation of deep learning models is an emerging hot topic especially in context of biological research. Here we describe plotMI, a mutual information based model interpretation strategy that can intuitively visualize positional preferences and pairwise interactions learned by any machine learning model trained on sequence data with a defined alphabet as input. PlotMI is freely available at https://github.com/hartonen/plotMI.


Author(s):  
S. T. Yekeen ◽  
A.-L. Balogun

Abstract. This study developed a novel deep learning oil spill instance segmentation model using Mask-Region-based Convolutional Neural Network (Mask R-CNN) model which is a state-of-the-art computer vision model. A total of 2882 imageries containing oil spill, look-alike, ship, and land area after conducting different pre-processing activities were acquired. These images were subsequently sub-divided into 88% training and 12% for testing, equating to 2530 and 352 images respectively. The model training was conducted using transfer learning on a pre-trained ResNet 101 with COCO data as a backbone in combination with Feature Pyramid Network (FPN) architecture for the extraction of features at 30 epochs with 0.001 learning rate. The model’s performance was evaluated using precision, recall, and F1-measure which shows a higher performance than other existing models with value of 0.964, 0.969 and 0.968 respectively. As a specialized task, the study concluded that the developed deep learning instance segmentation model (Mask R-CNN) performs better than conventional machine learning models and semantic segmentation deep learning models in detection and segmentation of marine oil spill.


2020 ◽  
Vol 3 (3) ◽  
pp. 202-213
Author(s):  
Lu Chen ◽  
Chunchao Xia ◽  
Huaiqiang Sun

ABSTRACT Deep learning (DL) is a recently proposed subset of machine learning methods that has gained extensive attention in the academic world, breaking benchmark records in areas such as visual recognition and natural language processing. Different from conventional machine learning algorithm, DL is able to learn useful representations and features directly from raw data through hierarchical nonlinear transformations. Because of its ability to detect abstract and complex patterns, DL has been used in neuroimaging studies of psychiatric disorders, which are characterized by subtle and diffuse alterations. Here, we provide a brief review of recent advances and associated challenges in neuroimaging studies of DL applied to psychiatric disorders. The results of these studies indicate that DL could be a powerful tool in assisting the diagnosis of psychiatric diseases. We conclude our review by clarifying the main promises and challenges of DL application in psychiatric disorders, and possible directions for future research.


2021 ◽  
Vol 11 (16) ◽  
pp. 7561
Author(s):  
Umair Iqbal ◽  
Johan Barthelemy ◽  
Wanqing Li ◽  
Pascal Perez

Blockage of culverts by transported debris materials is reported as the salient contributor in originating urban flash floods. Conventional hydraulic modeling approaches had no success in addressing the problem primarily because of the unavailability of peak floods hydraulic data and the highly non-linear behavior of debris at the culvert. This article explores a new dimension to investigate the issue by proposing the use of intelligent video analytics (IVA) algorithms for extracting blockage related information. The presented research aims to automate the process of manual visual blockage classification of culverts from a maintenance perspective by remotely applying deep learning models. The potential of using existing convolutional neural network (CNN) algorithms (i.e., DarkNet53, DenseNet121, InceptionResNetV2, InceptionV3, MobileNet, ResNet50, VGG16, EfficientNetB3, NASNet) is investigated over a dataset from three different sources (i.e., images of culvert openings and blockage (ICOB), visual hydrology-lab dataset (VHD), synthetic images of culverts (SIC)) to predict the blockage in a given image. Models were evaluated based on their performance on the test dataset (i.e., accuracy, loss, precision, recall, F1 score, Jaccard Index, region of convergence (ROC) curve), floating point operations per second (FLOPs) and response times to process a single test instance. Furthermore, the performance of deep learning models was benchmarked against conventional machine learning algorithms (i.e., SVM, RF, xgboost). In addition, the idea of classifying deep visual features extracted by CNN models (i.e., ResNet50, MobileNet) using conventional machine learning approaches was also implemented in this article. From the results, NASNet was reported most efficient in classifying the blockage images with the 5-fold accuracy of 85%; however, MobileNet was recommended for the hardware implementation because of its improved response time with 5-fold accuracy comparable to NASNet (i.e., 78%). Comparable performance to standard CNN models was achieved for the case where deep visual features were classified using conventional machine learning approaches. False negative (FN) instances, false positive (FP) instances and CNN layers activation suggested that background noise and oversimplified labelling criteria were two contributing factors in the degraded performance of existing CNN algorithms. A framework for partial automation of the visual blockage classification process was proposed, given that none of the existing models was able to achieve high enough accuracy to completely automate the manual process. In addition, a detection-classification pipeline with higher blockage classification accuracy (i.e., 94%) has been proposed as a potential future direction for practical implementation.


Complexity ◽  
2019 ◽  
Vol 2019 ◽  
pp. 1-15
Author(s):  
Zeynep Hilal Kilimci ◽  
Aykut Güven ◽  
Mitat Uysal ◽  
Selim Akyokus

Nowadays, smart devices as a part of daily life collect data about their users with the help of sensors placed on them. Sensor data are usually physical data but mobile applications collect more than physical data like device usage habits and personal interests. Collected data are usually classified as personal, but they contain valuable information about their users when it is analyzed and interpreted. One of the main purposes of personal data analysis is to make predictions about users. Collected data can be divided into two major categories: physical and behavioral data. Behavioral data are also named as neurophysical data. Physical and neurophysical parameters are collected as a part of this study. Physical data contains measurements of the users like heartbeats, sleep quality, energy, movement/mobility parameters. Neurophysical data contain keystroke patterns like typing speed and typing errors. Users’ emotional/mood statuses are also investigated by asking daily questions. Six questions are asked to the users daily in order to determine the mood of them. These questions are emotion-attached questions, and depending on the answers, users’ emotional states are graded. Our aim is to show that there is a connection between users’ physical/neurophysical parameters and mood/emotional conditions. To prove our hypothesis, we collect and measure physical and neurophysical parameters of 15 users for 1 year. The novelty of this work to the literature is the usage of both combinations of physical and neurophysical parameters. Another novelty is that the emotion classification task is performed by both conventional machine learning algorithms and deep learning models. For this purpose, Feedforward Neural Network (FFNN), Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), and Long Short-Term Memory (LSTM) neural network are employed as deep learning methodologies. Multinomial Naïve Bayes (MNB), Support Vector Regression (SVR), Decision Tree (DT), Random Forest (RF), and Decision Integration Strategy (DIS) are evaluated as conventional machine learning algorithms. To the best of our knowledge, this is the very first attempt to analyze the neurophysical conditions of the users by evaluating deep learning models for mood analysis and enriching physical characteristics with neurophysical parameters. Experiment results demonstrate that the utilization of deep learning methodologies and the combination of both physical and neurophysical parameters enhances the classification success of the system to interpret the mood of the users. A wide range of comparative and extensive experiments shows that the proposed model exhibits noteworthy results compared to the state-of-art studies.


2019 ◽  
Vol 8 (2S11) ◽  
pp. 3700-3705

The extraordinary research in the field of unsupervised machine learning has made the non-technical media to expect to see Robot Lords overthrowing humans in near future. Whatever might be the media exaggeration, but the results of recent advances in the research of Deep Learning applications are so beautiful that it has become very difficult to differentiate between the man-made content and computer-made content. This paper tries to establish a ground for new researchers with different real-time applications of Deep Learning. This paper is not a complete study of all applications of Deep Learning, rather it focuses on some of the highly researched themes and popular applications in domains such Image Processing, Sound/Speech Processing, and Video Processing.


2020 ◽  
Author(s):  
Zakhriya Alhassan ◽  
MATTHEW WATSON ◽  
David Budgen ◽  
Riyad Alshammari ◽  
Ali Alessan ◽  
...  

BACKGROUND Predicting the risk of glycated hemoglobin (HbA1c) elevation can help identify patients with the potential for developing serious chronic health problems such as diabetes and cardiovascular diseases. Early preventive interventions based upon advanced predictive models using electronic health records (EHR) data for such patients can ultimately help provide better health outcomes. OBJECTIVE Our study investigates the performance of predictive models to forecast HbA1c elevation levels by employing machine learning approaches using data from current and previous visits in the EHR systems for patients who had not been previously diagnosed with any type of diabetes. METHODS This study employed one statistical model and three commonly used conventional machine learning models, as well as a deep learning model, to predict patients’ current levels of HbA1c. For the deep learning model, we also integrated current visit data with historical (longitudinal) data from previous visits. Explainable machine learning methods were used to interrogate the models and have an understanding of the reasons behind the models' decisions. All models were trained and tested using a large and naturally balanced dataset from Saudi Arabia with 18,844 unique patient records. RESULTS The machine learning models achieved the best results for predicting current HbA1c elevation risk. The deep learning model outperformed the statistical and conventional machine learning models with respect to all reported measures when employing time-series data. The best performing model was the multi-layer perceptron (MLP) which achieved an accuracy of 74.52% when used with historical data. CONCLUSIONS This study shows that machine learning models can provide promising results for the task of predicting current HbA1c levels. For deep learning in particular, utilizing the patient's longitudinal time-series data improved the performance and affected the relative importance for the predictors used. The models showed robust results that were consistent with comparable studies.


Sign in / Sign up

Export Citation Format

Share Document