scholarly journals Biological features between miRNA and their targets are unveiled from deep learning models

Author(s):  
Tongjun Gu ◽  
Mingyi Xie ◽  
W. Brad Barbazuk ◽  
Ji-Hyun Lee

Abstract MicroRNAs (miRNAs) are ~22 nucleotide ubiquitous gene regulators. They modulate a broad range of essential cellular processes linked to human health and diseases. Consequently, identifying miRNA targets and understanding how they function are critical for treating miRNA associated diseases. In our earlier work, we developed a hybrid deep learning-based approach (miTAR) for predicting miRNA targets at significantly higher accuracy compared to existing methods. It integrates two major types of deep learning algorithms: convolutional neural networks (CNNs) and recurrent neural networks (RNNs). However, the features in miRNA:target interactions learned by miTAR have not been investigated. In the current study, we demonstrated that miTAR captures known features, including the involvement of seed region and the free energy, as well as multiple novel features, in the miRNA:target interactions. Interestingly, the CNN and RNN layers of the model behave differently at capturing the free energy feature: the feature captured by the CNN layer units, but not the RNN layer units, is overlapped within and across feature maps. Although deep learning models are commonly thought “black-boxes”, our discoveries support that the biological features in miRNA:target can be unveiled from deep learning models, which will be beneficial to the understanding of the mechanisms in miRNA:target interactions.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Tongjun Gu ◽  
Mingyi Xie ◽  
W. Brad Barbazuk ◽  
Ji-Hyun Lee

AbstractMicroRNAs (miRNAs) are ~ 22 nucleotide ubiquitous gene regulators. They modulate a broad range of essential cellular processes linked to human health and diseases. Consequently, identifying miRNA targets and understanding how they function are critical for treating miRNA associated diseases. In our earlier work, a hybrid deep learning-based approach (miTAR) was developed for predicting miRNA targets. It performs substantially better than the existing methods. The approach integrates two major types of deep learning algorithms: convolutional neural networks (CNNs) and recurrent neural networks (RNNs). However, the features in miRNA:target interactions learned by miTAR have not been investigated. In the current study, we demonstrated that miTAR captures known features, including the involvement of seed region and the free energy, as well as multiple novel features, in the miRNA:target interactions. Interestingly, the CNN and RNN layers of the model perform differently at capturing the free energy feature: the units in RNN layer is more unique at capturing the feature but collectively the CNN layer is more efficient at capturing the feature. Although deep learning models are commonly thought “black-boxes”, our discoveries support that the biological features in miRNA:target can be unveiled from deep learning models, which will be beneficial to the understanding of the mechanisms in miRNA:target interactions.


2020 ◽  
Author(s):  
Dean Sumner ◽  
Jiazhen He ◽  
Amol Thakkar ◽  
Ola Engkvist ◽  
Esben Jannik Bjerrum

<p>SMILES randomization, a form of data augmentation, has previously been shown to increase the performance of deep learning models compared to non-augmented baselines. Here, we propose a novel data augmentation method we call “Levenshtein augmentation” which considers local SMILES sub-sequence similarity between reactants and their respective products when creating training pairs. The performance of Levenshtein augmentation was tested using two state of the art models - transformer and sequence-to-sequence based recurrent neural networks with attention. Levenshtein augmentation demonstrated an increase performance over non-augmented, and conventionally SMILES randomization augmented data when used for training of baseline models. Furthermore, Levenshtein augmentation seemingly results in what we define as <i>attentional gain </i>– an enhancement in the pattern recognition capabilities of the underlying network to molecular motifs.</p>


2021 ◽  
Vol 11 (5) ◽  
pp. 2284
Author(s):  
Asma Maqsood ◽  
Muhammad Shahid Farid ◽  
Muhammad Hassan Khan ◽  
Marcin Grzegorzek

Malaria is a disease activated by a type of microscopic parasite transmitted from infected female mosquito bites to humans. Malaria is a fatal disease that is endemic in many regions of the world. Quick diagnosis of this disease will be very valuable for patients, as traditional methods require tedious work for its detection. Recently, some automated methods have been proposed that exploit hand-crafted feature extraction techniques however, their accuracies are not reliable. Deep learning approaches modernize the world with their superior performance. Convolutional Neural Networks (CNN) are vastly scalable for image classification tasks that extract features through hidden layers of the model without any handcrafting. The detection of malaria-infected red blood cells from segmented microscopic blood images using convolutional neural networks can assist in quick diagnosis, and this will be useful for regions with fewer healthcare experts. The contributions of this paper are two-fold. First, we evaluate the performance of different existing deep learning models for efficient malaria detection. Second, we propose a customized CNN model that outperforms all observed deep learning models. It exploits the bilateral filtering and image augmentation techniques for highlighting features of red blood cells before training the model. Due to image augmentation techniques, the customized CNN model is generalized and avoids over-fitting. All experimental evaluations are performed on the benchmark NIH Malaria Dataset, and the results reveal that the proposed algorithm is 96.82% accurate in detecting malaria from the microscopic blood smears.


2022 ◽  
pp. 1-27
Author(s):  
Clifford Bohm ◽  
Douglas Kirkpatrick ◽  
Arend Hintze

Abstract Deep learning (primarily using backpropagation) and neuroevolution are the preeminent methods of optimizing artificial neural networks. However, they often create black boxes that are as hard to understand as the natural brains they seek to mimic. Previous work has identified an information-theoretic tool, referred to as R, which allows us to quantify and identify mental representations in artificial cognitive systems. The use of such measures has allowed us to make previous black boxes more transparent. Here we extend R to not only identify where complex computational systems store memory about their environment but also to differentiate between different time points in the past. We show how this extended measure can identify the location of memory related to past experiences in neural networks optimized by deep learning as well as a genetic algorithm.


2021 ◽  
Author(s):  
Ramy Abdallah ◽  
Clare E. Bond ◽  
Robert W.H. Butler

&lt;p&gt;Machine learning is being presented as a new solution for a wide range of geoscience problems. Primarily machine learning has been used for 3D seismic data processing, seismic facies analysis and well log data correlation. The rapid development in technology with open-source artificial intelligence libraries and the accessibility of affordable computer graphics processing units (GPU) makes the application of machine learning in geosciences increasingly tractable. However, the application of artificial intelligence in structural interpretation workflows of subsurface datasets is still ambiguous. This study aims to use machine learning techniques to classify images of folds and fold-thrust structures. Here we show that convolutional neural networks (CNNs) as supervised deep learning techniques provide excellent algorithms to discriminate between geological image datasets. Four different datasets of images have been used to train and test the machine learning models. These four datasets are a seismic character dataset with five classes (faults, folds, salt, flat layers and basement), folds types with three classes (buckle, chevron and conjugate), fault types with three classes (normal, reverse and thrust) and fold-thrust geometries with three classes (fault bend fold, fault propagation fold and detachment fold). These image datasets are used to investigate three machine learning models. One Feedforward linear neural network model and two convolutional neural networks models (Convolution 2d layer transforms sequential model and Residual block model (ResNet with 9, 34, and 50 layers)). Validation and testing datasets forms a critical part of testing the model&amp;#8217;s performance accuracy. The ResNet model records the highest performance accuracy score, of the machine learning models tested. Our CNN image classification model analysis provides a framework for applying machine learning to increase structural interpretation efficiency, and shows that CNN classification models can be applied effectively to geoscience problems. The study provides a starting point to apply unsupervised machine learning approaches to sub-surface structural interpretation workflows.&lt;/p&gt;


2021 ◽  
Vol 6 (5) ◽  
pp. 10-15
Author(s):  
Ela Bhattacharya ◽  
D. Bhattacharya

COVID-19 has emerged as the latest worrisome pandemic, which is reported to have its outbreak in Wuhan, China. The infection spreads by means of human contact, as a result, it has caused massive infections across 200 countries around the world. Artificial intelligence has likewise contributed to managing the COVID-19 pandemic in various aspects within a short span of time. Deep Neural Networks that are explored in this paper have contributed to the detection of COVID-19 from imaging sources. The datasets, pre-processing, segmentation, feature extraction, classification and test results which can be useful for discovering future directions in the domain of automatic diagnosis of the disease, utilizing artificial intelligence-based frameworks, have been investigated in this paper.


2020 ◽  
Vol 14 ◽  
Author(s):  
Yaqing Zhang ◽  
Jinling Chen ◽  
Jen Hong Tan ◽  
Yuxuan Chen ◽  
Yunyi Chen ◽  
...  

Emotion is the human brain reacting to objective things. In real life, human emotions are complex and changeable, so research into emotion recognition is of great significance in real life applications. Recently, many deep learning and machine learning methods have been widely applied in emotion recognition based on EEG signals. However, the traditional machine learning method has a major disadvantage in that the feature extraction process is usually cumbersome, which relies heavily on human experts. Then, end-to-end deep learning methods emerged as an effective method to address this disadvantage with the help of raw signal features and time-frequency spectrums. Here, we investigated the application of several deep learning models to the research field of EEG-based emotion recognition, including deep neural networks (DNN), convolutional neural networks (CNN), long short-term memory (LSTM), and a hybrid model of CNN and LSTM (CNN-LSTM). The experiments were carried on the well-known DEAP dataset. Experimental results show that the CNN and CNN-LSTM models had high classification performance in EEG-based emotion recognition, and their accurate extraction rate of RAW data reached 90.12 and 94.17%, respectively. The performance of the DNN model was not as accurate as other models, but the training speed was fast. The LSTM model was not as stable as the CNN and CNN-LSTM models. Moreover, with the same number of parameters, the training speed of the LSTM was much slower and it was difficult to achieve convergence. Additional parameter comparison experiments with other models, including epoch, learning rate, and dropout probability, were also conducted in the paper. Comparison results prove that the DNN model converged to optimal with fewer epochs and a higher learning rate. In contrast, the CNN model needed more epochs to learn. As for dropout probability, reducing the parameters by ~50% each time was appropriate.


2021 ◽  
Author(s):  
Amit Kumar Srivast ◽  
Nima Safaei ◽  
Saeed Khaki ◽  
Gina Lopez ◽  
Wenzhi Zeng ◽  
...  

Abstract Crop yield forecasting depends on many interactive factors including crop genotype, weather, soil, and management practices. This study analyzes the performance of machine learning and deep learning methods for winter wheat yield prediction using extensive datasets of weather, soil, and crop phenology. We propose a convolutional neural network (CNN) which uses the 1-dimentional convolution operation to capture the time dependencies of environmental variables. The proposed CNN, evaluated along with other machine learning models for winter wheat yield prediction in Germany, outperformed all other models tested. To address the seasonality, weekly features were used that explicitly take soil moisture and meteorological events into account. Our results indicated that nonlinear models such as deep learning models and XGboost are more effective in finding the functional relationship between the crop yield and input data compared to linear models and deep neural networks had a higher prediction accuracy than XGboost. One of the main limitations of machine learning models is their black box property. Therefore, we moved beyond prediction and performed feature selection, as it provides key results towards explaining yield prediction (variable importance by time). As such, our study indicates which variables have the most significant effect on winter wheat yield.


Sign in / Sign up

Export Citation Format

Share Document