scholarly journals Deep Neural Network Based Predictions of Protein Interactions Using Primary Sequences

Molecules ◽  
2018 ◽  
Vol 23 (8) ◽  
pp. 1923 ◽  
Author(s):  
Hang Li ◽  
Xiu-Jun Gong ◽  
Hua Yu ◽  
Chang Zhou

Machine learning based predictions of protein–protein interactions (PPIs) could provide valuable insights into protein functions, disease occurrence, and therapy design on a large scale. The intensive feature engineering in most of these methods makes the prediction task more tedious and trivial. The emerging deep learning technology enabling automatic feature engineering is gaining great success in various fields. However, the over-fitting and generalization of its models are not yet well investigated in most scenarios. Here, we present a deep neural network framework (DNN-PPI) for predicting PPIs using features learned automatically only from protein primary sequences. Within the framework, the sequences of two interacting proteins are sequentially fed into the encoding, embedding, convolution neural network (CNN), and long short-term memory (LSTM) neural network layers. Then, a concatenated vector of the two outputs from the previous layer is wired as the input of the fully connected neural network. Finally, the Adam optimizer is applied to learn the network weights in a back-propagation fashion. The different types of features, including semantic associations between amino acids, position-related sequence segments (motif), and their long- and short-term dependencies, are captured in the embedding, CNN and LSTM layers, respectively. When the model was trained on Pan’s human PPI dataset, it achieved a prediction accuracy of 98.78% at the Matthew’s correlation coefficient (MCC) of 97.57%. The prediction accuracies for six external datasets ranged from 92.80% to 97.89%, making them superior to those achieved with previous methods. When performed on Escherichia coli, Drosophila, and Caenorhabditis elegans datasets, DNN-PPI obtained prediction accuracies of 95.949%, 98.389%, and 98.669%, respectively. The performances in cross-species testing among the four species above coincided in their evolutionary distances. However, when testing Mus Musculus using the models from those species, they all obtained prediction accuracies of over 92.43%, which is difficult to achieve and worthy of note for further study. These results suggest that DNN-PPI has remarkable generalization and is a promising tool for identifying protein interactions.

Author(s):  
Ralph Sherwin A. Corpuz ◽  

Analyzing natural language-based Customer Satisfaction (CS) is a tedious process. This issue is practically true if one is to manually categorize large datasets. Fortunately, the advent of supervised machine learning techniques has paved the way toward the design of efficient categorization systems used for CS. This paper presents the feasibility of designing a text categorization model using two popular and robust algorithms – the Support Vector Machine (SVM) and Long Short-Term Memory (LSTM) Neural Network, in order to automatically categorize complaints, suggestions, feedbacks, and commendations. The study found that, in terms of training accuracy, SVM has best rating of 98.63% while LSTM has best rating of 99.32%. Such results mean that both SVM and LSTM algorithms are at par with each other in terms of training accuracy, but SVM is significantly faster than LSTM by approximately 35.47s. The training performance results of both algorithms are attributed on the limitations of the dataset size, high-dimensionality of both English and Tagalog languages, and applicability of the feature engineering techniques used. Interestingly, based on the results of actual implementation, both algorithms are found to be 100% effective in accurately predicting the correct CS categories. Hence, the extent of preference between the two algorithms boils down on the available dataset and the skill in optimizing these algorithms through feature engineering techniques and in implementing them toward actual text categorization applications.


2019 ◽  
Vol 9 (14) ◽  
pp. 2861 ◽  
Author(s):  
Alessandro Crivellari ◽  
Euro Beinat

The interest in human mobility analysis has increased with the rapid growth of positioning technology and motion tracking, leading to a variety of studies based on trajectory recordings. Mapping the routes that people commonly perform was revealed to be very useful for location-based service applications, where individual mobility behaviors can potentially disclose meaningful information about each customer and be fruitfully used for personalized recommendation systems. This paper tackles a novel trajectory labeling problem related to the context of user profiling in “smart” tourism, inferring the nationality of individual users on the basis of their motion trajectories. In particular, we use large-scale motion traces of short-term foreign visitors as a way of detecting the nationality of individuals. This task is not trivial, relying on the hypothesis that foreign tourists of different nationalities may not only visit different locations, but also move in a different way between the same locations. The problem is defined as a multinomial classification with a few tens of classes (nationalities) and sparse location-based trajectory data. We hereby propose a machine learning-based methodology, consisting of a long short-term memory (LSTM) neural network trained on vector representations of locations, in order to capture the underlying semantics of user mobility patterns. Experiments conducted on a real-world big dataset demonstrate that our method achieves considerably higher performances than baseline and traditional approaches.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yangzi Zhao

The stock market is affected by economic market, policy, and other factors, and its internal change law is extremely complex. With the rapid development of the stock market and the expansion of the scale of investors, the stock market has produced a large number of transaction data, which makes it more difficult to obtain valuable information. Because deep neural network is good at dealing with the prediction problems with large amount of data and complex nonlinear mapping relationship, this paper proposes an attention-guided deep neural network stock prediction algorithm. This paper synthesizes the daily stock social media text emotion index and stock technology index as the data source and applies them to the long-term and short-term memory neural network (LSTM) model to predict the stock market. The stock emotion index is extracted by constructing a social text classification emotion model of bidirectional long-term and short-term memory neural network (Bi-LSTM) based on attention mechanism and glove word vector representation algorithm. In addition, a dimensionality reduction model based on decision tree (DT) and principal component analysis (PCA) is constructed to reduce the dimensionality of stock technical indicators and extract the main data information. Furthermore, this paper proposes a model based on nasNet for pattern recognition. The recognition results can be used to automatically identify short-term K-line patterns, predict reliable trading signals, and help investors customize short-term high-efficiency investment strategies. The experimental results show that the prediction accuracy of the proposed algorithm can reach 98.6%, which has high application value.


2019 ◽  
Author(s):  
Kangkang Zhang ◽  
Tong Liu ◽  
Shengjing Song ◽  
Xin Zhao ◽  
Shijun Sun ◽  
...  

AbstractAcquiring clear and usable audio recordings is critical for acoustic analysis of animal vocalizations. Bioacoustics studies commonly face the problem of overlapping signals, but the issue is often ignored, as there is currently no satisfactory solution. This study presents a bi-directional long short-term memory (BLSTM) network to separate overlapping bat calls and reconstruct waveform audio sounds. The separation quality was evaluated using seven temporal-spectrum parameters. The applicability of this method for bat calls was assessed using six different species. In addition, clustering analysis was conducted with separated echolocation calls from each population. Results showed that all syllables in the overlapping calls were separated with high robustness across species. A comparison between the seven temporal-spectrum parameters showed no significant difference and negligible deviation between the extracted and original calls, indicating high separation quality. Clustering analysis of the separated echolocation calls also produced an accuracy of 93.8%, suggesting the reconstructed waveform sounds could be reliably used. These results suggest the proposed technique is a convenient and automated approach for separating overlapping calls using a BLSTM network. This powerful deep neural network approach has the potential to solve complex problems in bioacoustics.Author summaryIn recent years, the development of recording techniques and devices in animal acoustic experiment and population monitoring has led to a sharp increase in the volume of sound data. However, the collected sound would be overlapped because of the existence of multiple individuals, which laid restrictions on taking full advantage of experiment data. Besides, more convenient and automatic methods are needed to cope with the large datasets in animal acoustics. The echolocation calls and communication calls of bats are variable and often overlapped with each other both in the recordings from field and laboratory, which provides an excellent template for research on animal sound separation. Here, we firstly solved the problem of overlapping calls in bats successfully based on deep neural network. We built a network to separate the overlapping calls of six bat species. All the syllables in overlapping calls were separated and we found no significant difference between the separated syllables with non-overlapping syllables. We also demonstrated an instance of applying our method on species classification. Our study provides a useful and efficient model for sound data processing in acoustic research and the proposed method has the potential to be generalized to other animal species.


Author(s):  
Thang

In this research, we propose a method of human robot interactive intention prediction. The proposed algorithm makes use of a OpenPose library and a Long-short term memory deep learning neural network. The neural network observes the human posture in a time series, then predicts the human interactive intention. We train the deep neural network using dataset generated by us. The experimental results show that, our proposed method is able to predict the human robot interactive intention, providing 92% the accuracy on the testing set.


2021 ◽  
Vol 4 (4) ◽  
pp. 85
Author(s):  
Hashem Saleh Sharaf Al-deen ◽  
Zhiwen Zeng ◽  
Raeed Al-sabri ◽  
Arash Hekmat

Due to the increasing growth of social media content on websites such as Twitter and Facebook, analyzing textual sentiment has become a challenging task. Therefore, many studies have focused on textual sentiment analysis. Recently, deep learning models, such as convolutional neural networks and long short-term memory, have achieved promising performance in sentiment analysis. These models have proven their ability to cope with the arbitrary length of sequences. However, when they are used in the feature extraction layer, the feature distance is highly dimensional, the text data are sparse, and they assign equal importance to various features. To address these issues, we propose a hybrid model that combines a deep neural network with a multi-head attention mechanism (DNN–MHAT). In the DNN–MHAT model, we first design an improved deep neural network to capture the text's actual context and extract the local features of position invariants by combining recurrent bidirectional long short-term memory units (Bi-LSTM) with a convolutional neural network (CNN). Second, we present a multi-head attention mechanism to capture the words in the text that are significantly related to long space and encoding dependencies, which adds a different focus to the information outputted from the hidden layers of BiLSTM. Finally, a global average pooling is applied for transforming the vector into a high-level sentiment representation to avoid model overfitting, and a sigmoid classifier is applied to carry out the sentiment polarity classification of texts. The DNN–MHAT model is tested on four reviews and two Twitter datasets. The results of the experiments illustrate the effectiveness of the DNN–MHAT model, which achieved excellent performance compared to the state-of-the-art baseline methods based on short tweets and long reviews.


Sign in / Sign up

Export Citation Format

Share Document