scholarly journals An interpretable protein localization prediction framework

2021 ◽  
Author(s):  
◽  
Yuexu Jiang

Protein localization is related to many human diseases. Therefore, the prediction of protein localization is an essential task that has been extensively studied. Additionally, the study of the localization mechanism can provide more biological insights and testable hypotheses. In this thesis, we propose MULocDeep, a general deep learning-based localization prediction framework. We designed a matrix layer in its architecture to reflect the hierarchical relationships of localization in cells. This enables MULocDeep, to predict multiple localizations of a protein at both subcellular and suborganellar levels. We collected a dataset with 44 suborganellar localization annotations in 10 major subcellular compartments--the most comprehensive suborganelle localization dataset to date. Our collaborators also experimentally generated an independent dataset of mitochondrial proteins in Arabidopsis thaliana cell cultures, Solanum tuberosum tubers, and Vicia faba roots and made this dataset publicl using the above datasets show that overall, MULocDeep outperforms other major methods at both subcellular and suborganellar levels. We also applied Long short-term memory (LSTM) and the multi-head self-attention in MULocDeep to pursue a single amino acid level resolution when assessing their contributions to localization. This provides insights into the mechanism of protein sorting and localization motifs. Many of the candidate sites or motifs match the existing localization knowledge. A web server can be accessed at https://www.mu-loc.org/.

Information ◽  
2020 ◽  
Vol 11 (6) ◽  
pp. 312 ◽  
Author(s):  
Asma Baccouche ◽  
Sadaf Ahmed ◽  
Daniel Sierra-Sosa ◽  
Adel Elmaghraby

Identifying internet spam has been a challenging problem for decades. Several solutions have succeeded to detect spam comments in social media or fraudulent emails. However, an adequate strategy for filtering messages is difficult to achieve, as these messages resemble real communications. From the Natural Language Processing (NLP) perspective, Deep Learning models are a good alternative for classifying text after being preprocessed. In particular, Long Short-Term Memory (LSTM) networks are one of the models that perform well for the binary and multi-label text classification problems. In this paper, an approach merging two different data sources, one intended for Spam in social media posts and the other for Fraud classification in emails, is presented. We designed a multi-label LSTM model and trained it on the joint datasets including text with common bigrams, extracted from each independent dataset. The experiment results show that our proposed model is capable of identifying malicious text regardless of the source. The LSTM model trained with the merged dataset outperforms the models trained independently on each dataset.


2020 ◽  
Author(s):  
Yuexu Jiang ◽  
Duolin Wang ◽  
Yifu Yao ◽  
Holger Eubel ◽  
Patrick Künzler ◽  
...  

Abstract Prediction of protein localization plays an important role in understanding protein function and mechanism. In this paper, we propose a general deep learning-based localization prediction framework, MULocDeep, which can predict multiple localizations of a protein at both subcellular and suborganellar levels. We collected a dataset with 45 suborganellar localization annotations in 10 major subcellular compartments, the most comprehensive suborganelle localization dataset to date. We also experimentally generated an independent dataset of mitochondrial proteins in Arabidopsis thaliana cell cultures, Solanum tuberosum tubers, and Vicia faba roots and made this dataset publicly available. Evaluations using the above datasets show that overall MULocDeep outperforms other major methods at both subcellular and suborganellar levels. Furthermore, MULocDeep assesses each amino acid’s contribution to localization, which provides insights into the mechanism of protein sorting and localization motifs. A web server can be accessed at http://mu-loc.org/.


2020 ◽  
Vol 12 (2) ◽  
pp. 84-99
Author(s):  
Li-Pang Chen

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.


2020 ◽  
Author(s):  
Abdolreza Nazemi ◽  
Johannes Jakubik ◽  
Andreas Geyer-Schulz ◽  
Frank J. Fabozzi

Sign in / Sign up

Export Citation Format

Share Document