Positional SHAP for Interpretation of Deep Learning Models Trained from Biological Sequences

Mapping Intimacies ◽

10.1101/2021.03.04.433939 ◽

2021 ◽

Author(s):

Quinn Dickinson ◽

Jesse G. Meyer

Keyword(s):

Deep Learning ◽

Rhesus Macaque ◽

Short Term Memory ◽

Peptide Binding ◽

Disease Diagnosis ◽

Biological Sequences ◽

Mhc I ◽

Binding Motifs ◽

Model Interpretation ◽

Biological Phenomena

AbstractMachine learning with artificial neural networks, also known as “deep learning”, accurately predicts biological phenomena such as disease diagnosis and protein structure. Despite the ability of deep learning to make accurate biological predictions, a challenge is model interpretation, which is especially challenging for recurrent neural network architectures due to the sequential input data. Here we train multi-output long short-term memory (LSTM) regression models to predict peptide binding affinity to five rhesus macaque major histocompatibility complex (MHC) I alleles. We adapt SHapely Additive exPlanations (SHAP) to generate positional model interpretations of which amino acids are important for peptide binding. These positional SHAP values reproduced known rhesus macaque MHC class I (Mamu-A1*001) peptide binding motifs and provided insights into inter-positional dependencies of peptide-MHC interactions. Positional SHAP should find widespread utility for interpreting a variety of models trained from biological sequences.

Download Full-text

Connecting MHC-I-binding motifs with HLA alleles via deep learning

10.1101/2021.04.18.440359 ◽

2021 ◽

Author(s):

Ko-Han Lee ◽

Yu-Chuan Chang ◽

Ting-Fu Chen ◽

Hsueh-Fen Juan ◽

Huai-Kuang Tsai ◽

...

Keyword(s):

Deep Learning ◽

Mhc I ◽

Binding Motifs ◽

Hla Alleles ◽

Antigen Discovery ◽

Mhc Molecules ◽

Binding Preference ◽

Allele Specificity ◽

Binding Peptides ◽

Selection Of

The selection of peptides presented by MHC molecules is crucial for antigen discovery. Previously, several predictors have shown impressive performance on binding affinity. However, the decisive MHC residues and their relation to the selection of binding peptides are still unrevealed. Here, we connected HLA alleles with binding motifs via our deep learning-based framework, MHCfovea. MHCfovea expanded the knowledge of MHC-I-binding motifs from 150 to 13,008 alleles. After clustering N-terminal and C-terminal sub-motifs on both observed and unobserved alleles, MHCfovea calculated the hyper-motifs and the corresponding allele signatures on the important positions to disclose the relation between binding motifs and MHC-I sequences. MHCfovea delivered 32 pairs of hyper-motifs and allele signatures (HLA-A: 13, HLA-B: 12, and HLA-C: 7). The paired hyper-motifs and allele signatures disclosed the critical polymorphic residues that determine the binding preference, which are believed to be valuable for antigen discovery and vaccine design when allele specificity is concerned.

Download Full-text

Connecting MHC-I-binding motifs with HLA alleles via deep learning

Communications Biology ◽

10.1038/s42003-021-02716-8 ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Ko-Han Lee ◽

Yu-Chuan Chang ◽

Ting-Fu Chen ◽

Hsueh-Fen Juan ◽

Huai-Kuang Tsai ◽

...

Keyword(s):

Deep Learning ◽

Mhc I ◽

Binding Motifs ◽

Hla Alleles ◽

Antigen Discovery ◽

Mhc Molecules ◽

Binding Preference ◽

Allele Specificity ◽

Binding Peptides ◽

Selection Of

AbstractThe selection of peptides presented by MHC molecules is crucial for antigen discovery. Previously, several predictors have shown impressive performance on binding affinity. However, the decisive MHC residues and their relation to the selection of binding peptides are still unrevealed. Here, we connected HLA alleles with binding motifs via our deep learning-based framework, MHCfovea. MHCfovea expanded the knowledge of MHC-I-binding motifs from 150 to 13,008 alleles. After clustering N-terminal and C-terminal sub-motifs on both observed and unobserved alleles, MHCfovea calculated the hyper-motifs and the corresponding allele signatures on the important positions to disclose the relation between binding motifs and MHC-I sequences. MHCfovea delivered 32 pairs of hyper-motifs and allele signatures (HLA-A: 13, HLA-B: 12, and HLA-C: 7). The paired hyper-motifs and allele signatures disclosed the critical polymorphic residues that determine the binding preference, which are believed to be valuable for antigen discovery and vaccine design when allele specificity is concerned.

Download Full-text

Deep learning pan‐specific model for interpretable MHC‐I peptide binding prediction with improved attention mechanism

Proteins Structure Function and Bioinformatics ◽

10.1002/prot.26065 ◽

2021 ◽

Author(s):

Jing Jin ◽

Zhonghao Liu ◽

Alireza Nasiri ◽

Yuxin Cui ◽

Stephen Louis ◽

...

Keyword(s):

Deep Learning ◽

Peptide Binding ◽

Specific Model ◽

Attention Mechanism ◽

Binding Prediction ◽

Mhc I ◽

Peptide Binding Prediction

Download Full-text

Author response for "Detailed and atypical HLA‐E peptide binding motifs revealed by a novel peptide exchange binding assay"

10.1002/eji.202048719/v2/response1 ◽

2020 ◽

Author(s):

Lucy C. Walters ◽

Andrew J. McMichael ◽

Geraldine M. Gillespie

Keyword(s):

Binding Assay ◽

Peptide Binding ◽

Author Response ◽

Binding Motifs ◽

Peptide Exchange

Download Full-text

Decision letter for "Detailed and atypical HLA‐E peptide binding motifs revealed by a novel peptide exchange binding assay"

10.1002/eji.202048719/v1/decision1 ◽

2020 ◽

Keyword(s):

Binding Assay ◽

Peptide Binding ◽

Binding Motifs ◽

Peptide Exchange

Download Full-text

A Deep Learning based Arabic Script Recognition System: Benchmark on KHAT

The International Arab Journal of Information Technology ◽

10.34028/iajit/17/3/3 ◽

2020 ◽

Vol 17 (3) ◽

pp. 299-305 ◽

Cited By ~ 1

Author(s):

Riaz Ahmad ◽

Saeeda Naz ◽

Muhammad Afzal ◽

Sheikh Rashid ◽

Marcus Liwicki ◽

...

Keyword(s):

Deep Learning ◽

Character Recognition ◽

Data Augmentation ◽

Short Term Memory ◽

Recognition System ◽

Learning Approach ◽

Arabic Text ◽

Data Set ◽

Processing Step ◽

Handwritten Arabic

This paper presents a deep learning benchmark on a complex dataset known as KFUPM Handwritten Arabic TexT (KHATT). The KHATT data-set consists of complex patterns of handwritten Arabic text-lines. This paper contributes mainly in three aspects i.e., (1) pre-processing, (2) deep learning based approach, and (3) data-augmentation. The pre-processing step includes pruning of white extra spaces plus de-skewing the skewed text-lines. We deploy a deep learning approach based on Multi-Dimensional Long Short-Term Memory (MDLSTM) networks and Connectionist Temporal Classification (CTC). The MDLSTM has the advantage of scanning the Arabic text-lines in all directions (horizontal and vertical) to cover dots, diacritics, strokes and fine inflammation. The data-augmentation with a deep learning approach proves to achieve better and promising improvement in results by gaining 80.02% Character Recognition (CR) over 75.08% as baseline.

Download Full-text

Barcoded Rational AAV Vector Evolution Enables Systematic In Vivo Mapping of Peptide Binding Motifs

SSRN Electronic Journal ◽

10.2139/ssrn.3245528 ◽

2018 ◽

Cited By ~ 1

Author(s):

Marcus Davidsson ◽

Gang Wang ◽

Patrick Aldrin-Kirk ◽

Tiago Cardoso ◽

Sara Nolbrant ◽

...

Keyword(s):

Peptide Binding ◽

Aav Vector ◽

Binding Motifs

Download Full-text

Deep Learning in Disease Diagnosis: Models and Datasets

Current Bioinformatics ◽

10.2174/1574893615999201002124021 ◽

2020 ◽

Vol 15 ◽

Author(s):

Deeksha Saxena ◽

Mohammed Haris Siddiqui ◽

Rajnish Kumar

Keyword(s):

Biological Sciences ◽

Machine Learning ◽

Deep Learning ◽

Disease Diagnosis ◽

Learning Models ◽

Data Types ◽

Related Data ◽

Abstract Level ◽

Experimental Validations ◽

Selection Of

Background: Deep learning (DL) is an Artificial neural network-driven framework with multiple levels of representation for which non-linear modules combined in such a way that the levels of representation can be enhanced from lower to a much abstract level. Though DL is used widely in almost every field, it has largely brought a breakthrough in biological sciences as it is used in disease diagnosis and clinical trials. DL can be clubbed with machine learning, but at times both are used individually as well. DL seems to be a better platform than machine learning as the former does not require an intermediate feature extraction and works well with larger datasets. DL is one of the most discussed fields among the scientists and researchers these days for diagnosing and solving various biological problems. However, deep learning models need some improvisation and experimental validations to be more productive. Objective: To review the available DL models and datasets that are used in disease diagnosis. Methods: Available DL models and their applications in disease diagnosis were reviewed discussed and tabulated. Types of datasets and some of the popular disease related data sources for DL were highlighted. Results: We have analyzed the frequently used DL methods, data types and discussed some of the recent deep learning models used for solving different biological problems. Conclusion: The review presents useful insights about DL methods, data types, selection of DL models for the disease diagnosis.

Download Full-text

Human Activity Recognition using Fourier Transform Inspired Deep Learning Combination Model

International Journal of Sensors Wireless Communications and Control ◽

10.2174/2210327908666180727123657 ◽

2019 ◽

Vol 9 (1) ◽

pp. 16-31

Author(s):

Kyungkoo Jun

Keyword(s):

Fourier Transform ◽

Deep Learning ◽

Short Term Memory ◽

Window Size ◽

Sensor Data ◽

Data Sets ◽

Data Set ◽

Proposed Model ◽

Testing Data ◽

Labeling Scheme

Background & Objective: This paper proposes a Fourier transform inspired method to classify human activities from time series sensor data. Methods: Our method begins by decomposing 1D input signal into 2D patterns, which is motivated by the Fourier conversion. The decomposition is helped by Long Short-Term Memory (LSTM) which captures the temporal dependency from the signal and then produces encoded sequences. The sequences, once arranged into the 2D array, can represent the fingerprints of the signals. The benefit of such transformation is that we can exploit the recent advances of the deep learning models for the image classification such as Convolutional Neural Network (CNN). Results: The proposed model, as a result, is the combination of LSTM and CNN. We evaluate the model over two data sets. For the first data set, which is more standardized than the other, our model outperforms previous works or at least equal. In the case of the second data set, we devise the schemes to generate training and testing data by changing the parameters of the window size, the sliding size, and the labeling scheme. Conclusion: The evaluation results show that the accuracy is over 95% for some cases. We also analyze the effect of the parameters on the performance.

Download Full-text

Improving Sentiment Analysis using Hybrid Deep Learning Model

Recent Advances in Computer Science and Communications ◽

10.2174/2213275912666190328200012 ◽

2020 ◽

Vol 13 (4) ◽

pp. 627-640 ◽

Cited By ~ 1

Author(s):

Avinash Chandra Pandey ◽

Dharmveer Singh Rajpoot

Keyword(s):

Neural Network ◽

Deep Learning ◽

Sentiment Analysis ◽

Classification Accuracy ◽

Short Term Memory ◽

Computational Cost ◽

Extraction Process ◽

Learning Model ◽

Sentiment Classification ◽

Deep Learning Model

Background: Sentiment analysis is a contextual mining of text which determines viewpoint of users with respect to some sentimental topics commonly present at social networking websites. Twitter is one of the social sites where people express their opinion about any topic in the form of tweets. These tweets can be examined using various sentiment classification methods to find the opinion of users. Traditional sentiment analysis methods use manually extracted features for opinion classification. The manual feature extraction process is a complicated task since it requires predefined sentiment lexicons. On the other hand, deep learning methods automatically extract relevant features from data hence; they provide better performance and richer representation competency than the traditional methods. Objective: The main aim of this paper is to enhance the sentiment classification accuracy and to reduce the computational cost. Method: To achieve the objective, a hybrid deep learning model, based on convolution neural network and bi-directional long-short term memory neural network has been introduced. Results: The proposed sentiment classification method achieves the highest accuracy for the most of the datasets. Further, from the statistical analysis efficacy of the proposed method has been validated. Conclusion: Sentiment classification accuracy can be improved by creating veracious hybrid models. Moreover, performance can also be enhanced by tuning the hyper parameters of deep leaning models.

Download Full-text