scholarly journals Attention mechanism-based deep learning pan-specific model for interpretable MHC-I peptide binding prediction

2019 ◽  
Author(s):  
Jing Jin ◽  
Zhonghao Liu ◽  
Alireza Nasiri ◽  
Yuxin Cui ◽  
Stephen Louis ◽  
...  

AbstractAccurate prediction of peptide binding affinity to the major histocompatibility complex (MHC) proteins has the potential to design better therapeutic vaccines. Previous work has shown that pan-specific prediction algorithms can achieve better prediction performance than other approaches. However, most of the top algorithms are neural networks based black box models. Here, we propose DeepAttentionPan, an improved pan-specific model, based on convolutional neural networks and attention mechanisms for more flexible, stable and interpretable MHC-I binding prediction. With the attention mechanism, our ensemble model consisting of 20 trained networks achieves high and more stabilized prediction performance. Extensive tests on IEDB’s weekly benchmark dataset show that our method achieves state-of-the-art prediction performance on 21 test allele datasets. Analysis of the peptide positional attention weights learned by our model demonstrates its capability to capture critical binding positions of the peptides, which leads to mechanistic understanding of MHC-peptide binding with high alignment with experimentally verified results. Furthermore, we show that with transfer learning, our pan model can be fine-tuned for alleles with few samples to achieve additional performance improvement. DeepAttentionPan is freely available as an open source software at https://github.com/jjin49/DeepAttentionPan.Author summaryHuman leukocyte antigen (HLA) proteins are classes of proteins that are responsible for immune system regulation in humans. The peptides are short chains of amino acids. HLA class I group present peptides from inside the cell to the cell surface for scrutiny by T cell receptors. For instance, if the cell is infected by a virus, the HLA system will bind to the peptides derived from viral proteins and bring them to the surface of the cell so that the cell can be destroyed by the immune system. Since the HLA genes exhibit extensive polymorphism, there are many HLA alleles binding to different peptides. And this diversity represents challenges in predicting binders for different HLA alleles, which are important in vaccine designs and characterization of immune responses. Before computational algorithms are used to predict the binding relationships of HLA-peptide pairs, scientists need to conduct costly biological experiments to do preliminary screening among a number of peptides and need to use mutant experiments to identify key peptide positions that contribute to the binding. While previous computational methods have been proposed to predict the binding affinity, identifying the binding anchors is not well addressed. Here we developed a deep neural network models with the attention mechanism to learn the binding relationships automatically in an end-to-end way. Our models are able to identify the important binding positions of the peptide sequence by learning the positional importance distribution, which used to be studied a lot only through costly experimental methods. Our model thus not only improves the performance of binding affinity prediction but also allows us to gain biological insight of binding motifs of different alleles via interpreting the learned deep neural network models.




2020 ◽  
Vol 13 (S11) ◽  
Author(s):  
Khandakar Tanvir Ahmed ◽  
Sunho Park ◽  
Qibing Jiang ◽  
Yunku Yeu ◽  
TaeHyun Hwang ◽  
...  

Abstract Background Drug sensitivity prediction and drug responsive biomarker selection on high-throughput genomic data is a critical step in drug discovery. Many computational methods have been developed to serve this purpose including several deep neural network models. However, the modular relations among genomic features have been largely ignored in these methods. To overcome this limitation, the role of the gene co-expression network on drug sensitivity prediction is investigated in this study. Methods In this paper, we first introduce a network-based method to identify representative features for drug response prediction by using the gene co-expression network. Then, two graph-based neural network models are proposed and both models integrate gene network information directly into neural network for outcome prediction. Next, we present a large-scale comparative study among the proposed network-based methods, canonical prediction algorithms (i.e., Elastic Net, Random Forest, Partial Least Squares Regression, and Support Vector Regression), and deep neural network models for drug sensitivity prediction. All the source code and processed datasets in this study are available at https://github.com/compbiolabucf/drug-sensitivity-prediction. Results In the comparison of different feature selection methods and prediction methods on a non-small cell lung cancer (NSCLC) cell line RNA-seq gene expression dataset with 50 different drug treatments, we found that (1) the network-based feature selection method improves the prediction performance compared to Pearson correlation coefficients; (2) Random Forest outperforms all the other canonical prediction algorithms and deep neural network models; (3) the proposed graph-based neural network models show better prediction performance compared to deep neural network model; (4) the prediction performance is drug dependent and it may relate to the drug’s mechanism of action. Conclusions Network-based feature selection method and prediction models improve the performance of the drug response prediction. The relations between the genomic features are more robust and stable compared to the correlation between each individual genomic feature and the drug response in high dimension and low sample size genomic datasets.



Author(s):  
Jingxian Li ◽  
Lixin Han ◽  
Xiaoshuang Li ◽  
Jun Zhu ◽  
Baohua Yuan ◽  
...  


ChemMedChem ◽  
2021 ◽  
Author(s):  
Christoph Grebner ◽  
Hans Matter ◽  
Daniel Kofink ◽  
Jan Wenzel ◽  
Friedemann Schmidt ◽  
...  


mBio ◽  
2017 ◽  
Vol 8 (6) ◽  
Author(s):  
Yushen Du ◽  
Tian-Hao Zhang ◽  
Lei Dai ◽  
Xiaojuan Zheng ◽  
Aleksandr M. Gorin ◽  
...  

ABSTRACT Certain “protective” major histocompatibility complex class I (MHC-I) alleles, such as B*57 and B*27, are associated with long-term control of HIV-1 in vivo mediated by the CD8+ cytotoxic-T-lymphocyte (CTL) response. However, the mechanism of such superior protection is not fully understood. Here we combined high-throughput fitness profiling of mutations in HIV-1 Gag, in silico prediction of MHC-peptide binding affinity, and analysis of intraperson virus evolution to systematically compare differences with respect to CTL escape mutations between epitopes targeted by protective MHC-I alleles and those targeted by nonprotective MHC-I alleles. We observed that the effects of mutations on both viral replication and MHC-I binding affinity are among the determinants of CTL escape. Mutations in Gag epitopes presented by protective MHC-I alleles are associated with significantly higher fitness cost and lower reductions in binding affinity with respect to MHC-I. A linear regression model accounting for the effect of mutations on both viral replicative capacity and MHC-I binding can explain the protective efficacy of MHC-I alleles. Finally, we found a consistent pattern in the evolution of Gag epitopes in long-term nonprogressors versus progressors. Overall, our results suggest that certain protective MHC-I alleles allow superior control of HIV-1 by targeting epitopes where mutations typically incur high fitness costs and small reductions in MHC-I binding affinity. IMPORTANCE Understanding the mechanism of viral control achieved in long-term nonprogressors with protective HLA alleles provides insights for developing functional cure of HIV infection. Through the characterization of CTL escape mutations in infected persons, previous researchers hypothesized that protective alleles target epitopes where escape mutations significantly reduce viral replicative capacity. However, these studies were usually limited to a few mutations observed in vivo. Here we utilized our recently developed high-throughput fitness profiling method to quantitatively measure the fitness of mutations across the entirety of HIV-1 Gag. The data enabled us to integrate the results with in silico prediction of MHC-peptide binding affinity and analysis of intraperson virus evolution to systematically determine the differences in CTL escape mutations between epitopes targeted by protective HLA alleles and those targeted by nonprotective HLA alleles. We observed that the effects of Gag epitope mutations on HIV replicative fitness and MHC-I binding affinity are among the major determinants of CTL escape. IMPORTANCE Understanding the mechanism of viral control achieved in long-term nonprogressors with protective HLA alleles provides insights for developing functional cure of HIV infection. Through the characterization of CTL escape mutations in infected persons, previous researchers hypothesized that protective alleles target epitopes where escape mutations significantly reduce viral replicative capacity. However, these studies were usually limited to a few mutations observed in vivo. Here we utilized our recently developed high-throughput fitness profiling method to quantitatively measure the fitness of mutations across the entirety of HIV-1 Gag. The data enabled us to integrate the results with in silico prediction of MHC-peptide binding affinity and analysis of intraperson virus evolution to systematically determine the differences in CTL escape mutations between epitopes targeted by protective HLA alleles and those targeted by nonprotective HLA alleles. We observed that the effects of Gag epitope mutations on HIV replicative fitness and MHC-I binding affinity are among the major determinants of CTL escape.





2021 ◽  
Author(s):  
Jesus Cano ◽  
Lorenzo Facila ◽  
Philip Langley ◽  
Roberto Zangroniz ◽  
Raul Alcaraz ◽  
...  


2020 ◽  
Vol 1662 ◽  
pp. 012010
Author(s):  
F Colecchia ◽  
J K Ruffle ◽  
G C Pombo ◽  
R Gray ◽  
H Hyare ◽  
...  


Sign in / Sign up

Export Citation Format

Share Document