A Review on the Methods of Peptide-MHC Binding Prediction

2021 ◽  
Vol 15 (8) ◽  
pp. 878-888
Author(s):  
Yang Liu ◽  
Xia-hui Ouyang ◽  
Zhi-Xiong Xiao ◽  
Le Zhang ◽  
Yang Cao

Background: T lymphocyte achieves an immune response by recognizing antigen peptides (also known as T cell epitopes) through major histocompatibility complex (MHC) molecules. The immunogenicity of T cell epitopes depends on their source and stability in combination with MHC molecules. The binding of the peptide to MHC is the most selective step, so predicting the binding affinity of the peptide to MHC is the principal step in predicting T cell epitopes. The identification of epitopes is of great significance in the research of vaccine design and T cell immune response. Objective: The traditional method for identifying epitopes is to synthesize and test the binding activity of peptide by experimental methods, which is not only time-consuming, but also expensive. In silico methods for predicting peptide-MHC binding emerge to pre-select candidate peptides for experimental testing, which greatly saves time and costs. By summarizing and analyzing these methods, we hope to have a better insight and provide guidance for future directions. Methods: Up to now, a number of methods have been developed to predict the binding ability of peptides to MHC based on various principles. Some of them employ matrix models or machine learning models based on the sequence characteristic embedded in peptides or MHC to predict the binding ability of peptides to MHC. Some others utilize the three-dimensional structural information of peptides or MHC, for example, by extracting three-dimensional structural information to construct a feature matrix or machine learning model, or directly using protein structure prediction, molecular docking to predict the binding mode of peptides and MHC. Results: Although the methods in predicting peptide-MHC binding based on the feature matrix or machine learning model can achieve high-throughput prediction, the accuracy of which depends heavily on the sequence characteristic of confirmed binding peptides. In addition, it cannot provide insights into the mechanism of antigen specificity. Therefore, such methods have certain limitations in practical applications. Methods in predicting peptide-MHC binding based on structural prediction or molecular docking are computationally intensive compared to the methods based on feature matrix or machine learning model and the challenge is how to predict a reliable structural model. Conclusion: This paper reviews the principles, advantages and disadvantages of the methods of peptide-MHC binding prediction and discussed the future directions to achieve more accurate predictions.

PeerJ ◽  
2020 ◽  
Vol 8 ◽  
pp. e10381
Author(s):  
Rohit Nandakumar ◽  
Valentin Dinu

Throughout the history of drug discovery, an enzymatic-based approach for identifying new drug molecules has been primarily utilized. Recently, protein–protein interfaces that can be disrupted to identify small molecules that could be viable targets for certain diseases, such as cancer and the human immunodeficiency virus, have been identified. Existing studies computationally identify hotspots on these interfaces, with most models attaining accuracies of ~70%. Many studies do not effectively integrate information relating to amino acid chains and other structural information relating to the complex. Herein, (1) a machine learning model has been created and (2) its ability to integrate multiple features, such as those associated with amino-acid chains, has been evaluated to enhance the ability to predict protein–protein interface hotspots. Virtual drug screening analysis of a set of hotspots determined on the EphB2-ephrinB2 complex has also been performed. The predictive capabilities of this model offer an AUROC of 0.842, sensitivity/recall of 0.833, and specificity of 0.850. Virtual screening of a set of hotspots identified by the machine learning model developed in this study has identified potential medications to treat diseases caused by the overexpression of the EphB2-ephrinB2 complex, including prostate, gastric, colorectal and melanoma cancers which are linked to EphB2 mutations. The efficacy of this model has been demonstrated through its successful ability to predict drug-disease associations previously identified in literature, including cimetidine, idarubicin, pralatrexate for these conditions. In addition, nadolol, a beta blocker, has also been identified in this study to bind to the EphB2-ephrinB2 complex, and the possibility of this drug treating multiple cancers is still relatively unexplored.


Mathematics ◽  
2021 ◽  
Vol 9 (5) ◽  
pp. 476
Author(s):  
Mingming Zhang ◽  
Shurong Hao ◽  
Anping Hou

In order to obtain the aerodynamic loads of the vibrating blades efficiently, the eXterme Gradient Boosting (XGBoost) algorithm in machine learning was adopted to establish a three-dimensional unsteady aerodynamic force reduction model. First, the database for the unsteady aerodynamic response during the blade vibration was acquired through the numerical simulation of flow field. Then the obtained data set was trained by the XGBoost algorithm to set up the intelligent model of unsteady aerodynamic force for the three-dimensional blade. Afterwards, the aerodynamic load could be gained at any spatial location during blade vibration. To evaluate and verify the reliability of the intelligent model for the blade aerodynamic load, the prediction results of the machine learning model were compared with the results of Computation Fluid Dynamics (CFD). The determination coefficient R2 and the Root Mean Square Error (RMSE) were introduced as the model evaluation indicators. The results show that the prediction results based on the machine learning model are in good agreement with the CFD results, and the calculation efficiency is significantly improved. The results also indicate that the aerodynamic intelligent model based on the machine learning method is worthy of further study in evaluating the blade vibration stability.


Author(s):  
J. Kim ◽  
J. Y. Jun ◽  
M. Hong ◽  
H. Shim ◽  
J. Ahn

<p><strong>Abstract.</strong> In the past few decades, a number of scholars studied painting classification based on image processing or computer vision technologies. Further, as the machine learning technology rapidly developed, painting classification using machine learning has been carried out. However, due to the lack of information about brushstrokes in the photograph, typical models cannot use more precise information of the painters painting style. We hypothesized that the visualized depth information of brushstroke is effective to improve the accuracy of the machine learning model for painting classification. This study proposes a new data utilization approach in machine learning with Reflectance Transformation Imaging (RTI) images, which maximizes the visualization of a three-dimensional shape of brushstrokes. Certain artist’s unique brushstrokes can be revealed in RTI images, which are difficult to obtain with regular photographs. If these new types of images are applied as data to train in with the machine learning model, classification would be conducted including not only the shape of the color but also the depth information. We used the Convolution Neural Network (CNN), a model optimized for image classification, using the VGG-16, ResNet-50, and DenseNet-121 architectures. We conducted a two-stage experiment using the works of two Korean artists. In the first experiment, we obtained a key part of the painting from RTI data and photographic data. In the second experiment on the second artists work, a larger quantity of data are acquired, and the whole part of the artwork was captured. The result showed that RTI-trained model brought higher accuracy than Non-RTI trained model. In this paper, we propose a method which uses machine learning and RTI technology to analyze and classify paintings more precisely to verify our hypothesis.</p>


2015 ◽  
Vol 9s3 ◽  
pp. BBI.S29466 ◽  
Author(s):  
Heng Luo ◽  
Hao Ye ◽  
Hui Wen Ng ◽  
Lemming Shi ◽  
Weida Tong ◽  
...  

As major histocompatibility complexes in humans, the human leukocyte antigens (HLAs) have important functions to present antigen peptides onto T-cell receptors for immunological recognition and responses. Interpreting and predicting HLA-peptide binding are important to study T-cell epitopes, immune reactions, and the mechanisms of adverse drug reactions. We review different types of machine learning methods and tools that have been used for HLA-peptide binding prediction. We also summarize the descriptors based on which the HLA-peptide binding prediction models have been constructed and discuss the limitation and challenges of the current methods. Lastly, we give a future perspective on the HLA-peptide binding prediction method based on network analysis.


2021 ◽  
Author(s):  
Damien Farrell

AbstractA key step in the cellular adaptive immune response is the presentation of antigen to T cells. During this process short peptides processed from self or foreign proteins may be presented on the surface bound to MHC molecules for binding to T cell receptors. Those that bind and activate an immune response are called epitopes. Computational prediction of T cell epitopes has many applications in vaccine design and immuno-diagnostics. This is the basis of immunoinformatics which allows in silico screening of peptides before experiments are performed. The most effective approach is to estimate the binding affinity of a given peptide fragment to MHC class I or II molecules. With the availability of whole genomes for many microbial species it is now feasible to computationally screen whole proteomes for candidate peptides. epitopepredict is a programmatic framework and command line tool designed to aid this process. It provides access to multiple binding prediction algorithms under a single interface and scales for whole genomes using multiple target MHC alleles. A web interface is provided to assist visualization and filtering of the results. The software is freely available under an open source license from https://github.com/dmnfarrell/epitopepredict


2018 ◽  
Author(s):  
Steen Lysgaard ◽  
Paul C. Jennings ◽  
Jens Strabo Hummelshøj ◽  
Thomas Bligaard ◽  
Tejs Vegge

A machine learning model is used as a surrogate fitness evaluator in a genetic algorithm (GA) optimization of the atomic distribution of Pt-Au nanoparticles. The machine learning accelerated genetic algorithm (MLaGA) yields a 50-fold reduction of required energy calculations compared to a traditional GA.


Author(s):  
Dhilsath Fathima.M ◽  
S. Justin Samuel ◽  
R. Hari Haran

Aim: This proposed work is used to develop an improved and robust machine learning model for predicting Myocardial Infarction (MI) could have substantial clinical impact. Objectives: This paper explains how to build machine learning based computer-aided analysis system for an early and accurate prediction of Myocardial Infarction (MI) which utilizes framingham heart study dataset for validation and evaluation. This proposed computer-aided analysis model will support medical professionals to predict myocardial infarction proficiently. Methods: The proposed model utilize the mean imputation to remove the missing values from the data set, then applied principal component analysis to extract the optimal features from the data set to enhance the performance of the classifiers. After PCA, the reduced features are partitioned into training dataset and testing dataset where 70% of the training dataset are given as an input to the four well-liked classifiers as support vector machine, k-nearest neighbor, logistic regression and decision tree to train the classifiers and 30% of test dataset is used to evaluate an output of machine learning model using performance metrics as confusion matrix, classifier accuracy, precision, sensitivity, F1-score, AUC-ROC curve. Results: Output of the classifiers are evaluated using performance measures and we observed that logistic regression provides high accuracy than K-NN, SVM, decision tree classifiers and PCA performs sound as a good feature extraction method to enhance the performance of proposed model. From these analyses, we conclude that logistic regression having good mean accuracy level and standard deviation accuracy compared with the other three algorithms. AUC-ROC curve of the proposed classifiers is analyzed from the output figure.4, figure.5 that logistic regression exhibits good AUC-ROC score, i.e. around 70% compared to k-NN and decision tree algorithm. Conclusion: From the result analysis, we infer that this proposed machine learning model will act as an optimal decision making system to predict the acute myocardial infarction at an early stage than an existing machine learning based prediction models and it is capable to predict the presence of an acute myocardial Infarction with human using the heart disease risk factors, in order to decide when to start lifestyle modification and medical treatment to prevent the heart disease.


Sign in / Sign up

Export Citation Format

Share Document