scholarly journals Machine Learning-Based Ensemble Model for Zika Virus T-Cell Epitope Prediction

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Syed Nisar Hussain Bukhari ◽  
Amit Jain ◽  
Ehtishamul Haq ◽  
Moaiad Ahmad Khder ◽  
Rahul Neware ◽  
...  

Zika virus (ZIKV), the causative agent of Zika fever in humans, is an RNA virus that belongs to the genus Flavivirus. Currently, there is no approved vaccine for clinical use to combat the ZIKV infection and contain the epidemic. Epitope-based peptide vaccines have a large untapped potential for boosting vaccination safety, cross-reactivity, and immunogenicity. Though many attempts have been made to develop vaccines for ZIKV, none of these have proved to be successful. Epitope-based peptide vaccines can act as powerful alternatives to conventional vaccines due to their low production cost, less reactogenic, and allergenic responses. For designing an effective and viable epitope-based peptide vaccine against this deadly virus, it is essential to select the antigenic T-cell epitopes since epitope-based vaccines are considered safe. The in silico machine-learning-based approach for ZIKV T-cell epitope prediction would save a lot of physical experimental time and efforts for speedy vaccine development compared to in vivo approaches. We hereby have trained a machine-learning-based computational model to predict novel ZIKV T-cell epitopes by employing physicochemical properties of amino acids. The proposed ensemble model based on a voting mechanism works by blending the predictions for each class (epitope or nonepitope) from each base classifier. Predictions obtained for each class by the individual classifier are summed up, and the class with the majority vote is predicted upon. An odd number of classifiers have been used to avoid the occurrence of ties in the voting. Experimentally determined ZIKV peptide sequences data set was collected from Immune Epitope Database and Analysis Resource (IEDB) repository. The data set consists of 3,519 sequences, of which 1,762 are epitopes and 1,757 are nonepitopes. The length of sequences ranges from 6 to 30 meter. For each sequence, we extracted 13 physicochemical features. The proposed ensemble model achieved sensitivity, specificity, Gini coefficient, AUC, precision, F-score, and accuracy of 0.976, 0.959, 0.993, 0.994, 0.989, 0.985, and 97.13%, respectively. To check the consistency of the model, we carried out five-fold cross-validation and an average accuracy of 96.072% is reported. Finally, a comparative analysis of the proposed model with existing methods has been carried out using a separate validation data set, suggesting the proposed ensemble model as a better model. The proposed ensemble model will help predict novel ZIKV vaccine candidates to save lives globally and prevent future epidemic-scale outbreaks.

2019 ◽  
Author(s):  
Sinu Paul ◽  
Nathan P. Croft ◽  
Anthony W. Purcell ◽  
David C. Tscharke ◽  
Alessandro Sette ◽  
...  

AbstractT cell epitope candidates are commonly identified using computational prediction tools in order to enable applications such as vaccine design, cancer neoantigen identification, development of diagnostics and removal of unwanted immune responses against protein therapeutics. Most T cell epitope prediction tools are based on machine learning algorithms trained on MHC binding or naturally processed MHC ligand elution data. The ability of currently available tools to predict T cell epitopes has not been comprehensively evaluated. In this study, we used a recently published dataset that systematically defined T cell epitopes recognized in vaccinia virus (VACV) infected mice, considering both peptides predicted to bind MHC or experimentally eluted from infected cells, making this the most comprehensive dataset of T cell epitopes mapped in a complex pathogen. We evaluated the performance of all currently publicly available computational T cell epitope prediction tools to identify these major epitopes from all peptides encoded in the VACV proteome. We found that all methods were able to improve epitope identification above random, with the best performance achieved by neural network-based predictions trained on both MHC binding and MHC ligand elution data (NetMHCPan-4.0 and MHCFlurry). Impressively, these methods were able to capture more than half of the major epitopes in the top 0.04% (N = 277) of peptides in the VACV proteome (N = 767,788). These performance metrics provide guidance for immunologists as to which prediction methods to use. In addition, this benchmark was implemented in an open and easy to reproduce format, providing developers with a framework for future comparisons against new tools.Author summaryComputational prediction tools are used to screen peptides to identify potential T cell epitope candidates. These tools, developed using machine learning methods, save time and resources in many immunological studies including vaccine discovery and cancer neoantigen identification. In addition to the already existing methods several epitope prediction tools are being developed these days but they lack a comprehensive and uniform evaluation to see which method performs best. In this study we did a comprehensive evaluation of publicly accessible MHC I restricted T cell epitope prediction tools using a recently published dataset of Vaccinia virus epitopes. We found that methods based on artificial neural network architecture and trained on both MHC binding and ligand elution data showed very high performance (NetMHCPan-4.0 and MHCFlurry). This benchmark analysis will help immunologists to choose the right prediction method for their desired work and will also serve as a framework for tool developers to evaluate new prediction methods.


2010 ◽  
Vol 6 (Suppl 2) ◽  
pp. S4 ◽  
Author(s):  
Darren R Flower ◽  
Kanchan Phadwal ◽  
Isabel K Macdonald ◽  
Peter V Coveney ◽  
Matthew N Davies ◽  
...  

2002 ◽  
Vol 9 (3) ◽  
pp. 527-539 ◽  
Author(s):  
Myong-Hee Sung ◽  
Yingdong Zhao ◽  
Roland Martin ◽  
Richard Simon

2009 ◽  
Vol 5 (3) ◽  
pp. e1000327 ◽  
Author(s):  
Aidan MacNamara ◽  
Ulrich Kadolsky ◽  
Charles R. M. Bangham ◽  
Becca Asquith

2010 ◽  
Vol 6 (Suppl 2) ◽  
pp. S3 ◽  
Author(s):  
Claus Lundegaard ◽  
Ilka Hoof ◽  
Ole Lund ◽  
Morten Nielsen

2016 ◽  
Vol 432 ◽  
pp. 72-81 ◽  
Author(s):  
Ramgopal R. Mettu ◽  
Tysheena Charles ◽  
Samuel J. Landry

2021 ◽  
pp. 100122
Author(s):  
Brian Reardon ◽  
Zeynep Koşaloğlu-Yalçın ◽  
Sinu Paul ◽  
Bjoern Peters ◽  
Alessandro Sette

SciVee ◽  
2009 ◽  
Author(s):  
Aidan MacNamara ◽  
Ulrich Kadolsky ◽  
Charles Bangham ◽  
Becca Asquith

Sign in / Sign up

Export Citation Format

Share Document