A Novel Ensemble Machine Learning Model for Prediction of Zika Virus T-Cell Epitopes

Background: T lymphocyte achieves an immune response by recognizing antigen peptides (also known as T cell epitopes) through major histocompatibility complex (MHC) molecules. The immunogenicity of T cell epitopes depends on their source and stability in combination with MHC molecules. The binding of the peptide to MHC is the most selective step, so predicting the binding affinity of the peptide to MHC is the principal step in predicting T cell epitopes. The identification of epitopes is of great significance in the research of vaccine design and T cell immune response. Objective: The traditional method for identifying epitopes is to synthesize and test the binding activity of peptide by experimental methods, which is not only time-consuming, but also expensive. In silico methods for predicting peptide-MHC binding emerge to pre-select candidate peptides for experimental testing, which greatly saves time and costs. By summarizing and analyzing these methods, we hope to have a better insight and provide guidance for future directions. Methods: Up to now, a number of methods have been developed to predict the binding ability of peptides to MHC based on various principles. Some of them employ matrix models or machine learning models based on the sequence characteristic embedded in peptides or MHC to predict the binding ability of peptides to MHC. Some others utilize the three-dimensional structural information of peptides or MHC, for example, by extracting three-dimensional structural information to construct a feature matrix or machine learning model, or directly using protein structure prediction, molecular docking to predict the binding mode of peptides and MHC. Results: Although the methods in predicting peptide-MHC binding based on the feature matrix or machine learning model can achieve high-throughput prediction, the accuracy of which depends heavily on the sequence characteristic of confirmed binding peptides. In addition, it cannot provide insights into the mechanism of antigen specificity. Therefore, such methods have certain limitations in practical applications. Methods in predicting peptide-MHC binding based on structural prediction or molecular docking are computationally intensive compared to the methods based on feature matrix or machine learning model and the challenge is how to predict a reliable structural model. Conclusion: This paper reviews the principles, advantages and disadvantages of the methods of peptide-MHC binding prediction and discussed the future directions to achieve more accurate predictions.

Download Full-text

Towards an Ensemble Machine Learning Model of Random Subspace Based Functional Tree Classifier for Snow Avalanche Susceptibility Mapping

IEEE Access ◽

10.1109/access.2020.3014816 ◽

2020 ◽

Vol 8 ◽

pp. 145968-145983 ◽

Cited By ~ 3

Author(s):

Amirhosein Mosavi ◽

Ataollah Shirzadi ◽

Bahram Choubin ◽

Fereshteh Taromideh ◽

Farzaneh Sajedi Hosseini ◽

...

Keyword(s):

Machine Learning ◽

Learning Model ◽

Susceptibility Mapping ◽

Snow Avalanche ◽

Random Subspace ◽

Ensemble Machine Learning ◽

Machine Learning Model ◽

Tree Classifier

Download Full-text

Ensemble Machine Learning Model for Mortality Prediction Inside Intensive Care Unit

Studies in Computational Intelligence - Medical Informatics and Bioimaging Using Artificial Intelligence ◽

10.1007/978-3-030-91103-4_14 ◽

2021 ◽

pp. 245-258

Author(s):

Nora El-Rashidy ◽

Shaker El-Sappagh ◽

Samir Abdelrazik ◽

Hazem El-Bakry

Keyword(s):

Machine Learning ◽

Intensive Care Unit ◽

Intensive Care ◽

Learning Model ◽

Mortality Prediction ◽

Ensemble Machine Learning ◽

Machine Learning Model

Download Full-text

An Ensemble Machine Learning Model for the Early Detection of Sepsis from Clinical Data

2019 Computing in Cardiology Conference (CinC) ◽

10.22489/cinc.2019.317 ◽

2019 ◽

Author(s):

Mengsha Fu ◽

Jiabin Yuan ◽

Menglin Lu ◽

Pengfei Hong ◽

Mei Zeng

Keyword(s):

Machine Learning ◽

Early Detection ◽

Clinical Data ◽

Learning Model ◽

Ensemble Machine Learning ◽

Machine Learning Model

Download Full-text

Ensemble Machine Learning Model for Higher Learning Scholarship Award Decisions

International Journal of Advanced Computer Science and Applications ◽

10.14569/ijacsa.2020.0110540 ◽

2020 ◽

Vol 11 (5) ◽

Author(s):

Wirawati Dewi Ahmad ◽

Azuraliza Abu

Keyword(s):

Machine Learning ◽

Learning Model ◽

Higher Learning ◽

Ensemble Machine Learning ◽

Machine Learning Model

Download Full-text

An improved catalogue of putative synaptic genes defined by their temporal transcription profiles through an ensemble machine learning approach

10.21203/rs.2.9628/v1 ◽

2019 ◽

Author(s):

Flavio Pazos ◽

Pablo Soto ◽

Martín Palazzo ◽

Gustavo Guerberoff ◽

Patricio Yankilevich ◽

...

Keyword(s):

Machine Learning ◽

Empirical Data ◽

Predictive Power ◽

Learning Model ◽

Synaptic Function ◽

Training Set ◽

Training Scheme ◽

Transcription Profiles ◽

Ensemble Machine Learning ◽

Machine Learning Model

Abstract Background. Assembly and function of neuronal synapses require the coordinated expression of a yet undetermined set of genes. Previously, we had trained an ensemble machine learning model to assign a probability of having synaptic function to every protein-coding gene in Drosophila melanogaster. This approach resulted in the publication of a catalogue of 893 genes that was postulated to be very enriched in genes with still undocumented synaptic functions. Since then, the scientific community has experimentally identified 79 new synaptic genes. Here we used these new empirical data to evaluate the predictive power of the catalogue. Then we implemented a series of improvements to the training scheme and the ensemble rules of our model and added the new synaptic genes to the training set, to obtain a new, enhanced catalogue of putative synaptic genes. Results. The retrospective analysis demonstrated that our original catalogue was indeed highly enriched in genes with unknown synaptic function. The changes to the training scheme and the ensemble rules resulted in a catalogue with better predictive power. Finally, training this improved model with an updated training set, that includes all the new synaptic genes, we obtained a new, enhanced catalogue of putative synaptic genes, which we present here announcing a regularly updated version that will be available online at: http://synapticgenes.bnd.edu.uy Conclusions. We show that training a machine learning model solely with the whole-body temporal transcription profiles of known synaptic genes resulted in a catalogue with a significant enrichment in undiscovered synaptic genes. Using new empirical data, we validated our original approach, improved our model an obtained a better catalogue. The utility of this approach is that it reduces the number of genes to be tested through hypothesis-driven experimentation.

Download Full-text