scholarly journals Immunogenic potential of neopeptides depends on parent protein subcellular location

2021 ◽  
Author(s):  
Andrea Castro ◽  
Sahar Kaabinejadian ◽  
William Hildebrand ◽  
Maurizio Zanetti ◽  
Hannah Carter

Antigen presentation via the major histocompatibility complex (MHC) is essential for anti-tumor immunity, however the rules that determine what tumor-derived peptides will be immunogenic are still incompletely understood. Here we investigate whether protein subcellular location driven constraints on accessibility of peptides to the MHC associate with potential for peptide immunogenicity. Analyzing over 380,000 of peptides from studies of MHC presentation and peptide immunogenicity, we find clear spatial biases in both eluted and immunogenic peptides. We find that including parent protein location improves prediction of peptide immunogenicity in multiple datasets. In human immunotherapy cohorts, location was associated with response to a neoantigen vaccine, and immune checkpoint blockade responders generally had a higher burden of neopeptides from accessible locations. We conclude that protein subcellular location adds important information for optimizing immunotherapies.

2019 ◽  
Vol 14 (5) ◽  
pp. 406-421 ◽  
Author(s):  
Ting-He Zhang ◽  
Shao-Wu Zhang

Background: Revealing the subcellular location of a newly discovered protein can bring insight into their function and guide research at the cellular level. The experimental methods currently used to identify the protein subcellular locations are both time-consuming and expensive. Thus, it is highly desired to develop computational methods for efficiently and effectively identifying the protein subcellular locations. Especially, the rapidly increasing number of protein sequences entering the genome databases has called for the development of automated analysis methods. Methods: In this review, we will describe the recent advances in predicting the protein subcellular locations with machine learning from the following aspects: i) Protein subcellular location benchmark dataset construction, ii) Protein feature representation and feature descriptors, iii) Common machine learning algorithms, iv) Cross-validation test methods and assessment metrics, v) Web servers. Result & Conclusion: Concomitant with a large number of protein sequences generated by highthroughput technologies, four future directions for predicting protein subcellular locations with machine learning should be paid attention. One direction is the selection of novel and effective features (e.g., statistics, physical-chemical, evolutional) from the sequences and structures of proteins. Another is the feature fusion strategy. The third is the design of a powerful predictor and the fourth one is the protein multiple location sites prediction.


2020 ◽  
Vol 15 (6) ◽  
pp. 517-527
Author(s):  
Yunyun Liang ◽  
Shengli Zhang

Background: Apoptosis proteins have a key role in the development and the homeostasis of the organism, and are very important to understand the mechanism of cell proliferation and death. The function of apoptosis protein is closely related to its subcellular location. Objective: Prediction of apoptosis protein subcellular localization is a meaningful task. Methods: In this study, we predict the apoptosis protein subcellular location by using the PSSMbased second-order moving average descriptor, nonnegative matrix factorization based on Kullback-Leibler divergence and over-sampling algorithms. This model is named by SOMAPKLNMF- OS and constructed on the ZD98, ZW225 and CL317 benchmark datasets. Then, the support vector machine is adopted as the classifier, and the bias-free jackknife test method is used to evaluate the accuracy. Results: Our prediction system achieves the favorable and promising performance of the overall accuracy on the three datasets and also outperforms the other listed models. Conclusion: The results show that our model offers a high throughput tool for the identification of apoptosis protein subcellular localization.


Genes ◽  
2021 ◽  
Vol 12 (3) ◽  
pp. 451
Author(s):  
Pablo Mier ◽  
Miguel A. Andrade-Navarro

Low complexity regions (LCRs) in proteins are characterized by amino acid frequencies that differ from the average. These regions evolve faster and tend to be less conserved between homologs than globular domains. They are not common in bacteria, as compared to their prevalence in eukaryotes. Studying their conservation could help provide hypotheses about their function. To obtain the appropriate evolutionary focus for this rapidly evolving feature, here we study the conservation of LCRs in bacterial strains and compare their high variability to the closeness of the strains. For this, we selected 20 taxonomically diverse bacterial species and obtained the completely sequenced proteomes of two strains per species. We calculated all orthologous pairs for each of the 20 strain pairs. Per orthologous pair, we computed the conservation of two types of LCRs: compositionally biased regions (CBRs) and homorepeats (polyX). Our results show that, in bacteria, Q-rich CBRs are the most conserved, while A-rich CBRs and polyA are the most variable. LCRs have generally higher conservation when comparing pathogenic strains. However, this result depends on protein subcellular location: LCRs accumulate in extracellular and outer membrane proteins, with conservation increased in the extracellular proteins of pathogens, and decreased for polyX in the outer membrane proteins of pathogens. We conclude that these dependencies support the functional importance of LCRs in host–pathogen interactions.


Amino Acids ◽  
2004 ◽  
Vol 28 (1) ◽  
pp. 57-61 ◽  
Author(s):  
X. Xiao ◽  
S. Shao ◽  
Y. Ding ◽  
Z. Huang ◽  
Y. Huang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document