Location Proteomics: Systematic Determination of Protein Subcellular Location

Author(s):  
Justin Newberg ◽  
Juchang Hua ◽  
Robert F. Murphy
2019 ◽  
Vol 36 (6) ◽  
pp. 1908-1914 ◽  
Author(s):  
Ying-Ying Xu ◽  
Hong-Bin Shen ◽  
Robert F Murphy

Abstract Motivation Systematic and comprehensive analysis of protein subcellular location as a critical part of proteomics (‘location proteomics’) has been studied for many years, but annotating protein subcellular locations and understanding variation of the location patterns across various cell types and states is still challenging. Results In this work, we used immunohistochemistry images from the Human Protein Atlas as the source of subcellular location information, and built classification models for the complex protein spatial distribution in normal and cancerous tissues. The models can automatically estimate the fractions of protein in different subcellular locations, and can help to quantify the changes of protein distribution from normal to cancer tissues. In addition, we examined the extent to which different annotated protein pathways and complexes showed similarity in the locations of their member proteins, and then predicted new potential proteins for these networks. Availability and implementation The dataset and code are available at: www.csbio.sjtu.edu.cn/bioinf/complexsubcellularpatterns. Supplementary information Supplementary data are available at Bioinformatics online.


1986 ◽  
Vol 14 (4) ◽  
pp. 201-218 ◽  
Author(s):  
A. G. Veith

Abstract This four-part series of papers addresses the problem of systematic determination of the influence of several tire factors on tire treadwear. Both the main effect of each factor and some of their interactive effects are included. The program was also structured to evaluate the influence of some external-to-tire conditions on the relationship of tire factors to treadwear. Part I describes the experimental design used to evaluate the effects on treadwear of generic tire type, aspect ratio, tread pattern (groove or void level), type of pattern (straight rib or block), and tread compound. Construction procedures and precautions used to obtain a valid and functional test method are included. Two guiding principles to be used in the data analyses of Parts II and III are discussed. These are the fractional groove and void concept, to characterize tread pattern geometry, and a demonstration of the equivalence of wear rate for identical compounds on whole tread or multi-section tread tires.


2019 ◽  
Vol 14 (5) ◽  
pp. 406-421 ◽  
Author(s):  
Ting-He Zhang ◽  
Shao-Wu Zhang

Background: Revealing the subcellular location of a newly discovered protein can bring insight into their function and guide research at the cellular level. The experimental methods currently used to identify the protein subcellular locations are both time-consuming and expensive. Thus, it is highly desired to develop computational methods for efficiently and effectively identifying the protein subcellular locations. Especially, the rapidly increasing number of protein sequences entering the genome databases has called for the development of automated analysis methods. Methods: In this review, we will describe the recent advances in predicting the protein subcellular locations with machine learning from the following aspects: i) Protein subcellular location benchmark dataset construction, ii) Protein feature representation and feature descriptors, iii) Common machine learning algorithms, iv) Cross-validation test methods and assessment metrics, v) Web servers. Result & Conclusion: Concomitant with a large number of protein sequences generated by highthroughput technologies, four future directions for predicting protein subcellular locations with machine learning should be paid attention. One direction is the selection of novel and effective features (e.g., statistics, physical-chemical, evolutional) from the sequences and structures of proteins. Another is the feature fusion strategy. The third is the design of a powerful predictor and the fourth one is the protein multiple location sites prediction.


2020 ◽  
Vol 15 (6) ◽  
pp. 517-527
Author(s):  
Yunyun Liang ◽  
Shengli Zhang

Background: Apoptosis proteins have a key role in the development and the homeostasis of the organism, and are very important to understand the mechanism of cell proliferation and death. The function of apoptosis protein is closely related to its subcellular location. Objective: Prediction of apoptosis protein subcellular localization is a meaningful task. Methods: In this study, we predict the apoptosis protein subcellular location by using the PSSMbased second-order moving average descriptor, nonnegative matrix factorization based on Kullback-Leibler divergence and over-sampling algorithms. This model is named by SOMAPKLNMF- OS and constructed on the ZD98, ZW225 and CL317 benchmark datasets. Then, the support vector machine is adopted as the classifier, and the bias-free jackknife test method is used to evaluate the accuracy. Results: Our prediction system achieves the favorable and promising performance of the overall accuracy on the three datasets and also outperforms the other listed models. Conclusion: The results show that our model offers a high throughput tool for the identification of apoptosis protein subcellular localization.


Genes ◽  
2021 ◽  
Vol 12 (3) ◽  
pp. 451
Author(s):  
Pablo Mier ◽  
Miguel A. Andrade-Navarro

Low complexity regions (LCRs) in proteins are characterized by amino acid frequencies that differ from the average. These regions evolve faster and tend to be less conserved between homologs than globular domains. They are not common in bacteria, as compared to their prevalence in eukaryotes. Studying their conservation could help provide hypotheses about their function. To obtain the appropriate evolutionary focus for this rapidly evolving feature, here we study the conservation of LCRs in bacterial strains and compare their high variability to the closeness of the strains. For this, we selected 20 taxonomically diverse bacterial species and obtained the completely sequenced proteomes of two strains per species. We calculated all orthologous pairs for each of the 20 strain pairs. Per orthologous pair, we computed the conservation of two types of LCRs: compositionally biased regions (CBRs) and homorepeats (polyX). Our results show that, in bacteria, Q-rich CBRs are the most conserved, while A-rich CBRs and polyA are the most variable. LCRs have generally higher conservation when comparing pathogenic strains. However, this result depends on protein subcellular location: LCRs accumulate in extracellular and outer membrane proteins, with conservation increased in the extracellular proteins of pathogens, and decreased for polyX in the outer membrane proteins of pathogens. We conclude that these dependencies support the functional importance of LCRs in host–pathogen interactions.


Amino Acids ◽  
2004 ◽  
Vol 28 (1) ◽  
pp. 57-61 ◽  
Author(s):  
X. Xiao ◽  
S. Shao ◽  
Y. Ding ◽  
Z. Huang ◽  
Y. Huang ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document