A New Subcellular Localization Predictor for Human Proteins Considering the Correlation of Annotation Features and Protein Multi-localization

Author(s):  
Hang Zhou ◽  
Yang Yang ◽  
Hong-Bin Shen
2020 ◽  
Vol 1 (2) ◽  
pp. 01-05
Author(s):  
Kuo Chou

In 2019 a very powerful web-server, or AI (Artificial Intelligence) tool, has been developed for predicting the subcellular localization of human proteins purely according to their information for the multi-label systems, in which a same protein may appear or travel between two or more locations and hence its identification needs the multi-label mark.


2020 ◽  
pp. 1-4
Author(s):  
Kuo-Chen Chou ◽  

In 2019 a very powerful web-server, or AI (Artificial Intelligence) tool, has been developed for predicting the subcellular localization of human proteins purely according to their information for the multi-label systems [1], in which a same protein may appear or travel between two or more locations and hence its identification needs the multi-label mark [2].


2020 ◽  
Vol 21 (7) ◽  
pp. 546-557
Author(s):  
Rahul Semwal ◽  
Pritish Kumar Varadwaj

Aims: To develop a tool that can annotate subcellular localization of human proteins. Background: With the progression of high throughput human proteomics projects, an enormous amount of protein sequence data has been discovered in the recent past. All these raw sequence data require precise mapping and annotation for their respective biological role and functional attributes. The functional characteristics of protein molecules are highly dependent on the subcellular localization/ compartment. Therefore, a fully automated and reliable protein subcellular localization prediction system would be very useful for current proteomic research. Objective: To develop a machine learning-based predictive model that can annotate the subcellular localization of human proteins with high accuracy and precision. Methods: In this study, we used the PSI-CD-HIT homology criterion and utilized the sequence-based features of protein sequences to develop a powerful subcellular localization predictive model. The dataset used to train the HumDLoc model was extracted from a reliable data source, Uniprot knowledge base, which helps the model to generalize on the unseen dataset. Result : The proposed model, HumDLoc, was compared with two of the most widely used techniques: CELLO and DeepLoc, and other machine learning-based tools. The result demonstrated promising predictive performance of HumDLoc model based on various machine learning parameters such as accuracy (≥97.00%), precision (≥0.86), recall (≥0.89), MCC score (≥0.86), ROC curve (0.98 square unit), and precision-recall curve (0.93 square unit). Conclusion: In conclusion, HumDLoc was able to outperform several alternative tools for correctly predicting subcellular localization of human proteins. The HumDLoc has been hosted as a web-based tool at https://bioserver.iiita.ac.in/HumDLoc/.


2006 ◽  
Vol 04 (01) ◽  
pp. 1-18 ◽  
Author(s):  
JOHN HAWKINS ◽  
MIKAEL BODÉN

This paper presents a composite multi-layer classifier system for predicting the subcellular localization of proteins based on their amino acid sequence. The work is an extension of our previous predictor PProwler v1.1 which is itself built upon the series of predictors SignalP and TargetP. In this study we outline experiments conducted to improve the classifier design. The major improvement came from using Support Vector machines as a "smart gate" sorting the outputs of several different targeting peptide detection networks. Our final model (PProwler v1.2) gives MCC values of 0.873 for non-plant and 0.849 for plant proteins. The model improves upon the accuracy of our previous subcellular localization predictor (PProwler v1.1) by 2% for plant data (which represents 7.5% improvement upon TargetP).


1999 ◽  
Vol 354 (1386) ◽  
pp. 1061-1067 ◽  
Author(s):  
J. C. Dorsman ◽  
M. A. Smoor ◽  
M. L. C. Maat Schieman ◽  
M. Bout ◽  
S. Siesling ◽  
...  

Huntington'sdisease (HD) is a neurodegenerative disorder with a midlife onset. The disease is caused by expansion of a CAG (glutamine) repeat within the coding region of the HD gene. The molecular mechanism by which the mutated protein causes this disease is still unclear. To study the protein we have generated a set of rabbit polyclonal antibodies raised against different segments of the N–terminal, central and C–terminal parts of the protein. The polyclonal antibodies were affinity purified and characterized in ELISA and Western blotting experiments. All antibodies can react with the mouse and human proteins. The specificity of these antibodies is underscored by their recognition of huntingtin with different repeat sizes in extracts prepared from patient–derived lymphoblasts. The antibodies were used in immunofluorescence experiments to study the subcellular localization of huntingtin in mouse neuroblastoma N1E–115 cells. The results indicate that most huntingtin is present in the cytoplasm, whereas a minor fraction is present in the nucleus. On differentiation of the N1E–115 cells in vitro , the subcellular distribution of huntingtin does not change significantly. These results suggest that full–length huntingtin with a normal repeat length can be detected in the nucleus of cycling and non–cycling cultured mammalian cells of neuronal origin. However, in HD autopsy brain the huntingtin–containing neuronal intranuclear inclusions can be detected only with antibodies raised against the N–terminus of huntingtin. Thus several forms of huntingtin display the propensity for nuclear localization, possibly with different functional consequences.


Sign in / Sign up

Export Citation Format

Share Document