protein subcellular localization
Recently Published Documents


TOTAL DOCUMENTS

315
(FIVE YEARS 66)

H-INDEX

44
(FIVE YEARS 6)

2022 ◽  
Author(s):  
Aayush Grover ◽  
Laurent Gatto

Protein subcellular localization prediction plays a crucial role in improving our understandings of different diseases and consequently assists in building drug targeting and drug development pipelines. Proteins are known to co-exist at multiple subcellular locations which make the task of prediction extremely challenging. A protein interaction network is a graph that captures interactions between different proteins. It is safe to assume that if two proteins are interacting, they must share some subcellular locations. With this regard, we propose ProtFinder - the first deep learning-based model that exclusively relies on protein interaction networks to predict the multiple subcellular locations of proteins. We also integrate biological priors like the cellular component of Gene Ontology to make ProtFinder a more biology-aware intelligent system. ProtFinder is trained and tested using the STRING and BioPlex databases whereas the annotations of proteins are obtained from the Human Protein Atlas. Our model gives an AUC-ROC score of 90.00% and an MCC score of 83.42% on a held-out set of proteins. We also apply ProtFinder to annotate proteins that currently do not have confident location annotations. We observe that ProtFinder is able to confirm some of these unreliable location annotations, while in some cases complementing the existing databases with novel location annotations.


2021 ◽  
Vol 2021 ◽  
pp. 1-14
Author(s):  
Danyu Jin ◽  
Ping Zhu

The prediction of protein subcellular localization not only is important for the study of protein structure and function but also can facilitate the design and development of new drugs. In recent years, feature extraction methods based on protein evolution information have attracted much attention and made good progress. Based on the protein position-specific score matrix (PSSM) obtained by PSI-BLAST, PSSM-GSD method is proposed according to the data distribution characteristics. In order to reflect the protein sequence information as much as possible, AAO method, PSSM-AAO method, and PSSM-GSD method are fused together. Then, conditional entropy-based classifier chain algorithm and support vector machine are used to locate multilabel proteins. Finally, we test Gpos-mPLoc and Gneg-mPLoc datasets, considering the severe imbalance of data, and select SMOTE algorithm to expand a few sample; the experiment shows that the AAO + PSSM ∗ method in the paper achieved 83.1% and 86.8% overall accuracy, respectively. After experimental comparison of different methods, AAO + PSSM ∗ has good performance and can effectively predict protein subcellular location.


2021 ◽  
Author(s):  
Ruhollah Jamali ◽  
Soheil Jahangiri-Tazehkand ◽  
Changiz Eslahchi

Abstract Identifying a protein’s subcellular location is of great interest for understanding its function and behavior within the cell. In the last decade, many computational approaches have been proposed as a surrogate for expensive and labor-intensive wet-lab methods that are used for protein subcellular localization. Yet, there is still much room for improving the prediction accuracy of these methods. In this article, we meant to develop a customized computational method rather than using common machine learning predictors, which are used in the majority of computational research on this topic. The neighbourhood regularized logistic matrix factorization technique was used to create PSL-Recommender (Protein subcellular location recommender), a GO-based predictor. We declared statistical inference as the driving force behind the PSL-Recommender here. Following that, it was benchmarked against twelve well-known methods using five different datasets, demonstrating outstanding performance. Finally, we discussed potential research avenues for developing a comprehensive prediction tool for protein subcellular location prediction. The datasets and codes are available at: https://github.com/RJamali/PSL-Recommender


Life ◽  
2021 ◽  
Vol 11 (4) ◽  
pp. 293
Author(s):  
Warin Wattanapornprom ◽  
Chinae Thammarongtham ◽  
Apiradee Hongsthong ◽  
Supatcha Lertampaiporn

The accurate prediction of protein localization is a critical step in any functional genome annotation process. This paper proposes an improved strategy for protein subcellular localization prediction in plants based on multiple classifiers, to improve prediction results in terms of both accuracy and reliability. The prediction of plant protein subcellular localization is challenging because the underlying problem is not only a multiclass, but also a multilabel problem. Generally, plant proteins can be found in 10–14 locations/compartments. The number of proteins in some compartments (nucleus, cytoplasm, and mitochondria) is generally much greater than that in other compartments (vacuole, peroxisome, Golgi, and cell wall). Therefore, the problem of imbalanced data usually arises. Therefore, we propose an ensemble machine learning method based on average voting among heterogeneous classifiers. We first extracted various types of features suitable for each type of protein localization to form a total of 479 feature spaces. Then, feature selection methods were used to reduce the dimensions of the features into smaller informative feature subsets. This reduced feature subset was then used to train/build three different individual models. In the process of combining the three distinct classifier models, we used an average voting approach to combine the results of these three different classifiers that we constructed to return the final probability prediction. The method could predict subcellular localizations in both single- and multilabel locations, based on the voting probability. Experimental results indicated that the proposed ensemble method could achieve correct classification with an overall accuracy of 84.58% for 11 compartments, on the basis of the testing dataset.


2021 ◽  
Author(s):  
Hirofumi Kobayashi ◽  
Keith C Cheveralls ◽  
Manuel Leonetti ◽  
Loic Alain Royer

Elucidating the diversity and complexity of protein localization is essential to fully understand cellular architecture. Here, we present cytoself, a deep learning-based approach for fully self-supervised protein localization profiling and clustering. cytoself leverages a self-supervised training scheme that does not require pre-existing knowledge, categories, or annotations. Applying cytoself to images of 1311 endogenously labeled proteins from the recently released OpenCell database creates a highly resolved protein localization atlas. We show that the representations derived from cytoself encapsulate highly specific features that can be used to derive functional insights for proteins on the sole basis of their localization. Finally, to better understand the inner workings of our model, we dissect the emergent features from which our clustering is derived, interpret these features in the context of the fluorescence images, and analyze the performance contributions of the different components of our approach.


2021 ◽  
Author(s):  
Yinze Han ◽  
Hailong Gao ◽  
Bing Han ◽  
Jinzhi Xu ◽  
Jian Luo ◽  
...  

Abstract Background: Microsporidia, a group of obligate intracellular parasites that can infect humans and nearly all animals, have lost the pathways for de novo amino acid, lipid and nucleotide synthesis and instead evolved strategies to manipulate host metabolism and immunity. The endoplasmic reticulum (ER) is a vital organelle for producing and processing proteins and lipids and is often hijacked by intracellular pathogens. However, little is known about how microsporidia modulate host ER pathways. Herein, we identified a secreted protein of Encephalitozoon hellem, EhHNTP1, and characterized its subcellular localization and functions in host cells.Methods: A polyclonal antibody against EhHNTP1 was produced to verify the protein subcellular localization in E. hellem-infected cells using indirect immunofluorescence assay (IFA) and Western blotting. HEK293 cells were transfected with wild-type or mutant EhHNTP1 fused with HA-EGFP, and the impacts on pathogen proliferation, protein subcellular localization and sequence functions were assessed. RNA sequencing of EhHNTP1-transfected cells was conducted to identify differentially expressed genes (DEGs) and pathway responses by bioinformatics analysis mainly with R packages. The DEGs in the transfected cells were experimentally confirmed with RT-qPCR and Western blotting. The regulatory effects of candidate DEGs were analyzed via RNA interference and cell transfection, and the effects were determined with RT-qPCR and Western blotting.Results: EhHNTP1 is secreted into the host nucleus, and its translocation depends on a nuclear localization signal sequence (NLS) at the C-terminus from amino acids 239 to 250. Transfection and overexpression of EhHNTP1 in HEK293 cells significantly promoted pathogen proliferation. RNA-seq of the transfected cells showed that genes involved in ER-associated degradation (ERAD), a quality control mechanism that allows for the targeted degradation of proteins in the ER, were prominently upregulated. Upregulation of the ERAD genes PDIA4, HERP, HSPA5 and Derlin3 determined by RNA-seq data was verified using RT-qPCR and Western blotting. Protein ubiquitination in the transfected cells was then assayed and found to be markedly increased, confirming the activation of ERAD. PDIA4 knockdown with RNAi significantly suppressed the expression of HERP, indicating that PDIA4 is a vital ERAD component exploited by EhHNTP1. Moreover, EhHNTP1ΔHRD, a deletion mutant lacking the histidine-rich domain (HRD) in the C-terminus, predominantly suppressed the upregulation of ERAD genes, indicating that the HRD is essential for EhHNTP1 functions.Conclusion: This study is the first report on a microsporidian secretory protein that targets the host nucleus to upregulate the ERAD pathway and subsequently promote protein ubiquitination. Our work provides new insights into microsporidia-host interactions.


Sign in / Sign up

Export Citation Format

Share Document