uniprot database
Recently Published Documents


TOTAL DOCUMENTS

19
(FIVE YEARS 13)

H-INDEX

4
(FIVE YEARS 2)

2021 ◽  
Author(s):  
Kyle S. Hoffman ◽  
Baozhen Shan ◽  
Jonathan R. Krieger

AbstractIdentifying antigens displayed specifically on tumour cell surfaces by human leukocyte antigen (HLA) proteins is important for the development of immunotherapies and cancer vaccines. The difficulty in capturing an HLA ligandome stems from the fact that many HLA ligands are derived from splicing events or contain mutations, hindering their identification in a standard database search. To address this challenge, we developed an immunopeptidomics workflow with PEAKS XPro that uses de novo sequencing to uncover such peptides and identifies mutations for neoantigen discovery. We demonstrate the utility of this workflow by re-analyzing HLA-I ligandome datasets and reveal a vast diversity in peptide sequences among clones derived from a colorectal cancer tumour. Over 8000 peptides predicted to bind HLA-I molecules were identified by de novo sequencing only (not found in the UniProt database) and make up over 50% of identified peptides from each sample. Lastly, tumour-specific mutations and consensus sequence motif characteristics are defined. This workflow is widely applicable to any immunopeptidomic mass spectrometry dataset and does not require custom database generation for neoantigen discovery.


2021 ◽  
Vol 18 ◽  
Author(s):  
Carlos Polanco ◽  
Vladimir N. Uversky ◽  
Gilberto Vargas-Alarcón ◽  
Thomas Buhse ◽  
Alberto Huberman ◽  
...  

Background: In the vast variety of viruses known, there is a particular interest in those transmitted to humans and whose ability to disseminate represents a significant public health issue. Objective: The present study’s objective is to bioinformatically characterize the proteins of the two main divisions of viruses, RNA-viruses and DNA-viruses. Methods: In this work, a set of in-house computational programs was used to calculate the polarity/charge profiles and intrinsic disorder predisposition profiles of the proteins of several groups of viruses representing both types extracted from UniProt database. The efficiency of these computational programs was statistically verified. Results: It was found that the polarity/charge profile of the proteins is, in most cases, an efficient discriminant that allows the re-creation of the taxonomy known for both viral groups. Additionally, the entire set of "reviewed" proteins in UniProt database was analyzed to find proteins with the polarity/charge profiles similar to those obtained for each viral group. This search revealed a substantial number of proteins with such polarity-charge profiles. Conclusion: Polarity/charge profile represents a physicochemical metric, which is easy to calculate, and which can be used to effectively identify viral groups from their protein sequences.


Author(s):  
Yanan Wang ◽  
Fuyi Li ◽  
Manasa Bharathwaj ◽  
Natalia C Rosas ◽  
André Leier ◽  
...  

Abstract Beta-lactamases (BLs) are enzymes localized in the periplasmic space of bacterial pathogens, where they confer resistance to beta-lactam antibiotics. Experimental identification of BLs is costly yet crucial to understand beta-lactam resistance mechanisms. To address this issue, we present DeepBL, a deep learning-based approach by incorporating sequence-derived features to enable high-throughput prediction of BLs. Specifically, DeepBL is implemented based on the Small VGGNet architecture and the TensorFlow deep learning library. Furthermore, the performance of DeepBL models is investigated in relation to the sequence redundancy level and negative sample selection in the benchmark dataset. The models are trained on datasets of varying sequence redundancy thresholds, and the model performance is evaluated by extensive benchmarking tests. Using the optimized DeepBL model, we perform proteome-wide screening for all reviewed bacterium protein sequences available from the UniProt database. These results are freely accessible at the DeepBL webserver at http://deepbl.erc.monash.edu.au/.


2020 ◽  
Vol 17 ◽  
Author(s):  
Carlos Polanco ◽  
Alberto Huberman ◽  
Vladimir N. Uversky ◽  
Leire Andrés ◽  
Thomas Buhse ◽  
...  

Background: Selective Cationic Amphipathic Antibacterial Peptides (SCAAPs) occupy a prominent place in the production of new drugs on account of their high toxicity towards bacteria and low toxicity towards mammalian cells, low hemolytic activity, and contribution to the protection of the human immune system. Introduction: Their number in nature is very low and experimental tests are very protracted and costly. Therefore, it would be useful to have bioinformatics tools that would identify them in the existing databases and also propose new synthetic SCAAPs. Method: In order to reduce the costs of identification and/or chemical synthesis. To know the physicochemical characteristics of SCAAPs at a residues level and to obtain a “bioiformatics fingerprint” suitable for their selection, we have modified the Polarity Index Method® (PIM®) to include the α-helical configuration of each sequence to determine their individual “PIM® profile”. We have also used a set of computer program to determine their “Intrinsic Disorder Predisposition”. This information was then compared with other protein groups such as bacteria, fungi, virus and cell penetrating peptides (CPP) from the UniProt database and a set of intrinsically disordered proteins. Once the “fingerprint” of SCAAPs was obtained, it was used for searching among the 559228 “reviewed” proteins from the UniProt database and a set of synthetic SCAAPs characterized by the predefined “PIM® profile” selected. Results: Our results showed that the metric named “PIM® profile” can identify, with a high level of accuracy, a group of bacterial SCAAPs. This bioinformatics study was supported at residues level, using the in-house bioinformatics system Polarity Index Method the commonly used algorithm for the prediction of intrinsic disorder predisposition, PONDR® FIT. Conclusions: The Polarity Index Method seems highly efficient identifying SCAAP candidates.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Xiangwen Ji ◽  
Chunmei Cui ◽  
Qinghua Cui

Abstract Background Small open reading frame (smORF) is open reading frame with a length of less than 100 codons. Microproteins, translated from smORFs, have been found to participate in a variety of biological processes such as muscle formation and contraction, cell proliferation, and immune activation. Although previous studies have collected and annotated a large abundance of smORFs, functions of the vast majority of smORFs are still unknown. It is thus increasingly important to develop computational methods to annotate the functions of these smORFs. Results In this study, we collected 617,462 unique smORFs from three studies. The expression of smORF RNAs was estimated by reannotated microarray probes. Using a speed-optimized correlation algorism, the functions of smORFs were predicted by their correlated genes with known functional annotations. After applying our method to 5 known microproteins from literatures, our method successfully predicted their functions. Further validation from the UniProt database showed that at least one function of 202 out of 270 microproteins was predicted. Conclusions We developed a method, smORFunction, to provide function predictions of smORFs/microproteins in at most 265 models generated from 173 datasets, including 48 tissues/cells, 82 diseases (and normal). The tool can be available at https://www.cuilab.cn/smorfunction.


2020 ◽  
Author(s):  
Xiangwen Ji ◽  
Chunmei Cui ◽  
Qinghua Cui

Abstract Background Small open reading frame (smORF) is open reading frame with a length of less than 100 codons. Microproteins, translated from smORFs, have been found to participate in a variety of biological processes such as muscle formation and contraction, cell proliferation, and immune activation. Although previous studies have collected and annotated a large abundance of smORFs, functions of the vast majority of smORFs are still unknown. It is thus increasingly important to develop computational methods to annotate the functions of these smORFs. Results In this study, we collected 617,462 unique smORFs from three studies. The expression of smORF RNAs was estimated by reannotated microarray probes. Using a speed-optimized correlation algorism, the functions of smORFs were predicted by their correlated genes with known functional annotations. After applying our method to 5 known microproteins from literatures, our method successfully predicted their functions. Further validation from the UniProt database showed that at least one function of 202 out of 270 microproteins was predicted. Conclusions We developed a method, smORFunction, to provide function predictions of smORFs/microproteins in at most 265 models generated from 173 datasets, including 48 tissues/cells, 82 diseases (and normal). The tool can be available at http://www.cuilab.cn/smorfunction.


2020 ◽  
Author(s):  
Xiangwen Ji ◽  
Chunmei Cui ◽  
Qinghua Cui

Abstract Background Small open reading frame (smORF) is open reading frame with a length of less than 100 codons. Microproteins, translated from smORFs, have been found to participate in a variety of biological processes such as muscle formation and contraction, cell proliferation, and immune activation. Although previous studies have collected and annotated a large abundance of smORFs, functions of the vast majority of smORFs are still unknown. It is thus increasingly important to develop computational methods to annotate the functions of these smORFs. Results In this study, we collected 617,462 unique smORFs from three studies. The expression of smORF RNAs was estimated by reannotated microarray probes. Using a speed-optimized correlation algorism, the functions of smORFs were predicted by their correlated genes with known functional annotations. After applying our method to 5 known microproteins from literatures, our method successfully predicted their functions. Further validation from the UniProt database showed that at least one function of 202 out of 270 microproteins was predicted. Conclusions We developed a method, smORFunction, to provide function predictions of smORFs/microproteins in at most 265 models generated from 173 datasets, including 48 tissues/cells, 82 diseases (and normal). The tool can be available at http://www.cuilab.cn/smorfunction.


2020 ◽  
Author(s):  
Xiangwen Ji ◽  
Chunmei Cui ◽  
Qinghua Cui

Abstract Background Small open reading frame (smORF) is open reading frame with a length of less than 100 codons. Microproteins, translated from smORFs, have been found to participate in a variety of biological processes such as muscle formation and contraction, cell proliferation, and immune activation. Although previous studies have collected and annotated a large abundance of smORFs, functions of the vast majority of smORFs are still unknown. It is thus increasingly important to develop computational methods to annotate the functions of these smORFs. Results In this study, we collected 617,462 unique smORFs from three studies. The expression of smORF RNAs was estimated by reannotated microarray probes. Using a speed-optimized correlation algorism, the functions of smORFs were predicted by their correlated genes with known functional annotations. After applying our method to 5 known microproteins from literatures, our method successfully predicted their functions. Further validation from the UniProt database showed that at least one function of 202 out of 270 microproteins was predicted. Conclusions We developed a method, smORFunction, to provide function predictions of smORFs/microproteins in at most 265 models generated from 173 datasets, including 48 tissues/cells, 82 diseases (and normal). The tool can be available at http://www.cuilab.cn/smorfunction.


2020 ◽  
Vol 17 ◽  
Author(s):  
Carlos Polanco ◽  
Vladimir N. Uversky ◽  
Alberto Huberman ◽  
Leire Andrés ◽  
Thomas Buhse ◽  
...  

Background: The female Aedes aegypti mosquito is a vector of several arthropodborne viruses, such as Mayaro, Dengue, Chikungunya, Yellow Fever, and Zika. These viruses cause the death of at least 600000 people a year and temporarily disable several millions more around the world. Up to date, there are no effective prophylactic measures that would prevent the contact and bite of this arthropod and, therefore, its consequential contagion. Objective: The objective of the present study was to search for the regularities of the proteins expressed by these five viruses, at residues level, and obtain a "bioinformatic fingerprint" to select them. Methods: We used two bioinformatic systems, our in-house bioinformatic system named Polarity Index Method® (PIM®) supported at residues level, and the commonly used algorithm for the prediction of intrinsic disorder predisposition, PONDR® FIT. We applied both programs to the 29 proteins that express the five groups of arboviruses studied, and we calculated for each of them their Polarity Index Method® profile and their intrinsic disorder predisposition. This information was then compared with analogous information for other protein groups, such as proteins from bacteria, fungi, viruses, and cell penetrating peptides from the UniProt database, and a set of intrinsically disordered proteins. Once the "fingerprint" of each group of arboviruses was obtained, these "fingerprints" were searched among the 559228 "reviewed" proteins from the UniProt database. Results: In total, 1736 proteins were identified from the 559228 “reviewed” proteins from UniProt database, with similar "PIM® profile" to the 29 mutated proteins that express the five groups of arboviruses. Conclusion: We propose that the “PIM® profile” of characterization of proteins might be useful for the identification of proteins expressed by arthropod-borne viruses transmitted by Aedes aegypti mosquito.


Catalysts ◽  
2020 ◽  
Vol 10 (2) ◽  
pp. 222 ◽  
Author(s):  
Ophelia Gevaert ◽  
Stevie Van Overtveldt ◽  
Matthieu Da Costa ◽  
Koen Beerens ◽  
Tom Desmet

C5-epimerases are promising tools for the production of rare l-hexoses from their more common d-counterparts. On that account, UDP-glucuronate 5-epimerase (UGA5E) attracts attention as this enzyme could prove to be useful for the synthesis of UDP-l-iduronate. Interestingly, l-iduronate is known as a precursor for the production of heparin, an effective anticoagulant. To date, the UGA5E specificity has only been detected in rabbit skin extract, and the respective enzyme has not been characterized in detail or even identified at the molecular level. Accordingly, the current work aimed to shed more light on the properties of UGA5E. Therefore, the pool of putative UGA5Es present in the UniProt database was scrutinized and their sequences were clustered in a phylogenetic tree. However, the examination of two of these enzymes revealed that they actually epimerize UDP-glucuronate at the 4- rather than 5-position. Furthermore, in silico analysis indicated that this should be the case for all sequences that are currently annotated as UGA5E and, hence, that such activity has not yet been discovered in nature. The detected l-iduronate synthesis in rabbit skin extract can probably be assigned to the enzyme chondroitin-glucuronate C5-epimerase, which catalyzes the conversion of d-glucuronate to l-iduronate on a polysaccharide level.


Sign in / Sign up

Export Citation Format

Share Document