Predicting Interactions Between Pathogen and Human Proteins Based on the Relation Between Sequence Length and Amino Acid Composition

2021 ◽  
Vol 16 ◽  
Author(s):  
Saud Alguwaizani ◽  
Shulei Ren ◽  
De-Shuang Huang ◽  
Kyungsook Han

Aim: Both bacterial infection and viral infection involve a large number of protein-protein interactions (PPIs) between a pathogen and its target host. Background: So far, many computational methods have focused on predicting PPIs within the same species rather than PPIs across different species. Methods: From the extensive analysis of PPIs between Yersinia pestis bacteria and humans, we recently discovered an interesting relation; a linear relation between amino acid composition and sequence length was observed in many proteins involved in PPIs. We have built a support vector machine (SVM) model, which predicts PPIs between human and bacteria using two feature types derived from the relation. The two feature types used in the SVM are the amino acid composition group (AACG) and the difference in amino acid composition between host and pathogen proteins. Result: The SVM model achieved high performance in predicting bacteria-human PPIs. The model showed an accuracy of 96%, sensitivity of 94%, and specificity of 98% in predicting PPIs between humans and Yersinia pestis, in which there is a strong relation between amino acid composition and sequence length. The SVM model was also tested in predicting PPIs between human and viruses, which include Ebola, HCV, and SARSCoV-2, and showed a good performance. Conclusion: The feature types identified in our study are simple yet powerful in predicting pathogen-human PPIs. Although preliminary, our method will be useful for finding unknown target host proteins or pathogen proteins and designing in vitro or in vivo experiments.

2004 ◽  
Vol 24 (16) ◽  
pp. 7206-7213 ◽  
Author(s):  
Eric D. Ross ◽  
Ulrich Baxa ◽  
Reed B. Wickner

ABSTRACT The [URE3] prion of Saccharomyces cerevisiae is a self-propagating amyloid form of Ure2p. The amino-terminal prion domain of Ure2p is necessary and sufficient for prion formation and has a high glutamine (Q) and asparagine (N) content. Such Q/N-rich domains are found in two other yeast prion proteins, Sup35p and Rnq1p, although none of the many other yeast Q/N-rich domain proteins have yet been found to be prions. To examine the role of amino acid sequence composition in prion formation, we used Ure2p as a model system and generated five Ure2p variants in which the order of the amino acids in the prion domain was randomly shuffled while keeping the amino acid composition and C-terminal domain unchanged. Surprisingly, all five formed prions in vivo, with a range of frequencies and stabilities, and the prion domains of all five readily formed amyloid fibers in vitro. Although it is unclear whether other amyloid-forming proteins would be equally resistant to scrambling, this result demonstrates that [URE3] formation is driven primarily by amino acid composition, largely independent of primary sequence.


Genetics ◽  
2009 ◽  
Vol 183 (3) ◽  
pp. 929-940 ◽  
Author(s):  
Carley D. Ross ◽  
Blake R. McCarty ◽  
Michael Hamilton ◽  
Asa Ben-Hur ◽  
Eric D. Ross

The [URE3] and [PSI+] prions are the infections amyloid forms of the Saccharomyces cerevisiae proteins Ure2p and Sup35p, respectively. Randomizing the order of the amino acids in the Ure2 and Sup35 prion domains while retaining amino acid composition does not block prion formation, indicating that amino acid composition, not primary sequence, is the predominant feature driving [URE3] and [PSI+] formation. Here we show that Ure2p promiscuously interacts with various compositionally similar proteins to influence [URE3] levels. Overexpression of scrambled Ure2p prion domains efficiently increases de novo formation of wild-type [URE3] in vivo. In vitro, amyloid aggregates of the scrambled prion domains efficiently seed wild-type Ure2p amyloid formation, suggesting that the wild-type and scrambled prion domains can directly interact to seed prion formation. To test whether interactions between Ure2p and naturally occurring yeast proteins could similarly affect [URE3] formation, we identified yeast proteins with domains that are compositionally similar to the Ure2p prion domain. Remarkably, all but one of these domains were also able to efficiently increase [URE3] formation. These results suggest that a wide variety of proteins could potentially affect [URE3] formation.


2015 ◽  
Vol 16 ◽  
pp. 94-103 ◽  
Author(s):  
Jae-Young Je ◽  
Soo Yeon Park ◽  
Joung-Youl Hwang ◽  
Chang-Bum Ahn

2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Lifu Zhang ◽  
Benzhi Dong ◽  
Zhixia Teng ◽  
Ying Zhang ◽  
Liran Juan

Enzymes are proteins that can efficiently catalyze specific biochemical reactions, and they are widely present in the human body. Developing an efficient method to identify human enzymes is vital to select enzymes from the vast number of human proteins and to investigate their functions. Nevertheless, only a limited amount of research has been conducted on the classification of human enzymes and nonenzymes. In this work, we developed a support vector machine- (SVM-) based predictor to classify human enzymes using the amino acid composition (AAC), the composition of k-spaced amino acid pairs (CKSAAP), and selected informative amino acid pairs through the use of a feature selection technique. A training dataset including 1117 human enzymes and 2099 nonenzymes and a test dataset including 684 human enzymes and 1270 nonenzymes were constructed to train and test the proposed model. The results of jackknife cross-validation showed that the overall accuracy was 76.46% for the training set and 76.21% for the test set, which are higher than the 72.6% achieved in previous research. Furthermore, various feature extraction methods and mainstream classifiers were compared in this task, and informative feature parameters of k-spaced amino acid pairs were selected and compared. The results suggest that our classifier can be used in human enzyme identification effectively and efficiently and can help to understand their functions and develop new drugs.


2016 ◽  
Vol 2016 ◽  
pp. 1-5 ◽  
Author(s):  
Yun Wu ◽  
Yufei Zheng ◽  
Hua Tang

Conotoxins are a kind of neurotoxin which can specifically interact with potassium, sodium type, and calcium channels. They have become potential drug candidates to treat diseases such as chronic pain, epilepsy, and cardiovascular diseases. Thus, correctly identifying the types of ion channel-targeted conotoxins will provide important clue to understand their function and find potential drugs. Based on this consideration, we developed a new computational method to rapidly and accurately predict the types of ion-targeted conotoxins. Three kinds of new properties of residues were proposed to use in pseudo amino acid composition to formulate conotoxins samples. The support vector machine was utilized as classifier. A feature selection technique based onF-score was used to optimize features. Jackknife cross-validated results showed that the overall accuracy of 94.6% was achieved, which is higher than other published results, demonstrating that the proposed method is superior to published methods. Hence the current method may play a complementary role to other existing methods for recognizing the types of ion-target conotoxins.


2008 ◽  
Vol 107 (1) ◽  
pp. 11-18 ◽  
Author(s):  
Xian-Sheng Wang ◽  
Chuan-He Tang ◽  
Xiao-Quan Yang ◽  
Wen-Rui Gao

Sign in / Sign up

Export Citation Format

Share Document