pseudo amino acid composition
Recently Published Documents


TOTAL DOCUMENTS

241
(FIVE YEARS 15)

H-INDEX

75
(FIVE YEARS 2)

2021 ◽  
Vol 22 (23) ◽  
pp. 13124
Author(s):  
Phasit Charoenkwan ◽  
Chanin Nantasenamat ◽  
Md Mehedi Hasan ◽  
Mohammad Ali Moni ◽  
Balachandran Manavalan ◽  
...  

Umami ingredients have been identified as important factors in food seasoning and production. Traditional experimental methods for characterizing peptides exhibiting umami sensory properties (umami peptides) are time-consuming, laborious, and costly. As a result, it is preferable to develop computational tools for the large-scale identification of available sequences in order to identify novel peptides with umami sensory properties. Although a computational tool has been developed for this purpose, its predictive performance is still insufficient. In this study, we use a feature representation learning approach to create a novel machine-learning meta-predictor called UMPred-FRL for improved umami peptide identification. We combined six well-known machine learning algorithms (extremely randomized trees, k-nearest neighbor, logistic regression, partial least squares, random forest, and support vector machine) with seven different feature encodings (amino acid composition, amphiphilic pseudo-amino acid composition, dipeptide composition, composition-transition-distribution, and pseudo-amino acid composition) to develop the final meta-predictor. Extensive experimental results demonstrated that UMPred-FRL was effective and achieved more accurate performance on the benchmark dataset compared to its baseline models, and consistently outperformed the existing method on the independent test dataset. Finally, to aid in the high-throughput identification of umami peptides, the UMPred-FRL web server was established and made freely available online. It is expected that UMPred-FRL will be a powerful tool for the cost-effective large-scale screening of candidate peptides with potential umami sensory properties.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Zhixia Teng ◽  
Zitong Zhang ◽  
Zhen Tian ◽  
Yanjuan Li ◽  
Guohua Wang

Abstract Background Amyloids are insoluble fibrillar aggregates that are highly associated with complex human diseases, such as Alzheimer’s disease, Parkinson’s disease, and type II diabetes. Recently, many studies reported that some specific regions of amino acid sequences may be responsible for the amyloidosis of proteins. It has become very important for elucidating the mechanism of amyloids that identifying the amyloidogenic regions. Accordingly, several computational methods have been put forward to discover amyloidogenic regions. The majority of these methods predicted amyloidogenic regions based on the physicochemical properties of amino acids. In fact, position, order, and correlation of amino acids may also influence the amyloidosis of proteins, which should be also considered in detecting amyloidogenic regions. Results To address this problem, we proposed a novel machine-learning approach for predicting amyloidogenic regions, called ReRF-Pred. Firstly, the pseudo amino acid composition (PseAAC) was exploited to characterize physicochemical properties and correlation of amino acids. Secondly, tripeptides composition (TPC) was employed to represent the order and position of amino acids. To improve the distinguishability of TPC, all possible tripeptides were analyzed by the binomial distribution method, and only those which have significantly different distribution between positive and negative samples remained. Finally, all samples were characterized by PseAAC and TPC of their amino acid sequence, and a random forest-based amyloidogenic regions predictor was trained on these samples. It was proved by validation experiments that the feature set consisted of PseAAC and TPC is the most distinguishable one for detecting amyloidosis. Meanwhile, random forest is superior to other concerned classifiers on almost all metrics. To validate the effectiveness of our model, ReRF-Pred is compared with a series of gold-standard methods on two datasets: Pep-251 and Reg33. The results suggested our method has the best overall performance and makes significant improvements in discovering amyloidogenic regions. Conclusions The advantages of our method are mainly attributed to that PseAAC and TPC can describe the differences between amyloids and other proteins successfully. The ReRF-Pred server can be accessed at http://106.12.83.135:8080/ReRF-Pred/.


2021 ◽  
Vol 18 ◽  
Author(s):  
Wajdi Alghamdi ◽  
Yaser Daanial Khan ◽  
Ebraheem Alzahrani ◽  
Malik Zaka Ullah

Background: Chaperones are a group of proteins that have functional similarities and support protein folding. These are proteins that can prevent non-specific aggregation by binding to non-natural proteins. These are mainly linked with the folding or assembly, which are important biological procedures of molecular biology. Not only is chaperone an important stress protein for maintaining the survival of other proteins and cells, but its therapeutic applications are dramatically increasing. Objectives: Herein, we report the first and the novel predictor for identification of Chaperone proteins. Methods: The predictor is developed using Chou’s pseudo amino acid composition (PseAAC), statistical moments and various position-based features. Results: The predictor is validated through 10-fold cross-validation and Jackknife testing, which gave 94.04% and 96.62% accurate results. Conclusion: Thus, the proposed predictor can help predict the Chaperone proteins efficiently and accurately and provide baseline data for the discovery of new drugs and biomarkers.


2021 ◽  
Vol 15 (8) ◽  
pp. 937-948
Author(s):  
Sheraz Naseer ◽  
Waqar Hussain ◽  
Yaser Daanial Khan ◽  
Nouman Rasool

Background: Among all the major post-translational modifications, amidation seems to be a small change, where a peptide ends with an amide group (-NH 2), not a carboxyl group (-COOH). Thus, to study their physicochemical properties, identification of the amidation mechanism is very important. However, the in vitro, ex vivo and in vivo identification can be laborious, time-taking and costly. There is a dire need for an efficient and accurate computational model to help researchers and biologists identifying these sites, in an easy manner. Objectives: Herein, we propose a novel predictor for the identification of arginine amide (R-Amide) sites in proteins, by integrating the Chou’s Pseudo Amino Acid Composition (PseAAC) with deep features. Methods: We use well-known DNNs for both the tasks of learning a feature representation of peptide sequences and performing classifications. Methods: We use well-known DNNs for both the tasks of learning a feature representation of peptide sequences and performing classifications. Results: Among different DNNs, CNN showed the highest scores in terms of accuracy, and all other computed measures outperformed all the previously reported predictors. Conclusions: Based on these results, it is concluded that the proposed model can help identify arginine amidation in a very efficient and accurate manner, which can help scientists understand the mechanism of this modification in proteins.


2020 ◽  
Vol 23 (8) ◽  
pp. 797-804
Author(s):  
Waqar Hussain ◽  
Nouman Rasool ◽  
Yaser D. Khan

Background: IKV has been a well-known global threat, which hits almost all of the American countries and posed a serious threat to the entire globe in 2016. The first outbreak of ZIKV was reported in 2007 in the Pacific area, followed by another severe outbreak, which occurred in 2013/2014 and subsequently, ZIKV spread to all other Pacific islands. A broad spectrum of ZIKV associated neurological malformations in neonates and adults has driven this deadly virus into the limelight. Though tremendous efforts have been focused on understanding the molecular basis of ZIKV, the viral proteins of ZIKV have still not been studied extensively. Objectives: Herein, we report the first and the novel predictor for the identification of ZIKV proteins. Methods: We have employed Chou’s pseudo amino acid composition (PseAAC), statistical moments and various position-based features. Results: The predictor is validated through 10-fold cross-validation and Jackknife testing. In 10- fold cross-validation, 94.09% accuracy, 93.48% specificity, 94.20% sensitivity and 0.80 MCC were achieved while in Jackknife testing, 96.62% accuracy, 94.57% specificity, 97.00% sensitivity and 0.88 MCC were achieved. Conclusion: Thus, ZIKVPred-PseAAC can help in predicting the ZIKV proteins efficiently and accurately and can provide baseline data for the discovery of new drugs and biomarkers against ZIKV.


2020 ◽  
Vol 21 (16) ◽  
pp. 5694
Author(s):  
Cheng Wang ◽  
Wenyan Wang ◽  
Kun Lu ◽  
Jun Zhang ◽  
Peng Chen ◽  
...  

The task of drug-target interaction (DTI) prediction plays important roles in drug development. The experimental methods in DTIs are time-consuming, expensive and challenging. To solve these problems, machine learning-based methods are introduced, which are restricted by effective feature extraction and negative sampling. In this work, features with electrotopological state (E-state) fingerprints for drugs and amphiphilic pseudo amino acid composition (APAAC) for target proteins are tested. E-state fingerprints are extracted based on both molecular electronic and topological features with the same metric. APAAC is an extension of amino acid composition (AAC), which is calculated based on hydrophilic and hydrophobic characters to construct sequence order information. Using the combination of these feature pairs, the prediction model is established by support vector machines. In order to enhance the effectiveness of features, a distance-based negative sampling is proposed to obtain reliable negative samples. It is shown that the prediction results of area under curve for Receiver Operating Characteristic (AUC) are above 98.5% for all the three datasets in this work. The comparison of state-of-the-art methods demonstrates the effectiveness and efficiency of proposed method, which will be helpful for further drug development.


Sign in / Sign up

Export Citation Format

Share Document