scholarly journals Prediction of Antimicrobial Peptides Based on Sequence Alignment and Support Vector Machine-Pairwise Algorithm Utilizing LZ-Complexity

2015 ◽  
Vol 2015 ◽  
pp. 1-13 ◽  
Author(s):  
Xin Yi Ng ◽  
Bakhtiar Affendi Rosdi ◽  
Shahriza Shahrudin

This study concerns an attempt to establish a new method for predicting antimicrobial peptides (AMPs) which are important to the immune system. Recently, researchers are interested in designing alternative drugs based on AMPs because they have found that a large number of bacterial strains have become resistant to available antibiotics. However, researchers have encountered obstacles in the AMPs designing process as experiments to extract AMPs from protein sequences are costly and require a long set-up time. Therefore, a computational tool for AMPs prediction is needed to resolve this problem. In this study, an integrated algorithm is newly introduced to predict AMPs by integrating sequence alignment and support vector machine- (SVM-) LZ complexity pairwise algorithm. It was observed that, when all sequences in the training set are used, the sensitivity of the proposed algorithm is 95.28% in jackknife test and 87.59% in independent test, while the sensitivity obtained for jackknife test and independent test is 88.74% and 78.70%, respectively, when only the sequences that has less than 70% similarity are used. Applying the proposed algorithm may allow researchers to effectively predict AMPs from unknown protein peptide sequences with higher sensitivity.

Author(s):  
Ren-Xiang Yan ◽  
Jing Liu ◽  
Yi-Min Tao

Profile-profile alignment may be the most sensitive and useful computational resource for identifying remote homologies and recognizing protein folds. However, profile-profile alignment is usually much more complex and slower than sequence-sequence or profile-sequence alignment. The profile or PSSM (position-specific scoring matrix) can be used to represent the mutational variability at each sequence position of a protein by using a vector of amino acid substitution frequencies and it is a much richer encoding of a protein sequence. Consensus sequence, which can be considered as a simplified profile, was used to improve sequence alignment accuracy in the early time. Recently, several studies were carried out to improve PSI-BLAST’s fold recognition performance by using consensus sequence information. There are several ways to compute a consensus sequence. Based on these considerations, we propose a method that combines the information of different types of consensus sequences with the assistance of support vector machine learning in this chapter. Benchmark results suggest that our method can further improve PSI-BLAST’s fold recognition performance.


2020 ◽  
Vol 11 (2) ◽  
pp. 66-81
Author(s):  
Badia Klouche ◽  
Sidi Mohamed Benslimane ◽  
Sakina Rim Bennabi

Sentiment analysis is one of the recent areas of emerging research in the classification of sentiment polarity and text mining, particularly with the considerable number of opinions available on social media. The Algerian Operator Telephone Ooredoo, as other operators, deploys in its new strategy to conquer new customers, by exploiting their opinions through a sentiments analysis. The purpose of this work is to set up a system called “Ooredoo Rayek”, whose objective is to collect, transliterate, translate and classify the textual data expressed by the Ooredoo operator's customers. This article developed a set of rules allowing the transliteration from Algerian Arabizi to Algerian dialect. Furthermore, the authors used Naïve Bayes (NB) and (Support Vector Machine) SVM classifiers to assign polarity tags to Facebook comments from the official pages of Ooredoo written in multilingual and multi-dialect context. Experimental results show that the system obtains good performance with 83% of accuracy.


2012 ◽  
Vol 518-523 ◽  
pp. 4039-4042
Author(s):  
Zhen Min Zhou

In order to improve the precision of medium-long term rainfall forecast, the rainfall estimation model was set up based on wavelet analysis and support vector machine (WA-SVM). It decomposed the original rainfall series to different layers through wavelet analysis, forecasted each layer by means of SVM, and finally obtained the forecast results of the original time series by composition. The model was used to estimate the monthly rainfall sequence in the watershed. Comparing with other method which only uses support vector machine(SVM), it indicates that the estimated accuracy was improved obviously.


2020 ◽  
Vol 15 (6) ◽  
pp. 563-573
Author(s):  
Chengyan Wu ◽  
Qianzhong Li ◽  
Ru Xing ◽  
Guo-Liang Fan

Background: The non-coding RNA identification at the organelle genome level is a challenging task. In our previous work, an ncRNA dataset with less than 80% sequence identity was built, and a method incorporating an increment of diversity combining with support vector machine method was proposed. Objective: Based on the ncRNA_361 dataset, a novel decision-making method-an improved KNN (iKNN) classifier was proposed. Methods: In this paper, based on the iKNN algorithm, the physicochemical features of nucleotides, the degeneracy of genetic codons, and topological secondary structure were selected to represent the effective ncRNA characters. Then, the incremental feature selection method was utilized to optimize the feature set. Results: The results of iKNN indicated that the decision-making method of mean value is distinctly superior to the traditional decision-making method of majority vote the Increment of Diversity Combining Support Vector Machine (ID-SVM). The iKNN algorithm achieved an overall accuracy of 97.368% in the jackknife test, when k=3. Conclusion: It should be noted that the triplets of the structure-sequence mode under reading frames not only contains the entire sequence information but also reflects whether the base was paired or not, and the secondary structural topological parameters further describe the ncRNA secondary structure on the spatial level. The ncRNA dataset and the iKNN classifier are freely available at http://202.207.14.87:8032/fuwu/iKNN/index.asp.


2019 ◽  
Vol 20 (5) ◽  
pp. 362-370 ◽  
Author(s):  
Meiqi Wu ◽  
Pengchao Lu ◽  
Yingxi Yang ◽  
Liwen Liu ◽  
Hui Wang ◽  
...  

Background: Lysine lipoylation which is a rare and highly conserved post-translational modification of proteins has been considered as one of the most important processes in the biological field. To obtain a comprehensive understanding of regulatory mechanism of lysine lipoylation, the key is to identify lysine lipoylated sites. The experimental methods are expensive and laborious. Due to the high cost and complexity of experimental methods, it is urgent to develop computational ways to predict lipoylation sites. Methodology: In this work, a predictor named LipoSVM is developed to accurately predict lipoylation sites. To overcome the problem of an unbalanced sample, synthetic minority over-sampling technique (SMOTE) is utilized to balance negative and positive samples. Furthermore, different ratios of positive and negative samples are chosen as training sets. Results: By comparing five different encoding schemes and five classification algorithms, LipoSVM is constructed finally by using a training set with positive and negative sample ratio of 1:1, combining with position-specific scoring matrix and support vector machine. The best performance achieves an accuracy of 99.98% and AUC 0.9996 in 10-fold cross-validation. The AUC of independent test set reaches 0.9997, which demonstrates the robustness of LipoSVM. The analysis between lysine lipoylation and non-lipoylation fragments shows significant statistical differences. Conclusion: A good predictor for lysine lipoylation is built based on position-specific scoring matrix and support vector machine. Meanwhile, an online webserver LipoSVM can be freely downloaded from https://github.com/stars20180811/LipoSVM.


2013 ◽  
pp. 1667-1675
Author(s):  
Ren-Xiang Yan ◽  
Jing Liu ◽  
Yi-Min Tao

Profile-profile alignment may be the most sensitive and useful computational resource for identifying remote homologies and recognizing protein folds. However, profile-profile alignment is usually much more complex and slower than sequence-sequence or profile-sequence alignment. The profile or PSSM (position-specific scoring matrix) can be used to represent the mutational variability at each sequence position of a protein by using a vector of amino acid substitution frequencies and it is a much richer encoding of a protein sequence. Consensus sequence, which can be considered as a simplified profile, was used to improve sequence alignment accuracy in the early time. Recently, several studies were carried out to improve PSI-BLAST’s fold recognition performance by using consensus sequence information. There are several ways to compute a consensus sequence. Based on these considerations, we propose a method that combines the information of different types of consensus sequences with the assistance of support vector machine learning in this chapter. Benchmark results suggest that our method can further improve PSI-BLAST’s fold recognition performance.


Molecules ◽  
2020 ◽  
Vol 25 (19) ◽  
pp. 4353
Author(s):  
Yanfen Lyu ◽  
Xinqi Gong

Study of interface residue pairs is important for understanding the interactions between monomers inside a trimer protein–protein complex. We developed a two-layer support vector machine (SVM) ensemble-classifier that considers physicochemical and geometric properties of amino acids and the influence of surrounding amino acids. Different descriptors and different combinations may give different prediction results. We propose feature combination engineering based on correlation coefficients and F-values. The accuracy of our method is 65.38% in independent test set, indicating biological significance. Our predictions are consistent with the experimental results. It shows the effectiveness and reliability of our method to predict interface residue pairs of protein trimers.


2014 ◽  
Vol 633-634 ◽  
pp. 734-737
Author(s):  
Shao Juan Su ◽  
Yong Hu ◽  
Cheng Fang Wang

Accurate prediction and effective control springback has a great significance to the plate cold bending forming. Based on the analysis of the support vector machine method principle and structure, Study on support vector machine prediction process based on MATLAB and prediction models about radius and resilience is set up. And the results predicted were compared with BP neural network prediction results to verify the accuracy.To provide an effective method for the accurate prediction springback.


Sign in / Sign up

Export Citation Format

Share Document