Prediction of Antimicrobial Peptides Based on Sequence Alignment and Support Vector Machine-Pairwise Algorithm Utilizing LZ-Complexity

Profile-profile alignment may be the most sensitive and useful computational resource for identifying remote homologies and recognizing protein folds. However, profile-profile alignment is usually much more complex and slower than sequence-sequence or profile-sequence alignment. The profile or PSSM (position-specific scoring matrix) can be used to represent the mutational variability at each sequence position of a protein by using a vector of amino acid substitution frequencies and it is a much richer encoding of a protein sequence. Consensus sequence, which can be considered as a simplified profile, was used to improve sequence alignment accuracy in the early time. Recently, several studies were carried out to improve PSI-BLAST’s fold recognition performance by using consensus sequence information. There are several ways to compute a consensus sequence. Based on these considerations, we propose a method that combines the information of different types of consensus sequences with the assistance of support vector machine learning in this chapter. Benchmark results suggest that our method can further improve PSI-BLAST’s fold recognition performance.

Download Full-text

Ooredoo Rayek

International Journal of Technology Diffusion ◽

10.4018/ijtd.2020040105 ◽

2020 ◽

Vol 11 (2) ◽

pp. 66-81

Author(s):

Badia Klouche ◽

Sidi Mohamed Benslimane ◽

Sakina Rim Bennabi

Keyword(s):

Social Media ◽

Support Vector Machine ◽

Text Mining ◽

Sentiment Analysis ◽

Experimental Results ◽

Support Vector ◽

Textual Data ◽

New Strategy ◽

Set Up

Sentiment analysis is one of the recent areas of emerging research in the classification of sentiment polarity and text mining, particularly with the considerable number of opinions available on social media. The Algerian Operator Telephone Ooredoo, as other operators, deploys in its new strategy to conquer new customers, by exploiting their opinions through a sentiments analysis. The purpose of this work is to set up a system called “Ooredoo Rayek”, whose objective is to collect, transliterate, translate and classify the textual data expressed by the Ooredoo operator's customers. This article developed a set of rules allowing the transliteration from Algerian Arabizi to Algerian dialect. Furthermore, the authors used Naïve Bayes (NB) and (Support Vector Machine) SVM classifiers to assign polarity tags to Facebook comments from the official pages of Ooredoo written in multilingual and multi-dialect context. Experimental results show that the system obtains good performance with 83% of accuracy.

Download Full-text

Precipitation Forecast Based on WA-SVM Theory

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.518-523.4039 ◽

2012 ◽

Vol 518-523 ◽

pp. 4039-4042

Author(s):

Zhen Min Zhou

Keyword(s):

Time Series ◽

Support Vector Machine ◽

Wavelet Analysis ◽

Precipitation Forecast ◽

Support Vector ◽

Estimation Model ◽

Original Time Series ◽

Set Up ◽

Original Time

In order to improve the precision of medium-long term rainfall forecast, the rainfall estimation model was set up based on wavelet analysis and support vector machine (WA-SVM). It decomposed the original rainfall series to different layers through wavelet analysis, forecasted each layer by means of SVM, and finally obtained the forecast results of the original time series by composition. The model was used to estimate the monthly rainfall sequence in the watershed. Comparing with other method which only uses support vector machine(SVM), it indicates that the estimated accuracy was improved obviously.

Download Full-text

Prediction of subcellular localization of proteins using pairwise sequence alignment and support vector machine

Pattern Recognition Letters ◽

10.1016/j.patrec.2005.11.014 ◽

2006 ◽

Vol 27 (9) ◽

pp. 996-1001 ◽

Cited By ~ 19

Author(s):

Jong Kyoung Kim ◽

G.P.S. Raghava ◽

Sung-Yang Bang ◽

Seungjin Choi

Keyword(s):

Support Vector Machine ◽

Subcellular Localization ◽

Sequence Alignment ◽

Support Vector ◽

Pairwise Sequence Alignment

Download Full-text

Comparative Analysis of H1N1 Avian Influenza Virus by Multiple Sequence Alignment and Support Vector Machine

Journal of Gene Therapy ◽

10.13188/2381-3326.1000003 ◽

2013 ◽

Vol 1 (1) ◽

Keyword(s):

Support Vector Machine ◽

Influenza Virus ◽

Comparative Analysis ◽

Avian Influenza ◽

Avian Influenza Virus ◽

Sequence Alignment ◽

Multiple Sequence Alignment ◽

Support Vector ◽

Multiple Sequence

Download Full-text

Using the Chou’s Pseudo Component to Predict the ncRNA Locations Based on the Improved K-Nearest Neighbor (iKNN) Classifier

Current Bioinformatics ◽

10.2174/1574893614666191003142406 ◽

2020 ◽

Vol 15 (6) ◽

pp. 563-573

Author(s):

Chengyan Wu ◽

Qianzhong Li ◽

Ru Xing ◽

Guo-Liang Fan

Keyword(s):

Decision Making ◽

Support Vector Machine ◽

Secondary Structure ◽

Feature Selection Method ◽

Support Vector ◽

Sequence Information ◽

Diversity Combining ◽

Machine Method ◽

Increment Of Diversity ◽

Jackknife Test

Background: The non-coding RNA identification at the organelle genome level is a challenging task. In our previous work, an ncRNA dataset with less than 80% sequence identity was built, and a method incorporating an increment of diversity combining with support vector machine method was proposed. Objective: Based on the ncRNA_361 dataset, a novel decision-making method-an improved KNN (iKNN) classifier was proposed. Methods: In this paper, based on the iKNN algorithm, the physicochemical features of nucleotides, the degeneracy of genetic codons, and topological secondary structure were selected to represent the effective ncRNA characters. Then, the incremental feature selection method was utilized to optimize the feature set. Results: The results of iKNN indicated that the decision-making method of mean value is distinctly superior to the traditional decision-making method of majority vote the Increment of Diversity Combining Support Vector Machine (ID-SVM). The iKNN algorithm achieved an overall accuracy of 97.368% in the jackknife test, when k=3. Conclusion: It should be noted that the triplets of the structure-sequence mode under reading frames not only contains the entire sequence information but also reflects whether the base was paired or not, and the secondary structural topological parameters further describe the ncRNA secondary structure on the spatial level. The ncRNA dataset and the iKNN classifier are freely available at http://202.207.14.87:8032/fuwu/iKNN/index.asp.

Download Full-text

LipoSVM: Prediction of Lysine lipoylation in Proteins based on the Support Vector Machine

Current Genomics ◽

10.2174/1389202919666191014092843 ◽

2019 ◽

Vol 20 (5) ◽

pp. 362-370 ◽

Cited By ~ 1

Author(s):

Meiqi Wu ◽

Pengchao Lu ◽

Yingxi Yang ◽

Liwen Liu ◽

Hui Wang ◽

...

Keyword(s):

Support Vector Machine ◽

Sampling Technique ◽

Experimental Methods ◽

Position Specific Scoring Matrix ◽

Support Vector ◽

Post Translational Modification ◽

Independent Test ◽

Scoring Matrix ◽

Sample Ratio ◽

Fold Cross Validation

Background: Lysine lipoylation which is a rare and highly conserved post-translational modification of proteins has been considered as one of the most important processes in the biological field. To obtain a comprehensive understanding of regulatory mechanism of lysine lipoylation, the key is to identify lysine lipoylated sites. The experimental methods are expensive and laborious. Due to the high cost and complexity of experimental methods, it is urgent to develop computational ways to predict lipoylation sites. Methodology: In this work, a predictor named LipoSVM is developed to accurately predict lipoylation sites. To overcome the problem of an unbalanced sample, synthetic minority over-sampling technique (SMOTE) is utilized to balance negative and positive samples. Furthermore, different ratios of positive and negative samples are chosen as training sets. Results: By comparing five different encoding schemes and five classification algorithms, LipoSVM is constructed finally by using a training set with positive and negative sample ratio of 1:1, combining with position-specific scoring matrix and support vector machine. The best performance achieves an accuracy of 99.98% and AUC 0.9996 in 10-fold cross-validation. The AUC of independent test set reaches 0.9997, which demonstrates the robustness of LipoSVM. The analysis between lysine lipoylation and non-lipoylation fragments shows significant statistical differences. Conclusion: A good predictor for lysine lipoylation is built based on position-specific scoring matrix and support vector machine. Meanwhile, an online webserver LipoSVM can be freely downloaded from https://github.com/stars20180811/LipoSVM.

Download Full-text

Improving PSI-BLAST’s Fold Recognition Performance through Combining Consensus Sequences and Support Vector Machine

Bioinformatics ◽

10.4018/978-1-4666-3604-0.ch087 ◽

2013 ◽

pp. 1667-1675

Author(s):

Ren-Xiang Yan ◽

Jing Liu ◽

Yi-Min Tao

Keyword(s):

Support Vector Machine ◽

Sequence Alignment ◽

Recognition Performance ◽

Consensus Sequence ◽

Early Time ◽

Fold Recognition ◽

Support Vector ◽

Sequence Information ◽

Consensus Sequences ◽

Profile Alignment

Profile-profile alignment may be the most sensitive and useful computational resource for identifying remote homologies and recognizing protein folds. However, profile-profile alignment is usually much more complex and slower than sequence-sequence or profile-sequence alignment. The profile or PSSM (position-specific scoring matrix) can be used to represent the mutational variability at each sequence position of a protein by using a vector of amino acid substitution frequencies and it is a much richer encoding of a protein sequence. Consensus sequence, which can be considered as a simplified profile, was used to improve sequence alignment accuracy in the early time. Recently, several studies were carried out to improve PSI-BLAST’s fold recognition performance by using consensus sequence information. There are several ways to compute a consensus sequence. Based on these considerations, we propose a method that combines the information of different types of consensus sequences with the assistance of support vector machine learning in this chapter. Benchmark results suggest that our method can further improve PSI-BLAST’s fold recognition performance.

Download Full-text

A Two-Layer SVM Ensemble-Classifier to Predict Interface Residue Pairs of Protein Trimers

Molecules ◽

10.3390/molecules25194353 ◽

2020 ◽

Vol 25 (19) ◽

pp. 4353

Author(s):

Yanfen Lyu ◽

Xinqi Gong

Keyword(s):

Amino Acids ◽

Support Vector Machine ◽

Biological Significance ◽

Correlation Coefficients ◽

Ensemble Classifier ◽

Interface Residue ◽

Support Vector ◽

Feature Combination ◽

Test Set ◽

Independent Test

Study of interface residue pairs is important for understanding the interactions between monomers inside a trimer protein–protein complex. We developed a two-layer support vector machine (SVM) ensemble-classifier that considers physicochemical and geometric properties of amino acids and the influence of surrounding amino acids. Different descriptors and different combinations may give different prediction results. We propose feature combination engineering based on correlation coefficients and F-values. The accuracy of our method is 65.38% in independent test set, indicating biological significance. Our predictions are consistent with the experimental results. It shows the effectiveness and reliability of our method to predict interface residue pairs of protein trimers.

Download Full-text

Study on Prediction Model of Springback Based on Support Vector Machine

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.633-634.734 ◽

2014 ◽

Vol 633-634 ◽

pp. 734-737

Author(s):

Shao Juan Su ◽

Yong Hu ◽

Cheng Fang Wang

Keyword(s):

Support Vector Machine ◽

Prediction Models ◽

Structure Study ◽

Accurate Prediction ◽

Effective Control ◽

Support Vector ◽

Machine Method ◽

Cold Bending ◽

Neural Network Prediction ◽

Set Up

Accurate prediction and effective control springback has a great significance to the plate cold bending forming. Based on the analysis of the support vector machine method principle and structure, Study on support vector machine prediction process based on MATLAB and prediction models about radius and resilience is set up. And the results predicted were compared with BP neural network prediction results to verify the accuracy.To provide an effective method for the accurate prediction springback.

Download Full-text