Open-Source Essential Protein Prediction Model by Integrating Chi-Square and Support Vector Machine

S. R. Mani Sekhar;  Siddesh G. M.; Sunilkumar S. Manvi

doi:10.4018/ijossp.2020070103

Open-Source Essential Protein Prediction Model by Integrating Chi-Square and Support Vector Machine

International Journal of Open Source Software and Processes ◽

10.4018/ijossp.2020070103 ◽

2020 ◽

Vol 11 (3) ◽

pp. 38-56

Author(s):

S. R. Mani Sekhar ◽

Siddesh G. M. ◽

Sunilkumar S. Manvi

Keyword(s):

Support Vector Machine ◽

Open Source ◽

Vital Role ◽

Support Vector ◽

Chi Square ◽

Essential Proteins ◽

Topological Features ◽

Proposed Model ◽

Protein Prediction ◽

Svm Model

Identification and analysis of protein play a vital role in drug design and disease prediction. There are several open-source applications that have been developed for identifying essential proteins which are based on biological or topological features. These techniques infer the possibility of proteins to be essential by using the network topology and feature selection, which can ignore some of the features to reduce the complexity and, subsequently, results in less accuracy. In the paper, the authors have used selenium driver to scrap the dataset. Later, the authors integrated the chi-square method with support vector machine for the prediction of essential proteins in baker yeast. Here, chi-square is a test of dissimilarity used for altering the record, and afterward, the support vector machine is used to classify the test dataset. The results show that the proposed model Chi-SVM model achieves an accuracy of 99.56%, whereas BC and CC achieved an accuracy of 84.0% and 86.0%. Finally, the proposed model is validated using Statistical performance measures such as PPA, NPA, SA, and STA.

Download Full-text

Support Vector Machine Berbasis Feature Selection Untuk Sentiment Analysis Kepuasan Pelanggan Terhadap Pelayanan Warung dan Restoran Kuliner Kota Tegal

Jurnal Teknologi Informasi dan Ilmu Komputer ◽

10.25126/jtiik.201855867 ◽

2018 ◽

Vol 5 (5) ◽

pp. 537 ◽

Cited By ~ 1

Author(s):

Oman Somantri ◽

Dyah Apriliani

Keyword(s):

Support Vector Machine ◽

Feature Selection ◽

Sentiment Analysis ◽

Information Gain ◽

Support Vector ◽

Chi Square ◽

Proposed Model ◽

Chi Squared ◽

The Difference ◽

Increase In Accuracy

Abstrak Setiap pelanggan pasti menginginkan sebuah pendukung keputusan dalam menentukan pilihan ketika akan mengunjungi sebuah tempat makan atau kuliner yang sesuai dengan keinginan salah satu contohnya yaitu di Kota Tegal. Sentiment analysis digunakan untuk memberikan sebuah solusi terkait dengan permasalahan tersebut, dengan menereapkan model algoritma Support Vector Machine (SVM). Tujuan dari penelitian ini adalah mengoptimalisasi model yang dihasilkan dengan diterapkannya feature selection menggunakan algoritma Informatioan Gain (IG) dan Chi Square pada hasil model terbaik yang dihasilkan oleh SVM pada klasifikasi tingkat kepuasan pelanggan terhadap warung dan restoran kuliner di Kota Tegal sehingga terjadi peningkatan akurasi dari model yang dihasilkan. Hasil penelitian menunjukan bahwa tingkat akurasi terbaik dihasilkan oleh model SVM-IG dengan tingkat akurasi terbaik sebesar 72,45% mengalami peningkatan sekitar 3,08% yang awalnya 69.36%. Selisih rata-rata yang dihasilkan setelah dilakukannya optimasi SVM dengan feature selection adalah 2,51% kenaikan tingkat akurasinya. Berdasarkan hasil penelitian bahwa feature selection dengan menggunakan Information Gain (IG) (SVM-IG) memiliki tingkat akurasi lebih baik apabila dibandingkan SVM dan Chi Squared (SVM-CS) sehingga dengan demikian model yang diusulkan dapat meningkatkan tingkat akurasi yang dihasilkan oleh SVM menjadi lebih baik. Abstract The Customer needs to get a decision support in determining a choice when they’re visit a culinary restaurant accordance to their wishes especially at Tegal City. Sentiment analysis is used to provide a solution related to this problem by applying the Support Vector Machine (SVM) algorithm model. The purpose of this research is to optimize the generated model by applying feature selection using Informatioan Gain (IG) and Chi Square algorithm on the best model produced by SVM on the classification of customer satisfaction level based on culinary restaurants at Tegal City so that there is an increasing accuracy from the model. The results showed that the best accuracy level produced by the SVM-IG model with the best accuracy of 72.45% experienced an increase of about 3.08% which was initially 69.36%. The difference average produced after SVM optimization with feature selection is 2.51% increase in accuracy. Based on the results of the research, the feature selection using Information Gain (SVM-IG) has a better accuracy rate than SVM and Chi Squared (SVM-CS) so that the proposed model can improve the accuracy of SVM better.

Download Full-text

Prediction of Classification of Rock Burst Risk Based on Genetic Algorithms with SVM

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.628.383 ◽

2014 ◽

Vol 628 ◽

pp. 383-389 ◽

Cited By ~ 6

Author(s):

Ya Hui Peng ◽

Kang Peng ◽

Jian Zhou ◽

Zhi Xiang Liu

Keyword(s):

Support Vector Machine ◽

Genetic Algorithms ◽

Rock Burst ◽

Support Vector ◽

Proposed Model ◽

Svm Model ◽

Rock Burst Risk ◽

Rock Burst Prediction ◽

Optimization Search

Due to the complex features of rock burst hazard assessment systems, a support vector machine (SVM) model for predicting of classification of rock burst was established based on the SVM theory and the actual characteristics of the project in this study. The main factors of rock burst, such as coal seam, dip, buried depth, structure situation, change of pitch angle, change of coal thickness, gas concentration, roof management, pressure relief and shooting were defined as the criterion indices for rock burst prediction in the proposed model. In order to determine reasonable and efficient the parameters of SVM, Firstly, the appropriate fitness function for genetic algorithms (GA) operation was determined, and then optimization parameters of SVM model were selected by real coded GA, therefore, the genetic algorithms and support vector machine (GSVM) model was established. A GSVM model was obtained through training 23 sets of measured data, the cross-validation method was introduced to verify the stability of GSVM model and the ratio of mis-discrimination is 0. Moreover, the proposed model was used to predict 12 new samples rock burst, the correct rate of prediction results is 91.6667% and are identical with actual situation. The results show that the genetic algorithm can speed up SVM parameter optimization search, the proposed model has a high credibility in the study of rock burst prediction of risk classification, which can be applied to practical engineering.

Download Full-text

Combination of Support Vector Machine and K-Fold cross-validation for prediction of long-term degradation of the compressive strength of marine concrete

International Journal of Computational Physics Series ◽

10.29167/a1i1p120-130 ◽

2018 ◽

Vol 1 (1) ◽

pp. 120-130 ◽

Cited By ~ 1

Author(s):

Chunxiang Qian ◽

Wence Kang ◽

Hao Ling ◽

Hua Dong ◽

Chengyao Liang ◽

...

Keyword(s):

Support Vector Machine ◽

Environmental Factors ◽

Cross Validation ◽

Concrete Strength ◽

Simulation Method ◽

Support Vector ◽

Svm Model ◽

Artificial Neural Network Ann ◽

Influence Degree ◽

Fold Cross Validation

Support Vector Machine (SVM) model optimized by K-Fold cross-validation was built to predict and evaluate the degradation of concrete strength in a complicated marine environment. Meanwhile, several mathematical models, such as Artificial Neural Network (ANN) and Decision Tree (DT), were also built and compared with SVM to determine which one could make the most accurate predictions. The material factors and environmental factors that influence the results were considered. The materials factors mainly involved the original concrete strength, the amount of cement replaced by fly ash and slag. The environmental factors consisted of the concentration of Mg2+, SO42-, Cl-, temperature and exposing time. It was concluded from the prediction results that the optimized SVM model appeared to perform better than other models in predicting the concrete strength. Based on SVM model, a simulation method of variables limitation was used to determine the sensitivity of various factors and the influence degree of these factors on the degradation of concrete strength.

Download Full-text

Three-Dimensional Site Characterization Model of Bangalore Using Support Vector Machine

ISRN Soil Science ◽

10.5402/2012/346439 ◽

2012 ◽

Vol 2012 ◽

pp. 1-10

Author(s):

Pijush Samui

Keyword(s):

Support Vector Machine ◽

Three Dimensional ◽

Site Characterization ◽

Standard Penetration Test ◽

Support Vector ◽

Characterization Model ◽

Svm Model ◽

Input Variables ◽

Learning Machine ◽

A Site

The main objective of site characterization is the prediction of in situ soil properties at any half-space point at a site based on limited tests. In this study, the Support Vector Machine (SVM) has been used to develop a three dimensional site characterization model for Bangalore, India based on large amount of Standard Penetration Test. SVM is a novel type of learning machine based on statistical learning theory, uses regression technique by introducing ε-insensitive loss function. The database consists of 766 boreholes, with more than 2700 field SPT values () spread over 220 sq km area of Bangalore. The model is applied for corrected () values. The three input variables (, , and , where , , and are the coordinates of the Bangalore) were used for the SVM model. The output of SVM was the data. The results presented in this paper clearly highlight that the SVM is a robust tool for site characterization. In this study, a sensitivity analysis of SVM parameters (σ, , and ε) has been also presented.

Download Full-text

A support vector machine with the tabu search algorithm for freeway incident detection

International Journal of Applied Mathematics and Computer Science ◽

10.2478/amcs-2014-0030 ◽

2014 ◽

Vol 24 (2) ◽

pp. 397-404 ◽

Cited By ~ 37

Author(s):

Baozhen Yao ◽

Ping Hu ◽

Mingheng Zhang ◽

Maoqing Jin

Keyword(s):

Support Vector Machine ◽

Tabu Search ◽

Traffic Management ◽

Search Algorithm ◽

Detection System ◽

Support Vector ◽

Incident Detection ◽

Tabu Search Algorithm ◽

Proposed Model ◽

Parameter Values

Abstract Automated Incident Detection (AID) is an important part of Advanced Trafﬁc Management and Information Systems (ATMISs). An automated incident detection system can effectively provide information on an incident, which can help initiate the required measure to reduce the inﬂuence of the incident. To accurately detect incidents in expressways, a Support Vector Machine (SVM) is used in this paper. Since the selection of optimal parameters for the SVM can improve prediction accuracy, the tabu search algorithm is employed to optimize the SVM parameters. The proposed model is evaluated with data for two freeways in China. The results show that the tabu search algorithm can effectively provide better parameter values for the SVM, and SVM models outperform Artiﬁcial Neural Networks (ANNs) in freeway incident detection.

Download Full-text

Estimation of CO2 Diffusivity in Brine by Use of the Genetic Algorithm and Mixed Kernels-Based Support Vector Machine Model

Journal of Energy Resources Technology ◽

10.1115/1.4041724 ◽

2018 ◽

Vol 141 (4) ◽

Cited By ~ 4

Author(s):

Qihong Feng ◽

Ronghao Cui ◽

Sen Wang ◽

Jin Zhang ◽

Zhe Jiang

Keyword(s):

Genetic Algorithm ◽

Support Vector Machine ◽

Diffusion Coefficient ◽

Co2 Storage ◽

Co2 Injection ◽

Support Vector ◽

Hybrid Technique ◽

Machine Model ◽

Proposed Model ◽

Wide Range

Diffusion coefficient of carbon dioxide (CO2), a significant parameter describing the mass transfer process, exerts a profound influence on the safety of CO2 storage in depleted reservoirs, saline aquifers, and marine ecosystems. However, experimental determination of diffusion coefficient in CO2-brine system is time-consuming and complex because the procedure requires sophisticated laboratory equipment and reasonable interpretation methods. To facilitate the acquisition of more accurate values, an intelligent model, termed MKSVM-GA, is developed using a hybrid technique of support vector machine (SVM), mixed kernels (MK), and genetic algorithm (GA). Confirmed by the statistical evaluation indicators, our proposed model exhibits excellent performance with high accuracy and strong robustness in a wide range of temperatures (273–473.15 K), pressures (0.1–49.3 MPa), and viscosities (0.139–1.950 mPa·s). Our results show that the proposed model is more applicable than the artificial neural network (ANN) model at this sample size, which is superior to four commonly used traditional empirical correlations. The technique presented in this study can provide a fast and precise prediction of CO2 diffusivity in brine at reservoir conditions for the engineering design and the technical risk assessment during the process of CO2 injection.

Download Full-text

Simulating the Stress-Strain Relationship of Geomaterials by Support Vector Machine

Mathematical Problems in Engineering ◽

10.1155/2014/482672 ◽

2014 ◽

Vol 2014 ◽

pp. 1-7 ◽

Cited By ~ 1

Author(s):

Hongbo Zhao ◽

Zenghui Huang ◽

Zhengsheng Zou

Keyword(s):

Support Vector Machine ◽

Constitutive Relationship ◽

Support Vector ◽

Nonlinear Relationship ◽

Ann Model ◽

Stress Strain ◽

Svm Model ◽

Strain Relationship ◽

Artificial Neural Network Ann ◽

Relationship Of

Stress-strain relationship of geomaterials is important to numerical analysis in geotechnical engineering. It is difficult to be represented by conventional constitutive model accurately. Artificial neural network (ANN) has been proposed as a more effective approach to represent this complex and nonlinear relationship, but ANN itself still has some limitations that restrict the applicability of the method. In this paper, an alternative method, support vector machine (SVM), is proposed to simulate this type of complex constitutive relationship. The SVM model can overcome the limitations of ANN model while still processing the advantages over the traditional model. The application examples show that it is an effective and accurate modeling approach for stress-strain relationship representation for geomaterials.

Download Full-text

Prediction of Shear Strength of Soil Using Direct Shear Test and Support Vector Machine Model

The Open Construction and Building Technology Journal ◽

10.2174/1874836802014010041 ◽

2020 ◽

Vol 14 (1) ◽

pp. 41-50 ◽

Cited By ~ 2

Author(s):

Hai-Bang Ly ◽

Binh Thai Pham

Keyword(s):

Support Vector Machine ◽

Shear Strength ◽

Moisture Content ◽

Mean Squared Error ◽

Learning Algorithm ◽

Liquid Limit ◽

Support Vector ◽

Plastic Limit ◽

Svm Model ◽

Soil Shear Strength

Background: Shear strength of soil, the magnitude of shear stress that a soil can maintain, is an important factor in geotechnical engineering. Objective: The main objective of this study is dedicated to the development of a machine learning algorithm, namely Support Vector Machine (SVM) to predict the shear strength of soil based on 6 input variables such as clay content, moisture content, specific gravity, void ratio, liquid limit and plastic limit. Methods: An important number of experimental measurements, including more than 500 samples was gathered from the Long Phu 1 power plant project’s technical reports. The accuracy of the proposed SVM was evaluated using statistical indicators such as the coefficient of correlation (R), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) over a number of 200 simulations taking into account the random sampling effect. Finally, the most accurate SVM model was used to interpret the prediction results due to Partial Dependence Plots (PDP). Results: Validation results showed that SVM model performed well for prediction of soil shear strength (R = 0.9 to 0.95), and the moisture content, liquid limit and plastic limit were found as the three most affecting features to the prediction of soil shear strength. Conclusion: This study might help in quick and accurate prediction of soil shear strength for practical purposes in civil engineering.

Download Full-text

Predicting Students Grades Using Artificial Neural Networks and Support Vector Machine

Advanced Methodologies and Technologies in Modern Education Delivery - Advances in Educational Technologies and Instructional Design ◽

10.4018/978-1-5225-7365-4.ch059 ◽

2019 ◽

pp. 751-766

Author(s):

Sajid Umair ◽

Muhammad Majid Sharif

Keyword(s):

Neural Networks ◽

Support Vector Machine ◽

Artificial Neural Networks ◽

Student Performance ◽

Semantic Analysis ◽

Vital Role ◽

Support Vector ◽

Data Set ◽

Important Research Topic ◽

Artificial Neural

Prediction of student performance on the basis of habits has been a very important research topic in academics. Studies show that selection of the correct data set also plays a vital role in these predictions. In this chapter, the authors took data from different schools that contains student habits and their comments, analyzed it using latent semantic analysis to get semantics, and then used support vector machine to classify the data into two classes, important for prediction and not important. Finally, they used artificial neural networks to predict the grades of students. Regression was also used to predict data coming from support vector machine, while giving only the important data for prediction.

Download Full-text

Combustible Gas Classification Modeling using Support Vector Machine and Pairing Plot Scheme

Sensors ◽

10.3390/s19225018 ◽

2019 ◽

Vol 19 (22) ◽

pp. 5018 ◽

Cited By ~ 1

Author(s):

Kyu-Won Jang ◽

Jong-Hyeok Choi ◽

Ji-Hoon Jeon ◽

Hyun-Seok Kim

Keyword(s):

Support Vector Machine ◽

Gas Sensors ◽

Learning Algorithm ◽

Support Vector ◽

Detection Accuracy ◽

Minimum Concentration ◽

Industrial Sites ◽

Combustible Gas ◽

Proposed Model ◽

Combustible Gases

Combustible gases, such as CH4 and CO, directly or indirectly affect the human body. Thus, leakage detection of combustible gases is essential for various industrial sites and daily life. Many types of gas sensors are used to identify these combustible gases, but since gas sensors generally have low selectivity among gases, coupling issues often arise which adversely affect gas detection accuracy. To solve this problem, we built a decoupling algorithm with different gas sensors using a machine learning algorithm. Commercially available semiconductor sensors were employed to detect CH4 and CO, and then support vector machine (SVM) applied as a supervised learning algorithm for gas classification. We also introduced a pairing plot scheme to more effectively classify gas type. The proposed model classified CH4 and CO gases 100% correctly at all levels above the minimum concentration the gas sensors could detect. Consequently, SVM with pairing plot is a memory efficient and promising method for more accurate gas classification.

Download Full-text