Using Stratified Sample and Grid Search to Improve Disease Prediction Accuracy of SVM

2021 ◽

Vol 28 (5) ◽

pp. 118-129

Author(s):

Alabi Waheed Banjoko ◽

◽

Kawthar Opeyemi Abdulazeez ◽

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Heart Disease ◽

Cross Validation ◽

Classification Method ◽

Support Vector ◽

Data Mining Algorithm ◽

Machine Method ◽

Mining Algorithm ◽

Splitting Ratio

Background: The computerised classification and prediction of heart disease can be useful for medical personnel for the purpose of fast diagnosis with accurate results. This study presents an efficient classification method for predicting heart disease using a data-mining algorithm. Methods: The algorithm utilises the weighted support vector machine method for efficient classification of heart disease based on a binary response that indicates the presence or absence of heart disease as the result of an angiographic test. The optimal values of the support vector machine and the Radial Basis Function kernel parameters for the heart disease classification were determined via a 10-fold cross-validation method. The heart disease data was partitioned into training and testing sets using different percentages of the splitting ratio. Each of the training sets was used in training the classification method while the predictive power of the method was evaluated on each of the test sets using the Monte-Carlo cross-validation resampling technique. The effect of different percentages of the splitting ratio on the method was also observed. Results: The misclassification error rate was used to compare the performance of the method with three selected machine learning methods and was observed that the proposed method performs best over others in all cases considered. Conclusion: Finally, the results illustrate that the classification algorithm presented can effectively predict the heart disease status of an individual based on the results of an angiographic test.

Download Full-text

Data Mining Algorithm and the Effectiveness of Mathematics Classroom Teaching based on Support Vector Machine

International Journal of Database Theory and Application ◽

10.14257/ijdta.2016.9.11.15 ◽

2016 ◽

Vol 9 (11) ◽

pp. 163-174 ◽

Cited By ~ 1

Author(s):

Tang Qiang

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Mathematics Classroom ◽

Classroom Teaching ◽

Support Vector ◽

Data Mining Algorithm ◽

Mining Algorithm

Download Full-text

A Method for Classification Using Data Mining Technique for Diabetes

Psychology and Mental Health ◽

10.4018/978-1-5225-0159-6.ch030 ◽

2016 ◽

pp. 738-761

Author(s):

Ahmad Al-Khasawneh

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Classification Accuracy ◽

Health Information System ◽

Parameters Optimization ◽

Support Vector ◽

Data Mining Algorithms ◽

Predictive Data Mining ◽

Severity Of The Disease ◽

Using Data

Many researchers in the health information system field have been attracted to develop computer applications that help in the diagnosis process. Imperatively, data mining algorithms address the vital role in all of these applications. Many contributions were made in this area. There has always been a debate on the algorithm that gives the best classifier, the parameters to be used, the dataset pre-processing steps, etc. In this paper, the author largely emphasizes that the best way to build a predictive model with relatively high classification accuracy is to build several predictive models and to choose the model that gives the best results through parameters optimization. Diagnosing diabetes mellitus has gained considerable attention in the last few decades due to the increased severity of the disease. In this research, the author reviews four predictive data mining approaches that are being used in diagnosing diabetes. Four models were implemented to diagnose diabetes from PIMA dataset; k-nearest neighbour, support vector machine, multilayer perceptron neural network, and naive bayesian network. Giving the highest classification accuracy, support vector machine technique outperformed the others with a value of 78.83%.

Download Full-text

Support Vector Machine for Text Categorization using Principle Component Analysis in Data Mining

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d7350.018520 ◽

2020 ◽

Vol 8 (5) ◽

pp. 3164-3167

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Principle Component Analysis ◽

Classification Accuracy ◽

Text Categorization ◽

Computation Time ◽

Component Analysis ◽

Learning System ◽

Support Vector ◽

Principle Component

Data mining is the withdrawal of concealed prescient information also obscure data, examples, connections and learning by investigating the enormous informational collections which are hard to discover and distinguish with customary measurable techniques. The major issues in text categorization are classification accuracy and computation time. To overcome these issues, an efficient classification method is needed for high differentiation exactness as fine as minimizing the computation period. In this work, we propose the classification of data using support vector machine for text categorization along with principle component analysis. Bolster Vector Machines is a managed learning system with numerous attractive characteristics that make it a prevalent calculation. Principle Component Analysis (PCA) is the feature removal technique is used towards mine the features with in the text. Chi-Square is a further assortment technique it is used to selecting the features from removed features. Finally by this proposed work, the classification accuracy also computation period is improved than other existing algorithms in many applications

Download Full-text

A Method for Classification Using Data Mining Technique for Diabetes

Nature-Inspired Computing ◽

10.4018/978-1-5225-0788-8.ch017 ◽

2016 ◽

pp. 426-449

Author(s):

Ahmad Al-Khasawneh

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Classification Accuracy ◽

Health Information System ◽

Parameters Optimization ◽

Support Vector ◽

Data Mining Algorithms ◽

Predictive Data Mining ◽

Severity Of The Disease ◽

Using Data

Many researchers in the health information system field have been attracted to develop computer applications that help in the diagnosis process. Imperatively, data mining algorithms address the vital role in all of these applications. Many contributions were made in this area. There has always been a debate on the algorithm that gives the best classifier, the parameters to be used, the dataset pre-processing steps, etc. In this paper, the author largely emphasizes that the best way to build a predictive model with relatively high classification accuracy is to build several predictive models and to choose the model that gives the best results through parameters optimization. Diagnosing diabetes mellitus has gained considerable attention in the last few decades due to the increased severity of the disease. In this research, the author reviews four predictive data mining approaches that are being used in diagnosing diabetes. Four models were implemented to diagnose diabetes from PIMA dataset; k-nearest neighbour, support vector machine, multilayer perceptron neural network, and naive bayesian network. Giving the highest classification accuracy, support vector machine technique outperformed the others with a value of 78.83%.

Download Full-text

A Method for Classification Using Data Mining Technique for Diabetes: A Study of Health Care Information System

International Journal of Healthcare Information Systems and Informatics ◽

10.4018/ijhisi.2015070101 ◽

2015 ◽

Vol 10 (3) ◽

pp. 1-23 ◽

Cited By ~ 5

Author(s):

Ahmad Al-Khasawneh

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Information System ◽

Classification Accuracy ◽

Health Information System ◽

Computer Applications ◽

Parameters Optimization ◽

Support Vector ◽

Data Mining Technique ◽

Predictive Data Mining

Many researchers in the health information system field have been attracted to develop computer applications that help in the diagnosis process. Imperatively, data mining algorithms address the vital role in all of these applications. Many contributions were made in this area. There has always been a debate on the algorithm that gives the best classifier, the parameters to be used, the dataset pre-processing steps, etc. In this paper, the author largely emphasizes that the best way to build a predictive model with relatively high classification accuracy is to build several predictive models and to choose the model that gives the best results through parameters optimization. Diagnosing diabetes mellitus has gained considerable attention in the last few decades due to the increased severity of the disease. In this research, the author reviews four predictive data mining approaches that are being used in diagnosing diabetes. Four models were implemented to diagnose diabetes from PIMA dataset; k-nearest neighbour, support vector machine, multilayer perceptron neural network, and naive bayesian network. Giving the highest classification accuracy, support vector machine technique outperformed the others with a value of 78.83%.

Download Full-text

A Method for Classification Using Data Mining Technique for Diabetes

Virtual and Mobile Healthcare ◽

10.4018/978-1-5225-9863-3.ch006 ◽

2020 ◽

pp. 127-150

Author(s):

Ahmad Al-Khasawneh

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Classification Accuracy ◽

Health Information System ◽

Parameters Optimization ◽

Support Vector ◽

Data Mining Algorithms ◽

Predictive Data Mining ◽

Severity Of The Disease ◽

Using Data

Many researchers in the health information system field have been attracted to develop computer applications that help in the diagnosis process. Imperatively, data mining algorithms address the vital role in all of these applications. Many contributions were made in this area. There has always been a debate on the algorithm that gives the best classifier, the parameters to be used, the dataset pre-processing steps, etc. In this paper, the author largely emphasizes that the best way to build a predictive model with relatively high classification accuracy is to build several predictive models and to choose the model that gives the best results through parameters optimization. Diagnosing diabetes mellitus has gained considerable attention in the last few decades due to the increased severity of the disease. In this research, the author reviews four predictive data mining approaches that are being used in diagnosing diabetes. Four models were implemented to diagnose diabetes from PIMA dataset; k-nearest neighbour, support vector machine, multilayer perceptron neural network, and naive bayesian network. Giving the highest classification accuracy, support vector machine technique outperformed the others with a value of 78.83%.

Download Full-text

An Efficient System for Early Diagnosis of Breast Cancer using Support Vector Machine

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.a1626.109119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 7029-7035

Keyword(s):

Breast Cancer ◽

Data Mining ◽

Support Vector Machine ◽

Low Income ◽

Low Income Countries ◽

Support Vector ◽

Test Accuracy ◽

Ultrasound Images ◽

Data Mining Algorithm ◽

Efficient System

There are many lives lost every year due to cancer and among them; among the women breast cancer causes the most deaths. For the better prediction of breast cancer risks, numerous studies have been undertaken incorporating data mining techniques. 1.1 million Cases of breast cancer were reported in 2004. It has been seen over the years that, that the numbers increase with the increasing industrialization and urbanization. It was earlier observed that mostly affected countries with breast cancer were high income countries such as America but now a days it is also very serious issue in middle and low income countries like Africa, Latin America and Asia. The main objective of this paper is to create a model which can more efficiently and accurately categorize a cancer as malignant or benevolent based on interpretation of the numerical values of attributes of ultrasound images of breast cancer. In this paper various data mining algorithm used like SVM(Support Vector Machine) for prediction and compared it with various other algorithms such as CART, Logistic Regression, KNN for the best training and test accuracy. SVM algorithm gives the most accurate results among the rest algorithm.

Download Full-text

MODEL KLASIFIKASI KEPUASAN MAHASISWA TEKNIK TERHADAP SARANA PEMBELAJARAN MENGGUNAKAN DATA MINING

Jurnal Teknologi Informasi Jurnal Keilmuan dan Aplikasi Bidang Teknik Informatika ◽

10.47111/jti.v14i2.1222 ◽

2020 ◽

Vol 14 (2) ◽

pp. 112-118

Author(s):

Ariesta Lestari ◽

Elga Mariati ◽

Widiatry Widiatry

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Data Mining Algorithm ◽

Prediction System ◽

Data Mining Approach

Student in one of the stakeholder in a university. Therefore, student’s perception in the quality of learning facilities and infrastructures become important to ensure the university’s performance. The Faculty of Engineering of University of Palangka Raya has not comprehensively evaluated the students’ satisfactory of the learning’s facilities. In this research, methods from data mining approach was implemented to classify whether the students satisfy or not with the quality of the learning’s facility in Engineering Faculty. This research compared three data mining algorithm, Decision Tree C4.5, Support Vector Machine, and Naïve Bayes to obtain the best algorithm for the prediction system. 948 responses were collected, 61% of the respondent were satisfied with the quality of the learning facilities and infrastructures, while 39% of the respondents were dissatisfied. The Decision Tree c4.5 had the best performance with accuracy of 88% and precision of 98% compared to the Naïve Bayes and support vector machine.

Download Full-text

KLASIFIKASI SMS SPAM MENGGUNAKAN SUPPORT VECTOR MACHINE

Jurnal Pilar Nusa Mandiri ◽

10.33480/pilar.v15i2.693 ◽

2019 ◽

Vol 15 (2) ◽

pp. 275-280

Author(s):

Agus Setiyono ◽

Hilman F Pardede

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Decision Tree ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Spam Detection ◽

Support Vector Machine Algorithm ◽

Data Mining Techniques ◽

To Receive

It is now common for a cellphone to receive spam messages. Great number of received messages making it difficult for human to classify those messages to Spam or no Spam. One way to overcome this problem is to use Data Mining for automatic classifications. In this paper, we investigate various data mining techniques, named Support Vector Machine, Multinomial Naïve Bayes and Decision Tree for automatic spam detection. Our experimental results show that Support Vector Machine algorithm is the best algorithm over three evaluated algorithms. Support Vector Machine achieves 98.33%, while Multinomial Naïve Bayes achieves 98.13% and Decision Tree is at 97.10 % accuracy.

Download Full-text

Using Stratified Sample and Grid Search to Improve Disease Prediction Accuracy of SVM

Efficient Data-Mining Algorithm for Predicting Heart Disease Based on an Angiographic Test

Data Mining Algorithm and the Effectiveness of Mathematics Classroom Teaching based on Support Vector Machine

A Method for Classification Using Data Mining Technique for Diabetes

Support Vector Machine for Text Categorization using Principle Component Analysis in Data Mining

A Method for Classification Using Data Mining Technique for Diabetes

A Method for Classification Using Data Mining Technique for Diabetes: A Study of Health Care Information System

A Method for Classification Using Data Mining Technique for Diabetes

An Efficient System for Early Diagnosis of Breast Cancer using Support Vector Machine

MODEL KLASIFIKASI KEPUASAN MAHASISWA TEKNIK TERHADAP SARANA PEMBELAJARAN MENGGUNAKAN DATA MINING

KLASIFIKASI SMS SPAM MENGGUNAKAN SUPPORT VECTOR MACHINE

Export Citation Format