Decision Support System for Diabetes Classification Using Data Mining Techniques

Author(s):  
Ahmad M. Al-Khasawneh

The use of data mining algorithms in health information systems has played a significant role in developing applications that help to diagnose different diseases. The type of the disease determines the selection of the algorithm, parameters to be used, and dataset pre-processing steps, etc. In this chapter, diagnosing diabetes mellitus is the target since it has gained significant attention in the last few decades due to the increased severity of the disease. Four predictive data mining approaches are being used in diagnosing diabetes. Four models were implemented to diagnose diabetes from PIMA dataset: k-nearest neighbor, support vector machine, multilayer perceptron neural network, and naive Bayesian network. Giving the highest classification accuracy, support vector machine technique outperformed the others with a value of 78.83%.

Author(s):  
Ahmad M. Al-Khasawneh

The use of data mining algorithms in health information systems has played a significant role in developing applications that help to diagnose different diseases. The type of the disease determines the selection of the algorithm, parameters to be used, and dataset pre-processing steps, etc. In this chapter, diagnosing diabetes mellitus is the target since it has gained significant attention in the last few decades due to the increased severity of the disease. Four predictive data mining approaches are being used in diagnosing diabetes. Four models were implemented to diagnose diabetes from PIMA dataset: k-nearest neighbor, support vector machine, multilayer perceptron neural network, and naive Bayesian network. Giving the highest classification accuracy, support vector machine technique outperformed the others with a value of 78.83%.


2016 ◽  
pp. 738-761
Author(s):  
Ahmad Al-Khasawneh

Many researchers in the health information system field have been attracted to develop computer applications that help in the diagnosis process. Imperatively, data mining algorithms address the vital role in all of these applications. Many contributions were made in this area. There has always been a debate on the algorithm that gives the best classifier, the parameters to be used, the dataset pre-processing steps, etc. In this paper, the author largely emphasizes that the best way to build a predictive model with relatively high classification accuracy is to build several predictive models and to choose the model that gives the best results through parameters optimization. Diagnosing diabetes mellitus has gained considerable attention in the last few decades due to the increased severity of the disease. In this research, the author reviews four predictive data mining approaches that are being used in diagnosing diabetes. Four models were implemented to diagnose diabetes from PIMA dataset; k-nearest neighbour, support vector machine, multilayer perceptron neural network, and naive bayesian network. Giving the highest classification accuracy, support vector machine technique outperformed the others with a value of 78.83%.


2016 ◽  
pp. 426-449
Author(s):  
Ahmad Al-Khasawneh

Many researchers in the health information system field have been attracted to develop computer applications that help in the diagnosis process. Imperatively, data mining algorithms address the vital role in all of these applications. Many contributions were made in this area. There has always been a debate on the algorithm that gives the best classifier, the parameters to be used, the dataset pre-processing steps, etc. In this paper, the author largely emphasizes that the best way to build a predictive model with relatively high classification accuracy is to build several predictive models and to choose the model that gives the best results through parameters optimization. Diagnosing diabetes mellitus has gained considerable attention in the last few decades due to the increased severity of the disease. In this research, the author reviews four predictive data mining approaches that are being used in diagnosing diabetes. Four models were implemented to diagnose diabetes from PIMA dataset; k-nearest neighbour, support vector machine, multilayer perceptron neural network, and naive bayesian network. Giving the highest classification accuracy, support vector machine technique outperformed the others with a value of 78.83%.


2020 ◽  
pp. 127-150
Author(s):  
Ahmad Al-Khasawneh

Many researchers in the health information system field have been attracted to develop computer applications that help in the diagnosis process. Imperatively, data mining algorithms address the vital role in all of these applications. Many contributions were made in this area. There has always been a debate on the algorithm that gives the best classifier, the parameters to be used, the dataset pre-processing steps, etc. In this paper, the author largely emphasizes that the best way to build a predictive model with relatively high classification accuracy is to build several predictive models and to choose the model that gives the best results through parameters optimization. Diagnosing diabetes mellitus has gained considerable attention in the last few decades due to the increased severity of the disease. In this research, the author reviews four predictive data mining approaches that are being used in diagnosing diabetes. Four models were implemented to diagnose diabetes from PIMA dataset; k-nearest neighbour, support vector machine, multilayer perceptron neural network, and naive bayesian network. Giving the highest classification accuracy, support vector machine technique outperformed the others with a value of 78.83%.


Author(s):  
M. Jupri ◽  
Riyanarto Sarno

The achievement of accepting optimal tax need effective and efficient tax supervision can be achieved by classifying taxpayer compliance to tax regulations. Considering this issue, this paper proposes the classification of taxpayer compliance using data mining algorithms; i.e. C4.5, Support Vector Machine, K-Nearest Neighbor, Naive Bayes, and Multilayer Perceptron based on the compliance of taxpayer data. The taxpayer compliance can be classified into four classes, which are (1) formal and material compliant taxpayers, (2) formal compliant taxpayers, (3) material compliant taxpayers, and (4) formal and material non-compliant taxpayers. Furthermore, the results of data mining algorithms are compared by using Fuzzy AHP and TOPSIS to determine the best performance classification based on the criteria of Accuracy, F-Score, and Time required. Selection of the taxpayer's priority for more detailed supervision at each level of taxpayer compliance is ranked using Fuzzy AHP and TOPSIS based on criteria of dataset variables. The results show that C4.5 is the best performance classification and achieves preference value of 0.998; whereas the MLP algorithm results from the lowest preference value of 0.131. Alternative taxpayer A233 is the top priority taxpayer with a preference value of 0.433; whereas alternative taxpayer A051 is the lowest priority taxpayer with a preference value of 0.036.


2021 ◽  
Vol 15 (6) ◽  
pp. 1812-1819
Author(s):  
Azita Yazdani ◽  
Ramin Ravangard ◽  
Roxana Sharifian

The new coronavirus has been spreading since the beginning of 2020 and many efforts have been made to develop vaccines to help patients recover. It is now clear that the world needs a rapid solution to curb the spread of COVID-19 worldwide with non-clinical approaches such as data mining, enhanced intelligence, and other artificial intelligence techniques. These approaches can be effective in reducing the burden on the health care system to provide the best possible way to diagnose and predict the COVID-19 epidemic. In this study, data mining models for early detection of Covid-19 in patients were developed using the epidemiological dataset of patients and individuals suspected of having Covid-19 in Iran. C4.5, support vector machine, Naive Bayes, logistic regression, Random Forest, and k-nearest neighbor algorithm were used directly on the dataset using Rapid miner to develop the models. By receiving clinical signs, this model diagnosis the risk of contracting the COVID-19 virus. Examination of the models in this study has shown that the support vector machine with 93.41% accuracy is more efficient in the diagnosis of patients with COVID-19 pandemic, which is the best model among other developed models. Keywords: COVID-19, Data mining, Machine Learning, Artificial Intelligence, Classification


Author(s):  
Reza Safdari ◽  
Peyman Rezaei-Hachesu ◽  
Marjan GhaziSaeedi ◽  
Taha Samad-Soltani ◽  
Maryam Zolnoori

Medical data mining intends to solve real-world problems in the diagnosis and treatment of diseases. This process applies various techniques and algorithms which have different levels of accuracy and precision. The purpose of this article is to apply data mining techniques to the diagnosis of asthma. Sensitivity, specificity and accuracy of K-nearest neighbor, Support Vector Machine, naive Bayes, Artificial Neural Network, classification tree, CN2 algorithms, and related similar studies were evaluated. ROC curves were plotted to show the performance of the authors' approach. Support vector machine (SVM) algorithms achieved the highest accuracy at 98.59% with a sensitivity of 98.59% and a specificity of 98.61% for class 1. Other algorithms had a range of accuracy greater than 87%. The results show that the authors can accurately diagnose asthma approximately 98% of the time based on demographics and clinical data. The study also has a higher sensitivity when compared to expert and knowledge-based systems.


2020 ◽  
Vol 6 (3) ◽  
pp. 337
Author(s):  
Seno Hartono ◽  
Anggi Perwitasari ◽  
Herry Sujaini

Klasifikasi merupakan metode data mining yang berfungsi untuk mengatur dan mengkategorikan data pada kelas yang berbeda-beda. Penelitian ini bertujuan untuk membandingkan dan menentukan algoritma nonparametrik terbaik dalam pengklasifikasian citra wajah. Dalam proses pengklasifikasian, penelitian ini menggunakan algoritma klasifikasi nonparametrik yaitu k-Nearest Neighbor (kNN), Support Vector Machine (SVM), Decision Tree, dan AdaBoost Untuk mengklasifikasikan citra wajah penduduk Indonesia yang berasal dari suku Batak, Dayak, Jawa, Melayu, dan Tionghoa. Penelitian ini menggunakan Orange Data Mining Tool sebagai alat bantu untuk melakukan proses data mining. Dari hasil pengklasifikasian dengan menerapkan algoritma k-Nearest Neigbor, Support Vector Machine, Decision Tree, dan AdaBoost, SVM memberikan nilai akurasi yang lebih baik dibanding algoritma lainnya. Rata-rata nilai precision keempat algoritma tersebut berturut-turut adalah Support Vector Machine 37.5%, diikuti oleh algoritma k-Nearest Neighbor 31.55%, AdaBoost 30.25%, dan untuk Decision Tree 29.75%.


The healthcare industry assembles massive volume of healthcare information or data that circulate the information into useful data. In everyday life several factors that affect the human diseases. Hospitals are producing large amount of information related to patients. This paper describes the various data mining algorithms such as neural network, support vector machine, KNN, decision tree etc. and provides an overall brief of the existing work. The major advantage of using data mining is that to identify the structures.


2020 ◽  
Author(s):  
L. J. Muhammad ◽  
Md. Milon Islam ◽  
Usman Sani Sharif ◽  
Safial Islam Ayon

Abstract Novel coronavirus (COVID-19 or 2019-nCoV) pandemic has neither clinically proven vaccine nor drugs; however, its patients are recovering with the aid of antibiotics medications, anti-viral drugs, and chloroquine as well as vitamin C supplementation. It is now evident that the world needs a speedy and quicker solution to contain and tackle the further spread of COVID-19 across the world with the aid of non-clinical approaches such as data mining approaches, augmented intelligence and other artificial intelligence techniques so as to mitigate the huge burden on the healthcare system while providing the best possible means for patients' diagnosis and prognosis of the 2019-nCoV pandemic effectively. In this study, data mining models were developed for the prediction of COVID-19 infected patients’ recovery using epidemiological dataset of COVID-19 patients of South Korea. The decision tree, support vector machine, naive Bayes, logistic regression, random forest, and K-nearest neighbor algorithms were applied directly on the dataset using python programming language to develop the models. The model predicted a minimum and maximum number of days for COVID-19 patients to recover from the virus, the age group of patients who are of high risk not to recover from the COVID-19 pandemic, those who are likely to recover and those who might be likely to recover quickly from COVID-19 pandemic. The results of the present study have shown that the model developed with decision tree data mining algorithm is more efficient to predict the possibility of recovery of the infected patients from COVID-19 pandemic with the overall accuracy of 99.85 % which stands to be the best model developed among the models developed with other algorithms including support vector machine, naive Bayes, logistic regression, random forest, and K-nearest neighbor.


Sign in / Sign up

Export Citation Format

Share Document