Evaluation of Classification Algorithms vs Knowledge-Based Methods for Differential Diagnosis of Asthma in Iranian Patients

Medical data mining intends to solve real-world problems in the diagnosis and treatment of diseases. This process applies various techniques and algorithms which have different levels of accuracy and precision. The purpose of this article is to apply data mining techniques to the diagnosis of asthma. Sensitivity, specificity and accuracy of K-nearest neighbor, Support Vector Machine, naive Bayes, Artificial Neural Network, classification tree, CN2 algorithms, and related similar studies were evaluated. ROC curves were plotted to show the performance of the authors' approach. Support vector machine (SVM) algorithms achieved the highest accuracy at 98.59% with a sensitivity of 98.59% and a specificity of 98.61% for class 1. Other algorithms had a range of accuracy greater than 87%. The results show that the authors can accurately diagnose asthma approximately 98% of the time based on demographics and clinical data. The study also has a higher sensitivity when compared to expert and knowledge-based systems.

Download Full-text

Data Mining Approach to Analyze COVID-19 Clinical Dataset

10.53350/pjmhs211561812 ◽

2021 ◽

Vol 15 (6) ◽

pp. 1812-1819

Author(s):

Azita Yazdani ◽

Ramin Ravangard ◽

Roxana Sharifian

Keyword(s):

Artificial Intelligence ◽

Data Mining ◽

Support Vector Machine ◽

Nearest Neighbor ◽

Clinical Signs ◽

Study Data ◽

Mining Machine ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Mining Approach

The new coronavirus has been spreading since the beginning of 2020 and many efforts have been made to develop vaccines to help patients recover. It is now clear that the world needs a rapid solution to curb the spread of COVID-19 worldwide with non-clinical approaches such as data mining, enhanced intelligence, and other artificial intelligence techniques. These approaches can be effective in reducing the burden on the health care system to provide the best possible way to diagnose and predict the COVID-19 epidemic. In this study, data mining models for early detection of Covid-19 in patients were developed using the epidemiological dataset of patients and individuals suspected of having Covid-19 in Iran. C4.5, support vector machine, Naive Bayes, logistic regression, Random Forest, and k-nearest neighbor algorithm were used directly on the dataset using Rapid miner to develop the models. By receiving clinical signs, this model diagnosis the risk of contracting the COVID-19 virus. Examination of the models in this study has shown that the support vector machine with 93.41% accuracy is more efficient in the diagnosis of patients with COVID-19 pandemic, which is the best model among other developed models. Keywords: COVID-19, Data mining, Machine Learning, Artificial Intelligence, Classification

Download Full-text

Komparasi Algoritma Nonparametrik untuk Klasifikasi Citra Wajah Berdasarkan Suku di Indonesia

Jurnal Edukasi dan Penelitian Informatika (JEPIN) ◽

10.26418/jp.v6i3.43268 ◽

2020 ◽

Vol 6 (3) ◽

pp. 337

Author(s):

Seno Hartono ◽

Anggi Perwitasari ◽

Herry Sujaini

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Decision Tree ◽

Nearest Neighbor ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Mining Tool ◽

Mining Tool

Klasifikasi merupakan metode data mining yang berfungsi untuk mengatur dan mengkategorikan data pada kelas yang berbeda-beda. Penelitian ini bertujuan untuk membandingkan dan menentukan algoritma nonparametrik terbaik dalam pengklasifikasian citra wajah. Dalam proses pengklasifikasian, penelitian ini menggunakan algoritma klasifikasi nonparametrik yaitu k-Nearest Neighbor (kNN), Support Vector Machine (SVM), Decision Tree, dan AdaBoost Untuk mengklasifikasikan citra wajah penduduk Indonesia yang berasal dari suku Batak, Dayak, Jawa, Melayu, dan Tionghoa. Penelitian ini menggunakan Orange Data Mining Tool sebagai alat bantu untuk melakukan proses data mining. Dari hasil pengklasifikasian dengan menerapkan algoritma k-Nearest Neigbor, Support Vector Machine, Decision Tree, dan AdaBoost, SVM memberikan nilai akurasi yang lebih baik dibanding algoritma lainnya. Rata-rata nilai precision keempat algoritma tersebut berturut-turut adalah Support Vector Machine 37.5%, diikuti oleh algoritma k-Nearest Neighbor 31.55%, AdaBoost 30.25%, dan untuk Decision Tree 29.75%.

Download Full-text

Decision Support System for Diabetes Classification Using Data Mining Techniques

Research Anthology on Decision Support Systems and Decision Management in Healthcare, Business, and Engineering ◽

10.4018/978-1-7998-9023-2.ch053 ◽

2021 ◽

pp. 1091-1113

Author(s):

Ahmad M. Al-Khasawneh

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Nearest Neighbor ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Mining Algorithms ◽

Use Of Data ◽

Predictive Data Mining ◽

Severity Of The Disease ◽

Using Data

The use of data mining algorithms in health information systems has played a significant role in developing applications that help to diagnose different diseases. The type of the disease determines the selection of the algorithm, parameters to be used, and dataset pre-processing steps, etc. In this chapter, diagnosing diabetes mellitus is the target since it has gained significant attention in the last few decades due to the increased severity of the disease. Four predictive data mining approaches are being used in diagnosing diabetes. Four models were implemented to diagnose diabetes from PIMA dataset: k-nearest neighbor, support vector machine, multilayer perceptron neural network, and naive Bayesian network. Giving the highest classification accuracy, support vector machine technique outperformed the others with a value of 78.83%.

Download Full-text

Decision Support System for Diabetes Classification Using Data Mining Techniques

Advances in Healthcare Information Systems and Administration - Handbook of Research on Emerging Perspectives on Healthcare Information Systems and Informatics ◽

10.4018/978-1-5225-5460-8.ch012 ◽

2018 ◽

pp. 281-303

Author(s):

Ahmad M. Al-Khasawneh

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Nearest Neighbor ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Mining Algorithms ◽

Use Of Data ◽

Predictive Data Mining ◽

Severity Of The Disease ◽

Using Data

Download Full-text

Predictive Data Mining Models for Novel Coronavirus (COVID-19) Infected Patients Recovery

10.21203/rs.3.rs-33247/v1 ◽

2020 ◽

Author(s):

L. J. Muhammad ◽

Md. Milon Islam ◽

Usman Sani Sharif ◽

Safial Islam Ayon

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Logistic Regression ◽

Random Forest ◽

Decision Tree ◽

Nearest Neighbor ◽

Support Vector ◽

K Nearest Neighbor ◽

The World ◽

Novel Coronavirus

Abstract Novel coronavirus (COVID-19 or 2019-nCoV) pandemic has neither clinically proven vaccine nor drugs; however, its patients are recovering with the aid of antibiotics medications, anti-viral drugs, and chloroquine as well as vitamin C supplementation. It is now evident that the world needs a speedy and quicker solution to contain and tackle the further spread of COVID-19 across the world with the aid of non-clinical approaches such as data mining approaches, augmented intelligence and other artificial intelligence techniques so as to mitigate the huge burden on the healthcare system while providing the best possible means for patients' diagnosis and prognosis of the 2019-nCoV pandemic effectively. In this study, data mining models were developed for the prediction of COVID-19 infected patients’ recovery using epidemiological dataset of COVID-19 patients of South Korea. The decision tree, support vector machine, naive Bayes, logistic regression, random forest, and K-nearest neighbor algorithms were applied directly on the dataset using python programming language to develop the models. The model predicted a minimum and maximum number of days for COVID-19 patients to recover from the virus, the age group of patients who are of high risk not to recover from the COVID-19 pandemic, those who are likely to recover and those who might be likely to recover quickly from COVID-19 pandemic. The results of the present study have shown that the model developed with decision tree data mining algorithm is more efficient to predict the possibility of recovery of the infected patients from COVID-19 pandemic with the overall accuracy of 99.85 % which stands to be the best model developed among the models developed with other algorithms including support vector machine, naive Bayes, logistic regression, random forest, and K-nearest neighbor.

Download Full-text

Analisis Data Bank Direct Marketing dengan Perbandingan Klasifikasi Data Mining Berbasis Optimize Selection (Evolutionary)

Jurnal Informatika Universitas Pamulang ◽

10.32493/informatika.v6i1.9291 ◽

2021 ◽

Vol 6 (1) ◽

pp. 102

Author(s):

Ahmad Fauzi

Keyword(s):

Data Mining ◽

Support Vector Machine ◽

Decision Maker ◽

Nearest Neighbor ◽

Direct Marketing ◽

Evolutionary Optimization ◽

Marketing Strategies ◽

Support Vector ◽

K Nearest Neighbor ◽

Bayes Algorithm

In determining marketing strategies, the bank performs a classification from a customer database, the database will be analyzed by a decision maker and this is not easy for a decision maker, because of the complexity of the vast data and the many attributes of the data owned, so that it becomes an obstacle and obstacle. in decision making. This of course can have a negative effect on the company's business processes because there will be delays in determining marketing strategies. Data mining method is a method that can classify large data to determine the level of accuracy of a database. In overcoming these problems, it is necessary to do a database analysis to determine the accuracy level of the database classification owned by the company. For this reason, in this study a classification process will be carried out with the Bank Direct Marketing dataset taken from the UCI Machine Learning Repository web, using the Naïve Bayes algorithm, K-Nearest Neighbor, Support Vector Machine with Optimize Selection (Evolutionary) optimization, the calculation process using a data mining application. namely Rapidminer 5.3, to find the highest accuracy value from the calculation algorithm. Test method with 10-fold cross validation. In this study, the classification results with the highest level of accuracy were obtained using Optimize Selection (Evolutionary) optimization, namely the Naïve Bayes algorithm 90.18%, then K-Nearest Neighbor 86.66%, and Support Vector Machine 89.40%.

Download Full-text

Framing Twitter Public Sentiment on Nigerian Government COVID-19 Palliatives Distribution Using Machine Learning

Sustainability ◽

10.3390/su13063497 ◽

2021 ◽

Vol 13 (6) ◽

pp. 3497

Author(s):

Hassan Adamu ◽

Syaheerah Lebai Lutfi ◽

Nurul Hashimah Ahamed Hassain Malim ◽

Rohail Hassan ◽

Assunta Di Vaio ◽

...

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Nearest Neighbor ◽

Primary Objective ◽

Support Vector ◽

Standard English ◽

Emotion Classification ◽

K Nearest Neighbor ◽

The Public ◽

The Government

Sustainable development plays a vital role in information and communication technology. In times of pandemics such as COVID-19, vulnerable people need help to survive. This help includes the distribution of relief packages and materials by the government with the primary objective of lessening the economic and psychological effects on the citizens affected by disasters such as the COVID-19 pandemic. However, there has not been an efficient way to monitor public funds’ accountability and transparency, especially in developing countries such as Nigeria. The understanding of public emotions by the government on distributed palliatives is important as it would indicate the reach and impact of the distribution exercise. Although several studies on English emotion classification have been conducted, these studies are not portable to a wider inclusive Nigerian case. This is because Informal Nigerian English (Pidgin), which Nigerians widely speak, has quite a different vocabulary from Standard English, thus limiting the applicability of the emotion classification of Standard English machine learning models. An Informal Nigerian English (Pidgin English) emotions dataset is constructed, pre-processed, and annotated. The dataset is then used to classify five emotion classes (anger, sadness, joy, fear, and disgust) on the COVID-19 palliatives and relief aid distribution in Nigeria using standard machine learning (ML) algorithms. Six ML algorithms are used in this study, and a comparative analysis of their performance is conducted. The algorithms are Multinomial Naïve Bayes (MNB), Support Vector Machine (SVM), Random Forest (RF), Logistics Regression (LR), K-Nearest Neighbor (KNN), and Decision Tree (DT). The conducted experiments reveal that Support Vector Machine outperforms the remaining classifiers with the highest accuracy of 88%. The “disgust” emotion class surpassed other emotion classes, i.e., sadness, joy, fear, and anger, with the highest number of counts from the classification conducted on the constructed dataset. Additionally, the conducted correlation analysis shows a significant relationship between the emotion classes of “Joy” and “Fear”, which implies that the public is excited about the palliatives’ distribution but afraid of inequality and transparency in the distribution process due to reasons such as corruption. Conclusively, the results from this experiment clearly show that the public emotions on COVID-19 support and relief aid packages’ distribution in Nigeria were not satisfactory, considering that the negative emotions from the public outnumbered the public happiness.

Download Full-text

Ovarian cancer classification using K-Nearest Neighbor and Support Vector Machine

Journal of Physics Conference Series ◽

10.1088/1742-6596/1821/1/012007 ◽

2021 ◽

Vol 1821 (1) ◽

pp. 012007

Author(s):

V V P Wibowo ◽

Z Rustam ◽

S Hartini ◽

F Maulidina ◽

I Wirasati ◽

...

Keyword(s):

Ovarian Cancer ◽

Support Vector Machine ◽

Nearest Neighbor ◽

Cancer Classification ◽

Support Vector ◽

K Nearest Neighbor

Download Full-text

Recognition of 3D Objects from 2D Views Features

Journal of Electronic Commerce in Organizations ◽

10.4018/jeco.2015040105 ◽

2015 ◽

Vol 13 (2) ◽

pp. 50-58

Author(s):

R. Khadim ◽

R. El Ayachi ◽

Mohamed Fakir

Keyword(s):

Neural Network ◽

Support Vector Machine ◽

Nearest Neighbor ◽

Color Image ◽

Recognition Rate ◽

Experimental Results ◽

Support Vector ◽

K Nearest Neighbor ◽

3D Objects ◽

Color Descriptor

This paper focuses on the recognition of 3D objects using 2D attributes. In order to increase the recognition rate, the present an hybridization of three approaches to calculate the attributes of color image, this hybridization based on the combination of Zernike moments, Gist descriptors and color descriptor (statistical moments). In the classification phase, three methods are adopted: Neural Network (NN), Support Vector Machine (SVM), and k-nearest neighbor (KNN). The database COIL-100 is used in the experimental results.

Download Full-text

Aspect Term Extraction for Aspect Based Opinion Mining

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k2050.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 2228-2233

Keyword(s):

Support Vector Machine ◽

Sentiment Analysis ◽

Random Fields ◽

Opinion Mining ◽

Nearest Neighbor ◽

Conditional Random Fields ◽

International Workshop ◽

Support Vector ◽

K Nearest Neighbor ◽

Term Extraction

Opinion Mining (OM) is also called as Sentiment Analysis (SA). Aspect Based Opinion Mining (ABOM) is also called as Aspect Based Sentiment Analysis (ABSA). In this paper, three new features are proposed to extract the aspect term for Aspect Based Sentiment Analysis (ABSA). The influence of the proposed features is evaluated on five classifiers namely Decision Tree (DT), Naive Bayes (NB), K-Nearest Neighbor (KNN), Support Vector Machine (SVM) and Conditional Random Fields (CRF). The proposed features are evaluated on the Two datasets on Restaurant and Laptop domains available in International Workshop on Semantic Evaluation 2014 i.e. SemEval 2014. The influence of proposed features is evaluated using Precision, Recall and F1 measures. The proposed features are highly influencing for aspect term extraction on classifiers. The performance of SVM and CRF classifiers with proposed features is more influencing for aspect term extraction compared with NB, DT and KNN classifiers.

Download Full-text