Heart disease prediction model with k-nearest neighbor algorithm

This paper presents an effective heart disease prediction model through detecting the anomalies, also known as outliers, in healthcare data using the unsupervised K-means clustering algorithm. Most existing approaches for detecting anomalies are based on constructing profiles of normal instances. However, such techniques require an adequate number of normal profiles to justify those models. Our proposed model first evaluates an \textit{optimal} value of K using Silhouette method. Next, it intends to locate anomalies that are far from a certain threshold distance with respect to their clusters. Finally, the five most popular classification techniques such as K-Nearest Neighbor (KNN), Random Forest (RF), Support Vector Machines (SVM), Naive Bayes (NB), and Logistic Regression (LR) are applied to build the resultant prediction model. The effectiveness of the proposed methodology is justified using a benchmark dataset of heart disease.

Download Full-text

A Framework for Heart Disease Prediction Using K nearest Neighbor Algorithm

Research Journal of Applied Sciences Engineering and Technology ◽

10.19026/rjaset.11.1669 ◽

2015 ◽

Vol 11 (1) ◽

pp. 10-13

Author(s):

R. Kavitha ◽

E. Kannan

Keyword(s):

Heart Disease ◽

Nearest Neighbor ◽

Disease Prediction ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm

Download Full-text

Efficient Heart Disease Prediction System using K-Nearest Neighbor Classification Technique

Proceedings of the International Conference on Big Data and Internet of Thing - BDIOT2017 ◽

10.1145/3175684.3175703 ◽

2017 ◽

Cited By ~ 3

Author(s):

Nida Khateeb ◽

Muhammad Usman

Keyword(s):

Heart Disease ◽

Nearest Neighbor ◽

Disease Prediction ◽

Prediction System ◽

K Nearest Neighbor ◽

Nearest Neighbor Classification ◽

Classification Technique ◽

Neighbor Classification

Download Full-text

Heart Disease Prediction Using k-Nearest Neighbor Classifier Based on Handwritten Text

Advances in Intelligent Systems and Computing - Computational Intelligence in Data Mining—Volume 1 ◽

10.1007/978-81-322-2734-2_6 ◽

2015 ◽

pp. 49-56 ◽

Cited By ~ 1

Author(s):

Seema Kedar ◽

D. S. Bormane ◽

Vaishnavi Nair

Keyword(s):

Heart Disease ◽

Nearest Neighbor ◽

Disease Prediction ◽

K Nearest Neighbor ◽

Nearest Neighbor Classifier ◽

Handwritten Text ◽

Neighbor Classifier

Download Full-text

K-Nearest Neighbor Based URL Identification Model for Phishing Attack Detection

Indian Journal of Artificial Intelligence and Neural Networking ◽

10.35940/ijainn.b1019.041221 ◽

2021 ◽

Vol 1 (2) ◽

pp. 18-21

Author(s):

Tsehay Admassu Assegie*

Keyword(s):

Nearest Neighbor ◽

Attack Detection ◽

Experimental Result ◽

Data Repository ◽

K Nearest Neighbor ◽

K Nearest Neighbors ◽

K Value ◽

Proposed Model ◽

Public Data ◽

Public Data Repository

Phishing causes many problems in business industry. The electronic commerce and electronic banking such as mobile banking involves a number of online transaction. In such online transactions, we have to discriminate features related to legitimate and phishing websites in order to ensure security of the online transaction. In this study, we have collected data form phish tank public data repository and proposed K-Nearest Neighbors (KNN) based model for phishing attack detection. The proposed model detects phishing attack through URL classification. The performance of the proposed model is tested empirically and result is analyzed. Experimental result on test set reveals that the model is efficient on phishing attack detection. Furthermore, the K value that gives better accuracy is determined to achieve better performance on phishing attack detection. Overall, the average accuracy of the proposed model is 85.08%.

Download Full-text

K-Nearest Neighbor Based URL Identification Model for Phishing Attack Detection

Indian Journal of Artificial Intelligence and Neural Networking ◽

10.54105/ijainn.b1019.041221 ◽

2021 ◽

pp. 18-21

Author(s):

Tsehay Admassu Assegie ◽

Keyword(s):

Nearest Neighbor ◽

Attack Detection ◽

Experimental Result ◽

Data Repository ◽

K Nearest Neighbor ◽

K Nearest Neighbors ◽

K Value ◽

Proposed Model ◽

Public Data ◽

Public Data Repository

Phishing causes many problems in business industry. The electronic commerce and electronic banking such as mobile banking involves a number of online transaction. In such online transactions, we have to discriminate features related to legitimate and phishing websites in order to ensure security of the online transaction. In this study, we have collected data form phish tank public data repository and proposed K-Nearest Neighbors (KNN) based model for phishing attack detection. The proposed model detects phishing attack through URL classification. The performance of the proposed model is tested empirically and result is analyzed. Experimental result on test set reveals that the model is efficient on phishing attack detection. Furthermore, the K value that gives better accuracy is determined to achieve better performance on phishing attack detection. Overall, the average accuracy of the proposed model is 85.08%.

Download Full-text

Diagnosis Of Heart Disease Using K-Nearest Neighbor Method Based On Forward Selection

Journal of Applied Intelligent System ◽

10.33633/jais.v4i2.2749 ◽

2020 ◽

Vol 4 (2) ◽

pp. 39-47

Author(s):

Junta Zeniarja ◽

Anisatawalanita Ukhifahdhina ◽

Abu Salam

Keyword(s):

Data Mining ◽

Feature Selection ◽

Heart Disease ◽

Early Diagnosis ◽

Nearest Neighbor ◽

Selection Method ◽

K Nearest Neighbor ◽

Forward Selection ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm

Heart is one of the essential organs that assume a significant part in the human body. However, heart can also cause diseases that affect the death. World Health Organization (WHO) data from 2012 showed that all deaths from cardiovascular disease (vascular) 7.4 million (42.3%) were caused by heart disease. Increased cases of heart disease require a step as an early prevention and prevention efforts by making early diagnosis of heart disease. In this research will be done early diagnosis of heart disease by using data mining process in the form of classification. The algorithm used is K-Nearest Neighbor algorithm with Forward Selection method. The K-Nearest Neighbor algorithm is used for classification in order to obtain a decision result from the diagnosis of heart disease, while the forward selection is used as a feature selection whose purpose is to increase the accuracy value. Forward selection works by removing some attributes that are irrelevant to the classification process. In this research the result of accuracy of heart disease diagnosis with K-Nearest Neighbor algorithm is 73,44%, while result of K-Nearest Neighbor algorithm accuracy with feature selection method 78,66%. It is clear that the incorporation of the K-Nearest Neighbor algorithm with the forward selection method has improved the accuracy result. Keywords - K-Nearest Neighbor, Classification, Heart Disease, Forward Selection, Data Mining

Download Full-text

Heart Disease Classification Model Using K-Nearest Neighbor Algorithm

10.1109/icic54025.2021.9632918 ◽

2021 ◽

Author(s):

Ben Rahman ◽

Harco Leslie Hendric Spits Warnars ◽

Boy Subirosa Sabarguna ◽

Widodo Budiharto

Keyword(s):

Heart Disease ◽

Nearest Neighbor ◽

Disease Classification ◽

Classification Model ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm

Download Full-text

Prediction of Citrullination Sites on the Basis of mRMR Method and SNN

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207322666191129113508 ◽

2020 ◽

Vol 22 (10) ◽

pp. 705-715 ◽

Cited By ~ 2

Author(s):

Min Liu ◽

Guangzhong Liu

Keyword(s):

Prediction Model ◽

Evaluation Method ◽

Nearest Neighbor ◽

Protein Sequences ◽

Support Vector ◽

Post Translational Modification ◽

K Nearest Neighbor ◽

Accurate Identification ◽

Citrullinated Proteins ◽

K Nearest Neighbor Algorithm

Background: Citrullination, an important post-translational modification of proteins, alters the molecular weight and electrostatic charge of the protein side chains. Citrulline, in protein sequences, is catalyzed by a class of Peptidyl Arginine Deiminases (PADs). Dependent on Ca2+, PADs include five isozymes: PAD 1, 2, 3, 4/5, and 6. Citrullinated proteins have been identified in many biological and pathological processes. Among them, abnormal protein citrullination modification can lead to serious human diseases, including multiple sclerosis and rheumatoid arthritis. Objective: It is important to identify the citrullination sites in protein sequences. The accurate identification of citrullination sites may contribute to the studies on the molecular functions and pathological mechanisms of related diseases. Methods and Results: In this study, after an encoded training set (containing 116 positive and 348 negative samples) into the feature matrix, the mRMR method was used to analyze the 941- dimensional features which were sorted on the basis of their importance. Then, a predictive model based on a self-normalizing neural network (SNN) was proposed to predict the citrullination sites in protein sequences. Incremental Feature Selection (IFS) and 10-fold cross-validation were used as the model evaluation method. Three classical machine learning models, namely random forest, support vector machine, and k-nearest neighbor algorithm, were selected and compared with the SNN prediction model using the same evaluation methods. SNN may be the best tool for citrullination site prediction. The maximum value of the Matthews Correlation Coefficient (MCC) reached 0.672404 on the basis of the optimal classifier of SNN. Conclusion: The results showed that the SNN-based prediction methods performed better when evaluated by some common metrics, such as MCC, accuracy, and F1-Measure. SNN prediction model also achieved a better balance in the classification and recognition of positive and negative samples from datasets compared with the other three models.

Download Full-text

Peningkatan Akurasi Klasifikasi Backpropagation Menggunakan Artificial Bee Colony dan K-NN Pada Penyakit Jantung

JURNAL MEDIA INFORMATIKA BUDIDARMA ◽

10.30865/mib.v5i1.2634 ◽

2021 ◽

Vol 5 (1) ◽

pp. 208

Author(s):

Pandito Dewa Putra ◽

Sukemi Sukemi ◽

Dian Palupi Rini

Keyword(s):

Blood Pressure ◽

Heart Disease ◽

Artificial Bee Colony ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Backpropagation Algorithm ◽

Model Accuracy ◽

Bee Colony ◽

The World ◽

K Nearest Neighbor Algorithm

Heart disease has ranked as the leading cause of death in the world, accounting for around 17.3 million deaths per year with some causes, as high blood pressure, diabetes, cholesterol fluctuation, fatigue, and some others which is collected on dataset. Heart disease dataset that was applied is cleveland heart disease with fourteen (14) data atribute. The method that was applied in this research was using Backpropagation algorithm on heart disease classifying, where will be combined Artificial Bee Colony and k-Nearest Neighbor algorithm for features or atribute choose due to this technique can increase classifier model accuracy which is produced as much as 94,23%.

Download Full-text