scholarly journals Diagnosis Of Heart Disease Using K-Nearest Neighbor Method Based On Forward Selection

2020 ◽  
Vol 4 (2) ◽  
pp. 39-47
Author(s):  
Junta Zeniarja ◽  
Anisatawalanita Ukhifahdhina ◽  
Abu Salam

Heart is one of the essential organs that assume a significant part in the human body. However, heart can also cause diseases that affect the death. World Health Organization (WHO) data from 2012 showed that all deaths from cardiovascular disease (vascular) 7.4 million (42.3%) were caused by heart disease. Increased cases of heart disease require a step as an early prevention and prevention efforts by making early diagnosis of heart disease. In this research will be done early diagnosis of heart disease by using data mining process in the form of classification. The algorithm used is K-Nearest Neighbor algorithm with Forward Selection method. The K-Nearest Neighbor algorithm is used for classification in order to obtain a decision result from the diagnosis of heart disease, while the forward selection is used as a feature selection whose purpose is to increase the accuracy value. Forward selection works by removing some attributes that are irrelevant to the classification process. In this research the result of accuracy of heart disease diagnosis with K-Nearest Neighbor algorithm is 73,44%, while result of K-Nearest Neighbor algorithm accuracy with feature selection method 78,66%. It is clear that the incorporation of the K-Nearest Neighbor algorithm with the forward selection method has improved the accuracy result. Keywords - K-Nearest Neighbor, Classification, Heart Disease, Forward Selection, Data Mining

Complexity ◽  
2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Hongyan Wang

This paper presents the concept and algorithm of data mining and focuses on the linear regression algorithm. Based on the multiple linear regression algorithm, many factors affecting CET4 are analyzed. Ideas based on data mining, collecting history data and appropriate to transform, using statistical analysis techniques to the many factors influencing the CET-4 test were analyzed, and we have obtained the CET-4 test result and its influencing factors. It was found that the linear regression relationship between the degrees of fit was relatively high. We further improve the algorithm and establish a partition-weighted K-nearest neighbor algorithm. The K-weighted K nearest neighbor algorithm and the partition algorithm are used in the CET-4 test score classification prediction, and the statistical method is used to study the relevant factors that affect the CET-4 test score, and screen classification is performed to predict when the comparison verification will pass. The weight K of the input feature and the adjacent feature are weighted, although the allocation algorithm of the adjacent classification effect has not been significantly improved, but the stability classification is better than K-nearest neighbor algorithm, its classification efficiency is greatly improved, classification time is greatly reduced, and classification efficiency is increased by 119%. In order to detect potential risk graduating students earlier, this paper proposes an appropriate and timely early warning and preschool K-nearest neighbor algorithm classification model. Taking test scores or make-up exams and re-learning as input features, the classification model can effectively predict ordinary students who have not graduated.


2010 ◽  
Vol 44-47 ◽  
pp. 1130-1134
Author(s):  
Sheng Li ◽  
Pei Lin Zhang ◽  
Bing Li

Feature selection is a key step in hydraulic system fault diagnosis. Some of the collected features are unrelated to classification model, and some are high correlated to other features. These features are harmful for establishing classification model. In order to solve this problem, genetic algorithm-partial least squares (GA-PLS) is proposed for selecting the representative and optimal features. K nearest neighbor algorithm (KNN) is used for diagnosing and classifying hydraulic system faults. For expressing better performance of GA-PLS, the original data of a model engineering hydraulic system is used, and the results of GA-PLS are compared with all feature used and GA. The experimental results show that, the proposed feature method can diagnose and classify hydraulic system faults more efficiently with using fewer features.


Author(s):  
Jeprianto Sinaga ◽  
Bosker Sinaga

Unsecured loans are the community's choice for lending to banks that provide Reviews These services. PT. RB Diori Ganda is a regional private banking company that serves savings and loans and loans without collateral for the community. Submission of unsecured loans must go through an assessor team to process the analysis of the attributes that Affect the customer's classification so that credit can be approved, the which is then submitted to the commissioner for credit approval. But what if Reviews those who apply for credit on the same day in large amounts, of course this will the make the process of credit analysis and approval will take a long time. If it is seen from the many needs of the community to apply for loans without collateral, a classification application is needed, in order to Facilitate the work of the assessor team in the process of analyzing the attributes that Affect customer classification. To find out the classification of customers who apply for unsecured loans for using data mining with the K-Nearest Neighbor algorithm. The result of this research is the classification of problematic or non-performing customers for credit applications without collateral.


2020 ◽  
Vol 3 (1) ◽  
pp. 27-41
Author(s):  
Achmad Saiful Rizal ◽  
Moch. Lutfi

Elections in Indonesia from period to period have undergone some changes. Elections legislative candidates not determined voters, but instead became a political elite authority in accordance with the order of the list of legislative candidates and their number sequence. To perform a prediction one of them with data mining. Data mining can be applied in the political sphere for example to predict the results of the legislative election and others. K-nearest neighbor algorithm is one of the data mining algorithm that performs classification based on learning object against which are closest to the object. Election-related research has been done with the k-nearest neighbor algorithm, but accuracy is obtained that method is still too low, so it takes an additional algorithm to improve accuracy. In this study, the proposed method, namely the method of k-nearest neighbor method combined with backward elimination as a selection of features. The dataset that will be used in the study comes from the KPU Sidoarjo that has special attributes 1 and 13 regular attributes. From the results of the analysis and computation of some methods, it can be concluded that the method of k-nearest neighbor method combined with backward elimination produced some conclusions. First, of the 14 attributes in the dataset, retrieved 8 most influential attribute. Second, the best accuracy are of 96.03% when k = 2 and tested by 10 fold cross validation.


2021 ◽  
Author(s):  
Ben Rahman ◽  
Harco Leslie Hendric Spits Warnars ◽  
Boy Subirosa Sabarguna ◽  
Widodo Budiharto

2020 ◽  
Vol 7 (2) ◽  
pp. 417
Author(s):  
Ikhsan Wisnuadji Gamadarenda ◽  
Indra Waspada

<p class="Abstrak">Penyakit ginjal kronis (PGK) merupakan masalah kesehatan publik di seluruh dunia dengan insiden yang terus meningkat. Berdasarkan sumber dari BPJS Kesehatan, perawatan PGK merupakan ranking kedua pembiayaan terbesar setelah penyakit jantung. Pendeteksian PGK juga memerlukan banyak atribut sehingga membutuhkan biaya yang cukup mahal. Oleh sebab itu dibuat sistem dengan tahapan data mining berbasis web yang memudahkan untuk melakukan deteksi PGK, sehingga PGK dapat dicegah, ditanggulangi, dan kemungkinan mendapatkan terapi yang efektif lebih besar jika diketahui lebih awal. Proses penelitian ini menggunakan sebuah rangka kerja<em> data mining</em> <em>Knowledge Data Discover</em>y (KDD). Dalam skenario rangka kerja yang digunakan, sistem ini menggunakan Algoritme <em>Backward Elimination</em> untuk mengurangi jumlah atribut yang dipakai dengan tujuan untuk mengurangi jenis pemeriksaan yang dilakukan, dan Algoritme k-<em>Nearest Neighbor</em> sebagai algoritme klasifikasi untuk mendeteksi penyakit. Hasil pemodelan terbaik <em>data mining</em> dari sistem yang dibuat menggunakan <em>Backward Elimination</em> (α = 0,05) dan kNN (<em>k = </em>3) dengan pertimbangan penurunan biaya pemeriksaan dan sensitivity tertinggi. Rekomendasi sistem menghasilkan 10 atribut yang terpilih dari 24 atribut awal yang digunakan, yaitu: berat jenis (<em>sg</em>), albumin (<em>al</em>), urea darah (<em>bu</em>), kreatinin serum (<em>sc</em>), sodium (<em>sod</em>), hemoglobin (<em>hemo</em>), sel darah merah (<em>rbc</em>), hipertensi (<em>htn</em>), diabetes mellitus (<em>dm</em>), dan nafsu makan (<em>appet</em>). Penggunaan atribut yang telah terseleksi tersebut, berhasil menekan biaya pemeriksaan hingga 73,36%. Selanjutnya dilakukan pendeteksian penyakit menggunakan Algoritme k-<em>Nearest Neighbor </em>menghasilkan nilai akurasi sebesar 99,25%, <em>sensitivity</em> sebesar 99,5%, dan <em>specificity</em> sebesar 98,745%.</p><p class="Abstrak"><em><strong>Abstract</strong></em></p><p class="Abstract"><em>Chronic kidney disease (CKD) is a health problem for people around the world with increasing incidence. Based on sources from BPJS Kesehatan, CKD care is the second largest ranking of financing after heart disease. CKD detection also requires many attributes, so it requires quite expensive costs. Create a system with web-based data mining stages that makes it easy to detect CKD. Allowing CKD to be prevented, addressed, and advised to get effective therapy is greater if acknowledged earlier. The process of this research uses work methods of Data Mining Knowledge Data Discovery (KDD). In the framework of the framework used, this system uses the Backward Elimination Algorithm to reduce the number of attributes used to reduce the type of inspection performed, and the k-Nearest Neighbor Algorithm as an algorithm to update disease. The best data mining modeling results from the system are made using Backward Elimination (α = 0.05) and kNN (k = 3) by calculating the increase in inspection costs and the highest sensitivity. System recommendations produce 10 attributes selected from the 24 initial attributes used, namely: specific gravity (sg), albumin (al), blood urea (bu), serum creatinine (sc), sodium (soil), hemoglobin (hemo), cell red blood (rbc), hypertension (htn), diabetes mellitus (dm), and appetite (appetite). The use of the selected attributes succeeded in achieving inspection costs of up to 73.36%. Furthermore, disease detection using the k-Nearest Neighbor Algorithm produces an accuracy value of 99.25%, sensitivity of 99.5%, and specificity of 98.745%.</em></p><p class="Abstrak"><em><strong><br /></strong></em></p>


Sign in / Sign up

Export Citation Format

Share Document