Analysis and Prediction of CET4 Scores Based on Data Mining Algorithm

This paper presents the concept and algorithm of data mining and focuses on the linear regression algorithm. Based on the multiple linear regression algorithm, many factors affecting CET4 are analyzed. Ideas based on data mining, collecting history data and appropriate to transform, using statistical analysis techniques to the many factors influencing the CET-4 test were analyzed, and we have obtained the CET-4 test result and its influencing factors. It was found that the linear regression relationship between the degrees of fit was relatively high. We further improve the algorithm and establish a partition-weighted K-nearest neighbor algorithm. The K-weighted K nearest neighbor algorithm and the partition algorithm are used in the CET-4 test score classification prediction, and the statistical method is used to study the relevant factors that affect the CET-4 test score, and screen classification is performed to predict when the comparison verification will pass. The weight K of the input feature and the adjacent feature are weighted, although the allocation algorithm of the adjacent classification effect has not been significantly improved, but the stability classification is better than K-nearest neighbor algorithm, its classification efficiency is greatly improved, classification time is greatly reduced, and classification efficiency is increased by 119%. In order to detect potential risk graduating students earlier, this paper proposes an appropriate and timely early warning and preschool K-nearest neighbor algorithm classification model. Taking test scores or make-up exams and re-learning as input features, the classification model can effectively predict ordinary students who have not graduated.

Download Full-text

PREDIKSI HASIL PEMILU LEGISLATIF MENGGUNAKAN ALGORITMA K-NEAREST NEIGHBOR BERBASIS BACKWARD ELIMINATION

Jurnal RESISTOR (Rekayasa Sistem Komputer) ◽

10.31598/jurnalresistor.v3i1.517 ◽

2020 ◽

Vol 3 (1) ◽

pp. 27-41

Author(s):

Achmad Saiful Rizal ◽

Moch. Lutfi

Keyword(s):

Data Mining ◽

Nearest Neighbor ◽

Political Elite ◽

Data Mining Algorithm ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

Backward Elimination ◽

K Nearest Neighbor Algorithm ◽

Fold Cross Validation ◽

Selection Of

Elections in Indonesia from period to period have undergone some changes. Elections legislative candidates not determined voters, but instead became a political elite authority in accordance with the order of the list of legislative candidates and their number sequence. To perform a prediction one of them with data mining. Data mining can be applied in the political sphere for example to predict the results of the legislative election and others. K-nearest neighbor algorithm is one of the data mining algorithm that performs classification based on learning object against which are closest to the object. Election-related research has been done with the k-nearest neighbor algorithm, but accuracy is obtained that method is still too low, so it takes an additional algorithm to improve accuracy. In this study, the proposed method, namely the method of k-nearest neighbor method combined with backward elimination as a selection of features. The dataset that will be used in the study comes from the KPU Sidoarjo that has special attributes 1 and 13 regular attributes. From the results of the analysis and computation of some methods, it can be concluded that the method of k-nearest neighbor method combined with backward elimination produced some conclusions. First, of the 14 attributes in the dataset, retrieved 8 most influential attribute. Second, the best accuracy are of 96.03% when k = 2 and tested by 10 fold cross validation.

Download Full-text

Data Mining Classification Of Filing Credit Customers Without Collateral With K-Nearest Neighbor Algorithm (Case study: PT. BPR Diori Double)

Journal Of Computer Networks, Architecture and High Performance Computing ◽

10.47709/cnapc.v2i2.401 ◽

2020 ◽

Vol 2 (2) ◽

pp. 204-210

Author(s):

Jeprianto Sinaga ◽

Bosker Sinaga

Keyword(s):

Data Mining ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

Savings And Loans ◽

Long Time ◽

K Nearest Neighbor Algorithm ◽

Using Data ◽

The Many

Unsecured loans are the community's choice for lending to banks that provide Reviews These services. PT. RB Diori Ganda is a regional private banking company that serves savings and loans and loans without collateral for the community. Submission of unsecured loans must go through an assessor team to process the analysis of the attributes that Affect the customer's classification so that credit can be approved, the which is then submitted to the commissioner for credit approval. But what if Reviews those who apply for credit on the same day in large amounts, of course this will the make the process of credit analysis and approval will take a long time. If it is seen from the many needs of the community to apply for loans without collateral, a classification application is needed, in order to Facilitate the work of the assessor team in the process of analyzing the attributes that Affect customer classification. To find out the classification of customers who apply for unsecured loans for using data mining with the K-Nearest Neighbor algorithm. The result of this research is the classification of problematic or non-performing customers for credit applications without collateral.

Download Full-text

Diagnosis Of Heart Disease Using K-Nearest Neighbor Method Based On Forward Selection

Journal of Applied Intelligent System ◽

10.33633/jais.v4i2.2749 ◽

2020 ◽

Vol 4 (2) ◽

pp. 39-47

Author(s):

Junta Zeniarja ◽

Anisatawalanita Ukhifahdhina ◽

Abu Salam

Keyword(s):

Data Mining ◽

Feature Selection ◽

Heart Disease ◽

Early Diagnosis ◽

Nearest Neighbor ◽

Selection Method ◽

K Nearest Neighbor ◽

Forward Selection ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm

Heart is one of the essential organs that assume a significant part in the human body. However, heart can also cause diseases that affect the death. World Health Organization (WHO) data from 2012 showed that all deaths from cardiovascular disease (vascular) 7.4 million (42.3%) were caused by heart disease. Increased cases of heart disease require a step as an early prevention and prevention efforts by making early diagnosis of heart disease. In this research will be done early diagnosis of heart disease by using data mining process in the form of classification. The algorithm used is K-Nearest Neighbor algorithm with Forward Selection method. The K-Nearest Neighbor algorithm is used for classification in order to obtain a decision result from the diagnosis of heart disease, while the forward selection is used as a feature selection whose purpose is to increase the accuracy value. Forward selection works by removing some attributes that are irrelevant to the classification process. In this research the result of accuracy of heart disease diagnosis with K-Nearest Neighbor algorithm is 73,44%, while result of K-Nearest Neighbor algorithm accuracy with feature selection method 78,66%. It is clear that the incorporation of the K-Nearest Neighbor algorithm with the forward selection method has improved the accuracy result. Keywords - K-Nearest Neighbor, Classification, Heart Disease, Forward Selection, Data Mining

Download Full-text

Heart Disease Classification Model Using K-Nearest Neighbor Algorithm

10.1109/icic54025.2021.9632918 ◽

2021 ◽

Author(s):

Ben Rahman ◽

Harco Leslie Hendric Spits Warnars ◽

Boy Subirosa Sabarguna ◽

Widodo Budiharto

Keyword(s):

Heart Disease ◽

Nearest Neighbor ◽

Disease Classification ◽

Classification Model ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm

Download Full-text

Implementasi Data Mining untuk Deteksi Penyakit Ginjal Kronis (PGK) menggunakan K-Nearest Neighbor (KNN) dengan Backward Elimination

Jurnal Teknologi Informasi dan Ilmu Komputer ◽

10.25126/jtiik.2020721896 ◽

2020 ◽

Vol 7 (2) ◽

pp. 417

Author(s):

Ikhsan Wisnuadji Gamadarenda ◽

Indra Waspada

Keyword(s):

Diabetes Mellitus ◽

Data Mining ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Data Discovery ◽

Nearest Neighbor Algorithm ◽

Web Based ◽

Elimination Algorithm ◽

Backward Elimination ◽

K Nearest Neighbor Algorithm

Penyakit ginjal kronis (PGK) merupakan masalah kesehatan publik di seluruh dunia dengan insiden yang terus meningkat. Berdasarkan sumber dari BPJS Kesehatan, perawatan PGK merupakan ranking kedua pembiayaan terbesar setelah penyakit jantung. Pendeteksian PGK juga memerlukan banyak atribut sehingga membutuhkan biaya yang cukup mahal. Oleh sebab itu dibuat sistem dengan tahapan data mining berbasis web yang memudahkan untuk melakukan deteksi PGK, sehingga PGK dapat dicegah, ditanggulangi, dan kemungkinan mendapatkan terapi yang efektif lebih besar jika diketahui lebih awal. Proses penelitian ini menggunakan sebuah rangka kerja data mining Knowledge Data Discovery (KDD). Dalam skenario rangka kerja yang digunakan, sistem ini menggunakan Algoritme Backward Elimination untuk mengurangi jumlah atribut yang dipakai dengan tujuan untuk mengurangi jenis pemeriksaan yang dilakukan, dan Algoritme k-Nearest Neighbor sebagai algoritme klasifikasi untuk mendeteksi penyakit. Hasil pemodelan terbaik data mining dari sistem yang dibuat menggunakan Backward Elimination (α = 0,05) dan kNN (k = 3) dengan pertimbangan penurunan biaya pemeriksaan dan sensitivity tertinggi. Rekomendasi sistem menghasilkan 10 atribut yang terpilih dari 24 atribut awal yang digunakan, yaitu: berat jenis (sg), albumin (al), urea darah (bu), kreatinin serum (sc), sodium (sod), hemoglobin (hemo), sel darah merah (rbc), hipertensi (htn), diabetes mellitus (dm), dan nafsu makan (appet). Penggunaan atribut yang telah terseleksi tersebut, berhasil menekan biaya pemeriksaan hingga 73,36%. Selanjutnya dilakukan pendeteksian penyakit menggunakan Algoritme k-Nearest Neighbor menghasilkan nilai akurasi sebesar 99,25%, sensitivity sebesar 99,5%, dan specificity sebesar 98,745%.AbstractChronic kidney disease (CKD) is a health problem for people around the world with increasing incidence. Based on sources from BPJS Kesehatan, CKD care is the second largest ranking of financing after heart disease. CKD detection also requires many attributes, so it requires quite expensive costs. Create a system with web-based data mining stages that makes it easy to detect CKD. Allowing CKD to be prevented, addressed, and advised to get effective therapy is greater if acknowledged earlier. The process of this research uses work methods of Data Mining Knowledge Data Discovery (KDD). In the framework of the framework used, this system uses the Backward Elimination Algorithm to reduce the number of attributes used to reduce the type of inspection performed, and the k-Nearest Neighbor Algorithm as an algorithm to update disease. The best data mining modeling results from the system are made using Backward Elimination (α = 0.05) and kNN (k = 3) by calculating the increase in inspection costs and the highest sensitivity. System recommendations produce 10 attributes selected from the 24 initial attributes used, namely: specific gravity (sg), albumin (al), blood urea (bu), serum creatinine (sc), sodium (soil), hemoglobin (hemo), cell red blood (rbc), hypertension (htn), diabetes mellitus (dm), and appetite (appetite). The use of the selected attributes succeeded in achieving inspection costs of up to 73.36%. Furthermore, disease detection using the k-Nearest Neighbor Algorithm produces an accuracy value of 99.25%, sensitivity of 99.5%, and specificity of 98.745%.

Download Full-text

Recent trends in big data using hadoop

International Journal of Informatics and Communication Technology (IJ-ICT) ◽

10.11591/ijict.v8i1.pp39-49 ◽

2019 ◽

Vol 8 (1) ◽

pp. 39

Author(s):

Chetna Kaushal ◽

Deepika Koundal

Keyword(s):

Data Mining ◽

Social Media ◽

Big Data ◽

Nearest Neighbor ◽

Classification Algorithms ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

Clustering Techniques ◽

Recent Trends ◽

K Nearest Neighbor Algorithm

Big data refers to huge set of data which is very common these days due to the increase of internet utilities. Data generated from social media is a very common example for the same. This paper depicts the summary on big data and ways in which it has been utilized in all aspects. Data mining is radically a mode of deriving the indispensable knowledge from extensively vast fractions of data which is quite challenging to be interpreted by conventional methods. The paper mainly focuses on the issues related to the clustering techniques in big data. For the classification purpose of the big data, the existing classification algorithms are concisely acknowledged and after that, k-nearest neighbor algorithm is discreetly chosen among them and described along with an example.

Download Full-text

APLIKASI PREDIKSI KELAYAKAN CALON ANGGOTA KREDIT PADA KSPPS BMT ARTA JIWA MANDIRI WONOGIRI MENGGUNAKAN ALGORITMA K-NEAREST NEIGHBOR

JISKA (Jurnal Informatika Sunan Kalijaga) ◽

10.14421/jiska.2018.32-03 ◽

2019 ◽

Vol 3 (2) ◽

pp. 84

Author(s):

Yogiek Indra Kurniawan ◽

Farida Angguntina

Keyword(s):

Data Mining ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Timely Manner ◽

Nearest Neighbor Algorithm ◽

Savings And Loans ◽

Research Results ◽

The People ◽

Savings And Loan ◽

K Nearest Neighbor Algorithm

An economy that tends to be unstable causes many people to make loans at banks and cooperatives to meet their increasing daily needs. But there are some people who cannot return the loan in a timely manner. These problems can be created or developed by an application that is used to predict whether the people who apply for loans can return loans smoothly, smoothly and stall. Use of attributes such as gender, age, type of work, number of loans, term of return, collateral and income and use the K-Nearest Neighbor algorithm to make predictions. From the research results obtained in the form of accuracy value of 80%, recall of 91% and preciison of 85%. Thus this application can be used to help the pinjman savings cooperative in considering prospective savings and loan credit members who deserve a capital loan. Keywords: data mining, K Nearest Neighbor, cooperatives, savings and loans.

Download Full-text

A Scalable K-Nearest Neighbor Algorithm for Recommendation System Problems

2020 43rd International Convention on Information, Communication and Electronic Technology (MIPRO) ◽

10.23919/mipro48935.2020.9245195 ◽

2020 ◽

Author(s):

A. Sagdic ◽

C. Tekinbas ◽

E. Arslan ◽

T. Kucukyilmaz

Keyword(s):

Recommendation System ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm

Download Full-text

Perancangan Aplikasi Prediksi Kelulusan Tepat Waktu Bagi Mahasiswa Baru Dengan Teknik Data Mining (Studi Kasus: Data Akademik Mahasiswa STMIK Dipanegara Makassar)

Creative Information Technology Journal ◽

10.24076/citec.2014v1i4.27 ◽

2015 ◽

Vol 1 (4) ◽

pp. 270

Author(s):

Muhammad Syukri Mustafa ◽

I. Wayan Simpen

Keyword(s):

Data Mining ◽

Nearest Neighbor ◽

Test Results ◽

K Nearest Neighbor ◽

Accuracy Rate ◽

Sample Data ◽

New Students ◽

K Nearest Neighbor Algorithm ◽

Using Data ◽

Existing Data

Penelitian ini dimaksudkan untuk melakukan prediksi terhadap kemungkian mahasiswa baru dapat menyelesaikan studi tepat waktu dengan menggunakan analisis data mining untuk menggali tumpukan histori data dengan menggunakan algoritma K-Nearest Neighbor (KNN). Aplikasi yang dihasilkan pada penelitian ini akan menggunakan berbagai atribut yang klasifikasikan dalam suatu data mining antara lain nilai ujian nasional (UN), asal sekolah/ daerah, jenis kelamin, pekerjaan dan penghasilan orang tua, jumlah bersaudara, dan lain-lain sehingga dengan menerapkan analysis KNN dapat dilakukan suatu prediksi berdasarkan kedekatan histori data yang ada dengan data yang baru, apakah mahasiswa tersebut berpeluang untuk menyelesaikan studi tepat waktu atau tidak. Dari hasil pengujian dengan menerapkan algoritma KNN dan menggunakan data sampel alumni tahun wisuda 2004 s.d. 2010 untuk kasus lama dan data alumni tahun wisuda 2011 untuk kasus baru diperoleh tingkat akurasi sebesar 83,36%.This research is intended to predict the possibility of new students time to complete studies using data mining analysis to explore the history stack data using K-Nearest Neighbor algorithm (KNN). Applications generated in this study will use a variety of attributes in a data mining classified among other Ujian Nasional scores (UN), the origin of the school / area, gender, occupation and income of parents, number of siblings, and others that by applying the analysis KNN can do a prediction based on historical proximity of existing data with new data, whether the student is likely to complete the study on time or not. From the test results by applying the KNN algorithm and uses sample data alumnus graduation year 2004 s.d 2010 for the case of a long and alumni data graduation year 2011 for new cases obtained accuracy rate of 83.36%.

Download Full-text

Handwritten Balinesse Character Recognition using K-Nearest Neighbor

10.31227/osf.io/z6m8u ◽

2018 ◽

Author(s):

I Wayan Agus Surya Darma

Keyword(s):

Feature Extraction ◽

Success Rate ◽

Character Recognition ◽

Nearest Neighbor ◽

Recognition System ◽

Extraction Process ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm ◽

Character Feature

Balinese character recognition is a technique to recognize feature or pattern of Balinese character. Feature of Balinese character is generated through feature extraction process. This research using handwritten Balinese character. Feature extraction is a process to obtain the feature of character. In this research, feature extraction process generated semantic and direction feature of handwritten Balinese character. Recognition is using K-Nearest Neighbor algorithm to recognize 81 handwritten Balinese character. The feature of Balinese character images tester are compared with reference features. Result of the recognition system with K=3 and reference=10 is achieved a success rate of 97,53%.

Download Full-text