PREDIKSI KELANCARAN PEMBAYARAN CICILAN  CALON DEBITUR DENGAN METODE K-NEAREST NEIGHBOR

Abstract: In this research, a prediction system has been successfully developed to predict whether or not a prospective money borrower will run smoothly. Prospective borrowers who will borrow, some of the data that meet the criteria will be inputted by the office clerk into a prediction application system interface to be processed using the Data Mining method, namely the K-Nearest Neighbor Algorithm with the Codeigniter programming language 3. The results of the Euclidean calculation process are based on predetermined criteria Between training data (training) to testing data (test) will be displayed with a table that has been sorted from smallest to largest containing 9 closest neighbors according to the K value that has been determined, namely 9. The nine neighbors will be taken the dominant category. This dominant category can be used as a guideline that makes it easier for the leader to make a decision on the next borrower. Keywords: Data Mining; Euclidean; K-Nearest Neighbor; Prospective Borrowers; Abstrak: Dalam penelitian ini telah berhasil dibuat sebuah sistem prediksi untuk memprediksi lancar atau tidak lancarnya seorang calon peminjam uang. Calon peminjam uang yang akan meminjam, sebagian datanya yang memenuhi kriteria akan diinputkan petugas kantor ke dalam sebuah interface sistem aplikasi prediksi untuk diolah menggunakan metode Data Mining yaitu Algoritma K-Nearest Neighbor dengan bahasa pemrograman Codeigniter 3. Hasil proses perhitungan Euclidean berdasarkan kriteria yang sudah ditentukan antara data training (latih) ke data testing (uji) tersebut akan ditampilkan dengan sebuah tabel yang sudah diurutkan dari yang terkecil ke terbesar berisi 9 tetangga terdekat sesuai dengan nilai K yang sudah ditentukan yaitu 9. Sembilan tetangga tersebut akan diambil kategori yang dominan. Kategori yang dominan tersebut bisa dijadikan suatu pedoman yang memudahkan pimpinan dalam mengambil sebuah keputusan kepada calon peminjam selanjutnya. Kata kunci: Debitur; Data Mining; Euclidean; K-Nearest Neighbor

Download Full-text

Optimization of k value and lag parameter of k-nearest neighbor algorithm on the prediction of hotel occupancy rates

Jurnal Teknologi dan Sistem Komputer ◽

10.14710/jtsiskom.2020.13648 ◽

2020 ◽

Vol 8 (3) ◽

pp. 246-254

Author(s):

Agus Subhan Akbar ◽

R. Hadapiningradja Kusumodestoni

Keyword(s):

Nearest Neighbor ◽

Business Management ◽

Training Data ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Value ◽

Sample Data ◽

K Nearest Neighbor Algorithm ◽

Occupancy Rates ◽

Fold Cross Validation

Hotel occupancy rates are the most important factor in hotel business management. Prediction of the rates for the next few months determines the manager's decision to arrange and provide all the needed facilities. This study performs the optimization of lag parameters and k values of the k-Nearest Neighbor algorithm on hotel occupancy history data. Historical data were arranged in the form of supervised training data, with the number of columns per row according to the lag parameter and the number of prediction targets. The kNN algorithm was applied using 10-fold cross-validation and k-value variations from 1-30. The optimal lag was obtained at intervals of 14-17 and the optimal k at intervals of 5-13 to predict occupancy rates of 1, 3, 6, 9, and 12 months later. The obtained k-value does not follow the rule at the square root of the number of sample data.

Download Full-text

Implementasi Algoritma K-Nearest Neighbor Untuk Memprediksi Program Studi Bagi Calon Mahasiswa Baru

Infotek : Jurnal Informatika dan Teknologi ◽

10.29408/jit.v4i2.3546 ◽

2021 ◽

Vol 4 (2) ◽

pp. 131-141

Author(s):

Ratna Rahmawati Rahayu ◽

◽

Lidiawati Lidiawati ◽

Keyword(s):

Measurement Accuracy ◽

Nearest Neighbor ◽

High Accuracy ◽

Training Data ◽

K Nearest Neighbor ◽

Study Program ◽

Testing Data ◽

Study Programs ◽

New Students ◽

K Nearest Neighbor Algorithm

One of the factors for students graduating on time with good grades is that the study program they take is in accordance with their interests and competencies. For this reason, in the process of admitting new students, it is necessary to carry out selection, information and direction regarding the chosen study program. By using previous year's student data, data mining processing is carried out to produce classifications of study programs for prospective new students. To get maximum results, preprocessing data is carried out, after which the data is divided into training data and testing data. The two data are then processed with the K-Nearest Neighbor algorithm to determine the suitability of the Study Program class in the testing data and then the measurement accuracy value is calculated. Because it has a high accuracy value of 74%, using this training data it is developed in the form of an application with Java NetBeans which can be used to assist prospective new students in predicting the appropriate study program

Download Full-text

Comparison of Distance Models on K-Nearest Neighbor Algorithm in Stroke Disease Detection

Applied Technology and Computing Science Journal ◽

10.33086/atcsj.v4i1.2097 ◽

2021 ◽

Vol 4 (1) ◽

pp. 63-68

Author(s):

Iswanto Iswanto ◽

Tulus Tulus ◽

Poltak Sihombing

Keyword(s):

Nearest Neighbor ◽

The Other ◽

Training Data ◽

Machine Learning Method ◽

Test Results ◽

K Nearest Neighbor ◽

Minkowski Distance ◽

K Value ◽

Average Accuracy ◽

K Nearest Neighbor Algorithm

Stroke is a cardiovascular (CVD) disease caused by the failure of brain cells to get oxygen supply to pose a risk of ischemic damage and result in death. This Disease can detect based on the similarity of symptoms experienced by the sufferer so that early steps can be taking with appropriate counseling and treatment. Stroke detecting requires a machine learning method. In this research, the author used one of the supervised learning classification methods, namely K-Nearest Neighbor (K-NN). K-NN is a classification method based on calculating the distance to training data. This research compares the Euclidean, Minkowski, Manhattan, Chebyshev distance models to obtain optimal results. The distance models have been tested using the stroke dataset sourced from the Kaggle repository. Based on the test results, the Chebyshev model has the highest levels of accuracy compared to the other three distance models with an average accuracy value of 95.49%, the highest accuracy of 96.03%, at K = 10. The Euclidean and Minkowski distance models have the same level of accuracy at each K value with an average accuracy value of 95.45%, the highest accuracy of 95.93% at K = 10. Meanwhile, Manhattan has the lowest average compared to the other distance models, which is 95.42% but has the highest accuracy of 96.03% at the value of K = 6

Download Full-text

Comparison of Naive Bayes Method, K-NN (K-Nearest Neighbor) and Decision Tree for Predicting the Graduation of ‘Aisyiyah University Students of Yogyakarta

International Journal of Health Science and Technology ◽

10.31101/ijhst.v2i1.1829 ◽

2021 ◽

Vol 2 (1) ◽

Author(s):

Tikaridha Hardiani

Keyword(s):

Data Mining ◽

Decision Tree ◽

Nearest Neighbor ◽

Naive Bayes ◽

Training Data ◽

K Nearest Neighbor ◽

Classification Technique ◽

Testing Data ◽

Student Graduation ◽

The University

The students of Universitas ‘Aisyiyah Yogyakarta have been increasing including the number of students in the Faculty of Health Sciences. In 2016 the total number of UNISA students was 1851. The increasing number of students every year leads to great numbers of data stored in the university database. The data provide useful information for the university to predict student graduation or student study period whether they graduate on time with a study period of 4 years or late with a study period of more than 4 years. This can be processed by using a data mining technique that is the classification technique. Data needed in the classification technique are data of students who have graduated as training data and data of students who are still studying in the university as testing data. The training data were 501 records with 10 goals and the testing data were 428 records. Data mining process method used was the Cross-Industry Standard Prosses for Data Mining (CRISPDM). The algorithms used in this study were Naive Bayes, K-Nearest Neighbor (KNN) and Decision Tree. The three algorithms were compared to see the accuracy by using Rapidminer software. Based on the accuracy, it was found that the K-NN algorithm was the best in predicting student graduation with an accuracy of 91.82%. The K-NN algorithm showed that 100% of the students of Nursing study program of Universitas Aisyiyah Yogyakarta are predicted to graduate on time.

Download Full-text

Perancangan Aplikasi Prediksi Kelulusan Tepat Waktu Bagi Mahasiswa Baru Dengan Teknik Data Mining (Studi Kasus: Data Akademik Mahasiswa STMIK Dipanegara Makassar)

Creative Information Technology Journal ◽

10.24076/citec.2014v1i4.27 ◽

2015 ◽

Vol 1 (4) ◽

pp. 270

Author(s):

Muhammad Syukri Mustafa ◽

I. Wayan Simpen

Keyword(s):

Data Mining ◽

Nearest Neighbor ◽

Test Results ◽

K Nearest Neighbor ◽

Accuracy Rate ◽

Sample Data ◽

New Students ◽

K Nearest Neighbor Algorithm ◽

Using Data ◽

Existing Data

Penelitian ini dimaksudkan untuk melakukan prediksi terhadap kemungkian mahasiswa baru dapat menyelesaikan studi tepat waktu dengan menggunakan analisis data mining untuk menggali tumpukan histori data dengan menggunakan algoritma K-Nearest Neighbor (KNN). Aplikasi yang dihasilkan pada penelitian ini akan menggunakan berbagai atribut yang klasifikasikan dalam suatu data mining antara lain nilai ujian nasional (UN), asal sekolah/ daerah, jenis kelamin, pekerjaan dan penghasilan orang tua, jumlah bersaudara, dan lain-lain sehingga dengan menerapkan analysis KNN dapat dilakukan suatu prediksi berdasarkan kedekatan histori data yang ada dengan data yang baru, apakah mahasiswa tersebut berpeluang untuk menyelesaikan studi tepat waktu atau tidak. Dari hasil pengujian dengan menerapkan algoritma KNN dan menggunakan data sampel alumni tahun wisuda 2004 s.d. 2010 untuk kasus lama dan data alumni tahun wisuda 2011 untuk kasus baru diperoleh tingkat akurasi sebesar 83,36%.This research is intended to predict the possibility of new students time to complete studies using data mining analysis to explore the history stack data using K-Nearest Neighbor algorithm (KNN). Applications generated in this study will use a variety of attributes in a data mining classified among other Ujian Nasional scores (UN), the origin of the school / area, gender, occupation and income of parents, number of siblings, and others that by applying the analysis KNN can do a prediction based on historical proximity of existing data with new data, whether the student is likely to complete the study on time or not. From the test results by applying the KNN algorithm and uses sample data alumnus graduation year 2004 s.d 2010 for the case of a long and alumni data graduation year 2011 for new cases obtained accuracy rate of 83.36%.

Download Full-text

Analysis and Prediction of CET4 Scores Based on Data Mining Algorithm

Complexity ◽

10.1155/2021/5577868 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Hongyan Wang

Keyword(s):

Data Mining ◽

Linear Regression ◽

Test Score ◽

Nearest Neighbor ◽

Classification Model ◽

Data Mining Algorithm ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm ◽

Classification Efficiency

This paper presents the concept and algorithm of data mining and focuses on the linear regression algorithm. Based on the multiple linear regression algorithm, many factors affecting CET4 are analyzed. Ideas based on data mining, collecting history data and appropriate to transform, using statistical analysis techniques to the many factors influencing the CET-4 test were analyzed, and we have obtained the CET-4 test result and its influencing factors. It was found that the linear regression relationship between the degrees of fit was relatively high. We further improve the algorithm and establish a partition-weighted K-nearest neighbor algorithm. The K-weighted K nearest neighbor algorithm and the partition algorithm are used in the CET-4 test score classification prediction, and the statistical method is used to study the relevant factors that affect the CET-4 test score, and screen classification is performed to predict when the comparison verification will pass. The weight K of the input feature and the adjacent feature are weighted, although the allocation algorithm of the adjacent classification effect has not been significantly improved, but the stability classification is better than K-nearest neighbor algorithm, its classification efficiency is greatly improved, classification time is greatly reduced, and classification efficiency is increased by 119%. In order to detect potential risk graduating students earlier, this paper proposes an appropriate and timely early warning and preschool K-nearest neighbor algorithm classification model. Taking test scores or make-up exams and re-learning as input features, the classification model can effectively predict ordinary students who have not graduated.

Download Full-text

Business Intelligence using the K-Nearest Neighbor Algorithm to Analyze Customer Behavior in Online Crowdfunding Systems

E3S Web of Conferences ◽

10.1051/e3sconf/202020216005 ◽

2020 ◽

Vol 202 ◽

pp. 16005

Author(s):

Chashif Syadzali ◽

Suryono Suryono ◽

Jatmiko Endro Suseno

Keyword(s):

Business Intelligence ◽

Nearest Neighbor ◽

Customer Behavior ◽

Training Data ◽

Business Strategies ◽

Intelligence Analysis ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm

Customer behavior classification can be useful to assist companies in conducting business intelligence analysis. Data mining techniques can classify customer behavior using the K-Nearest Neighbor algorithm based on the customer's life cycle consisting of prospect, responder, active and former. Data used to classify include age, gender, number of donations, donation retention and number of user visits. The calculation results from 2,114 data in the classification of each customer’s category are namely active by 1.18%, prospect by 8.99%, responder by 4.26% and former by 85.57%. System accuracy using a range of K from K = 1 to K = 20 produces that the highest accuracy is 94.3731% at a value of K = 4. The results of the training data that produce a classification of user behavior can be used as a Business Intelligence analysis that is useful for companies in determining business strategies by knowing the target of optimal market.

Download Full-text

Tone Classification Matches Kodàly Handsign with the K-Nearest Neighbor Method at Leap Motion Controller

International Journal on Information and Communication Technology (IJoICT) ◽

10.21108/ijoict.2019.52.283 ◽

2020 ◽

Vol 5 (2) ◽

pp. 40

Author(s):

Muhammad Croassacipto ◽

Muhammad Ichwan ◽

Dina Budhi Utami

Keyword(s):

Music Education ◽

Nearest Neighbor ◽

Human Interaction ◽

Training Data ◽

Leap Motion ◽

K Nearest Neighbor ◽

Motion Controller ◽

K Value ◽

Leap Motion Controller ◽

Natural Function

<p>Hands can produce a variety of poses in which each pose can have a meaning or purpose that can be used as a form of communication determined according to a general agreement or who communicate. Hand pose can be used as human interaction with the computer is faster, intuitive, and in line with the natural function of the human body called Handsign. One of them is Kodàly Handsign, made by a Hungarian composer named Zoltán Kodály, which is a concept in music education in Hungary. This hand sign is used in interactive angklung performances in determining the tone that will be played by the K-Nearest Neighbor (KNN) algorithm classification process based on hand poses. This classification process is performed on the extracted data from Leap Motion Controller, which takes Pitch, Roll, and Yaw values based on basic aircraft principle. The results of the research were conducted five times with the value of k periodically 1,3,5,7,9 with test data consisting pose of 874 Do', 702 Si, 913 La, 612 Sol, 661 Fa, 526 Mi, 891 Re, and 1004 Do punctuation on 21099 training data. The test results can recognize hand poses with the optimal k value k=1 with an accuracy level of 94.87%.</p>

Download Full-text

Helmet Monitoring System using Hough Circle and HOG based on KNN

Lontar Komputer Jurnal Ilmiah Teknologi Informasi ◽

10.24843/lkjiti.2021.v12.i01.p02 ◽

2021 ◽

Vol 12 (1) ◽

pp. 13

Author(s):

Rachmad Jibril Al Kautsar ◽

Fitri Utaminingrum ◽

Agung Setia Budi

Keyword(s):

Police Officers ◽

Nearest Neighbor ◽

Training Data ◽

K Nearest Neighbor ◽

Accuracy Rate ◽

Histogram Of Oriented Gradient ◽

Surveillance Camera ◽

Testing Data ◽

Motorized Vehicles ◽

Traffic Operation

Indonesian citizens who use motorized vehicles are increasing every year. Every motorcyclist in Indonesia must wear a helmet when riding a motorcycle. Even though there are rules that require motorbike riders to wear helmets, there are still many motorists who disobey the rules. To overcome this, police officers have carried out various operations (such as traffic operation, warning, etc.). This is not effective because of the number of police officers available, and the probability of police officers make a mistake when detecting violations that might be caused due to fatigue. This study asks the system to detect motorcyclists who do not wear helmets through a surveillance camera. Referring to this reason, the Circular Hough Transform (CHT), Histogram of Oriented Gradient (HOG), and K-Nearest Neighbor (KNN) are used. Testing was done by using images taken from surveillance cameras divided into 200 training data and 40 testing data obtained an accuracy rate of 82.5%.

Download Full-text

Application of K-Nearest Neighbor Algorithm on Classification of Disk Hernia and Spondylolisthesis in Vertebral Column

Indonesian Journal of Information Systems ◽

10.24002/ijis.v2i1.2352 ◽

2019 ◽

Vol 2 (1) ◽

pp. 57 ◽

Cited By ~ 1

Author(s):

Irma Handayani

Keyword(s):

Vertebral Column ◽

Nearest Neighbor ◽

Average Length ◽

Data Classification ◽

The Body ◽

Training Data ◽

K Nearest Neighbor ◽

Sample Data ◽

K Nearest Neighbor Algorithm

Vertebral column as a part of backbone has important role in human body. Trauma in vertebral column can affect spinal cord capability to send and receive messages from brain to the body system that controls sensory and motoric movement. Disk hernia and spondylolisthesis are examples of pathologies on the vertebral column. Research about pathology or damage bones and joints of skeletal system classification is rare whereas the classification system can be used by radiologists as a second opinion so that can improve productivity and diagnosis consistency of the radiologists. This research used dataset Vertebral Column that has three classes (Disk Hernia, Spondylolisthesis and Normal) and instances in UCI Machine Learning. This research applied the K-NN algorithm for classification of disk hernia and spondylolisthesis in vertebral column. The data were then classified into two different but related classification tasks: “normal” and “abnormal”. K-NN algorithm adopts the approach of data classification by optimizing sample data that can be used as a reference for training data to produce vertebral column data classification based on the learning process. The results showed that the accuracy of K-NN classifier was 83%. The average length of time needed to classify the K-NN classifier was 0.000212303 seconds.

Download Full-text