IMPLEMENTASI DATA MINING UNTUK MENENTUKAN KELAYAKAN PEMBERIAN KREDIT DENGAN MENGGUNAKAN ALGORITMA  K-NEAREST NEIGHBORS (K-NN)

Tupan Tri Muryono; Irwansyah Irwansyah

doi:10.37365/it.v6i1.78

IMPLEMENTASI DATA MINING UNTUK MENENTUKAN KELAYAKAN PEMBERIAN KREDIT DENGAN MENGGUNAKAN ALGORITMA K-NEAREST NEIGHBORS (K-NN)

Infotech: Journal of Technology Information ◽

10.37365/it.v6i1.78 ◽

2020 ◽

Vol 6 (1) ◽

pp. 43-48

Author(s):

Tupan Tri Muryono ◽

Irwansyah Irwansyah

Keyword(s):

Data Mining ◽

Nearest Neighbor ◽

Home Ownership ◽

K Nearest Neighbor ◽

K Nearest Neighbors ◽

Analysis Process ◽

Status Number ◽

The Right ◽

Credit Analysis ◽

Loan Amount

In English : The banking world in terms of lending to customers is routine activities that are at high risk. In its execution, the problematic credit or bad credit is often due to the lack of careful credit analysis in the process of granting credit, as well as from poor customers. The purpose of this study is to implement data mining to assist in conducting credit analysis process in order to produce the right information whether the customer who will apply for the credit is worthy or not to be able to see the potential payment by the customer. The attributes used in this study consist of 11 attributes i.e. marital status, number of liabilities, age, last education, occupation, monthly income, home ownership, warranties, loan amount, length of loan and description as a result attribute. The methods of data collection used are observation, interviews, and documentation. The method used in this study is K-Nearest Neighbor (K-NN). From the results of evaluation and validation using the K-5 fold that has been done using the RapidMiner tools obtained the highest accuracy results from the K-Nearest Neighbor (K-NN) method of 93.33% in the 5th test. In Indonesian : Dunia perbankan dalam hal pemberian kredit kepada nasabah adalah kegiatan rutin yang mempunyai resiko tinggi. Dalam pelaksanaannya, kredit yang bermasalah atau kredit macet sering terjadi akibat analisis kredit kurang cermat dalam proses pemberian kredit, maupun dari nasabah yang tidak baik. Tujuan dalam penelitian ini ialah menerapkan data mining untuk dapat membantu melakukan proses analisis kredit agar dapat menghasilkan informasi yang tepat apakah nasabah yang akan mengajukan kreditnya layak atau tidaknya sehingga dapat melihat potensi pembayaran kredit yang dilakukan nasabah. Atribut yang digunakan dalam penelitian ini terdiri dari 11 atribut yaitu status perkawinan, jumlah tanggungan, usia, pendidikan terakhir, pekerjaan, penghasilan perbulan, kepemilikan rumah, jaminan, jumlah pinjaman, lama pinjaman dan keterangan sebagai atribut hasil. Metode pungumpulan data yang digunakan ialah observasi, wawancara, dan dokumentasi. Metode yang digunakan dalam penelitian ini adalah K-Nearest Neighbor (K-NN). Dari hasil evaluasi dan validasi menggunakan k-5 fold yang telah dilakukan menggunakan tools RapidMiner diperoleh hasil akurasi tertinggi dari Metode K-Nearest Neighbor (K-NN) sebesar 93.33% pada pengujian ke 5.

Download Full-text

IMPLEMENTASI DATA MINING UNTUK MENENTUKAN KELAYAKAN PEMBERIAN KREDIT DENGAN MENGGUNAKAN ALGORITMA K-NEAREST NEIGHBORS (K-NN)

Infotech: Journal of Technology Information ◽

10.37365/jti.v6i1.78 ◽

2020 ◽

Vol 6 (1) ◽

pp. 43-48

Author(s):

Tupan Tri Muryono ◽

Irwansyah Irwansyah

Keyword(s):

Data Mining ◽

Nearest Neighbor ◽

Home Ownership ◽

K Nearest Neighbor ◽

K Nearest Neighbors ◽

Analysis Process ◽

Status Number ◽

The Right ◽

Credit Analysis ◽

Loan Amount

The banking world in terms of lending to customers is routine activities that are at high risk. In its execution, the problematic credit or bad credit is often due to the lack of careful credit analysis in the process of granting credit, as well as from poor customers. The purpose of this study is to implement data mining to assist in conducting credit analysis process in order to produce the right information whether the customer who will apply for the credit is worthy or not to be able to see the potential payment by the customer. The attributes used in this study consist of 11 attributes i.e. marital status, number of liabilities, age, last education, occupation, monthly income, home ownership, warranties, loan amount, length of loan and description as a result attribute. The methods of data collection used are observation, interviews, and documentation. The method used in this study is K-Nearest Neighbor (K-NN). From the results of evaluation and validation using the K-5 fold that has been done using the RapidMiner tools obtained the highest accuracy results from the K-Nearest Neighbor (K-NN) method of 93.33% in the 5th test.

Download Full-text

PERBANDINGAN ALGORITMA K-NEAREST NEIGHBOR, DECISION TREE, DAN NAIVE BAYES UNTUK MENENTUKAN KELAYAKAN PEMBERIAN KREDIT

Infotech: Journal of Technology Information ◽

10.37365/jti.v7i1.104 ◽

2021 ◽

Vol 7 (1) ◽

pp. 35-40

Author(s):

Tupan Tri Muryono ◽

Ahmad Taufik ◽

Irwansyah Irwansyah

Keyword(s):

Decision Tree ◽

Nearest Neighbor ◽

Naive Bayes ◽

Home Ownership ◽

Naïve Bayes ◽

K Nearest Neighbor ◽

Status Number ◽

Credit Analysis ◽

Credit Granting ◽

Loan Amount

The banking world in terms of providing credit to customers is a regular activity that has a large effect. In its application, non-performing loans or bad loans are often created due to poor credit analysis in the credit granting process, or from bad customers. The purpose of this study is to compare the results of algorithm accuracy between K-Nearest Neighbor (K-NN), Decision Tree, and Naive Bayes which results in the best accuracy will be implemented to determine creditworthiness. The attributes used in this study consisted of 11 attributes, namely marital status, number of dependents, age, last education, occupation, monthly income, home ownership, collateral, loan amount, length of loan and information as result attributes. The methods used in this research are K-Nearest Neighbor, Decision Tree, and Naive Bayes. From the results of evaluation and validation using k-5 fold that has been carried out using RapidMiner tools, the highest accuracy results from a comparison of 3 algorithms is using a decision tree (C4.5) of 98% in the 3rd test.

Download Full-text

KLASIFIKASI CALON DEBITUR KREDIT PEMILIKAN RUMAH (KPR) MULTIGUNA TAKE OVER MENGGUNAKAN METODE k NEAREST NEIGHBOR DENGAN PEMBOBOTAN GLOBAL GINI DIVERSITY INDEX

Jurnal Gaussian ◽

10.14710/j.gauss.v8i4.26721 ◽

2019 ◽

Vol 8 (4) ◽

pp. 407-417

Author(s):

Inas Hasimah ◽

Moch. Abdul Mukid ◽

Hasbi Yasin

Keyword(s):

Nearest Neighbor ◽

Home Ownership ◽

Financial Institution ◽

Diversity Index ◽

Training Data ◽

Final Decision ◽

K Nearest Neighbor ◽

Data Set ◽

Independent Variable ◽

Credit Analysis

House credit (KPR) is a credit facilities for buying or other comsumptive needs with house warranty. The warranty for KPR is the house that will be purchased. The warranty for KPR multiguna take over is the house that will be owned by debtor, and then debtor is taking over KPR to another financial institution. For fulfilled the credit to prospective debtor is done by passing through the process of credit application and credit analysis. With the credit analysis, will acknowledge the ability of debtor for repay a credit. Final decision of credit application is classified into approved and refused. k Nearest Neighbor by attributes weighting using Global Gini Diversity Index is a statistical method that can be used to classify the credit decision of prospective debtor. This research use 2443 data of KPR multiguna take over’s prospective debtor in 2018 with credit decision of prospective debtor as dependent variable and four selected independent variable such as home ownership status, job, loans amount, and income. The best classification result of k-NN by Global Gini Diversity Index weighting is when using 80% training data set and 20% testing data set with k=7 obtained APER value 0,0798 and accuracy 92,02%. Keywords: KPR Multiguna Take Over, Classification, KNN by Global Gini Diversity Index weighting, Evaluation of Classification

Download Full-text

Sistem Pendukung Keputusan Kredit Usaha Rakyat PT. Bank Rakyat Indonesia Unit Kaliangkrik Magelang

Creative Information Technology Journal ◽

10.24076/citec.2014v2i1.33 ◽

2015 ◽

Vol 2 (1) ◽

pp. 1

Author(s):

Agung Nugroho ◽

Kusrini Kusrini ◽

M. Rudyanto Arief

Keyword(s):

Data Mining ◽

Decision Support ◽

Decision Maker ◽

Nearest Neighbor ◽

Nearest Neighbors ◽

Training Data ◽

Classification Rule ◽

K Nearest Neighbor ◽

K Nearest Neighbors ◽

Viable Solution

Banyak faktor dan variabel yang mempengaruhi risiko kredit dalam pengambilan keputusan pada permasalahan Kredit Usaha Rakyat (KUR). Faktor-faktor yang digunakan sebagai dasar penilaian Kredit Usaha Rakyat pada PT.Bank Rakyat Indonesia Unit Kaliangkrik menggunakan prinsip dasar yang dikenal dengan prinsip “5 of Credit” yaitu Character, Capacity, Capital, Condition dan Collateral. Dari factor-faktor yang digunakan sebagai dasar penilaian kredit, digunakan metode Mining Classification Rule dalam membuat Sistem Pendukung Keputusan pemberian KUR. Terdapat beberapa algoritma yang dapat digunakan dalam data mining untuk metode klasifikasi salah satunya adalah algoritma k-nearest neightbor. Konsep sistem pendukung keputusan pemberian KUR ini dirancang dapat melakukan klasifikasi terhadap objek berdasarkan data pembelajaran yang jaraknya paling dekat dengan objek tersebut dan memberikan solusi nasabah yang layak menerima KUR berdasarkan masukan dari user dengan menggunakan metode k-nearest neighbors (knn). Data-data transaksi pembayaran nasabah lama akan dijadikan sebagai data training dimana sebelumnya akan ditentukan kelasnya terlebih dahulu. Penentuan kelas dilakukan dengan proses klasifikasi data berdasarkan kategori status nasabah sesuai jumlah tunggakan pembayaran kreditnya. Dari hasil perhitungan kemiripan kasus antara data calon nasabah baru dengan nasabah lama atau data training menggunakan algoritma K-Nearest Neighbor, hasil dengan nilai tertinggi akan dijadikan acuan seorang decision maker dalam mengambil keputusan.Many factors and variables that affect credit risk in decision-making on issues People's Business Credit (KUR). The factors are used as the basis of assessment of the People's Business Credit Unit at PT Bank Rakyat Indonesia Kaliangkrik using basic principle known as the principle of "5 of Credit" ie Character, Capacity, Capital, Collateral Condition and. Of the factors that are used as a basis for credit assessment, Classification Rule Mining method used in making the administration of KUR Decision Support Systems. There are several algorithms that can be used in data mining for classification methods one of which is the k-nearest algorithm neightbor. The concept of the provision of decision support system is designed KUR can perform the classification of objects based on distance learning data that is closest to the object and provide a viable solution customers receive KUR based on input from the user by using the k-nearest neighbors (KNN). Payment transaction data will be used as a customer long training data which will be determined prior to first class. Grading is done with the data classification process based on customer status categories according to the amount of credit outstanding payments. From the calculation of the similarity between the case of data with prospective new customers or old customers training data using the K-Nearest Neighbor algorithm, the results with the highest scores will be used as a reference to a decision maker in making decisions.

Download Full-text

Perancangan Aplikasi Prediksi Kelulusan Tepat Waktu Bagi Mahasiswa Baru Dengan Teknik Data Mining (Studi Kasus: Data Akademik Mahasiswa STMIK Dipanegara Makassar)

Creative Information Technology Journal ◽

10.24076/citec.2014v1i4.27 ◽

2015 ◽

Vol 1 (4) ◽

pp. 270

Author(s):

Muhammad Syukri Mustafa ◽

I. Wayan Simpen

Keyword(s):

Data Mining ◽

Nearest Neighbor ◽

Test Results ◽

K Nearest Neighbor ◽

Accuracy Rate ◽

Sample Data ◽

New Students ◽

K Nearest Neighbor Algorithm ◽

Using Data ◽

Existing Data

Penelitian ini dimaksudkan untuk melakukan prediksi terhadap kemungkian mahasiswa baru dapat menyelesaikan studi tepat waktu dengan menggunakan analisis data mining untuk menggali tumpukan histori data dengan menggunakan algoritma K-Nearest Neighbor (KNN). Aplikasi yang dihasilkan pada penelitian ini akan menggunakan berbagai atribut yang klasifikasikan dalam suatu data mining antara lain nilai ujian nasional (UN), asal sekolah/ daerah, jenis kelamin, pekerjaan dan penghasilan orang tua, jumlah bersaudara, dan lain-lain sehingga dengan menerapkan analysis KNN dapat dilakukan suatu prediksi berdasarkan kedekatan histori data yang ada dengan data yang baru, apakah mahasiswa tersebut berpeluang untuk menyelesaikan studi tepat waktu atau tidak. Dari hasil pengujian dengan menerapkan algoritma KNN dan menggunakan data sampel alumni tahun wisuda 2004 s.d. 2010 untuk kasus lama dan data alumni tahun wisuda 2011 untuk kasus baru diperoleh tingkat akurasi sebesar 83,36%.This research is intended to predict the possibility of new students time to complete studies using data mining analysis to explore the history stack data using K-Nearest Neighbor algorithm (KNN). Applications generated in this study will use a variety of attributes in a data mining classified among other Ujian Nasional scores (UN), the origin of the school / area, gender, occupation and income of parents, number of siblings, and others that by applying the analysis KNN can do a prediction based on historical proximity of existing data with new data, whether the student is likely to complete the study on time or not. From the test results by applying the KNN algorithm and uses sample data alumnus graduation year 2004 s.d 2010 for the case of a long and alumni data graduation year 2011 for new cases obtained accuracy rate of 83.36%.

Download Full-text

Tropical Balls and Its Applications to K Nearest Neighbor over the Space of Phylogenetic Trees

Mathematics ◽

10.3390/math9070779 ◽

2021 ◽

Vol 9 (7) ◽

pp. 779

Author(s):

Ruriko Yoshida

Keyword(s):

Supervised Learning ◽

Phylogenetic Trees ◽

Nearest Neighbor ◽

Nearest Neighbors ◽

High Dimensional ◽

Learning Method ◽

Dimensional Vector ◽

K Nearest Neighbor ◽

K Nearest Neighbors

A tropical ball is a ball defined by the tropical metric over the tropical projective torus. In this paper we show several properties of tropical balls over the tropical projective torus and also over the space of phylogenetic trees with a given set of leaf labels. Then we discuss its application to the K nearest neighbors (KNN) algorithm, a supervised learning method used to classify a high-dimensional vector into given categories by looking at a ball centered at the vector, which contains K vectors in the space.

Download Full-text

Data Mining Approach to Analyze COVID-19 Clinical Dataset

10.53350/pjmhs211561812 ◽

2021 ◽

Vol 15 (6) ◽

pp. 1812-1819

Author(s):

Azita Yazdani ◽

Ramin Ravangard ◽

Roxana Sharifian

Keyword(s):

Artificial Intelligence ◽

Data Mining ◽

Support Vector Machine ◽

Nearest Neighbor ◽

Clinical Signs ◽

Study Data ◽

Mining Machine ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Mining Approach

The new coronavirus has been spreading since the beginning of 2020 and many efforts have been made to develop vaccines to help patients recover. It is now clear that the world needs a rapid solution to curb the spread of COVID-19 worldwide with non-clinical approaches such as data mining, enhanced intelligence, and other artificial intelligence techniques. These approaches can be effective in reducing the burden on the health care system to provide the best possible way to diagnose and predict the COVID-19 epidemic. In this study, data mining models for early detection of Covid-19 in patients were developed using the epidemiological dataset of patients and individuals suspected of having Covid-19 in Iran. C4.5, support vector machine, Naive Bayes, logistic regression, Random Forest, and k-nearest neighbor algorithm were used directly on the dataset using Rapid miner to develop the models. By receiving clinical signs, this model diagnosis the risk of contracting the COVID-19 virus. Examination of the models in this study has shown that the support vector machine with 93.41% accuracy is more efficient in the diagnosis of patients with COVID-19 pandemic, which is the best model among other developed models. Keywords: COVID-19, Data mining, Machine Learning, Artificial Intelligence, Classification

Download Full-text

Performance of Naïve Bayes, C4.5 and KNN using Breast Cancer, Iris and Hypothyroid Datasets

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.c8795.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 2193-2197

Keyword(s):

Breast Cancer ◽

Data Mining ◽

Nearest Neighbor ◽

Naive Bayes ◽

Naïve Bayes ◽

Specific Pattern ◽

K Nearest Neighbor ◽

Data Mining Technique ◽

Digital Format ◽

Tree Classifier

Data mining usually specifies the discovery of specific pattern or analysis of data from a large dataset. Classification is one of an efficient data mining technique, in which class the data are classified are already predefined using the existing datasets. The classification of medical records in terms of its symptoms using computerized method and storing the predicted information in the digital format is of great importance in the diagnosis of various diseases in the medical field. In this paper, finding the algorithm with highest accuracy range is concentrated so that a cost-effective algorithm can be found. Here the data mining classification algorithms are compared with their accuracy of finding exact data according to the diagnosis report and their execution rate to identify how fast the records are classified. The classification technique based algorithms used in this study are the Naive Bayes Classifier, the C4.5 tree classifier and the K-Nearest Neighbor (KNN) to predict which algorithm is the best suited for classifying any kind of medical dataset. Here the datasets such as Breast Cancer, Iris and Hypothyroid are used to predict which of the three algorithms is suitable for classifying the datasets with highest accuracy of finding the records of patients with the particular health problems. The experimental results represented in the form of table and graph shows the performance and the importance of Naïve Bayes, C4.5 and K-Nearest Neighbor algorithms. From the performance outcome of the three algorithms the C4.5 algorithm is a lot better than the Naïve Bayes and the K-Nearest Neighbor algorithm.

Download Full-text

Analysis and Prediction of CET4 Scores Based on Data Mining Algorithm

Complexity ◽

10.1155/2021/5577868 ◽

2021 ◽

Vol 2021 ◽

pp. 1-11

Author(s):

Hongyan Wang

Keyword(s):

Data Mining ◽

Linear Regression ◽

Test Score ◽

Nearest Neighbor ◽

Classification Model ◽

Data Mining Algorithm ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

K Nearest Neighbor Algorithm ◽

Classification Efficiency

This paper presents the concept and algorithm of data mining and focuses on the linear regression algorithm. Based on the multiple linear regression algorithm, many factors affecting CET4 are analyzed. Ideas based on data mining, collecting history data and appropriate to transform, using statistical analysis techniques to the many factors influencing the CET-4 test were analyzed, and we have obtained the CET-4 test result and its influencing factors. It was found that the linear regression relationship between the degrees of fit was relatively high. We further improve the algorithm and establish a partition-weighted K-nearest neighbor algorithm. The K-weighted K nearest neighbor algorithm and the partition algorithm are used in the CET-4 test score classification prediction, and the statistical method is used to study the relevant factors that affect the CET-4 test score, and screen classification is performed to predict when the comparison verification will pass. The weight K of the input feature and the adjacent feature are weighted, although the allocation algorithm of the adjacent classification effect has not been significantly improved, but the stability classification is better than K-nearest neighbor algorithm, its classification efficiency is greatly improved, classification time is greatly reduced, and classification efficiency is increased by 119%. In order to detect potential risk graduating students earlier, this paper proposes an appropriate and timely early warning and preschool K-nearest neighbor algorithm classification model. Taking test scores or make-up exams and re-learning as input features, the classification model can effectively predict ordinary students who have not graduated.

Download Full-text

IntelliFin: Advanced Stock Prediction using Hybrid ML and LSTM Model with Financial Indicators powered by Sentiment Determination using NLP

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.d8437.069520 ◽

2020 ◽

Vol 9 (5) ◽

pp. 428-433

Keyword(s):

Stock Market ◽

Stock Prices ◽

Stock Price ◽

Nearest Neighbor ◽

Learning Algorithm ◽

Majority Voting ◽

Support Vector ◽

K Nearest Neighbor ◽

Financial History ◽

The Right

Stock Trading has been one of the most important parts of the financial world for decades. People investing in the share market analyze the financial history of a corporation, the news related to it and study huge amounts of data so as to predict its stock price trend. The right investment i.e. buying and selling a company stock at the right time leads to monetary benefits and can make one a millionaire overnight. The stock market is an extremely fluctuating platform wherein data is produced in humongous quantities and is influenced by numerous disparate factors such as socio-political issues, financial activities like splits and dividends, news as well as rumors. This work proposes a novel system “IntelliFin” to predict the share market trend. The system uses the various stock market technical indicators along with the company's historical market data trends to predict the share prices. The system employs the sentiment determination of a company's financial and socio-political news for a more accurate prediction. This system is implemented using two models. The first is a hybrid LSTM model optimized by an ADAM optimizer. The other is a hybrid ML model which integrates a Support Vector Regressor, K-Nearest Neighbor classifier, an RF classifier and a Linear Regressor using a Majority Voting algorithm. Both models employ a sentiment analyzer to account for the news impacting the stock prices which is powered by NLP. The models are trained continuously using Reinforcement Learning implemented by the Q-Learning Algorithm to increase the consistency and accuracy. The project aims to support the inexperienced investors, who don't have enough experience in investing in the stock market and help them maximize their profit and minimize or eliminate the losses. The developed system will also serve as a tool for professional investors to help and aid their decision making.

Download Full-text