scholarly journals KLASIFIKASI CALON DEBITUR KREDIT PEMILIKAN RUMAH (KPR) MULTIGUNA TAKE OVER MENGGUNAKAN METODE k NEAREST NEIGHBOR DENGAN PEMBOBOTAN GLOBAL GINI DIVERSITY INDEX

2019 ◽  
Vol 8 (4) ◽  
pp. 407-417
Author(s):  
Inas Hasimah ◽  
Moch. Abdul Mukid ◽  
Hasbi Yasin

House credit (KPR) is a credit facilities for buying or other comsumptive needs with house warranty. The warranty for KPR is the house that will be purchased. The warranty for KPR multiguna take over is the house that will be owned by debtor, and then debtor is taking over KPR to another financial institution. For fulfilled the credit to prospective debtor is done by passing through the process of credit application and credit analysis. With the credit analysis, will acknowledge the ability of debtor for repay a credit. Final decision of credit application is classified into approved and refused. k Nearest Neighbor by attributes weighting using Global Gini Diversity Index is a statistical method that can be used to classify the credit decision of prospective debtor. This research use 2443 data of KPR multiguna take over’s prospective debtor in 2018 with credit decision of prospective debtor as dependent variable and four selected independent variable such as home ownership status, job, loans amount, and income.  The best classification result of k-NN by Global Gini Diversity Index weighting is when using 80% training data set and 20% testing data set with k=7 obtained  APER value 0,0798 and accuracy 92,02%. Keywords: KPR Multiguna Take Over, Classification, KNN by Global Gini Diversity Index weighting, Evaluation of Classification

2021 ◽  
Vol 87 (6) ◽  
pp. 445-455
Author(s):  
Yi Ma ◽  
Zezhong Zheng ◽  
Yutang Ma ◽  
Mingcang Zhu ◽  
Ran Huang ◽  
...  

Many manifold learning algorithms conduct an eigen vector analysis on a data-similarity matrix with a size of N×N, where N is the number of data points. Thus, the memory complexity of the analysis is no less than O(N2). We pres- ent in this article an incremental manifold learning approach to handle large hyperspectral data sets for land use identification. In our method, the number of dimensions for the high-dimensional hyperspectral-image data set is obtained with the training data set. A local curvature varia- tion algorithm is utilized to sample a subset of data points as landmarks. Then a manifold skeleton is identified based on the landmarks. Our method is validated on three AVIRIS hyperspectral data sets, outperforming the comparison algorithms with a k–nearest-neighbor classifier and achieving the second best performance with support vector machine.


Diagnostics ◽  
2019 ◽  
Vol 9 (3) ◽  
pp. 104 ◽  
Author(s):  
Ahmed ◽  
Yigit ◽  
Isik ◽  
Alpkocak

Leukemia is a fatal cancer and has two main types: Acute and chronic. Each type has two more subtypes: Lymphoid and myeloid. Hence, in total, there are four subtypes of leukemia. This study proposes a new approach for diagnosis of all subtypes of leukemia from microscopic blood cell images using convolutional neural networks (CNN), which requires a large training data set. Therefore, we also investigated the effects of data augmentation for an increasing number of training samples synthetically. We used two publicly available leukemia data sources: ALL-IDB and ASH Image Bank. Next, we applied seven different image transformation techniques as data augmentation. We designed a CNN architecture capable of recognizing all subtypes of leukemia. Besides, we also explored other well-known machine learning algorithms such as naive Bayes, support vector machine, k-nearest neighbor, and decision tree. To evaluate our approach, we set up a set of experiments and used 5-fold cross-validation. The results we obtained from experiments showed that our CNN model performance has 88.25% and 81.74% accuracy, in leukemia versus healthy and multiclass classification of all subtypes, respectively. Finally, we also showed that the CNN model has a better performance than other wellknown machine learning algorithms.


2021 ◽  
Vol 7 (1) ◽  
pp. 35-40
Author(s):  
Tupan Tri Muryono ◽  
Ahmad Taufik ◽  
Irwansyah Irwansyah

The banking world in terms of providing credit to customers is a regular activity that has a large effect. In its application, non-performing loans or bad loans are often created due to poor credit analysis in the credit granting process, or from bad customers. The purpose of this study is to compare the results of algorithm accuracy between K-Nearest Neighbor (K-NN), Decision Tree, and Naive Bayes which results in the best accuracy will be implemented to determine creditworthiness. The attributes used in this study consisted of 11 attributes, namely marital status, number of dependents, age, last education, occupation, monthly income, home ownership, collateral, loan amount, length of loan and information as result attributes. The methods used in this research are K-Nearest Neighbor, Decision Tree, and Naive Bayes. From the results of evaluation and validation using k-5 fold that has been carried out using RapidMiner tools, the highest accuracy results from a comparison of 3 algorithms is using a decision tree (C4.5) of 98% in the 3rd test.


Author(s):  
Yeni Kustiyahningsih

The large number of cattle population that exists can increase the potential for developing cow disease. Lack of knowledge about various kinds of cattle diseases and their handling solutions is one of the causes of decreasing cow productivity. The aim of this research is to classify cattle disease quickly and accurately to assist cattle breeders in accelerating detection and handling of cattle disease. This study uses K-Nearest Neighbour (KNN) classification method with the F-Score feature selection. The KNN method is used for disease classification based on the distance between training data and test data, while F-Score feature selection is used to reduce the attribute dimensions in order to obtain the relevant attributes. The data set used was data on cattle disease in Madura with a total of 350 data consisting of 21 features and 7 classes. Data were broken down using K-fold Cross Validation using k = 5. Based on the test results, the best accuracy was obtained with the number of features = 18 and KNN (k = 3) which resulted in an accuracy of 94.28571, a recall of 0.942857 and a precision of 0.942857.


2020 ◽  
Vol 6 (1) ◽  
pp. 43-48
Author(s):  
Tupan Tri Muryono ◽  
Irwansyah Irwansyah

In English : The banking world in terms of lending to customers is routine activities that are at high risk. In its execution, the problematic credit or bad credit is often due to the lack of careful credit analysis in the process of granting credit, as well as from poor customers. The purpose of this study is to implement data mining to assist in conducting credit analysis process in order to produce the right information whether the customer who will apply for the credit is worthy or not to be able to see the potential payment by the customer. The attributes used in this study consist of 11 attributes i.e. marital status, number of liabilities, age, last education, occupation, monthly income, home ownership, warranties, loan amount, length of loan and description as a result attribute. The methods of data collection used are observation, interviews, and documentation. The method used in this study is K-Nearest Neighbor (K-NN). From the results of evaluation and validation using the K-5 fold that has been done using the RapidMiner tools obtained the highest accuracy results from the K-Nearest Neighbor (K-NN) method of 93.33% in the 5th test. In Indonesian : Dunia perbankan dalam hal pemberian kredit kepada nasabah adalah kegiatan rutin yang mempunyai resiko tinggi. Dalam pelaksanaannya, kredit yang bermasalah atau kredit macet sering terjadi akibat analisis kredit kurang cermat dalam proses pemberian kredit, maupun dari nasabah yang tidak baik. Tujuan dalam penelitian ini ialah menerapkan data mining untuk dapat membantu melakukan proses analisis kredit agar dapat menghasilkan informasi yang tepat apakah nasabah yang akan mengajukan kreditnya layak atau tidaknya sehingga dapat melihat potensi pembayaran kredit yang dilakukan nasabah. Atribut yang digunakan dalam penelitian ini terdiri dari 11 atribut yaitu status perkawinan, jumlah tanggungan, usia, pendidikan terakhir, pekerjaan, penghasilan perbulan, kepemilikan rumah, jaminan, jumlah pinjaman, lama pinjaman dan keterangan sebagai atribut hasil. Metode pungumpulan data yang digunakan ialah observasi, wawancara, dan dokumentasi. Metode yang digunakan dalam penelitian ini adalah K-Nearest Neighbor (K-NN). Dari hasil evaluasi dan validasi menggunakan k-5 fold yang telah dilakukan menggunakan tools RapidMiner diperoleh hasil akurasi tertinggi dari Metode K-Nearest Neighbor (K-NN) sebesar 93.33% pada pengujian ke 5.


2021 ◽  
Vol 32 (2) ◽  
pp. 20-25
Author(s):  
Efraim Kurniawan Dairo Kette

In pattern recognition, the k-Nearest Neighbor (kNN) algorithm is the simplest non-parametric algorithm. Due to its simplicity, the model cases and the quality of the training data itself usually influence kNN algorithm classification performance. Therefore, this article proposes a sparse correlation weight model, combined with the Training Data Set Cleaning (TDC) method by Classification Ability Ranking (CAR) called the CAR classification method based on Coefficient-Weighted kNN (CAR-CWKNN) to improve kNN classifier performance. Correlation weight in Sparse Representation (SR) has been proven can increase classification accuracy. The SR can show the 'neighborhood' structure of the data, which is why it is very suitable for classification based on the Nearest Neighbor. The Classification Ability (CA) function is applied to classify the best training sample data based on rank in the cleaning stage. The Leave One Out (LV1) concept in the CA works by cleaning data that is considered likely to have the wrong classification results from the original training data, thereby reducing the influence of the training sample data quality on the kNN classification performance. The results of experiments with four public UCI data sets related to classification problems show that the CAR-CWKNN method provides better performance in terms of accuracy.


2015 ◽  
Vol 4 (1) ◽  
pp. 61-81
Author(s):  
Mohammad Masoud Javidi

Multi-label classification is an extension of conventional classification in which a single instance can be associated with multiple labels. Problems of this type are ubiquitous in everyday life. Such as, a movie can be categorized as action, crime, and thriller. Most algorithms on multi-label classification learning are designed for balanced data and don’t work well on imbalanced data. On the other hand, in real applications, most datasets are imbalanced. Therefore, we focused to improve multi-label classification performance on imbalanced datasets. In this paper, a state-of-the-art multi-label classification algorithm, which called IBLR_ML, is employed. This algorithm is produced from combination of k-nearest neighbor and logistic regression algorithms. Logistic regression part of this algorithm is combined with two ensemble learning algorithms, Bagging and Boosting. My approach is called IB-ELR. In this paper, for the first time, the ensemble bagging method whit stable learning as the base learner and imbalanced data sets as the training data is examined. Finally, to evaluate the proposed methods; they are implemented in JAVA language. Experimental results show the effectiveness of proposed methods. Keywords: Multi-label classification, Imbalanced data set, Ensemble learning, Stable algorithm, Logistic regression, Bagging, Boosting


2020 ◽  
Vol 6 (1) ◽  
pp. 43-48
Author(s):  
Tupan Tri Muryono ◽  
Irwansyah Irwansyah

The banking world in terms of lending to customers is routine activities that are at high risk. In its execution, the problematic credit or bad credit is often due to the lack of careful credit analysis in the process of granting credit, as well as from poor customers. The purpose of this study is to implement data mining to assist in conducting credit analysis process in order to produce the right information whether the customer who will apply for the credit is worthy or not to be able to see the potential payment by the customer. The attributes used in this study consist of 11 attributes i.e. marital status, number of liabilities, age, last education, occupation, monthly income, home ownership, warranties, loan amount, length of loan and description as a result attribute. The methods of data collection used are observation, interviews, and documentation. The method used in this study is K-Nearest Neighbor (K-NN). From the results of evaluation and validation using the K-5 fold that has been done using the RapidMiner tools obtained the highest accuracy results from the K-Nearest Neighbor (K-NN) method of 93.33% in the 5th test.


2021 ◽  
Author(s):  
tejaswini kambaiahgari ◽  
Uma Rao K

Abstract In the present world, there are many songs over the internet. But the information retrieval on these songs can be complicated. This paper intends to classify songs based on emotions using deep learning. We propose a strategy to recognize the emotion present in a song by classifying their spectrograms, which contains both time and frequency information. According to human psychology, neurons within a sub pop- ulation of our brain did not react the same way for all the emotions.So only specific neurons need to be triggered for identifying an emotion. Dif- ferent deep learning and machine learning algorithms are implemented to build music emotion recognizer. The main objective of this study is to study about the features which are important for audio file ,to de- velop a music emotion classifier using deep learning algorithm and also to validate the model.The datasets are split into training and testing sets, models are trained with training data set. The accuracy of Artifi- cial Neural Network (ANN) model is 79.7% ,K-Nearest Neighbor (KNN) model is 78.26% and logistic regression for gender classification is 81%.


Sign in / Sign up

Export Citation Format

Share Document