scholarly journals PERBANDINGAN KINERJA MUTUAL K-NEAREST NEIGHBOR (MKNN) DAN K-NEAREST NEIGHBOR (KNN) DALAM ANALISIS KLASIFIKASI KELAYAKAN KREDIT

2019 ◽  
Vol 8 (3) ◽  
pp. 366-376
Author(s):  
Annisa Sugesti ◽  
Moch. Abdul Mukid ◽  
Tarno Tarno

Credit feasibility analysis is important for lenders to avoid the risk among the increasement of credit applications. This analysis can be carried out by the classification technique. Classification technique used in this research is instance-based classification. These techniques tend to be simple, but are very dependent on the determination of  K values. K is number of nearest neighbor considered for class classification of new data. A small value of K is very sensitive to outliers. This weakness can be overcome using an algorithm that is able to handle outliers, one of them is Mutual K-Nearest Neighbor (MKNN). MKNN removes outliers first, then predicts new observation classes based on the majority class of their mutual nearest neighbors. The algorithm will be compared with KNN without outliers. The model is evaluated by 10-fold cross validation and the classification performance is measured by Gemoetric-Mean of sensitivity and specificity. Based on the analysis the optimal value of K is 9 for MKNN and 3 for KNN, with the highest G-Mean produced by KNN is equal to 0.718, meanwhile G-Mean produced by MKNN is 0.702. The best alternative to classifying credit feasibility in this study is K-Nearest Neighbor (KNN) algorithm with K=3.Keywords: Classification, Credit, MKNN, KNN, G-Mean.

Author(s):  
Suwanto Sanjaya ◽  
Morina Lisa Pura ◽  
Siska Kurnia Gusti ◽  
Febi Yanto ◽  
Fadhilah Syafria

The selection of tomatoes can use several indicators. One of the indicators is the fruit color. In digital image processing, one of the color information that could be used in Hue, Saturation, and Value (HSV). In this research, HSV is proposed as a color model feature for information on the ripeness of tomatoes. The total data of tomato images used in this research were 400 images from four sides. The maturity level of tomatoes uses five levels, namely green, turning, pink, light red, and red. The process of divide data uses K-Fold Cross Validation with ten folds. The method used for classification is k-Nearest Neighbor (kNN). The scenario of the test performed is to combine the image size with the parameter value of the neighbor (k). The image sizes tested are 100x100 pixels, 300x300 pixels, 600x600 pixels and 1000x1000 pixels. The “k” values tested were 1, 3, 5, 7, 9, 11, and 13. The highest accuracy reached 92.5% in the image size 1000x1000 pixels with a parameter “k” is 3. The result of the experiment showed that the image size has a significant influence of accuracy, but the parameter value of neighbor (k) has an influence that is not too significant.


2021 ◽  
Vol 8 (6) ◽  
pp. 1287
Author(s):  
Adam Syarif Hidayatullah ◽  
Fitra Abdurrachman Bachtiar ◽  
Imam Cholissodin

<p class="Abstrak">Keberhasilan sebuah perusahaan terjadi karena dapat mengelola sumber daya manusianya dengan baik begitu juga sebaliknya. Salah satu instansi yang mengelola sumber daya manusia menggunakan Manajemen Talenta adalah Badan Kepegawaian Daerah (BKD) kota Malang, dengan mengevaluasi pegawainya setiap tahunnya setelah pekerjaan selesai dilakukan. Hal ini menyebabkan hasil pekerjaan yang telah dilakukan tidak optimal, sehingga perlu identifikasi dini pegawai yang memiliki kinerja dibawah rata – rata sehingga dapat dievaluasi dan meminimalisir hasil pekerjaan yang tidak optimal dengan menggunakan teknik klasifikasi. Penelitian ini menggunakan teknik klasifikasi <em>Nearest Centroid Neighbor Classifier Based on K Local Means Using Harmonic Mean Distance</em> (LMKHNCN). Metode ini merupakan metode modifikasi dari metode <em>K-Nearest Neighbor</em> (KNN) dan dibuktikan memiliki performa lebih baik dibandingkan dengan metode aslinya KNN. Dilakukan pengujian <em>F1-Score</em> dan akurasi menggunakan <em>K-Fold Cross Validation</em> untuk mengetahui persebaran akurasi dan juga pengujian mengenai pengaruh normalisasi karena tidak ada informasi normalisasi pada penelitian sebelumnya. Metode pada kasus ini menghasilkan performa klasifikasi yang baik, dibuktikan bahwa hasil akurasi dan <em>F1-Score</em> oleh metode ini berturut – turut ialah mencapai 98,8% dan 98,1%.</p><p class="Abstrak"> </p><p class="Judul2"><strong><em>Abstract</em></strong></p><p><em>The success of company occurs because is manage human resources well and vice versa. One of institute that mange human resource using Talent Management is Malang city Badan Kepegawaian Daerah (BKD), which evaluates its employee annually after the work is completed. This can cause not optimal work result, so it necessary to early identification of employees who have performance below average performance so that can be evaluated and minimize not optimal result. This study is use classification technique Nearest Centroid Neighbor Classifier Based on K Local Means Using Harmonic Mean Distance (LMKHNCN). This method is modified base algorithm of K-Nearest Neighbor (KNN). F1-Score and Accuracy using K-Fold Cross Validation to measure performance of this method and normalization testing due to no any information about that in previous study. This method is proven to have better performance compared to it original algorithm KNN. The method in this study has produced good classification performance. The result of classification accuracy and F1-Score by this method reach </em><em>98,8% dan 98,1%</em>.</p>


Author(s):  
Marina Milosevic ◽  
Dragan Jankovic ◽  
Aleksandar Peulic

AbstractIn this paper, we present a system based on feature extraction techniques for detecting abnormal patterns in digital mammograms and thermograms. A comparative study of texture-analysis methods is performed for three image groups: mammograms from the Mammographic Image Analysis Society mammographic database; digital mammograms from the local database; and thermography images of the breast. Also, we present a procedure for the automatic separation of the breast region from the mammograms. Computed features based on gray-level co-occurrence matrices are used to evaluate the effectiveness of textural information possessed by mass regions. A total of 20 texture features are extracted from the region of interest. The ability of feature set in differentiating abnormal from normal tissue is investigated using a support vector machine classifier, Naive Bayes classifier and K-Nearest Neighbor classifier. To evaluate the classification performance, five-fold cross-validation method and receiver operating characteristic analysis was performed.


Author(s):  
Grassella Gunsyang ◽  
Ika Purnamasari ◽  
Fidia Deny Tisna Amijaya

Algoritma Neighbor Weighted K-Nearest Neighbor (NWKNN) merupakan pengembangan dari algoritma K-Nearest Neighbor (KNN), dengan memberikan bobot pada setiap kelas yang akan diklasifikasikan. Penelitian ini membahas tentang klasifikasi menggunakan algoritma NWKNN yang diaplikasikan pada data status pembayaran premi. Tujuannya untuk mengetahui nilai eksponen (E) dan nilai ketetanggaan (K) yang optimal, serta nilai akurasi dari klasifikasi data status pembayaran Premi di PT. Bumiputera Kota Samarinda. Tahapan dalam penelitian ini yaitu menentukan nilai E dan nilai K menggunakan k-fold cross validation, menghitung jarak euclidean, menghitung bobot dan skor setiap kelas, melihat nilai skor terbesar untuk menentukan hasil klasifikasi, kemudian menghitung nilai akurasi klasifikasi. Hasil penelitian menunjukkan bahwa nilai K dan nilai E yang optimal untuk klasifikasi status pembayaran premi di PT. Bumiputera Kota Samarinda menggunakan NWKNN sebesar K=3 dan E=6 dengan nilai akurasi sebesar 75%.


2020 ◽  
Vol 37 (4) ◽  
pp. 563-569
Author(s):  
Dželila Mehanović ◽  
Jasmin Kevrić

Security is one of the most actual topics in the online world. Lists of security threats are constantly updated. One of those threats are phishing websites. In this work, we address the problem of phishing websites classification. Three classifiers were used: K-Nearest Neighbor, Decision Tree and Random Forest with the feature selection methods from Weka. Achieved accuracy was 100% and number of features was decreased to seven. Moreover, when we decreased the number of features, we decreased time to build models too. Time for Random Forest was decreased from the initial 2.88s and 3.05s for percentage split and 10-fold cross validation to 0.02s and 0.16s respectively.


Author(s):  
Gede Aditra Pradnyana ◽  
I Komang Agus Suryantara ◽  
I Gede Mahendra Darmawiguna

An impression can be interpreted as a psychological feeling toward a product and it plays an important role in decision making. Therefore, the understanding of the data in the domain of impressions will be very useful. This research had the objective of knowing the performance of K-Nearest Neighbors method to classify endek image impression using K-Fold Cross Validation method. The images were taken from 3 locations, namely CV. Artha Dharma, Agung Bali Collection, and Pengrajin Sri Rejeki. To get the image impression was done by consulting with an endek expert named Dr. D.A Tirta Ray, M.Si. The process of data mining was done by using K-Nearest Neighbors Method which was a classification method to a set of data based on learning data that had been classified previously and to classify new objects based on attributes and training samples. K-Fold Cross Validation testing obtained accuracy of 91% with K value in K-Nearest Neighbors of 3, 4, 7, 8.


2019 ◽  
Vol 6 (2) ◽  
pp. 226-235
Author(s):  
Muhammad Rangga Aziz Nasution ◽  
Mardhiya Hayaty

Salah satu cabang ilmu komputer yaitu pembelajaran mesin (machine learning) menjadi tren dalam beberapa waktu terakhir. Pembelajaran mesin bekerja dengan memanfaatkan data dan algoritma untuk membuat model dengan pola dari kumpulan data tersebut. Selain itu, pembelajaran mesin juga mempelajari bagaimama model yang telah dibuat dapat memprediksi keluaran (output) berdasarkan pola yang ada. Terdapat dua jenis metode pembelajaran mesin yang dapat digunakan untuk analisis sentimen:  supervised learning dan unsupervised learning. Penelitian ini akan membandingkan dua algoritma klasifikasi yang termasuk dari supervised learning: algoritma K-Nearest Neighbor dan Support Vector Machine, dengan cara membuat model dari masing-masing algoritma dengan objek teks sentimen. Perbandingan dilakukan untuk mengetahui algoritma mana lebih baik dalam segi akurasi dan waktu proses. Hasil pada perhitungan akurasi menunjukkan bahwa metode Support Vector Machine lebih unggul dengan nilai 89,70% tanpa K-Fold Cross Validation dan 88,76% dengan K-Fold Cross Validation. Sedangkan pada perhitungan waktu proses metode K-Nearest Neighbor lebih unggul dengan waktu proses 0.0160s tanpa K-Fold Cross Validation dan 0.1505s dengan K-Fold Cross Validation.


2016 ◽  
Vol 7 (4) ◽  
Author(s):  
Mochammad Yusa ◽  
Ema Utami ◽  
Emha T. Luthfi

Abstract. Readmission is associated with quality measures on patients in hospitals. Different attributes related to diabetic patients such as medication, ethnicity, race, lifestyle, age, and others result in the calculation of quality care that tends to be complicated. Classification techniques of data mining can solve this problem. In this paper, the evaluation on three different classifiers, i.e. Decision Tree, k-Nearest Neighbor (k-NN), dan Naive Bayes with various settingparameter, is developed by using 10-Fold Cross Validation technique. The targets of parameter performance evaluated is based on term of Accuracy, Mean Absolute Error (MAE), dan Kappa Statistic. The selected dataset consists of 47 attributes and 49.735 records. The result shows that k-NN classifier with k=100 has a better performance in term of accuracy and Kappa Statistic, but Naive Bayes outperforms in term of MAE among other classifiers. Keywords: k-NN, naive bayes, diabetes, readmissionAbstrak. Proses Readmisi dikaitkan dengan perhitungan kualitas penanganan pasien di rumah sakit. Perbedaan atribut-atribut yang berhubungan dengan pasien diabetes proses medikasi, etnis, ras, gaya hidup, umur, dan lain-lain, mengakibatkan perhitungan kualitas cenderung rumit. Teknik klasifikasi data mining dapat menjadi solusi dalam perhitungan kualitas ini. Teknik klasifikasi merupakan salah satu teknik data mining yang perkembangannya cukup signifikan. Di dalam penelitian ini, model algoritma klasifikasi Decision Tree, k-Nearest Neighbor (k-NN), dan Naive Bayes dengan berbagai parameter setting akan dievaluasi performanya berdasarkan nilai performa Accuracy, Mean AbsoluteError (MAE), dan Kappa Statistik dengan metode 10-Fold Cross Validation. Dataset yang dievaluasi memiliki 47 atribut dengan 49.735 records. Hasil penelitian menunjukan bahwa performa accuracy, MAE, dan Kappa Statistik terbaik didapatkan dari Model Algoritma Naive Bayes.Kata Kunci: k-NN, naive bayes, diabetes, readmisi


Teknika ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 96-103
Author(s):  
Mohammad Farid Naufal ◽  
Selvia Ferdiana Kusuma ◽  
Kevin Christian Tanus ◽  
Raynaldy Valentino Sukiwun ◽  
Joseph Kristiano ◽  
...  

Kondisi pandemi global Covid-19 yang muncul diakhir tahun 2019 telah menjadi permasalahan utama seluruh negara di dunia. Covid-19 merupakan virus yang menyerang organ paru-paru dan dapat mengakibatkan kematian. Pasien Covid-19 banyak yang telah dirawat di rumah sakit sehingga terdapat data citra chest X-ray paru-paru pasien yang terjangkit Covid-19. Saat ini sudah banyak peneltian yang melakukan klasifikasi citra chest X-ray menggunakan Convolutional Neural Network (CNN) untuk membedakan paru-paru sehat, terinfeksi covid-19, dan penyakit paru-paru lainnya, namun belum ada penelitian yang mencoba membandingkan performa algoritma CNN dan machine learning klasik seperti Support Vector Machine (SVM), dan K-Nearest Neighbor (KNN) untuk mengetahui gap performa dan waktu eksekusi yang dibutuhkan. Penelitian ini bertujuan untuk membandingkan performa dan waktu eksekusi algoritma klasifikasi K-Nearest Neighbors (KNN), Support Vector Machine (SVM), dan CNN  untuk mendeteksi Covid-19 berdasarkan citra chest X-Ray. Berdasarkan hasil pengujian menggunakan 5 Cross Validation, CNN merupakan algoritma yang memiliki rata-rata performa terbaik yaitu akurasi 0,9591, precision 0,9592, recall 0,9591, dan F1 Score 0,959 dengan waktu eksekusi rata-rata sebesar 3102,562 detik.


Sign in / Sign up

Export Citation Format

Share Document