IMPLEMENTASI METODE K-NEAREST NEIGHBOR (KNN) UNTUK SELEKSI CALON KARYAWAN BARU (Studi Kasus : BFI Finance Surabaya)

Proses seleksi merupakan salah satu cara penting yang digunakan untuk memilih yang terbaik. Proses seleksi yang dilakukan BFI Finance Surabaya meliputi beberapa proses, proses tersebut antara lain seleksi CV, tes psikologi, interview, offering letter, cek kesehatan, dan terakhir tanda tangan kontrak. Masalah yang timbul dari proses seleksi ini seperti berkas yang masuk banyak, terdapat kemiripan penilaian calon karyawan, tidak memenuhi panggilan interview, tidak mengerjakan tes psikologi, sudah diterima tempat lain dan bad altitude. Akibat dari permasalahan tersebut adalah proses seleksi yang dilakukan perusahaan dirasa memakan waktu yang lama dan kurang efektif, sehingga menjadi masalah dalam rekrutmen karyawan. Permasalahan tersebut menjadi latar belakang dilakukannya pengembangan dan pembuatan program seleksi calon karyawan berbasis web yang memudahkan dalam klasifikasi data karyawan baru yang termasuk dalam kategori lolos atau tidak lolos. Dengan menggunakan sistem ini diharapkan dapat membantu pihak HRD dalam mengolah data karyawan dengan tepat dan akurat. Berdasarkan hasil pengujian sistem yang telah dilakukan, perhitungan manual yang dilakukan menggunakan Microsoft Excel dengan perhitungan yang dilakukan oleh sistem menghasilkan persentase 100%. Dan hasil perhitungan algoritma K- Nearest Neighbor dengan nilai parameter K = 7 menggunakan metode Euclidean Distance didapat nilai akurasi sebesar 91%, nilai presisi sebesar 87%, dan nilai recall sebesar 100%.

Download Full-text

Efficient Shared Execution Processing of k-Nearest Neighbor Joins in Road Networks

Mobile Information Systems ◽

10.1155/2018/1243289 ◽

2018 ◽

Vol 2018 ◽

pp. 1-17 ◽

Cited By ~ 1

Author(s):

Hyung-Ju Cho

Keyword(s):

Euclidean Distance ◽

Nearest Neighbor ◽

Real Life ◽

Road Networks ◽

Nearest Neighbors ◽

Superior Performance ◽

K Nearest Neighbor ◽

Wide Range ◽

Primitive Operation ◽

Nested Loop

We investigate the k-nearest neighbor (kNN) join in road networks to determine the k-nearest neighbors (NNs) from a dataset S to every object in another dataset R. The kNN join is a primitive operation and is widely used in many data mining applications. However, it is an expensive operation because it combines the kNN query and the join operation, whereas most existing methods assume the use of the Euclidean distance metric. We alternatively consider the problem of processing kNN joins in road networks where the distance between two points is the length of the shortest path connecting them. We propose a shared execution-based approach called the group-nested loop (GNL) method that can efficiently evaluate kNN joins in road networks by exploiting grouping and shared execution. The GNL method can be easily implemented using existing kNN query algorithms. Extensive experiments using several real-life roadmaps confirm the superior performance and effectiveness of the proposed method in a wide range of problem settings.

Download Full-text

Short-Term Traffic Volume Forecasting with Asymmetric Loss Based on Enhanced KNN Method

Mathematical Problems in Engineering ◽

10.1155/2019/4589437 ◽

2019 ◽

Vol 2019 ◽

pp. 1-11

Author(s):

Zhiyuan Wang ◽

Shouwen Ji ◽

Bowen Yu

Keyword(s):

Euclidean Distance ◽

Intelligent Transportation System ◽

Nearest Neighbor ◽

Mean Squared Error ◽

Essential Elements ◽

Traffic Volume ◽

K Nearest Neighbor ◽

Short Term ◽

Traffic Condition ◽

Asymmetric Loss

Short-term traffic volume forecasting is one of the most essential elements in Intelligent Transportation System (ITS) by providing prediction of traffic condition for traffic management and control applications. Among previous substantial forecasting approaches, K nearest neighbor (KNN) is a nonparametric and data-driven method popular for conciseness, interpretability, and real-time performance. However, in previous related researches, the limitations of Euclidean distance and forecasting with asymmetric loss have rarely been focused on. This research aims to fill up these gaps. This paper reconstructs Euclidean distance to overcome its limitation and proposes a KNN forecasting algorithm with asymmetric loss. Correspondingly, an asymmetric loss index, Imbalanced Mean Squared Error (IMSE), has also been proposed to test the effectiveness of newly designed algorithm. Moreover, the effect of Loess technique and suitable parameter value of dynamic KNN method have also been tested. In contrast to the traditional KNN algorithm, the proposed algorithm reduces the IMSE index by more than 10%, which shows its effectiveness when the cost of forecasting residual direction is notably different. This research expands the applicability of KNN method in short-term traffic volume forecasting and provides an available approach to forecast with asymmetric loss.

Download Full-text

Smart Access Absensi Praktikum Teknik Elektro Universitas Trunojoyo Madura Menggunakan RFID Berbasis Internet of Things (IoT)

Jurnal Riset Rekayasa Elektro ◽

10.30595/jrre.v3i1.9333 ◽

2021 ◽

Vol 3 (1) ◽

Author(s):

Miftachul Ulum ◽

Ahmad Fiqhi Ibadillah ◽

Adi Kurniawan Saputro

Keyword(s):

Internet Of Things ◽

Human Error ◽

Euclidean Distance ◽

Nearest Neighbor ◽

K Nearest Neighbor

Sistem absensi manual tidak praktis dalam proses perekapan absensike server pusat, karena harus diolah secara manual dan banyakkemungkinan human error. Data absensi tidak dapat langsung diuploadke server sehingga diperlukan sistem absensi yang terintegrasi agardapat meminimalisir kesalahan dan kecurangan. Pada penelitian ini,dirancang alat untuk sistem dengan rfid yang akan mengidentifikasi iddata sebagai input untuk database. Penulis menggunakan metode KNearestNeighbor sebagai klasifikasi, jam masuk dan jam keluardijadikan sebagai masukkan untuk data uji dan data latih yangdiperoleh dari pembacaan id oleh RFID RC522 yang disematkandalam modul absensi. Bertumpu pada NodeMCU untuk kebutuhanInternet of Things dan juga penggerak dari keseluruhan komponen didalamnya, alat tersebut dapat diwujudkan dalam bentuk yang simpeldan menarik. Dari hasil pengujian yang telah dilakukan pada penelitiansistem dan modul absensi mendapatkan skor rata-rata waktu di bawah10 second untuk 1 kali proses absensi, untuk klasifikasi menggunakanmetode K-Nearest Neighbor dengan euclidean distance menghasilkantingkat akurasi yang tinggi berturut-turut 66,67% - 100% sesuai dandapat dikatakan sistem dan modul absensi ini sudah berjalan denganbaik dan efektif.

Download Full-text

Star identification based on euclidean distance transform, voronoi tessellation, and k-nearest neighbor classification

IEEE Transactions on Aerospace and Electronic Systems ◽

10.1109/taes.2016.150642 ◽

2016 ◽

Vol 52 (6) ◽

pp. 2940-2949 ◽

Cited By ~ 9

Author(s):

Jafar Roshanian ◽

Shabnam Yazdani ◽

Masoud Ebrahimi

Keyword(s):

Euclidean Distance ◽

Nearest Neighbor ◽

Voronoi Tessellation ◽

Distance Transform ◽

K Nearest Neighbor ◽

Nearest Neighbor Classification ◽

Euclidean Distance Transform ◽

Star Identification ◽

Neighbor Classification

Download Full-text

Penggunaan Global Contrast Saliency dan Histogram of Oriented Gradient Sebagai Fitur untuk Klasifikasi Jenis Hewan Mamalia

Petir ◽

10.33322/petir.v13i1.908 ◽

2020 ◽

Vol 13 (1) ◽

pp. 80-85

Author(s):

Yohannes Yohannes ◽

Muhammad Ezar Al Rivan

Keyword(s):

Euclidean Distance ◽

Nearest Neighbor ◽

Shape Feature ◽

Shape Features ◽

K Nearest Neighbor ◽

Histogram Of Oriented Gradient ◽

Global Contrast ◽

The Face ◽

Better Than

Mammal type can be classified based on the face. Every mammal’s face has a different shape. Histogram of Oriented Gradient (HOG) used to get shape feature from mammal’s face. Before this step, Global Contrast Saliency used to make images focused on an object. This process conducts to get better shape features. Then, classification using k-Nearest Neighbor (k-NN). Euclidean and cityblock distance with k=3,5,7 and 9 used in this study. The result shows cityblock distance with k=9 better than Euclidean distance for each k. Tiger is superior to others for all distances. Sheep is bad classified.

Download Full-text

Klasifikasi Sekolah Menengah Pertama/Sederajat Wilayah Bireuen Menggunakan Algoritma K-Nearest Neighbors Berbasis Web

Computer Engineering Science and System Journal ◽

10.24114/cess.v5i1.14962 ◽

2020 ◽

Vol 5 (1) ◽

pp. 33

Author(s):

Rozzi Kesuma Dinata ◽

Fajriana Fajriana ◽

Zulfa Zulfa ◽

Novia Hasdyna

Keyword(s):

Euclidean Distance ◽

Nearest Neighbor ◽

Nearest Neighbors ◽

K Nearest Neighbor ◽

K Nearest Neighbors

Pada penelitian ini diimplementasikan algoritma K-Nearest Neighbor dalam pengklasifikasian Sekolah Menengah Pertama/Sederajat berdasarkan peminatan calon siswa. Tujuan penelitian ini adalah untuk memudahkan pengguna dalam menemukan sekolah SMP/sederajat berdasarkan 8 kriteria sekolah yaitu akreditasi, fasilitas ruangan, fasilitas olah raga, laboratorium, ekstrakulikuler, biaya, tingkatan kelas dan waktu belajar. Adapun data yang digunakan dalam penelitian ini didapatkan dari Dinas Pendidikan Pemuda dan Olahraga Kabupaten Bireuen. Hasil penelitian dengan menggunakan K-NN dan pendekatan Euclidean Distance dengan k=3, diperoleh nilai precision sebesar 63,67%, recall 68,95% dan accuracy sebesar 79,33% .

Download Full-text

A generalized fuzzy k-nearest neighbor regression model based on Minkowski distance

Granular Computing ◽

10.1007/s41066-021-00288-w ◽

2021 ◽

Author(s):

Mahinda Mailagaha Kumbure ◽

Pasi Luukka

Keyword(s):

Regression Model ◽

Euclidean Distance ◽

Nearest Neighbor ◽

Weighted Average ◽

Nearest Neighbors ◽

Manhattan Distance ◽

K Nearest Neighbor ◽

Minkowski Distance ◽

Regression Methods ◽

Target Sample

AbstractThe fuzzy k-nearest neighbor (FKNN) algorithm, one of the most well-known and effective supervised learning techniques, has often been used in data classification problems but rarely in regression settings. This paper introduces a new, more general fuzzy k-nearest neighbor regression model. Generalization is based on the usage of the Minkowski distance instead of the usual Euclidean distance. The Euclidean distance is often not the optimal choice for practical problems, and better results can be obtained by generalizing this. Using the Minkowski distance allows the proposed method to obtain more reasonable nearest neighbors to the target sample. Another key advantage of this method is that the nearest neighbors are weighted by fuzzy weights based on their similarity to the target sample, leading to the most accurate prediction through a weighted average. The performance of the proposed method is tested with eight real-world datasets from different fields and benchmarked to the k-nearest neighbor and three other state-of-the-art regression methods. The Manhattan distance- and Euclidean distance-based FKNNreg methods are also implemented, and the results are compared. The empirical results show that the proposed Minkowski distance-based fuzzy regression (Md-FKNNreg) method outperforms the benchmarks and can be a good algorithm for regression problems. In particular, the Md-FKNNreg model gave the significantly lowest overall average root mean square error (0.0769) of all other regression methods used. As a special case of the Minkowski distance, the Manhattan distance yielded the optimal conditions for Md-FKNNreg and achieved the best performance for most of the datasets.

Download Full-text

Hasil Klasifikasi Algoritma Backpropagation dan K-Nearest Neighbor pada Cardiovascular Disease

Journal of Dinda : Data Science, Information Technology, and Data Analytics ◽

10.20895/dinda.v1i1.141 ◽

2021 ◽

Vol 1 (1) ◽

pp. 17-27

Author(s):

Nashrulloh Khoiruzzaman ◽

Rima Dias Ramadhani ◽

Apri Junaidi

Keyword(s):

Cardiovascular Disease ◽

Chest Pain ◽

Euclidean Distance ◽

Nearest Neighbor ◽

K Nearest Neighbor ◽

Hidden Layer

Cardiovascular disease adalah penyakit yang diakibatkan oleh kelainan yang terjadi pada organ jantung. Cardivascular disease dapat menyerang manusia dari usia muda hingga usia tua yang terdapat 13 faktor yang mempengaruhinya yaitu Age, Sex, Chest pain, Trestbps, Chol, Fbs, Restecg, Thalach, Exang, Oldpeak, Slope, Ca, dan Thal. Cardiovascular disease beragam jenisnya antara lain penyakit jantung koroner, gagal jantung, tekanan darah tinggi, tekanan darah rendah dan lain-lain. Oleh karena itu, penelitian ini memiliki tujuan untuk melakukan klasifikasi terhadap cardiovascular disease. Pada penelitian ini menggunakan algoritma backpropagation dan algoritma K-nearest neighbor. Langkah awal dilakukan adalah proses perhitungan euclidean distance pada K-NN untuk mencari jarak k terdekat untuk mendapatkan kategori berdasarkan frequensi terbanyak dari nilai k yang ditentukan dan mencari bobot baru untuk algoritma backpropagation untuk mendapatkan bobot baru yang digunakan untuk mendapatkan nilai yang sesuai dengan yang diharapkan. Pengujian sistem ini terdiri dari pengujian nilai akurasi dengan nilai K, pengujian K-fold X validation dan pengaruh hidden layer. Hasil dari Penelitian ini bahwa algoritma backpropagation menghasilkan nilai akurasi sebesar 64%, presisi sebesar 62%, recall sebesar 64% dan algoritma K-nearest neighbor menghasilkan nilai akurasi sebesar 66%, presisi sebesar 61% dan recall sebesar 66%. Pengaruh hidden layer terhadap algoritma backpropagation dalam mengklasifikasikan cardiovascular disease sangat besar hal ini sesuai dengan hasil dari penelitian yang telah dilakukan bahwa ketika jumlah hidden layer kecil, nilai yang dihasilkan juga kecil akan tetapi ketika jumlah hidden layernya tinggi nilai akurasinya bahkan menjadi rendah .

Download Full-text

Diagnosis of colorectal cancer based on imperialist competitive algorithm

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189021 ◽

2020 ◽

Vol 39 (4) ◽

pp. 5359-5368

Author(s):

B Ratna Raju ◽

G.N Swamy ◽

K. Padma Raju

Keyword(s):

Colorectal Cancer ◽

Colon Cancer ◽

Feature Selection ◽

Euclidean Distance ◽

Nearest Neighbor ◽

Imperialist Competitive Algorithm ◽

Experimental Results ◽

K Nearest Neighbor ◽

Competitive Algorithm ◽

Knn Classifier

The Colorectal cancer leads to more number of death in recent years. The diagnosis of Colorectal cancer as early is safe to treat the patient. To identify and treat this type of cancer, Colonoscopy is applied commonly. The feature selection based methods are proposed which helps to choose the subset variables and to attain better prediction. An Imperialist Competitive Algorithm (ICA) is proposed which helps to select features in identification of colon cancer and its treatment. Also K-Nearest Neighbor (KNN) classifier is used to retain a minimal Euclidean distance between the feature of query vector and all the data in the nature of prototype training. Experimental results have proved that the proposed method is superior when compared to other methods in its metrics of performance. Better accuracy is achieved by the proposed method.

Download Full-text

Optimization of distance formula in K-Nearest Neighbor method

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v9i1.1464 ◽

2020 ◽

Vol 9 (1) ◽

pp. 326-338 ◽

Cited By ~ 1

Author(s):

Arif Ridho Lubis ◽

Muharman Lubis ◽

Al- Khowarizmi

Keyword(s):

Learning Process ◽

Euclidean Distance ◽

Nearest Neighbor ◽

Current Data ◽

The Other ◽

K Nearest Neighbor ◽

Object Based ◽

Optimal Value ◽

Learning Data ◽

Distance Formula

K-Nearest Neighbor (KNN) is a method applied in classifying objects based on learning data that is closest to the object based on comparison between previous and current data. In the learning process, KNN calculates the distance of the nearest neighbor by applying the euclidean distance formula, while in other methods, optimization has been done on the distance formula by comparing it with the other similar in order to get optimal results. This study will discuss the calculation of the euclidean distance formula in KNN compared with the normalized euclidean distance, manhattan and normalized manhattan to achieve optimization results or optimal value in finding the distance of the nearest neighbor.

Download Full-text