Estimating the Posterior Probabilities Using the K-Nearest Neighbor Rule

In many pattern classification problems, an estimate of the posterior probabilities (rather than only a classification) is required. This is usually the case when some confidence measure in the classification is needed. In this article, we propose a new posterior probability estimator. The proposed estimator considers the K-nearest neighbors. It attaches a weight to each neighbor that contributes in an additive fashion to the posterior probability estimate. The weights corresponding to the K-nearest-neighbors (which add to 1) are estimated from the data using a maximum likelihood approach. Simulation studies confirm the effectiveness of the proposed estimator.

Download Full-text

Tropical Balls and Its Applications to K Nearest Neighbor over the Space of Phylogenetic Trees

Mathematics ◽

10.3390/math9070779 ◽

2021 ◽

Vol 9 (7) ◽

pp. 779

Author(s):

Ruriko Yoshida

Keyword(s):

Supervised Learning ◽

Phylogenetic Trees ◽

Nearest Neighbor ◽

Nearest Neighbors ◽

High Dimensional ◽

Learning Method ◽

Dimensional Vector ◽

K Nearest Neighbor ◽

K Nearest Neighbors

A tropical ball is a ball defined by the tropical metric over the tropical projective torus. In this paper we show several properties of tropical balls over the tropical projective torus and also over the space of phylogenetic trees with a given set of leaf labels. Then we discuss its application to the K nearest neighbors (KNN) algorithm, a supervised learning method used to classify a high-dimensional vector into given categories by looking at a ball centered at the vector, which contains K vectors in the space.

Download Full-text

Analisis Perbandingan Algoritma Klasifikasi Citra Chest X-ray Untuk Deteksi Covid-19

Teknika ◽

10.34148/teknika.v10i2.331 ◽

2021 ◽

Vol 10 (2) ◽

pp. 96-103

Author(s):

Mohammad Farid Naufal ◽

Selvia Ferdiana Kusuma ◽

Kevin Christian Tanus ◽

Raynaldy Valentino Sukiwun ◽

Joseph Kristiano ◽

...

Keyword(s):

Neural Network ◽

Support Vector Machine ◽

Cross Validation ◽

Nearest Neighbor ◽

Nearest Neighbors ◽

Support Vector ◽

K Nearest Neighbor ◽

K Nearest Neighbors ◽

X Ray ◽

Chest X Ray

Kondisi pandemi global Covid-19 yang muncul diakhir tahun 2019 telah menjadi permasalahan utama seluruh negara di dunia. Covid-19 merupakan virus yang menyerang organ paru-paru dan dapat mengakibatkan kematian. Pasien Covid-19 banyak yang telah dirawat di rumah sakit sehingga terdapat data citra chest X-ray paru-paru pasien yang terjangkit Covid-19. Saat ini sudah banyak peneltian yang melakukan klasifikasi citra chest X-ray menggunakan Convolutional Neural Network (CNN) untuk membedakan paru-paru sehat, terinfeksi covid-19, dan penyakit paru-paru lainnya, namun belum ada penelitian yang mencoba membandingkan performa algoritma CNN dan machine learning klasik seperti Support Vector Machine (SVM), dan K-Nearest Neighbor (KNN) untuk mengetahui gap performa dan waktu eksekusi yang dibutuhkan. Penelitian ini bertujuan untuk membandingkan performa dan waktu eksekusi algoritma klasifikasi K-Nearest Neighbors (KNN), Support Vector Machine (SVM), dan CNN untuk mendeteksi Covid-19 berdasarkan citra chest X-Ray. Berdasarkan hasil pengujian menggunakan 5 Cross Validation, CNN merupakan algoritma yang memiliki rata-rata performa terbaik yaitu akurasi 0,9591, precision 0,9592, recall 0,9591, dan F1 Score 0,959 dengan waktu eksekusi rata-rata sebesar 3102,562 detik.

Download Full-text

Methods based on k-nearest neighbor regression in the prediction of basal area diameter distribution

Canadian Journal of Forest Research ◽

10.1139/x98-085 ◽

1998 ◽

Vol 28 (8) ◽

pp. 1107-1115 ◽

Cited By ~ 61

Author(s):

Matti Maltamo ◽

Annika Kangas

Keyword(s):

Nearest Neighbor ◽

Basal Area ◽

Nearest Neighbors ◽

Volume Estimation ◽

Diameter Distribution ◽

K Nearest Neighbor ◽

K Nearest Neighbors ◽

Stand Growth ◽

Weighted Averages ◽

Growing Stock

In the Finnish compartmentwise inventory systems, growing stock is described with means and sums of tree characteristics, such as mean height and basal area, by tree species. In the calculations, growing stock is described in a treewise manner using a diameter distribution predicted from stand variables. The treewise description is needed for several reasons, e.g., for predicting log volumes or stand growth and for analyzing the forest structure. In this study, methods for predicting the basal area diameter distribution based on the k-nearest neighbor (k-nn) regression are compared with methods based on parametric distributions. In the k-nn method, the predicted values for interesting variables are obtained as weighted averages of the values of neighboring observations. Using k-nn based methods, the basal area diameter distribution of a stand is predicted with a weighted average of the distributions of k-nearest neighbors. The methods tested in this study include weighted averages of (i)Weibull distributions of k-nearest neighbors, (ii)distributions of k-nearest neighbors smoothed with the kernel method, and (iii)empirical distributions of the k-nearest neighbors. These methods are compared for the accuracy of stand volume estimation, stand structure description, and stand growth prediction. Methods based on the k-nn regression proved to give a more accurate description of the stand than the parametric methods.

Download Full-text

Klasifikasi Sekolah Menengah Pertama/Sederajat Wilayah Bireuen Menggunakan Algoritma K-Nearest Neighbors Berbasis Web

Computer Engineering Science and System Journal ◽

10.24114/cess.v5i1.14962 ◽

2020 ◽

Vol 5 (1) ◽

pp. 33

Author(s):

Rozzi Kesuma Dinata ◽

Fajriana Fajriana ◽

Zulfa Zulfa ◽

Novia Hasdyna

Keyword(s):

Euclidean Distance ◽

Nearest Neighbor ◽

Nearest Neighbors ◽

K Nearest Neighbor ◽

K Nearest Neighbors

Pada penelitian ini diimplementasikan algoritma K-Nearest Neighbor dalam pengklasifikasian Sekolah Menengah Pertama/Sederajat berdasarkan peminatan calon siswa. Tujuan penelitian ini adalah untuk memudahkan pengguna dalam menemukan sekolah SMP/sederajat berdasarkan 8 kriteria sekolah yaitu akreditasi, fasilitas ruangan, fasilitas olah raga, laboratorium, ekstrakulikuler, biaya, tingkatan kelas dan waktu belajar. Adapun data yang digunakan dalam penelitian ini didapatkan dari Dinas Pendidikan Pemuda dan Olahraga Kabupaten Bireuen. Hasil penelitian dengan menggunakan K-NN dan pendekatan Euclidean Distance dengan k=3, diperoleh nilai precision sebesar 63,67%, recall 68,95% dan accuracy sebesar 79,33% .

Download Full-text

PREDIKSI KELULUSAN MAHASISWA MAGISTER TEKNIK INFORMATIKA UNIVERSITAS AMIKOM YOGYAKARTA MENGGUNAKAN METODE K-NEAREST NEIGHBOR

Respati ◽

10.35842/jtir.v13i2.260 ◽

2018 ◽

Vol 13 (2) ◽

Author(s):

Eri Sasmita Susanto ◽

Kusrini Kusrini ◽

Hanif Al Fatta

Keyword(s):

Nearest Neighbor ◽

Nearest Neighbors ◽

Training Data ◽

K Nearest Neighbor ◽

Process Data ◽

K Nearest Neighbors ◽

Testing Data ◽

Estimation Scheme ◽

Student Graduation ◽

Feasibility Test

INTISARIPenelitian ini difokuskan untuk mengetahui uji kelayakan prediksi kelulusan mahasiswa Universitas AMIKOM Yogyakarta. Dalam hal ini penulis memilih algoritma K-Nearest Neighbors (K-NN) karena K-Nearest Neighbors (K-NN) merupakan algoritma yang bisa digunakan untuk mengolah data yang bersifat numerik dan tidak membutuhkan skema estimasi parameter perulangan yang rumit, ini berarti bisa diaplikasikan untuk dataset berukuran besar.Input dari sistem ini adalah Data sampel berupa data mahasiswa tahun 2014-2015. pengujian pada penelitian ini menggunakn dua pengujian yaitu data testing dan data training. Kriteria yang digunakan dalam penelitian ini adalah , IP Semester 1-4, capaian SKS, Status Kelulusan. Output dari sistem ini berupa hasil prediksi kelulusan mahasiswa yang terbagi menjadi dua yaitu tepat waktu dan kelulusan tidak tepat waktu.Hasil pengujian menunjukkan bahwa Berdasarkan penerapan k=14 dan k-fold=5 menghasilkan performa yang terbaik dalam memprediksi kelulusan mahasiswa dengan metode K-Nearest Neighbor menggunakan indeks prestasi 4 semester dengan nilai akurasi= 98,46%, precision= 99.53% dan recall =97.64%.Kata kunci: Algoritma K-Nearest Neighbors, Prediksi Kelulusan, Data Testing, Data Training ABSTRACTThis research is focused on knowing the feasibility test of students' graduation prediction of AMIKOM University Yogyakarta. In this case the authors chose the K-Nearest Neighbors (K-NN) algorithm because K-Nearest Neighbors (K-NN) is an algorithm that can be used to process data that is numerical and does not require complicated repetitive parameter estimation scheme, this means it can be applied for large datasets.The input of this system is the sample data in the form of student data from 2014-2015. test in this research use two test that is data testing and training data. The criteria used in this study are, IP Semester 1-4, achievement of SKS, Graduation Status. The output of this system in the form of predicted results of student graduation which is divided into two that is timely and graduation is not timely.The result of the test shows that based on the application of k = 14 and k-fold = 5, the best performance in predicting the students' graduation using K-Nearest Neighbor method uses 4 semester achievement index with accuracy value = 98,46%, precision = 99.53% and recall = 97.64%.Keywords: K-Nearest Neighbors Algorithm, Graduation Prediction, Testing Data, Training Data

Download Full-text

KLASIFIKASI JENIS BUAH APEL DENGAN METODE K-NEAREST NEIGHBORS DENGAN EKSTRAKSI FITUR HSV DAN LBP

Jurnal Sisfokom (Sistem Informasi dan Komputer) ◽

10.32736/sisfokom.v8i1.610 ◽

2019 ◽

Vol 8 (1) ◽

Author(s):

Novan Wijaya

Keyword(s):

Nearest Neighbor ◽

Nearest Neighbors ◽

K Nearest Neighbor ◽

K Nearest Neighbors

Abstrak Apel merupakan salah satu jenis buah yang unggul dan sangat digemari dan dikonsumsi masyarakat. Buah apel memiliki banyak varietas yang dapat dibedakan berdasarkan warna dan bentuk buah. Fitur Hue Saturation Value (HSV) dan Local Binary Patern (LBP) digunakan pada penelitian ini sebagai ekstraksi fitur warna dan bentuk pada buah yang kemudian akan dijadikan ciri dari warna dan bentuk buah apel yang akan diteliti. Metode K-Nearest Neighbor (K-NN) adalah salah satu metode penelitian pada kecerdasan buatan yang digunakan dalam penelitian ini untuk mengklasifikasikan nilai-nilai yang didapat dari hasil ekstraksi fitur HSV dan LBP. Data yang digunakan pada penelittian ini adalah 800 citra, yang terdiri dari 600 citra latih dan 200 citra uji. Hasil evaluasi yang didapat dari metode K-Nearest Neighbor ini untuk Secara keseluruhan dapat dilihat bahwa rata-rata nilai Precision yang di dapat sebesar 94%, Recall sebesar 100%, dan Accuracy sebesar 94 %.Kata kunci: Hue Saturation Value, Local Binary Patern, K-Nearest Neighbor

Download Full-text

Classification of Mango Fruit Quality Based on Texture Characteristics of GLCM (Gray Level Co-Occurrence Matrices) with Algorithm K-NN (K-Nearest Neighbors)

Techno (Jurnal Fakultas Teknik Universitas Muhammadiyah Purwokerto) ◽

10.30595/techno.v20i1.3816 ◽

2019 ◽

Vol 20 (1) ◽

pp. 31

Author(s):

Wahyu Wijaya Widiyanto ◽

Eko Purwanto ◽

Kusrini Kusrini

Keyword(s):

Computer Vision ◽

Nearest Neighbor ◽

Nearest Neighbors ◽

Gray Level ◽

Nearest Neighbour ◽

K Nearest Neighbor ◽

Mango Fruit ◽

K Nearest Neighbors ◽

Texture Characteristics

Proses klasifikasi kualitas mutu buah mangga dengan cara konvensional menggunakan mata manusia memiliki kelemahan di antaranya membutuhkan tenaga lebih banyak untuk memilah, anggapan mutu kualitas buah mangga antar manusia yang berbeda, tingkat konsistensi manusia dalam menilai kualitas mutu buah mangga yang tidak menjamin valid karena manusia dapat mengalami kelelahan. Penelitian ini bertujuan untuk klasifikasi kualitas mutu buah mangga ke dalam tiga kelas mutu yaitu kelas Super, A, dan B dengan computer vision dan algoritma k-Nearest Neighbor. Hasil pengujian menggunakan jumlah k tetangga 9 menunjukan tingkat akurasi sebesar 88,88%.Kata-kata kunci— Klasifikasi, GLCM, K-Nearest Neighbour, Mangga

Download Full-text

The k conditional nearest neighbor algorithm for classification and class probability estimation

PeerJ Computer Science ◽

10.7717/peerj-cs.194 ◽

2019 ◽

Vol 5 ◽

pp. e194 ◽

Cited By ~ 2

Author(s):

Hyukjun Gweon ◽

Matthias Schonlau ◽

Stefan H. Steiner

Keyword(s):

Nearest Neighbor ◽

Nearest Neighbors ◽

Training Data ◽

Data Sets ◽

Posterior Probabilities ◽

Bayes Classifier ◽

K Nearest Neighbor ◽

Benchmark Data ◽

Nonparametric Classification ◽

Class Probability

The k nearest neighbor (kNN) approach is a simple and effective nonparametric algorithm for classification. One of the drawbacks of kNN is that the method can only give coarse estimates of class probabilities, particularly for low values of k. To avoid this drawback, we propose a new nonparametric classification method based on nearest neighbors conditional on each class: the proposed approach calculates the distance between a new instance and the kth nearest neighbor from each class, estimates posterior probabilities of class memberships using the distances, and assigns the instance to the class with the largest posterior. We prove that the proposed approach converges to the Bayes classifier as the size of the training data increases. Further, we extend the proposed approach to an ensemble method. Experiments on benchmark data sets show that both the proposed approach and the ensemble version of the proposed approach on average outperform kNN, weighted kNN, probabilistic kNN and two similar algorithms (LMkNN and MLM-kHNN) in terms of the error rate. A simulation shows that kCNN may be useful for estimating posterior probabilities when the class distributions overlap.

Download Full-text

Fast Approximate Complete-data k-nearest-neighbor Estimation

Austrian Journal of Statistics ◽

10.17713/ajs.v49i2.907 ◽

2020 ◽

Vol 49 (2) ◽

pp. 18-30

Author(s):

Alejandro Murua ◽

Nicolas Wicker

Keyword(s):

Nearest Neighbor ◽

Nearest Neighbors ◽

Complete Data ◽

Fast Method ◽

K Nearest Neighbor ◽

K Nearest Neighbors ◽

Data Set ◽

Neighbor Graph ◽

Very Large Datasets ◽

Nearest Neighbor Graph

We introduce a fast method to estimate the complete-data set of k-nearest-neighbors.This is equivalent to finding an estimate of the k-nearest-neighbor graph of the data. The method relies on random normal projections. The k-nearest-neighbors are estimated by sorting points in a number of random lines. For very large datasets, the method is quasi-linear in the data size. As an application, we show that the intrinsic dimension of a manifold can be reliably estimated from the estimated set of k-nearest-neighbors in time about two orders of magnitude faster than when using the exact set of k-nearest-neighbors.

Download Full-text

Klasifikasi Berita Clickbait Menggunakan K-Nearest Neighbor (KNN)

JOINS (Journal of Information System) ◽

10.33633/joins.v5i2.3705 ◽

2020 ◽

Vol 5 (2) ◽

pp. 230-239

Author(s):

Riska Sagita ◽

Ultach Enri ◽

Aji Primajaya

Keyword(s):

Nearest Neighbor ◽

Nearest Neighbors ◽

K Nearest Neighbor ◽

K Nearest Neighbors

Clickbait menjadi salah satu cara untuk mencari uang dengan meningkatkan traffic pengunjung dan pengunjung. Praktik clickbait pada saat ini sudah merambah pada dunia jurnalistik sedangkan sistem berita media online berbeda dengan media cetak. Sama halnya dengan media online lainnya, clickbait ini memberikan pengaruh besar terhadap penyedia berita karena rasa keingintahuan dari para pembaca dan sulitnya para pembaca memilih berita clickbait atau bukan clickbait. Praktik clickbait ini sendiri sangat di andalkan oleh penyedia situs berita yang menggunakan judul-judul yang menjebak untuk menarik para pembaca. Berdasarkan masalah tersebut dilakukan penelitian untuk mengklasifikasikan berita clickbait menggunakan metode K-Nearest Neighbors (K-NN).Dari hasil penelitian yang dilakukan, diperoleh hasil terbaik pada jumlah k = 11 dengan menggunakan skenario 1 pada data pembagian dengan jumlah data sebanyak 800 data dan 200 data uji yang menghasilkan akurasi sebesar 71%, ketepatan 72%, dan ingat 71%. Hal ini menunjukkan bahwa klasifikasi berita clickbait dapat di klasifikasikan menggunakan K-Nearest Neighbor.

Download Full-text