Perbandingan Performansi Model pada Algoritma K-NN terhadap Klasifikasi Berita Fakta Hoaks Tentang Covid-19

Wahyu Hidayat;  ; Ema Utami; Ahmad Fikri Iskandar; Anggit Dwi Hartanto; Agung Budi Prasetio;  ;  ;  ;

doi:10.29408/edumatic.v5i2.3664

Perbandingan Performansi Model pada Algoritma K-NN terhadap Klasifikasi Berita Fakta Hoaks Tentang Covid-19

EDUMATIC Jurnal Pendidikan Informatika ◽

10.29408/edumatic.v5i2.3664 ◽

2021 ◽

Vol 5 (2) ◽

pp. 167-176

Author(s):

Wahyu Hidayat ◽

◽

Ema Utami ◽

Ahmad Fikri Iskandar ◽

Anggit Dwi Hartanto ◽

...

Keyword(s):

Nearest Neighbor ◽

Performance Comparison ◽

Test Results ◽

K Nearest Neighbor ◽

K Value ◽

Jaccard Distance ◽

Comparison Results ◽

Class 1 ◽

Correct Data

During Covid-19 pandemic, there was various hoax news about Covid-19. There are truth-clarification platforms for hoax news about Covid-19 such as Jala Hoax and Saber Hoax which categorize into misinformation and disinformation. Classification of supervised learning methods is applied to carry out learning from fact labels. Dataset is taken from Jala Hoax and Saber Hoax as many as 559 data which are made into Class 1 (Misleading Content, Satire/Parody, False Connection), Class 2 (False Context, Imposter Content), Class 3 (Fabricated and Manipulated Content). K-Nearest Neighbor (K-NN) is used to classify categories of misinformation and disinformation. Dissimilarity measure Jaccard Distance is compared with Euclidean, Manhattan, and Minkowski and uses k-value variance in the K-NN to determine the performance comparison results for each test. Results of Jaccard Distance at the value of k = 4 get a higher value than other model with an accuracy 0.696, precision 0.710, recall 0.572, and F1-Score. Maximum results tend to be on label of the most data class in Class 1 (Misleading Content, Satire or Parody, False Connection) with a total of 58 correct data from 61 test data.

Download Full-text

Design and Analysis System of KNN and ID3 Algorithm for Music Classification based on Mood Feature Extraction

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v7i1.pp486-495 ◽

2017 ◽

Vol 7 (1) ◽

pp. 486

Author(s):

Made Sudarma ◽

I Gede Harsemadi

Keyword(s):

Feature Extraction ◽

Processing Time ◽

Nearest Neighbor ◽

Extraction Process ◽

Performance Comparison ◽

K Nearest Neighbor ◽

K Value ◽

Id3 Algorithm ◽

Analysis System ◽

Music Information

Each of music which has been created, has its own mood which is emitted, therefore, there has been many researches in Music Information Retrieval (MIR) field that has been done for recognition of mood to music. This research produced software to classify music to the mood by using K-Nearest Neighbor and ID3 algorithm. In this research accuracy performance comparison and measurement of average classification time is carried out which is obtained based on the value produced from music feature extraction process. For music feature extraction process it uses 9 types of spectral analysis, consists of 400 practicing data and 400 testing data. The system produced outcome as classification label of mood type those are contentment, exuberance, depression and anxious. Classification by using algorithm of KNN is good enough that is 86.55% at k value = 3 and average processing time is 0.01021. Whereas by using ID3 it results accuracy of 59.33% and average of processing time is 0.05091 second.

Download Full-text

Classification of Fish Species with Image Data Using K-Nearest Neighbor

International Journal of Computer and Information System (IJCIS) ◽

10.29040/ijcis.v2i2.33 ◽

2021 ◽

Vol 2 (2) ◽

pp. 54-58

Author(s):

Kaharuddin Kaharuddin ◽

Eka Wahyu Sholeha

Keyword(s):

Computer Science ◽

Everyday Life ◽

Fish Species ◽

Nearest Neighbor ◽

Image Data ◽

Test Results ◽

Shape Features ◽

K Nearest Neighbor ◽

The World

Abstract— Classification is a technique that many of us encounter in everyday life, classification science is also growing and being applied to various types of data and cases in everyday life, in computer science classification has been developed to facilitate human work, one example of its application is to classify fish species in the world, the number of fish species in the world is very much so that there are still many people who are sometimes confused to distinguish them, therefore in this study a study will be conducted to classify fish species using the K-Nearest Neighbor Method. 4 types of fish, all data totaling 160 data. The purpose of this study was to test the K-Nearest Neighbor method for classifying fish species based on color, texture, and shape features. Based on the test results, the accuracy value of the truth is obtained using the value of K = 7 with a percentage of the truth of 77.50%, the second-highest accuracy value is the value of K = 10, namely 76.88%. Based on the results of this study, it can be concluded that the K-Nearest Neighbor method has a good enough ability to classify, but it can be done by adding variables or adding more amount of data, and using other types of fish.

Download Full-text

Performance Comparison of Ensemble-based k-Nearest Neighbor and CART Classifiers for the Classification of Adaptive e-learning User Knowledge Levels

10.2991/assehr.k.211020.037 ◽

2021 ◽

Author(s):

Utomo Pujianto ◽

Harits Ar Rosyid ◽

Aditya Cahyadi Putra

Keyword(s):

Nearest Neighbor ◽

Performance Comparison ◽

K Nearest Neighbor ◽

E Learning ◽

Knowledge Levels ◽

User Knowledge

Download Full-text

Nuclei Detection and Classification System Based On Speeded Up Robust Feature (SURF)

EMITTER International Journal of Engineering Technology ◽

10.24003/emitter.v7i1.288 ◽

2019 ◽

Vol 7 (1) ◽

pp. 1-13 ◽

Cited By ~ 1

Author(s):

Neneng Nur Amalina ◽

Kurniawan Nur Ramadhani ◽

Febryanti Sthevanie

Keyword(s):

Random Forest ◽

Nearest Neighbor ◽

Performance Comparison ◽

Cellular Heterogeneity ◽

Experimental Result ◽

Cell Type ◽

K Nearest Neighbor ◽

Nuclei Detection ◽

High Degree

Tumors contain a high degree of cellular heterogeneity. Various type of cells infiltrate the organs rapidly due to uncontrollable cell division and the evolution of those cells. The heterogeneous cell type and its quantity in infiltrated organs determine the level maglinancy of the tumor. Therefore, the analysis of those cells through their nuclei is needed for better understanding of tumor and also specify its proper treatment. In this paper, Speeded Up Robust Feature (SURF) is implemented to build a system that can detect the centroid position of nuclei on histopathology image of colon cancer. Feature extraction of each nuclei is also generated by system to classify the nuclei into two types, inflammatory nuclei and non-inflammatory nuclei. There are three classifiers that are used to classify the nuclei as performance comparison, those are k-Nearest Neighbor (k-NN), Random Forest (RF), and State Vector Machine (SVM). Based on the experimental result, the highest F1 score for nuclei detection is 0.722 with Determinant of Hessian (DoH) thresholding = 50 as parameter. For classification of nuclei, Random Forest classifier produces F1 score of 0.527, it is the highest score as compared to the other classifier.

Download Full-text

Classification of batik patterns using K-Nearest neighbor and support vector machine

Bulletin of Electrical Engineering and Informatics ◽

10.11591/eei.v9i3.1971 ◽

2020 ◽

Vol 9 (3) ◽

pp. 1260-1267

Author(s):

Agus Eko Minarno ◽

Fauzi Dwi Setiawan Sumadi ◽

Hardianto Wibowo ◽

Yuda Munarko

Keyword(s):

Support Vector Machine ◽

Data Sharing ◽

Cross Validation ◽

Nearest Neighbor ◽

Experimental Result ◽

Support Vector ◽

Test Results ◽

K Nearest Neighbor ◽

Average Accuracy

This study is proposed to compare which are the better method to classify Batik image between K-Nearest neighbor and Support Vector Machine using minimum features of GLCM. The proposed steps are started by converting image to grayscale and extracting colour feature using four features of GLCM. The features include Energy, Entropy, Contras, Correlation and 0o, 45o, 90o, and 135o. The classifier features consist of 16 features in total. In the experimental result, there exist comparison of previous works regarding the classification KNN and SVM using multi texton histogram (MTH). The experiments are carried out in the form of calculation of accuracy with data sharing and cross-validation scenario. From the test results, the average accuracy for KNN is 78.3% and 92.3% for SVM in the cross-validation scenario. The scenario for the highest accuracy of data sharing is at 70% for KNN and at 100% for SVM. Thus, it is apparent that the application of the GLCM and SVM method for extracting and classifying batik motifs has been effective and better than previous work.

Download Full-text

IDENTIFICATION OF HOAX BASED ON TEXT MINING USING K-NEAREST NEIGHBOR METHOD

JELIKU (Jurnal Elektronik Ilmu Komputer Udayana) ◽

10.24843/jlk.2021.v10.i02.p04 ◽

2022 ◽

Vol 10 (2) ◽

pp. 217

Author(s):

I Wayan Santiyasa ◽

Gede Putra Aditya Brahmantha ◽

I Wayan Supriana ◽

I GA Gede Arya Kadyanan ◽

I Ketut Gede Suhartana ◽

...

Keyword(s):

Nearest Neighbor ◽

The Internet ◽

Test Results ◽

K Nearest Neighbor ◽

K Value ◽

The Public ◽

A Value ◽

K Nearest Neighbor Algorithm ◽

Time Information ◽

Fold Cross Validation

At this time, information is very easy to obtain, information can spread quickly to all corners of society. However, the information that spreaded are not all true, there is false information or what is commonly called hoax which of course is also easily spread by the public, the public only thinks that all the information circulating on the internet is true. From every news published on the internet, it cannot be known directly that the news is a hoax or valid one. The test uses 740 random contents / issue data that has been verified by an institution, where 370 contents are hoaxes and 370 contents are valid. The test uses the K-Nearest Neighbor algorithm, before the classification process is performed, the preprocessing stage is performed first and uses the TF-IDF equation to get the weight of each feature, then classified using K-Nearest Neighbor and the test results is evaluated using 10-Fold Cross Validation. The test uses the k value with a value of 2 to 10. The optimal use of the k value in the implementation is obtained at a value of k = 4 with precision, recall, and F-Measure results of 0.764856, 0.757583, and 0.751944 respectively and an accuracy of 75.4%

Download Full-text

Comparison of Distance Models on K-Nearest Neighbor Algorithm in Stroke Disease Detection

Applied Technology and Computing Science Journal ◽

10.33086/atcsj.v4i1.2097 ◽

2021 ◽

Vol 4 (1) ◽

pp. 63-68

Author(s):

Iswanto Iswanto ◽

Tulus Tulus ◽

Poltak Sihombing

Keyword(s):

Nearest Neighbor ◽

The Other ◽

Training Data ◽

Machine Learning Method ◽

Test Results ◽

K Nearest Neighbor ◽

Minkowski Distance ◽

K Value ◽

Average Accuracy ◽

K Nearest Neighbor Algorithm

Stroke is a cardiovascular (CVD) disease caused by the failure of brain cells to get oxygen supply to pose a risk of ischemic damage and result in death. This Disease can detect based on the similarity of symptoms experienced by the sufferer so that early steps can be taking with appropriate counseling and treatment. Stroke detecting requires a machine learning method. In this research, the author used one of the supervised learning classification methods, namely K-Nearest Neighbor (K-NN). K-NN is a classification method based on calculating the distance to training data. This research compares the Euclidean, Minkowski, Manhattan, Chebyshev distance models to obtain optimal results. The distance models have been tested using the stroke dataset sourced from the Kaggle repository. Based on the test results, the Chebyshev model has the highest levels of accuracy compared to the other three distance models with an average accuracy value of 95.49%, the highest accuracy of 96.03%, at K = 10. The Euclidean and Minkowski distance models have the same level of accuracy at each K value with an average accuracy value of 95.45%, the highest accuracy of 95.93% at K = 10. Meanwhile, Manhattan has the lowest average compared to the other distance models, which is 95.42% but has the highest accuracy of 96.03% at the value of K = 6

Download Full-text

Machine Learning Verdict of EEG Signals in Brain Computer Interface

International Journal of Scientific Research in Computer Science Engineering and Information Technology ◽

10.32628/cseit1838114 ◽

2018 ◽

pp. 429-441

Author(s):

M. Jeyanthi ◽

C. Velayutham

Keyword(s):

Nearest Neighbor ◽

Technology Development ◽

Vital Role ◽

Svm Classifier ◽

K Nearest Neighbor ◽

Data Mining Technique ◽

Data Set ◽

Eeg Data ◽

Irrelevant Attributes

In Science and Technology Development BCI plays a vital role in the field of Research. Classification is a data mining technique used to predict group membership for data instances. Analyses of BCI data are challenging because feature extraction and classification of these data are more difficult as compared with those applied to raw data. In this paper, We extracted features using statistical Haralick features from the raw EEG data . Then the features are Normalized, Binning is used to improve the accuracy of the predictive models by reducing noise and eliminate some irrelevant attributes and then the classification is performed using different classification techniques such as Naïve Bayes, k-nearest neighbor classifier, SVM classifier using BCI dataset. Finally we propose the SVM classification algorithm for the BCI data set.

Download Full-text

Perancangan Aplikasi Prediksi Kelulusan Tepat Waktu Bagi Mahasiswa Baru Dengan Teknik Data Mining (Studi Kasus: Data Akademik Mahasiswa STMIK Dipanegara Makassar)

Creative Information Technology Journal ◽

10.24076/citec.2014v1i4.27 ◽

2015 ◽

Vol 1 (4) ◽

pp. 270

Author(s):

Muhammad Syukri Mustafa ◽

I. Wayan Simpen

Keyword(s):

Data Mining ◽

Nearest Neighbor ◽

Test Results ◽

K Nearest Neighbor ◽

Accuracy Rate ◽

Sample Data ◽

New Students ◽

K Nearest Neighbor Algorithm ◽

Using Data ◽

Existing Data

Penelitian ini dimaksudkan untuk melakukan prediksi terhadap kemungkian mahasiswa baru dapat menyelesaikan studi tepat waktu dengan menggunakan analisis data mining untuk menggali tumpukan histori data dengan menggunakan algoritma K-Nearest Neighbor (KNN). Aplikasi yang dihasilkan pada penelitian ini akan menggunakan berbagai atribut yang klasifikasikan dalam suatu data mining antara lain nilai ujian nasional (UN), asal sekolah/ daerah, jenis kelamin, pekerjaan dan penghasilan orang tua, jumlah bersaudara, dan lain-lain sehingga dengan menerapkan analysis KNN dapat dilakukan suatu prediksi berdasarkan kedekatan histori data yang ada dengan data yang baru, apakah mahasiswa tersebut berpeluang untuk menyelesaikan studi tepat waktu atau tidak. Dari hasil pengujian dengan menerapkan algoritma KNN dan menggunakan data sampel alumni tahun wisuda 2004 s.d. 2010 untuk kasus lama dan data alumni tahun wisuda 2011 untuk kasus baru diperoleh tingkat akurasi sebesar 83,36%.This research is intended to predict the possibility of new students time to complete studies using data mining analysis to explore the history stack data using K-Nearest Neighbor algorithm (KNN). Applications generated in this study will use a variety of attributes in a data mining classified among other Ujian Nasional scores (UN), the origin of the school / area, gender, occupation and income of parents, number of siblings, and others that by applying the analysis KNN can do a prediction based on historical proximity of existing data with new data, whether the student is likely to complete the study on time or not. From the test results by applying the KNN algorithm and uses sample data alumnus graduation year 2004 s.d 2010 for the case of a long and alumni data graduation year 2011 for new cases obtained accuracy rate of 83.36%.

Download Full-text

Hydroponic Nutrient Control System Based on Internet of Things

CommIT (Communication and Information Technology) Journal ◽

10.21512/commit.v13i2.6016 ◽

2019 ◽

Vol 13 (2) ◽

Cited By ~ 1

Author(s):

Herman Herman ◽

Demi Adidrana ◽

Nico Surantha ◽

Suharjito Suharjito

Keyword(s):

Urban Areas ◽

Human Population ◽

Mineral Water ◽

Nearest Neighbor ◽

Total Dissolved Solids ◽

K Nearest Neighbor ◽

Turn On ◽

Nutrient Film Technique ◽

Planting Method

The human population significantly increases in crowded urban areas. It causes a reduction of available farming land. Therefore, a landless planting method is needed to supply the food for society. Hydroponics is one of the solutions for gardening methods without using soil. It uses nutrient-enriched mineral water as a nutrition solution for plant growth. Traditionally, hydroponic farming is conducted manually by monitoring the nutrition such as acidity or basicity (pH), the value of Total Dissolved Solids (TDS), Electrical Conductivity (EC), and nutrient temperature. In this research, the researchers propose a system that measures pH, TDS, and nutrient temperature values in the Nutrient Film Technique (NFT) technique using a couple of sensors. The researchers use lettuce as an object of experiment and apply the k-Nearest Neighbor (k-NN) algorithm to predict the classification of nutrient conditions. The result of prediction is used to provide a command to the microcontroller to turn on or off the nutrition controller actuators simultaneously at a time. The experiment result shows that the proposed k-NN algorithm achieves 93.3% accuracy when it is k = 5.

Download Full-text