scholarly journals Information Retrieval Document Classified with K-Nearest Neighbor

2016 ◽  
Vol 1 (2) ◽  
pp. 129
Author(s):  
Badruz Zaman ◽  
Endah Purwanti ◽  
Alifian Sukma

Along with the rapid advancement of technology development led to the amount of information available is also increasingly abundant. The aim of this study was to determine how the implementation of information retrieval system in the classification of the journal by using the cosine similarity and K-Nearest Neighbor (KNN). The data used as many as 160 documents with categories such as Physical Sciences and Engineering, Life Science, Health Science, and Social Sciences and Humanities. Construction stage begins with the use of text mining processing, the weighting of each token by using the term frequency-inverse document frequency (TF-IDF), calculate the degree of similarity of each document by using the cosine similarity and classification using k-Nearest Neighbor.Evaluation is done by using the testing documents as much as 20 documents, with a value of k = {37, 41, 43}. Evaluation system shows the level of success in classifying documents on the value of k = 43 with a value precision of 0501. System test results showed that 20 document testing used can be classified according to the actual category.

2018 ◽  
Vol 1 (2) ◽  
pp. 129 ◽  
Author(s):  
Alifian Sukma ◽  
Badruz Zaman ◽  
Endah Purwanti

Along with the rapid advancement of technology development led to the amount of information available is also increasingly abundant. The aim of this study was to determine how the implementation of information retrieval system in the classification of the journal by using the cosine similarity and K-Nearest Neighbor (KNN).The data used as many as 160 documents with categories such as Physical Sciences and Engineering, Life Science, Health Science, and Social Sciences and Humanities. Construction stage begins with the use of text mining processing, the weighting of each token by using the term frequency-inverse document frequency (TF-IDF), calculate the degree of similarity of each document by using the cosine similarity and classification using k-Nearest Neighbor.Evaluation is done by using the testing documents as much as 20 documents, with a value of k = {37, 41, 43}. Evaluation system shows the level of success in classifying documents on the value of k = 43 with a value precision of 0501. System test results showed that 20 document testing used can be classified according to the actual category


Author(s):  
M. Jeyanthi ◽  
C. Velayutham

In Science and Technology Development BCI plays a vital role in the field of Research. Classification is a data mining technique used to predict group membership for data instances. Analyses of BCI data are challenging because feature extraction and classification of these data are more difficult as compared with those applied to raw data. In this paper, We extracted features using statistical Haralick features from the raw EEG data . Then the features are Normalized, Binning is used to improve the accuracy of the predictive models by reducing noise and eliminate some irrelevant attributes and then the classification is performed using different classification techniques such as Naïve Bayes, k-nearest neighbor classifier, SVM classifier using BCI dataset. Finally we propose the SVM classification algorithm for the BCI data set.


2020 ◽  
Vol 8 (4) ◽  
pp. 367
Author(s):  
Muhammad Arief Budiman ◽  
Gst. Ayu Vida Mastrika Giri

The development of the music industry is currently growing rapidly, millions of music works continue to be issued by various music artists. As for the technologies also follows these developments, examples are mobile phones applications that have music subscription services, namely Spotify, Joox, GrooveShark, and others. Application-based services are increasingly in demand by users for streaming music, free or paid. In this paper, a music recommendation system is proposed, which the system itself can recommend songs based on the similarity of the artist that the user likes or has heard. This research uses Collaborative Filtering method with Cosine Similarity and K-Nearest Neighbor algorithm. From this research, a system that can recommend songs based on artists who are related to one another is generated.


JOUTICA ◽  
2021 ◽  
Vol 6 (2) ◽  
pp. 506
Author(s):  
Mustain Mustain Mustain

Kesulitan untuk mengorganisir data kuesioner yang bersifat konvensional melatarbelakangi penelitian ini. Oleh karena itu dibuat sistem yang memudahkan pengelompokan data kuesioner secara otomatis yang lengkap dengan sentimen yang terkandung didalamnya. Dataset yang digunakan dalam penelitian ini adalah data kuesioner rumah sakit Muhammadiyah lamongan. Penelitian ini hanya menangani kuesioner yang berbentuk teks. Data dengan fisik kertas direkap kemudian diinput ke database lengkap dengan kategori unit kerja dan sentiment. Selanjutnya dataset tersebut di dilakukan pre-prosesing yang meliputi penanganan negasi case folding, tokenizing, filtering dan stemming. Sebagai data uji komentar dari kuesioner akan dilakukan pre-prosesing selanjutnya dihitung tingkat kemiripan document dengan menggunakan metode K- Nearest Neighbor dan Vector Space Model. Jumlah data yang ditangani mempengaruhi performa system terutama dari akurasi dan kecepatan pada saat proses klasifikasi. Hasil dari sistem yang dibuat berupa ranking dokumen yang paling mirip dengan dataset berdasarkan urutan nilai cosine similarity. Ujicoba klasifikasi berdasarkan kelas kategori menghasilkan nilai akurasi 91 %. Ujicoba berdasarkan Kelas Sentimen sebesar 94 %.dari kombinasi keduanya system berhasil mendapat akurasi sebesar 86 %


Author(s):  
Danny Sebastian

E-marketplace has gained popularity with the Indonesian society resulting in the increment of products offered. Consequently, customers require more effort to search for products. In this study, we classified products from several e-marketplaces. The classification was carried out using TF-IDF method for the weighting, cosine similarity to calculate product similarity distance, and k-nearest neighbor algorithm. Based on the first testing result using 150 product data, the k-nearest neighbor method with k=5 successfully classified 146 data with 4 data classified into the wrong class. This k=5 value gives the best result for this case, with an accuracy of 97.33%. The second testing result using 150 mixed brand product data, the k-nearest neighbor method successfully classified 145 data with 5 data classified into the wrong class. The accuracy of the second testing is 96.67%.


2019 ◽  
Vol 24 (2) ◽  
pp. 129-133
Author(s):  
Hustinawaty ◽  
Rama Al Azis Dwiputra ◽  
Tavipia Rumambi

Pasar Lama Tangerang is a tourist attraction in the city of Tangerang. With the development of current technology, the public can provide an overview of how the facilities and services are provided by expressing opinions on the internet. However, it is difficult to distinguish which opinions belong to positive or negative opinions. Sentiment analysis is needed to overcome this problem. The stage in sentiment analysis starts with collecting data first, then the data is processed. Furthermore, the data that has been propagated is given a sentiment classification using the K-Nearest Neighbor (KNN) algorithm. Then the classification results obtained an accuracy of 83% with a value of k = 1 of 120 data divided by 92 positive and 28 negative comments. Sentiment analysis is made using the R and Rstudio programming languages as supporting software.


2020 ◽  
Vol 9 (2) ◽  
pp. 277
Author(s):  
Ayu Made Surya Indra Dewi ◽  
Ida Bagus Gede Dwidasmara

Obesity or overweight is a health problem that can affect anyone. In research in several journals, it was found that obesity can be influenced by many factors, but the most dominant factors are lifestyle and diet. Obesity should not only be considered as a consequence of an unhealthy lifestyle, but obesity is a disease that can lead to other dangerous diseases. Therefore, it is important to know the level of obesity in order to take early prevention. To determine the level of obesity, a classification method is used, namely K-Nearest Neighbor (KNN) to classify the level of obesity. In this study, classification was carried out with 16 test parameters, namely Gender, Age, Height, Weight, Family History With Overweight, FAVC, FCVC, NCP, CAEC, Smoke, CH2O, SCC, FAF, TUE, CALC, Mtrans and 1 class attribute, namely Nobesity. From tests carried out using the KNN algorithm, the results obtained are 78.98% accuracy with a value of k = 2. Keywords: Obesity, KNN, Classification


2021 ◽  
Vol 6 (1) ◽  
pp. 96
Author(s):  
Ikhsan Romli ◽  
Shanti Prameswari R ◽  
Antika Zahrotul Kamalia

Sentiment analysis is a data processing to recognize topics that people talk about and their sentiments toward the topics, one of which in this study is about large-scale social restrictions (PSBB). This study aims to classify negative and positive sentiments by applying the K-Nearest Neighbor algorithm to see the accuracy value of 3 types of distance calculation which are cosine similarity, euclidean, and manhattan distance for Indonesian language tweets about large-scale social restrictions (PSBB) from social media twitter. With the results obtained, the K-Nearest Neighbor accuracy by the Cosine Similarity distance 82% at k = 3, K-Nearest Neighbor by the Euclidean Distance with an accuracy of 81% at k = 11 and K-Nearest Neighbor by Manhattan Distance with an accuracy 80% at k = 5, 7, 9, 11, and 13. So, in this study the K-Nearest Neighbor algorithm with the Cosine Similarity Distance calculation gets the highest point.


Sign in / Sign up

Export Citation Format

Share Document