Introduction to Classification: Naïve Bayes and Nearest Neighbour

Data mining merupakan teknik pengolahan data dalam jumlah besar untuk pengelompokan.Teknik ini digunakan dalam proses Knowledge Discovery in Database (KDD). Teknik tersebut mempunyai beberapa metode dalam pengelompokannya Naïve-Bayes dan Nearest Neighbour, pohon keputusan (KD-Tree), ID3, K-Means, text mining dan dbscan. Dalam hal ini penulis mengelompokan data siswa baru sekolah menengah kejuruan tahun ajaran 2014/2015. Pengelompokan tersebut berdasarkan kriteria – kriteria data siswa. Pada penelitian ini, penulis menerapkan algoritma K-Means Clustering untuk pengelompokan data siswa baru sekolah menengah kejuruan. Dalam hal ini, pada umumnya untuk memamasuki jurusan hanya disesuaikan dengan nilai siswa saja namun dalam penelitian ini pengelompokan disesuaikan kriteria – kriteria siswa seperti penghasilan orang tua, tanggungan anak orang tua dan nilai tes siswa. Penulis menggunakan beberapa kriteria tersebut agar pengelompokan yang dihasilkan menjadi lebih optimal. Tujuan dari pengelompokan ini adalah terbentuknya kelompok jurusan pada siswa yang menggunakan algoritma K-Means clustering. Hasil dari pengelompokan tersebut diperoleh tiga kelompok yaitu kelompok tidak lulus, kelompok rekayasa perangkat lunak dan kelompok teknik komputer jaringan. Terdapat pusat cluster dengan Cluster-1=1.4;2.2;2.2, Cluster-2= 2.28;1.64;4 dan Cluster-3=5;2;6. Pusat cluster tersebut didapat dari beberapa iterasi sehingga mengahasilakan pusat cluster yang optimal.

Download Full-text

Analysis of E-learner’s Opinion Using Automated Sentiment Analysis in E-learning and Comparison with Naive Bayes Classification, Random Forest and K-Nearest Neighbour Algorithms

10.1007/978-981-16-3153-5_30 ◽

2021 ◽

pp. 265-277

Author(s):

P. Rajesh ◽

G. Suseendran

Keyword(s):

Random Forest ◽

Sentiment Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Nearest Neighbour ◽

E Learning ◽

Naive Bayes Classification ◽

Naïve Bayes Classification

Download Full-text

Introduction to Classification: Naïve Bayes and Nearest Neighbour

Principles of Data Mining - Undergraduate Topics in Computer Science ◽

10.1007/978-1-4471-7493-6_3 ◽

2020 ◽

pp. 21-37

Author(s):

Max Bramer

Keyword(s):

Naive Bayes ◽

Naïve Bayes ◽

Nearest Neighbour

Download Full-text

ARABIC PART OF SPEECH TAGGING USING K-NEAREST NEIGHBOUR AND NAIVE BAYES CLASSIFIERS COMBINATION

Journal of Computer Science ◽

10.3844/jcssp.2014.1865.1873 ◽

2014 ◽

Vol 10 (9) ◽

pp. 1865-1873 ◽

Cited By ~ 2

Author(s):

Mahafdah

Keyword(s):

Naive Bayes ◽

Naïve Bayes ◽

Nearest Neighbour ◽

Part Of Speech Tagging ◽

Part Of Speech ◽

Classifiers Combination ◽

Speech Tagging

Download Full-text

Sentiment analysis and sarcasm detection from social network to train health-care professionals

World Journal of Engineering ◽

10.1108/wje-02-2021-0108 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Jyoti Godara ◽

Rajni Aron ◽

Mohammad Shabaz

Keyword(s):

Health Care ◽

Support Vector Machine ◽

Decision Tree ◽

Sentiment Analysis ◽

Health Care Professionals ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

Nearest Neighbour ◽

Content Type

Purpose Sentiment analysis has observed a nascent interest over the past decade in the field of social media analytics. With major advances in the volume, rationality and veracity of social networking data, the misunderstanding, uncertainty and inaccuracy within the data have multiplied. In the textual data, the location of sarcasm is a challenging task. It is a different way of expressing sentiments, in which people write or says something different than what they actually intended to. So, the researchers are showing interest to develop various techniques for the detection of sarcasm in the texts to boost the performance of sentiment analysis. This paper aims to overview the sentiment analysis, sarcasm and related work for sarcasm detection. Further, this paper provides training to health-care professionals to make the decision on the patient’s sentiments. Design/methodology/approach This paper has compared the performance of five different classifiers – support vector machine, naïve Bayes classifier, decision tree classifier, AdaBoost classifier and K-nearest neighbour on the Twitter data set. Findings This paper has observed that naïve Bayes has performed the best having the highest accuracy of 61.18%, and decision tree performed the worst with an accuracy of 54.27%. Accuracy of AdaBoost, K-nearest neighbour and support vector machine measured were 56.13%, 54.81% and 59.55%, respectively. Originality/value This research work is original.

Download Full-text