scholarly journals A Chinese text classification system based on Naive Bayes algorithm

2016 ◽  
Vol 44 ◽  
pp. 01015 ◽  
Author(s):  
Wei Cui
2019 ◽  
Vol 26 (1) ◽  
pp. 1-12 ◽  
Author(s):  
Peng Liu ◽  
Hui-han Zhao ◽  
Jia-yu Teng ◽  
Yan-yan Yang ◽  
Ya-feng Liu ◽  
...  

Compiler ◽  
2016 ◽  
Vol 5 (2) ◽  
Author(s):  
Siti Anisah ◽  
Anton Setiawan Honggowibowo ◽  
Asih Pujiastuti

A comic has its own characteristics compared the other types of books. The difference between comic and other books can be seen from the category o f period, material and physical. Comicand other booksneeded an application o f classification system. Looking for the problem, classification system was made using Chi Square Feature Selection and Naive Bayes algorithm to determine the comic based on the period, material and physical. Delphi programming language and Oracle Database are used to build the Classification System. Chi Square Feature Selection acquired trait a comic is in 0.10347 and which not comic is in 1.9531. Furthermore, data is classified by the Naive Bayes algorithm. From 120 titles o f comic that consists 60 titles o f comic and non comicused to build classesfor trainand 60 titles o f comic and non comic used to test. The results o f Naive Bayesalgorithm for comic is 96,67%with 3.33% error rate, and non comic is 90% with 10% error rate. The classification to determine comic is good.


Author(s):  
Jonathan Radot Fernando ◽  
Raymond Budiraharjo ◽  
Emeraldi Haganusa

Text classification are used in many aspect of technologies such as spam classification, news categorization, Auto-correct texting. One of the most popular algorithm for text classification nowadays is Multinomial Naïve-Bayes. This paper explained how Naïve-Bayes assumption method works to classify 2019 Indonesian Election Youtube comments. The output prediction of this algorithm is spam or not spam. Spam messages are defined as racist comments, advertising comments, and unsolicited comments. The algorithms text representation method used bag-of-words method. Bag-of-words method defined a text as the multiset of its words. The algorithm then calculate the probability of a word given the class of spam or not spam. The main difference between normal Naïve-Bayes algorithm and Multinomial Naïve-Bayes is the way the algorithm treats the data itself. Multinomial Naïve-Bayes treats data as a frequency data hence it is suitable for text classification task.


Sign in / Sign up

Export Citation Format

Share Document