Chinese text classification by the Naïve Bayes Classifier and the associative classifier with multiple confidence threshold values

Document classification is a growing interest in the research of text mining. Classification can be done based on the topics, languages, and so on. This study was conducted to determine how Naive Bayes Updateable performs in classifying the SBMPTN exam questions based on its theme. Increment model of one classification algorithm often used in text classification Naive Bayes classifier has the ability to learn from new data introduces with the system even after the classifier has been produced with the existing data. Naive Bayes Classifier classifies the exam questions based on the theme of the field of study by analyzing keywords that appear on the exam questions. One of feature selection method DF-Thresholding is implemented for improving the classification performance. Evaluation of the classification with Naive Bayes classifier algorithm produces 84,61% accuracy.

Download Full-text

Optimization Scheme for Text Classification Using Machine Learning Naïve Bayes Classifier

Lecture Notes in Electrical Engineering - ICDSMLA 2019 ◽

10.1007/978-981-15-1420-3_61 ◽

2020 ◽

pp. 576-586

Author(s):

Venkatesh ◽

K. V. Ranjitha ◽

B. S. Venkatesh Prasad

Keyword(s):

Machine Learning ◽

Text Classification ◽

Naive Bayes ◽

Naïve Bayes ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Optimization Scheme

Download Full-text

Text Classification for Student Data Set using Naive Bayes Classifier and KNN Classifier

International Journal of Computer Trends and Technology ◽

10.14445/22312803/ijctt-v43p103 ◽

2017 ◽

Vol 43 (1) ◽

pp. 8-12 ◽

Cited By ~ 7

Author(s):

Rajeswari R.P ◽

Kavitha Juliet ◽

Arad hana

Keyword(s):

Text Classification ◽

Naive Bayes ◽

Naïve Bayes ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Data Set ◽

Student Data ◽

Knn Classifier

Download Full-text

Statistical Analysis of Public Sentiment on the Ghanaian Government: A Machine Learning Approach

Advances in Human-Computer Interaction ◽

10.1155/2021/5561204 ◽

2021 ◽

Vol 2021 ◽

pp. 1-7

Author(s):

John Andoh ◽

Louis Asiedu ◽

Anani Lotsi ◽

Charlotte Chapman-Wardy

Keyword(s):

Machine Learning ◽

Text Classification ◽

Naive Bayes ◽

Learning Algorithms ◽

Naïve Bayes ◽

Classification Systems ◽

Support Vector ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier

Gathering public opinions on the Internet and Internet-based applications like Twitter has become popular in recent times, as it provides decision-makers with uncensored public views on products, government policies, and programs. Through natural language processing and machine learning techniques, unstructured data forms from these sources can be analyzed using traditional statistical learning. The challenge encountered in machine learning method-based sentiment classification still remains the abundant amount of data available, which makes it difficult to train the learning algorithms in feasible time. This eventually degrades the classification accuracy of the algorithms. From this assertion, the effect of training data sizes in classification tasks cannot be overemphasized. This study statistically assessed the performance of Naive Bayes, support vector machine (SVM), and random forest algorithms on sentiment text classification task. The research also investigated the optimal conditions such as varying data sizes, trees, and kernel types under which each of the respective algorithms performed best. The study collected Twitter data from Ghanaian users which contained sentiments about the Ghanaian Government. The data was preprocessed, manually labeled by the researcher, and then trained using the aforementioned algorithms. These algorithms are three of the most popular learning algorithms which have had lots of success in diverse fields. The Naive Bayes classifier was adjudged the best algorithm for the task as it outperformed the other two machine learning algorithms with an accuracy of 99%, F1 score of 86.51%, and Matthews correlation coefficient of 0.9906. The algorithm also performed well with increasing data sizes. The Naive Bayes classifier is recommended as viable for sentiment text classification, especially for text classification systems which work with Big Data.

Download Full-text

WEIGHTED NAÏVE BAYES FOR TEXT CLASSIFICATION USING POSITIVE TERM-CLASS DEPENDENCY

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213011004769 ◽

2012 ◽

Vol 21 (01) ◽

pp. 1250008 ◽

Cited By ~ 11

Author(s):

YANJUN LI ◽

CONGNAN LUO ◽

SOON M. CHUNG

Keyword(s):

Text Classification ◽

Text Categorization ◽

Naive Bayes ◽

Real Data ◽

Naïve Bayes ◽

Data Sets ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Positive Term

Naïve Bayes is a simple and efficient classification algorithm which performs well on text classification, which is also known as text categorization. Many researches have been done to improve the performance of the naïve Bayes classifier by weighting the correlated terms, in order to relax the strong assumption of independence between terms. In this paper, we first introduce a new χ2 statistical data, denoted by Rw,c, which can measure positive term-class dependency accurately, and then propose a new weighted naïve Bayes classifier using Rw,c at the training phase. Experimental results with real data sets show that our weighted naïve Bayes classifier has much better performance than the basic naïve Bayes classifier in most cases.

Download Full-text