Large Scale Text Classification Using Map Reduce and Naive Bayes Algorithm for Domain Specified Ontology Building

Text classification are used in many aspect of technologies such as spam classification, news categorization, Auto-correct texting. One of the most popular algorithm for text classification nowadays is Multinomial Naïve-Bayes. This paper explained how Naïve-Bayes assumption method works to classify 2019 Indonesian Election Youtube comments. The output prediction of this algorithm is spam or not spam. Spam messages are defined as racist comments, advertising comments, and unsolicited comments. The algorithms text representation method used bag-of-words method. Bag-of-words method defined a text as the multiset of its words. The algorithm then calculate the probability of a word given the class of spam or not spam. The main difference between normal Naïve-Bayes algorithm and Multinomial Naïve-Bayes is the way the algorithm treats the data itself. Multinomial Naïve-Bayes treats data as a frequency data hence it is suitable for text classification task.

Download Full-text

Acceleration of Naive-Bayes algorithm on multicore processor for massive text classification

2014 International Symposium on Integrated Circuits (ISIC) ◽

10.1109/isicir.2014.7029490 ◽

2014 ◽

Cited By ~ 3

Author(s):

Lijun Zhou ◽

Zhiyi Yu ◽

Jie Lin ◽

Shikai Zhu ◽

Weijing Shi ◽

...

Keyword(s):

Text Classification ◽

Naive Bayes ◽

Naïve Bayes ◽

Multicore Processor ◽

Bayes Algorithm

Download Full-text

Development of Big Data App for Classification based on Map Reduce of Naive Bayes with or without Web and Mobile Interface by RESTful API Using Hadoop and Spark

Journal of Information Technology and Computer Science ◽

10.25126/jitecs.202053233 ◽

2020 ◽

Vol 5 (3) ◽

pp. 302

Author(s):

Imam Cholissodin ◽

Diajeng Sekar Seruni ◽

Junda Alfiah Zulqornain ◽

Audi Nuermey Hanafi ◽

Afwan Ghofur ◽

...

Keyword(s):

Big Data ◽

Naive Bayes ◽

End Users ◽

Naïve Bayes ◽

Mobile App ◽

Map Reduce ◽

Data Application ◽

Big Data Application ◽

Web App ◽

Bayes Algorithm

Big Data App is a developed framework that we made based on our previous project research and we have uploaded it on github, which is developing lightweight serverless both on Windows and Linux OS with the term of EdUBig as Open Source Hadoop Distribution. In this study, the focus is on solving problems related to difficulties in building a frontend and backend model of a Big Data application which by default only runs scripts through consoles in the terminal. This will be quite a tribulation for the end users when the Big Data application has been released and mass produced to general users (end users) and at the same time how the end users test the performance of the Map Reduce Naive Bayes algorithm used in several datasets. In accordance to these problems, we created the Big Data App framework to make the end users, especially developers, feel easier to build a Big Data application by integrating the frontend using the Web App from Django framework and Mobile App Native, while for the backend, we use Django framework that is able to communicate directly with the script either hadoop batch, streaming processing or spark streaming very easily and also to use the script for pig, hive, web hdfs, sqoop, oozie, etc. the making of which is extremely fast with reliable results. Based on the test results, a very significant result in the ease of data computation processing by the end users and the final results showing the highest classification accuracy of 88.3576% was obtained.Keywords: big data, map reduce of naive bayes, serverless, web and mobile app, restful api, django framework

Download Full-text

Analisis Sentimen Pembelajaran Daring Pada Twitter di Masa Pandemi COVID-19 Menggunakan Metode Naïve Bayes

JURNAL MEDIA INFORMATIKA BUDIDARMA ◽

10.30865/mib.v5i1.2580 ◽

2021 ◽

Vol 5 (1) ◽

pp. 157

Author(s):

Samsir Samsir ◽

Ambiyar Ambiyar ◽

Unung Verawardina ◽

Firman Edi ◽

Ronal Watrianthos

Keyword(s):

Online Learning ◽

High Frequency ◽

Large Scale ◽

Naive Bayes ◽

Naïve Bayes ◽

Face To Face ◽

Face To Face Learning ◽

Bayes Algorithm ◽

Sudden Transition ◽

Negative Sentiment

The WHO announced that more than 52 million people tested positive for Covid-19, and 1.2 million died in the second week of November 2020. Meanwhile, Indonesia recorded 463 thousand individuals with 15,148 deaths that were confirmed positive. Strategy against pandemics by incorporating socialization. However, learning that was initially bold as a technique became controversial due to the briefness of the adaptation process. a wide continuum of social reactions has resulted in the sudden transition from face-to-face learning to bold learning on a large scale. This research focuses on public opinion on online learning during the Indonesian COVID-19 pandemic in early November 2020. The analysis was carried out on Twitter by mining document-based text that was interpreted using the Naïve Bayes algorithm. The results show that online learning has a positive sentiment of 30 percent, a negative sentiment of 69 percent, and a neutral 1 percent over the period. Due to community dissatisfaction about online learning, a significant amount of negative sentiment is created. Some tweets indicate disappointment with the words' stress 'and' lazy 'in the conversation being high-frequency words.

Download Full-text

A Chinese text classification system based on Naive Bayes algorithm

MATEC Web of Conferences ◽

10.1051/matecconf/20164401015 ◽

2016 ◽

Vol 44 ◽

pp. 01015 ◽

Cited By ~ 1

Author(s):

Wei Cui

Keyword(s):

Chinese Text ◽

Text Classification ◽

Classification System ◽

Naive Bayes ◽

Naïve Bayes ◽

Chinese Text Classification ◽

Bayes Algorithm

Download Full-text

IMPLEMENTASI ALGORITMA NAIVE BAYES UNTUK MEMPREDIKSI FREKUENSI TUNAI PADA MESIN ATM DI MASA TRANSISI PEMBATASAN SOSIAL BERSKALA BESAR (PSBB) PANDEMI COVID-19

SINTECH (Science and Information Technology) Journal ◽

10.31598/sintechjournal.v4i1.622 ◽

2021 ◽

Vol 4 (1) ◽

pp. 47-52

Author(s):

Saptari Wijaya Mulia ◽

Sujiharno Sujiharno ◽

Arief Wibowo

Keyword(s):

Quality Of Service ◽

Prediction Accuracy ◽

Large Scale ◽

Naive Bayes ◽

Naïve Bayes ◽

Seasonal Factors ◽

Bayes Algorithm ◽

The Right

Determining the need of money for ATM is usually different, that is one of the problems in managing money allocation of ATM. Some seasonal factors such as holidays and the implementation of transition large-scale social restrictions related to the covid-19 pandemic that can affect fluctuations in cash transactions. In this paper aims to determine the frequency of cash withdrawals at ATM since the enactment of transition large-scale social restrictions in Jakarta using the naive bayes algorithm so it can be identified which ATM require more allocation money or not. Providing the right money allocation can improve the quality of service to customers and minimize unused money in ATM. Results of analysis using a Naive Bayes algorithm to predict cash withdrawals frequencies at ATM that show a prediction accuracy up to 81%

Download Full-text

Emotion Identification between POMS and Multinomial Naive Bayes Algorithm Using Twitter API

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i7.1419 ◽

2019 ◽

Vol 7 (7) ◽

pp. 14-19 ◽

Cited By ~ 1

Author(s):

Asharani S Dandoti ◽

Sunil M Sangve

Keyword(s):

Naive Bayes ◽

Naïve Bayes ◽

Emotion Identification ◽

Bayes Algorithm

Download Full-text

Algorithm Comparation of Naive Bayes and Support Vector Machine based on Particle Swarm Optimization in Sentiment Analysis of Freight Forwarding Services

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v4i2.1840 ◽

2020 ◽

Vol 4 (2) ◽

pp. 362-369

Author(s):

Sharazita Dyah Anggita ◽

Ikmah

Keyword(s):

Support Vector Machine ◽

Sentiment Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector ◽

The Public ◽

Svm Algorithm ◽

Bayes Algorithm ◽

Freight Forwarding ◽

Improved Accuracy

The needs of the community for freight forwarding are now starting to increase with the marketplace. User opinion about freight forwarding services is currently carried out by the public through many things one of them is social media Twitter. By sentiment analysis, the tendency of an opinion will be able to be seen whether it has a positive or negative tendency. The methods that can be applied to sentiment analysis are the Naive Bayes Algorithm and Support Vector Machine (SVM). This research will implement the two algorithms that are optimized using the PSO algorithms in sentiment analysis. Testing will be done by setting parameters on the PSO in each classifier algorithm. The results of the research that have been done can produce an increase in the accreditation of 15.11% on the optimization of the PSO-based Naive Bayes algorithm. Improved accuracy on the PSO-based SVM algorithm worth 1.74% in the sigmoid kernel.

Download Full-text

Analysis of Sentiment of Moving a National Capital with Feature Selection Naive Bayes Algorithm and Support Vector Machine

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v4i3.1942 ◽

2020 ◽

Vol 4 (3) ◽

pp. 504-512

Author(s):

Faried Zamachsari ◽

Gabriel Vangeran Saragih ◽

Susafa'ati ◽

Windu Gata

Keyword(s):

Social Media ◽

Support Vector Machine ◽

Feature Selection ◽

Public Opinion ◽

Naive Bayes ◽

Naïve Bayes ◽

Capital City ◽

Support Vector ◽

National Capital ◽

Bayes Algorithm

The decision to move Indonesia's capital city to East Kalimantan received mixed responses on social media. When the poverty rate is still high and the country's finances are difficult to be a factor in disapproval of the relocation of the national capital. Twitter as one of the popular social media, is used by the public to express these opinions. How is the tendency of community responses related to the move of the National Capital and how to do public opinion sentiment analysis related to the move of the National Capital with Feature Selection Naive Bayes Algorithm and Support Vector Machine to get the highest accuracy value is the goal in this study. Sentiment analysis data will take from public opinion using Indonesian from Twitter social media tweets in a crawling manner. Search words used are #IbuKotaBaru and #PindahIbuKota. The stages of the research consisted of collecting data through social media Twitter, polarity, preprocessing consisting of the process of transform case, cleansing, tokenizing, filtering and stemming. The use of feature selection to increase the accuracy value will then enter the ratio that has been determined to be used by data testing and training. The next step is the comparison between the Support Vector Machine and Naive Bayes methods to determine which method is more accurate. In the data period above it was found 24.26% positive sentiment 75.74% negative sentiment related to the move of a new capital city. Accuracy results using Rapid Miner software, the best accuracy value of Naive Bayes with Feature Selection is at a ratio of 9:1 with an accuracy of 88.24% while the best accuracy results Support Vector Machine with Feature Selection is at a ratio of 5:5 with an accuracy of 78.77%.

Download Full-text

Large Scale Text Classification Using Map Reduce and Naive Bayes Algorithm for Domain Specified Ontology Building

Parallel naive Bayes algorithm for large-scale Chinese text classification based on spark

Spam Classification on 2019 Indonesian President Election Youtube Comments Using Multinomial Naïve-Bayes

Acceleration of Naive-Bayes algorithm on multicore processor for massive text classification

Development of Big Data App for Classification based on Map Reduce of Naive Bayes with or without Web and Mobile Interface by RESTful API Using Hadoop and Spark

Analisis Sentimen Pembelajaran Daring Pada Twitter di Masa Pandemi COVID-19 Menggunakan Metode Naïve Bayes

A Chinese text classification system based on Naive Bayes algorithm

IMPLEMENTASI ALGORITMA NAIVE BAYES UNTUK MEMPREDIKSI FREKUENSI TUNAI PADA MESIN ATM DI MASA TRANSISI PEMBATASAN SOSIAL BERSKALA BESAR (PSBB) PANDEMI COVID-19

Emotion Identification between POMS and Multinomial Naive Bayes Algorithm Using Twitter API

Algorithm Comparation of Naive Bayes and Support Vector Machine based on Particle Swarm Optimization in Sentiment Analysis of Freight Forwarding Services

Analysis of Sentiment of Moving a National Capital with Feature Selection Naive Bayes Algorithm and Support Vector Machine

Export Citation Format