Parallel naive Bayes algorithm for large-scale Chinese text classification based on spark

2019 ◽  
Vol 26 (1) ◽  
pp. 1-12 ◽  
Author(s):  
Peng Liu ◽  
Hui-han Zhao ◽  
Jia-yu Teng ◽  
Yan-yan Yang ◽  
Ya-feng Liu ◽  
...  
Author(s):  
Jonathan Radot Fernando ◽  
Raymond Budiraharjo ◽  
Emeraldi Haganusa

Text classification are used in many aspect of technologies such as spam classification, news categorization, Auto-correct texting. One of the most popular algorithm for text classification nowadays is Multinomial Naïve-Bayes. This paper explained how Naïve-Bayes assumption method works to classify 2019 Indonesian Election Youtube comments. The output prediction of this algorithm is spam or not spam. Spam messages are defined as racist comments, advertising comments, and unsolicited comments. The algorithms text representation method used bag-of-words method. Bag-of-words method defined a text as the multiset of its words. The algorithm then calculate the probability of a word given the class of spam or not spam. The main difference between normal Naïve-Bayes algorithm and Multinomial Naïve-Bayes is the way the algorithm treats the data itself. Multinomial Naïve-Bayes treats data as a frequency data hence it is suitable for text classification task.


2021 ◽  
Vol 5 (1) ◽  
pp. 157
Author(s):  
Samsir Samsir ◽  
Ambiyar Ambiyar ◽  
Unung Verawardina ◽  
Firman Edi ◽  
Ronal Watrianthos

The WHO announced that more than 52 million people tested positive for Covid-19, and 1.2 million died in the second week of November 2020. Meanwhile, Indonesia recorded 463 thousand individuals with 15,148 deaths that were confirmed positive. Strategy against pandemics by incorporating socialization. However, learning that was initially bold as a technique became controversial due to the briefness of the adaptation process. a wide continuum of social reactions has resulted in the sudden transition from face-to-face learning to bold learning on a large scale. This research focuses on public opinion on online learning during the Indonesian COVID-19 pandemic in early November 2020. The analysis was carried out on Twitter by mining document-based text that was interpreted using the Naïve Bayes algorithm. The results show that online learning has a positive sentiment of 30 percent, a negative sentiment of 69 percent, and a neutral 1 percent over the period. Due to community dissatisfaction about online learning, a significant amount of negative sentiment is created. Some tweets indicate disappointment with the words' stress 'and' lazy 'in the conversation being high-frequency words.


2021 ◽  
Vol 4 (1) ◽  
pp. 47-52
Author(s):  
Saptari Wijaya Mulia ◽  
Sujiharno Sujiharno ◽  
Arief Wibowo

Determining the need of money for ATM is usually different, that is one of the problems in managing money allocation of ATM. Some seasonal factors such as holidays and the implementation of transition large-scale social restrictions related to the covid-19 pandemic that can affect fluctuations in cash transactions. In this paper aims to determine the frequency of cash withdrawals at ATM since the enactment of transition large-scale social restrictions in Jakarta using the naive bayes algorithm so it can be identified which ATM require more allocation money or not. Providing the right money allocation can improve the quality of service to customers and minimize unused money in ATM. Results of analysis using a Naive Bayes algorithm to predict cash withdrawals frequencies at ATM that show a prediction accuracy up to 81%


Sign in / Sign up

Export Citation Format

Share Document