scholarly journals Accelerating the EM Algorithm through Selective Sampling for Naive Bayes Text Classifier

2006 ◽  
Vol 13D (3) ◽  
pp. 369-376
Author(s):  
Jae-Young Chang ◽  
Han-Joon Kim
Author(s):  
Han-joon Kim

This chapter introduces two practical techniques for improving Naïve Bayes text classifiers that are widely used for text classification. The Naïve Bayes has been evaluated to be a practical text classification algorithm due to its simple classification model, reasonable classification accuracy, and easy update of classification model. Thus, many researchers have a strong incentive to improve the Naïve Bayes by combining it with other meta-learning approaches such as EM (Expectation Maximization) and Boosting. The EM approach is to combine the Naïve Bayes with the EM algorithm and the Boosting approach is to use the Naïve Bayes as a base classifier in the AdaBoost algorithm. For both approaches, a special uncertainty measure fit for Naïve Bayes learning is used. In the Naïve Bayes learning framework, these approaches are expected to be practical solutions to the problem of lack of training documents in text classification systems.


2021 ◽  
Vol 5 (2) ◽  
pp. 94-105
Author(s):  
Muhammad Danial Romadloni ◽  
Indra Gita Anugrah

Movies are very familiar to everyone, from children, adolescents to adults, whether just because they want to watch, a hobby, or fill their spare time. Movies that used to be watched only on television and had to wait months after release or directly to the cinema, with the development of technology, of course, it is increasingly easier for everyone to enjoy movies, now they can be watched through paid television services to smartphones. One of the websites that viewers often use to review movies they have watched is IMDb. The data review can be used to get an opinion or opinion mining from the audience, whether the title of the movie being reviewed is good or not. One of the algorithms that are often used is Naïve Bayes, apart from being easy to implement, Naïve Bayes is also known to be very fast and easy to use to predict classes on a test dataset. The purpose of this study is to see how much influence the Expectation-Maximization to increase accuracy on implementation of Expectation-Maximization algorithm in opinion mining movies review case studies. From the results of this study using the Expectation-Maximization method, it was found that the accuracy increased by 4% compared to using only Naïve Bayes.


Author(s):  
Indra Gita Anugrah ◽  
Harunur Rosyid

<p>Pesatnya perkembangan teknologi informasi saat ini, diikuti meningkatnya perkembangan data. Data merupakan informasi yang sangat berharga perkembangan yang semakin pesat menyebabkan kesulitan dalam pengelolaannya. Salah satu pemanfaatan data adalah penggunaan temu kembali informasi pada portal video multimedia. Semakin banyak video multimedia yang tersimpan pada repositori maka semakin sulit dalam proses pencarian. Pada proses pencarian, pengguna terkadang menginginkan korelasi diantara hasil pencarian. Untuk membentuk korelasi dari hasil pencarian, dibutuhkan sebuah pemodelan topik yang berfungsi sebagai penghubung diantara query, kata dan dokumen dari deskripsi video multimedia. Salah satu metode pemodelan topik dapat dilakukan menggunakan model <em>Probabilistic Latent Semantic Analysis</em> <em>(PLSA)</em> dengan algoritma <em>Expectation dan Maximization (EM Algorithm)</em>. Algoritma EM merupakan algoritma untuk menduga suatu parameter, tahap awal adalah melakukan pencarian nilai ekspektasi <em>(Expectation).</em> Pencarian nilai ekspektasi membutuhkan topik sebagai parameter awal yang nilai parameter-parameter akan diperbaharui menggunakan algoritma <em>Maximization</em>. Proses pembentukan parameter awal dilakukan menggunakan algoritma <em>Naive Bayes</em>, dimana algoritma Naive Bayes digunakan memprediksi kejadian dimasa datang menggunakan pengalaman sebelumnya.</p>


2019 ◽  
Vol 10 ◽  
pp. 1873-1885
Author(s):  
Guillermo Alfonso De la Torre Gea

The porous media approach has become more popular thus, it solves the equations of motion and energy numerically and therefore obtains detailed distributions of temperature and airspeed. However, those models are not allowed to forecast the relationships between the porosity of the volume of the crop with respect to the variables that comprise the climate in natural ventilation greenhouses at the same time in terms of probability. A porous media model of the crop and its approximations were developed and analyzed through non-supervised Bayesian Networks clustering, with the aim of determining the influence of porous media in function to the density crop, over the climate conditions in a natural ventilation greenhouse. Also, a naïve Bayes model unsupervised by the EM algorithm, initialized with random parameters was developed. The resulting model maximized the likelihood of the training data set. The relationships between the pressure drops in the flow limits at the crop were established. Porosity is directly influenced by humidity, temperature and slowly to CO2 concentration. Solar radiation, speed air and slowly the height are inversely influenced with the porosity. Naïve Bayes EM application to a CFD model has been providing a greater understanding of the interactions between the variables.


Author(s):  
Agung Eddy Suryo Saputro ◽  
Khairil Anwar Notodiputro ◽  
Indahwati A

In 2018, Indonesia implemented a Governor's Election which included 17 provinces. For several months before the Election, news and opinions regarding the Governor's Election were often trending topics on Twitter. This study aims to describe the results of sentiment mining and determine the best method for predicting sentiment classes. Sentiment mining is based on Lexicon. While the methods used for sentiment analysis are Naive Bayes and C5.0. The results showed that the percentage of positive sentiment in 17 provinces was greater than the negative and neutral sentiments. In addition, method C5.0 produces a better prediction than Naive Bayes.


Sign in / Sign up

Export Citation Format

Share Document