Handling missing data in software effort prediction with naive Bayes and EM algorithm

Author(s):  
Wen Zhang ◽  
Ye Yang ◽  
Qing Wang
2021 ◽  
Vol 5 (2) ◽  
pp. 94-105
Author(s):  
Muhammad Danial Romadloni ◽  
Indra Gita Anugrah

Movies are very familiar to everyone, from children, adolescents to adults, whether just because they want to watch, a hobby, or fill their spare time. Movies that used to be watched only on television and had to wait months after release or directly to the cinema, with the development of technology, of course, it is increasingly easier for everyone to enjoy movies, now they can be watched through paid television services to smartphones. One of the websites that viewers often use to review movies they have watched is IMDb. The data review can be used to get an opinion or opinion mining from the audience, whether the title of the movie being reviewed is good or not. One of the algorithms that are often used is Naïve Bayes, apart from being easy to implement, Naïve Bayes is also known to be very fast and easy to use to predict classes on a test dataset. The purpose of this study is to see how much influence the Expectation-Maximization to increase accuracy on implementation of Expectation-Maximization algorithm in opinion mining movies review case studies. From the results of this study using the Expectation-Maximization method, it was found that the accuracy increased by 4% compared to using only Naïve Bayes.


Author(s):  
M. Ilić ◽  
Z. Srdjević ◽  
B. Srdjević

Abstract In the fast-changing world with increased water demand, water pollution, environmental problems, and related data, information on water quality and suitability for any purpose should be prompt and reliable. Traditional approaches often fail in the attempt to predict water quality classes and new ones are needed to handle a large amount or missing data to predict water quality in real-time. One of such approaches is machine-learning (ML) based prediction. This paper presents the results of the application of the Naïve Bayes, a widely used ML method, in creating the prediction model. The proposed model is based on nine water quality parameters: temperature, pH value, electrical conductivity, oxygen saturation, biological oxygen demand, suspended solids, nitrogen oxides, orthophosphates, and ammonium. It is created in software Netica and tested and verified using the data covering the period 2013–2019 from five locations in Vojvodina Province, Serbia. Forty-eight samples are used to train the model. Once trained, the Naïve Bayes model correctly predicted the class of water sample in 64 out of 68 cases, including cases with missing data. This recommends it as a trustful tool in the transition from traditional to digital water management.


Author(s):  
Indra Gita Anugrah ◽  
Harunur Rosyid

<p>Pesatnya perkembangan teknologi informasi saat ini, diikuti meningkatnya perkembangan data. Data merupakan informasi yang sangat berharga perkembangan yang semakin pesat menyebabkan kesulitan dalam pengelolaannya. Salah satu pemanfaatan data adalah penggunaan temu kembali informasi pada portal video multimedia. Semakin banyak video multimedia yang tersimpan pada repositori maka semakin sulit dalam proses pencarian. Pada proses pencarian, pengguna terkadang menginginkan korelasi diantara hasil pencarian. Untuk membentuk korelasi dari hasil pencarian, dibutuhkan sebuah pemodelan topik yang berfungsi sebagai penghubung diantara query, kata dan dokumen dari deskripsi video multimedia. Salah satu metode pemodelan topik dapat dilakukan menggunakan model <em>Probabilistic Latent Semantic Analysis</em> <em>(PLSA)</em> dengan algoritma <em>Expectation dan Maximization (EM Algorithm)</em>. Algoritma EM merupakan algoritma untuk menduga suatu parameter, tahap awal adalah melakukan pencarian nilai ekspektasi <em>(Expectation).</em> Pencarian nilai ekspektasi membutuhkan topik sebagai parameter awal yang nilai parameter-parameter akan diperbaharui menggunakan algoritma <em>Maximization</em>. Proses pembentukan parameter awal dilakukan menggunakan algoritma <em>Naive Bayes</em>, dimana algoritma Naive Bayes digunakan memprediksi kejadian dimasa datang menggunakan pengalaman sebelumnya.</p>


Author(s):  
Lukman Syafie ◽  
Fitriyani Umar ◽  
Aliyazid Mude ◽  
Herdianti Darwis ◽  
Herman ◽  
...  

2019 ◽  
Vol 10 ◽  
pp. 1873-1885
Author(s):  
Guillermo Alfonso De la Torre Gea

The porous media approach has become more popular thus, it solves the equations of motion and energy numerically and therefore obtains detailed distributions of temperature and airspeed. However, those models are not allowed to forecast the relationships between the porosity of the volume of the crop with respect to the variables that comprise the climate in natural ventilation greenhouses at the same time in terms of probability. A porous media model of the crop and its approximations were developed and analyzed through non-supervised Bayesian Networks clustering, with the aim of determining the influence of porous media in function to the density crop, over the climate conditions in a natural ventilation greenhouse. Also, a naïve Bayes model unsupervised by the EM algorithm, initialized with random parameters was developed. The resulting model maximized the likelihood of the training data set. The relationships between the pressure drops in the flow limits at the crop were established. Porosity is directly influenced by humidity, temperature and slowly to CO2 concentration. Solar radiation, speed air and slowly the height are inversely influenced with the porosity. Naïve Bayes EM application to a CFD model has been providing a greater understanding of the interactions between the variables.


Sign in / Sign up

Export Citation Format

Share Document