Handling missing data in software effort prediction with naive Bayes and EM algorithm

Movies are very familiar to everyone, from children, adolescents to adults, whether just because they want to watch, a hobby, or fill their spare time. Movies that used to be watched only on television and had to wait months after release or directly to the cinema, with the development of technology, of course, it is increasingly easier for everyone to enjoy movies, now they can be watched through paid television services to smartphones. One of the websites that viewers often use to review movies they have watched is IMDb. The data review can be used to get an opinion or opinion mining from the audience, whether the title of the movie being reviewed is good or not. One of the algorithms that are often used is Naïve Bayes, apart from being easy to implement, Naïve Bayes is also known to be very fast and easy to use to predict classes on a test dataset. The purpose of this study is to see how much influence the Expectation-Maximization to increase accuracy on implementation of Expectation-Maximization algorithm in opinion mining movies review case studies. From the results of this study using the Expectation-Maximization method, it was found that the accuracy increased by 4% compared to using only Naïve Bayes.

Download Full-text

Water quality prediction based on Naïve Bayes algorithm

Water Science & Technology ◽

10.2166/wst.2022.006 ◽

2022 ◽

Author(s):

M. Ilić ◽

Z. Srdjević ◽

B. Srdjević

Keyword(s):

Water Quality ◽

Missing Data ◽

Naive Bayes ◽

Ph Value ◽

Oxygen Demand ◽

Quality Parameters ◽

Naïve Bayes ◽

Water Quality Prediction ◽

Related Data ◽

Bayes Algorithm

Abstract In the fast-changing world with increased water demand, water pollution, environmental problems, and related data, information on water quality and suitability for any purpose should be prompt and reliable. Traditional approaches often fail in the attempt to predict water quality classes and new ones are needed to handle a large amount or missing data to predict water quality in real-time. One of such approaches is machine-learning (ML) based prediction. This paper presents the results of the application of the Naïve Bayes, a widely used ML method, in creating the prediction model. The proposed model is based on nine water quality parameters: temperature, pH value, electrical conductivity, oxygen saturation, biological oxygen demand, suspended solids, nitrogen oxides, orthophosphates, and ammonium. It is created in software Netica and tested and verified using the data covering the period 2013–2019 from five locations in Vojvodina Province, Serbia. Forty-eight samples are used to train the model. Once trained, the Naïve Bayes model correctly predicted the class of water sample in 64 out of 68 cases, including cases with missing data. This recommends it as a trustful tool in the transition from traditional to digital water management.

Download Full-text

Quality Assessment of Affymetrix GeneChip Data using the EM Algorithm and a Naive Bayes Classifier

2007 IEEE 7th International Symposium on BioInformatics and BioEngineering ◽

10.1109/bibe.2007.4375557 ◽

2007 ◽

Cited By ~ 1

Author(s):

Brian E. Howard ◽

Beate Sick ◽

Imara Perera ◽

Yang Ju Im ◽

Heike Winter-Sederoff ◽

...

Keyword(s):

Em Algorithm ◽

Quality Assessment ◽

Naive Bayes ◽

Naïve Bayes ◽

Affymetrix Genechip ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Genechip Data ◽

The Em Algorithm

Download Full-text

A HYBRID SELF ORGANIZING MAP IMPUTATION (SOMI) WITH NAÏVE BAYES FOR IMPUTATION MISSING DATA CLASSIFICATION

International Journal of Geomate ◽

10.21660/2019.62.71789 ◽

2019 ◽

Vol 17 (62) ◽

Author(s):

Bain Khusnul Khotimah

Keyword(s):

Missing Data ◽

Naive Bayes ◽

Data Classification ◽

Naïve Bayes ◽

Self Organizing Map ◽

Self Organizing

Download Full-text

Naïve Bayes vs. Support Vector Machine: Resilience to Missing Data

Artificial Intelligence and Computational Intelligence - Lecture Notes in Computer Science ◽

10.1007/978-3-642-23887-1_86 ◽

2011 ◽

pp. 680-687 ◽

Cited By ~ 6

Author(s):

Hongbo Shi ◽

Yaqin Liu

Keyword(s):

Support Vector Machine ◽

Missing Data ◽

Naive Bayes ◽

Naïve Bayes ◽

Support Vector

Download Full-text

Penerapan Information Retrieval Menggunakan Pemodelan Topik Pada Deskripsi Portal Multimedia

Jurnal Nasional Komputasi dan Teknologi Informasi (JNKTI) ◽

10.32672/jnkti.v2i1.1057 ◽

2019 ◽

Vol 2 (1) ◽

pp. 48

Author(s):

Indra Gita Anugrah ◽

Harunur Rosyid

Keyword(s):

Information Retrieval ◽

Em Algorithm ◽

Latent Semantic Analysis ◽

Semantic Analysis ◽

Naive Bayes ◽

Naïve Bayes ◽

Probabilistic Latent Semantic Analysis

Pesatnya perkembangan teknologi informasi saat ini, diikuti meningkatnya perkembangan data. Data merupakan informasi yang sangat berharga perkembangan yang semakin pesat menyebabkan kesulitan dalam pengelolaannya. Salah satu pemanfaatan data adalah penggunaan temu kembali informasi pada portal video multimedia. Semakin banyak video multimedia yang tersimpan pada repositori maka semakin sulit dalam proses pencarian. Pada proses pencarian, pengguna terkadang menginginkan korelasi diantara hasil pencarian. Untuk membentuk korelasi dari hasil pencarian, dibutuhkan sebuah pemodelan topik yang berfungsi sebagai penghubung diantara query, kata dan dokumen dari deskripsi video multimedia. Salah satu metode pemodelan topik dapat dilakukan menggunakan model Probabilistic Latent Semantic Analysis (PLSA) dengan algoritma Expectation dan Maximization (EM Algorithm). Algoritma EM merupakan algoritma untuk menduga suatu parameter, tahap awal adalah melakukan pencarian nilai ekspektasi (Expectation). Pencarian nilai ekspektasi membutuhkan topik sebagai parameter awal yang nilai parameter-parameter akan diperbaharui menggunakan algoritma Maximization. Proses pembentukan parameter awal dilakukan menggunakan algoritma Naive Bayes, dimana algoritma Naive Bayes digunakan memprediksi kejadian dimasa datang menggunakan pengalaman sebelumnya.

Download Full-text

Accelerating the EM Algorithm through Selective Sampling for Naive Bayes Text Classifier

The KIPS Transactions PartD ◽

10.3745/kipstd.2006.13d.3.369 ◽

2006 ◽

Vol 13D (3) ◽

pp. 369-376

Author(s):

Jae-Young Chang ◽

Han-Joon Kim

Keyword(s):

Em Algorithm ◽

Naive Bayes ◽

Naïve Bayes ◽

Selective Sampling ◽

The Em Algorithm

Download Full-text

Missing Data Handling Using The Naive Bayes Logarithm (NBL) Formula

2018 2nd East Indonesia Conference on Computer and Information Technology (EIConCIT) ◽

10.1109/eiconcit.2018.8878538 ◽

2018 ◽

Author(s):

Lukman Syafie ◽

Fitriyani Umar ◽

Aliyazid Mude ◽

Herdianti Darwis ◽

Herman ◽

...

Keyword(s):

Missing Data ◽

Naive Bayes ◽

Naïve Bayes ◽

Data Handling

Download Full-text

Porous Media in the Simulation of Greenhouse Crops Using the Naïves Bayes EM Algorithm

JOURNAL OF ADVANCES IN AGRICULTURE ◽

10.24297/jaa.v10i0.8115 ◽

2019 ◽

Vol 10 ◽

pp. 1873-1885

Author(s):

Guillermo Alfonso De la Torre Gea

Keyword(s):

Porous Media ◽

Em Algorithm ◽

Natural Ventilation ◽

Naive Bayes ◽

Equations Of Motion ◽

Naïve Bayes ◽

Training Data ◽

Data Set ◽

Climate Conditions ◽

Porous Media Model

The porous media approach has become more popular thus, it solves the equations of motion and energy numerically and therefore obtains detailed distributions of temperature and airspeed. However, those models are not allowed to forecast the relationships between the porosity of the volume of the crop with respect to the variables that comprise the climate in natural ventilation greenhouses at the same time in terms of probability. A porous media model of the crop and its approximations were developed and analyzed through non-supervised Bayesian Networks clustering, with the aim of determining the influence of porous media in function to the density crop, over the climate conditions in a natural ventilation greenhouse. Also, a naïve Bayes model unsupervised by the EM algorithm, initialized with random parameters was developed. The resulting model maximized the likelihood of the training data set. The relationships between the pressure drops in the flow limits at the crop were established. Porosity is directly influenced by humidity, temperature and slowly to CO2 concentration. Solar radiation, speed air and slowly the height are inversely influenced with the porosity. Naïve Bayes EM application to a CFD model has been providing a greater understanding of the interactions between the variables.

Download Full-text

Handling missing data in software effort prediction with naive Bayes and EM algorithm

Improving Naïve Bayes Text Classifier with Modified EM Algorithm

Implementation of EM Algorithm in Opinion Mining Movies Review Case Studies

Water quality prediction based on Naïve Bayes algorithm

Quality Assessment of Affymetrix GeneChip Data using the EM Algorithm and a Naive Bayes Classifier

A HYBRID SELF ORGANIZING MAP IMPUTATION (SOMI) WITH NAÏVE BAYES FOR IMPUTATION MISSING DATA CLASSIFICATION

Naïve Bayes vs. Support Vector Machine: Resilience to Missing Data

Penerapan Information Retrieval Menggunakan Pemodelan Topik Pada Deskripsi Portal Multimedia

Accelerating the EM Algorithm through Selective Sampling for Naive Bayes Text Classifier

Missing Data Handling Using The Naive Bayes Logarithm (NBL) Formula

Porous Media in the Simulation of Greenhouse Crops Using the Naïves Bayes EM Algorithm

Export Citation Format