Prediction of Solid Garbage Waste Generation in Smart Cities using Naive Bayes Algorithm

Smart cities which are becoming overcrowded today are making human beings life miserable and prone to more challenges on daily basis. Overcrowded is leading to vast generation of wastes contributing to air pollution and in turn is affecting health causing various diseases. Even though various measures are taken to recycle wastes, the rate at which it is being produced is becoming higher and higher. This paper deals with prediction of waste generation using Naïve Bayes machine learning algorithm(Classifier) based on the statistics of previous waste datasets. The datasets used for the future prediction are obtained from reliable sources. The implementation of the algorithm is done in Pyspark using Anaconda Jupyter. The performance of the classifier on the datasets is analyzed with confusion matrix and accuracy metric is used to rate the efficiency of the classifier. The accuracy obtained indicates that algorithm can be effectively used for real time prediction and it gives more accurate results for huge input datasets based on independence assumption.

Download Full-text

A Machine Learning Framework for Improving Classification Performance on Credit Approval

IJID (International Journal on Informatics for Development) ◽

10.14421/ijid.2021.2384 ◽

2021 ◽

Vol 10 (1) ◽

pp. 47-52

Author(s):

Pulung Hendro Prastyo ◽

Septian Eko Prasetyo ◽

Shindy Arti

Keyword(s):

Machine Learning ◽

Naive Bayes ◽

Information Gain ◽

Learning Algorithm ◽

Confusion Matrix ◽

Credit Scoring ◽

Research Work ◽

Classification Performance ◽

Naïve Bayes ◽

Bayes Algorithm

Credit scoring is a model commonly used in the decision-making process to refuse or accept loan requests. The credit score model depends on the type of loan or credit and is complemented by various credit factors. At present, there is no accurate model for determining which creditors are eligible for loans. Therefore, an accurate and automatic model is needed to make it easier for banks to determine appropriate creditors. To address the problem, we propose a new approach using the combination of a machine learning algorithm (Naïve Bayes), Information Gain (IG), and discretization in classifying creditors. This research work employed an experimental method using the Weka application. Australian Credit Approval data was used as a dataset, which contains 690 instances of data. In this study, Information Gain is employed as a feature selection to select relevant features so that the Naïve Bayes algorithm can work optimally. The confusion matrix is used as an evaluator and 10-fold cross-validation as a validator. Based on experimental results, our proposed method could improve the classification performance, which reached the highest performance in average accuracy, precision, recall, and f-measure with the value of 86.29%, 86.33%, 86.29%, 86.30%, and 91.52%, respectively. Besides, the proposed method also obtains 91.52% of the ROC area. It indicates that our proposed method can be classified as an excellent classification.

Download Full-text

SENTIMEN ANALISIS KEBIJAKAN GANJIL GENAP DI TOL BEKASI MENGGUNAKAN ALGORITMA NAIVE BAYES DENGAN OPTIMALISASI INFORMATION GAIN

Jurnal Pilar Nusa Mandiri ◽

10.33480/pilar.v15i2.705 ◽

2019 ◽

Vol 15 (2) ◽

pp. 247-254

Author(s):

Heru Sukma Utama ◽

Didi Rosiyadi ◽

Dedi Aridarma ◽

Bobby Suryo Prakoso

Keyword(s):

Social Media ◽

Opinion Mining ◽

Naive Bayes ◽

Information Gain ◽

Confusion Matrix ◽

Naïve Bayes ◽

Support Vector ◽

Toll Road ◽

Textual Data ◽

Bayes Algorithm

Analysis of the odd even-numbered sentiment systems in Bekasi toll using the Naïve Bayes Algorithm, is a process of understanding, extracting, and processing textual data automatically from social media. The purpose of this study was to determine the level of accuracy, recall and precision of opinion mining generated using the Naïve Bayes algorithm to provide information community sentiment towards the effectiveness of the odd system of Bekasi tiolls on social media. The research method used in this study was to do text mining in comments-comments regarding posts regarding even odd oddities on Bekasi toll on Twitter, Instagram, Youtube and Facebook. The steps taken are starting from preprocessing, transformation, datamining and evaluation, followed by information gaon feature selection, select by weight and applying NB Algorithm model. The results obtained from the study using the NB model are obtained Confusion Matrix result, namely accuracy of 79,55%, Precision of 80,51%, and Sensitivity or Recall of 80,91%. Thus this study concludes that the use of Support Vector Machine Algorithms can analyze even odd sentiments on the Bekasi toll road.

Download Full-text

Attribute Selection in Naive Bayes Algorithm Using Genetic Algorithms and Bagging for Prediction of Liver Disease

JOURNAL OF INFORMATICS AND TELECOMMUNICATION ENGINEERING ◽

10.31289/jite.v4i1.3793 ◽

2020 ◽

Vol 4 (1) ◽

pp. 76-85

Author(s):

Dwi Yuni Utami ◽

Elah Nurlelah ◽

Noer Hikmah

Keyword(s):

Genetic Algorithms ◽

Liver Disease ◽

Naive Bayes ◽

Confusion Matrix ◽

Naïve Bayes ◽

Attribute Selection ◽

World Health ◽

The Difference ◽

Bayes Algorithm ◽

Health Organization

Liver disease is an inflammatory disease of the liver and can cause the liver to be unable to function as usual and even cause death. According to WHO (World Health Organization) data, almost 1.2 million people per year, especially in Southeast Asia and Africa, have died from liver disease. The problem that usually occurs is the difficulty of recognizing liver disease early on, even when the disease has spread. This study aims to compare and evaluate Naive Bayes algorithm as a selected algorithm and Naive Bayes algorithm based on Genetic Algorithm (GA) and Bagging to find out which algorithm has a higher accuracy in predicting liver disease by processing a dataset taken from the UCI Machine Learning Repository database (GA). University of California Invene). From the results of testing by evaluating both the confusion matrix and the ROC curve, it was proven that the testing carried out by the Naive Bayes Optimization algorithm using Algortima Genetics and Bagging has a higher accuracy value than only using the Naive Bayes algorithm. The accuracy value for the Naive Bayes algorithm model is 66.66% and the accuracy value for the Naive Bayes model with attribute selection using Genetic Algorithms and Bagging is 72.02%. Based on this value, the difference in accuracy is 5.36%.Keywords: Liver Disease, Naïve Bayes, Genetic Agorithms, Bagging.

Download Full-text

The Use of Naive Bayes for Broiler Digestive Tract Disease Detection

Journal on Information Technology and Computer Engineering ◽

10.25077/jitce.3.01.1-7.2019 ◽

2019 ◽

Vol 3 (01) ◽

pp. 1-7

Author(s):

Hindriyanto Dwi Purnomo

Keyword(s):

Evaluation Method ◽

Naive Bayes ◽

Confusion Matrix ◽

Gastrointestinal Diseases ◽

Naïve Bayes ◽

Common Disease ◽

High Productivity ◽

Bayes Algorithm ◽

Tract Disease

Broiler chicken is a species of chicken that have high productivity. In order to get a good quality of chicken, good treatments of the breeding factors is needed, so the chicken will not easily infected by diseases. Gastrointestinal diseases are common disease that infects chickens. The mortality level caused by gastrointestinal diseases is considered high. This study is designed to address the problem by developing a system using the Naive Bayes algorithm. 60 chicken data samples were used, and the result shows that Naive Bayes might be used to detect gastrointestinal diseases among chickens with accuracy level of 93.3%. The number was confirmed by using confusion matrix evaluation method, and gave same level of accuracy compared to the expert judgments.

Download Full-text

Sentiment Analysis Using Naive Bayes Algorithm with Feature Selection Particle Swarm Optimization (PSO) and Genetic Algorithm

International Journal of Advances in Data and Information Systems ◽

10.25008/ijadis.v2i2.1224 ◽

2021 ◽

Vol 2 (2) ◽

Author(s):

Abi Rafdi ◽

Herman Mawengkang Herman ◽

Syahril Efendi

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Particle Swarm Optimization ◽

Sentiment Analysis ◽

Naive Bayes ◽

Confusion Matrix ◽

Particle Swarm ◽

Naïve Bayes ◽

Swarm Optimization ◽

Bayes Algorithm

This study analyzes Sentiment to see opinions, points of view, judgments, attitudes, and emotions towards creatures and aspects expressed through texts. One of Social Media is like Twitter is one of the most widely used means of communication as a research topic. The main problem with sentiment analysis is voting and using the best feature options for maximum results. Either, the most widely known classification method is Naive Bayes. However, Naive Bayes is very sensitive to significant features. That way, in this test, a comparison of feature selection is carried out using Particle Swarm Optimization and Genetic Algorithm to improve the accuracy performance of the Naive Bayes algorithm. Analyses are performed by comparing before and after testing using feature selection. Validation uses a cross-validation technique, while the confusion matrix ??is appealed to measure accuracy. The results showed the highest increase for Naïve Bayes algorithm accuracy when using the feature selection of the Particle Swarm Optimization Algorithm from 60.26% to 77.50%, while the genetic algorithm from 60.26% to 70.71%. Therefore, the choice of the best characteristics is Particle Swarm Optimization which is superior with an increase in accuracy of 17.24%.

Download Full-text

Identification of Violent Response with Stretch Sensor Data from a Smart-Jacket using Naïve Bayes Algorithm

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a9244.119119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 5265-5270

Keyword(s):

Supervised Learning ◽

Naive Bayes ◽

Learning Algorithm ◽

Pressure Sensors ◽

Naïve Bayes ◽

Sensor Data ◽

Body Movements ◽

Bayes Algorithm ◽

Do So

In this paper, a smart-jacket using stretch sensors, pressure sensors was built for purpose of generating body-movements data and in order to record different kinds of signals and the distribution of the same on the jacket. Every degree of motion, when exercised, generates voltage changes in the stretch sensors as it is its property to do so. This data is collected in a flora chip set, which is Arduino based. The collected data is processed, pruned and filtered for outliers. This paper concerns with a supervised learning algorithm called Naive Bayes, which is applied over independent datasets, meaning one set of observation has no direct relations to each other. The placement of sensor are on the shoulders and elbows and the responses from each are independent of each other. Using Naive Bayes, the date has been classified for the violent response and the normal action.

Download Full-text

Comparison Analysis of K-Nearest Neighbor and Naïve Bayes in Determining Talent of Adolescence

International Journal of Artificial Intelligence Research ◽

10.29099/ijair.v4i1.118 ◽

2020 ◽

Vol 4 (1) ◽

Author(s):

Yessi Jusman ◽

Widdya Rahmalina ◽

Juni Zarman

Keyword(s):

Nearest Neighbor ◽

Naive Bayes ◽

Confusion Matrix ◽

Naïve Bayes ◽

Training Data ◽

K Nearest Neighbor ◽

Combined Training ◽

Testing Data ◽

Bayes Algorithm ◽

Children's Interests

Adolescence always searches for the identity to shape the personality character. This paper aims to use the artificial intelligent analysis to determine the talent of the adolescence. This study uses a sample of children aged 10-18 years with testing data consisting of 100 respondents. The algorithm used for analysis is the K-Nearest Neigbor and Naive Bayes algorithm. The analysis results are performance of accuracy results of both algorithms of classification. In knowing the accurate algorithm in determining children's interests and talents, it can be seen from the accuracy of the data with the confusion matrix using the RapidMiner software for training data, testing data, and combined training and testing data. This study concludes that the K-Nearest Neighbor algorithm is better than Naive Bayes in terms of classification accuracy.

Download Full-text

Klasifikasi Opini Masyarakat Terhadap Jasa Ekspedisi JNE dengan Naïve Bayes

JURNAL SISTEM INFORMASI BISNIS ◽

10.21456/vol8iss1pp92-98 ◽

2018 ◽

Vol 8 (1) ◽

pp. 92

Author(s):

Fithri Selva Jumeilah

Keyword(s):

Naive Bayes ◽

Probability Model ◽

Confusion Matrix ◽

Naïve Bayes ◽

Service Users ◽

Average Percentage ◽

Online Sales ◽

Bayes Algorithm ◽

Using Data

The large number of online sales transactions has increased the number of service users. One of the companies engaged in the delivery service in Indonesia is Tiki Nugraha Ekakurir or more known JNE. Currently, JNE service users reach 14.000.000 per month. JNE has used many media communications with its customers one of them with Twitter. The number of followers of JNECare is 108,000 and the number of tweets is 375,000. The number of comments for people who can be used to see what they think of JNE is an inseparable comment is a negative, positive or neutral category. To simplify the grouping of comments, the data will be classified using the Naïve Bayes method present in Rstudio. The amount data used on the internet is 1725 tweets. The data will be divided into allegations of 70% data training as much as 1208 data and 30% data testing or as many as 517 data. Before the data is classified the previous data must go through the process of preprocessing that is changing all the letters into lowercase and other letters other than letters and spaces (case folding), tokenizing words, and the removal of the word common (stopword remove). After the data is cleared the data will be labeled manually one by one and new data can be used for the training process to get the probability model for each category. Probailitas obtained by using Naïve bayes algorithm. Models obtained from the training will be used using data testing. The categories obtained from the test will be used to process the data used by using the confusion matrix and will calculate the accuracy, precision and recall. From the results of the classification of JNE comments obtained that Naïve Bayes was able to classify the data well. This is evidenced by the average percentage accuracy of 85%, 78% precision and 67% recall.

Download Full-text

A Machine Learning-Based Intelligent System for Predicting Diabetes

International Journal of Big Data and Analytics in Healthcare ◽

10.4018/ijbdah.2019070101 ◽

2019 ◽

Vol 4 (2) ◽

pp. 1-20

Author(s):

Nabila Shahnaz Khan ◽

Mehedi Hasan Muaz ◽

Anusha Kabir ◽

Muhammad Nazrul Islam

Keyword(s):

Machine Learning ◽

Naive Bayes ◽

Intelligent System ◽

Learning Algorithm ◽

Naïve Bayes ◽

Personal Health ◽

Diabetic Patients ◽

Technological Growth ◽

Bayes Algorithm ◽

The Comparative Study

In this era of technological growth, the diagnosis of diseases and finding cures, personal health parameter management and predicting the possibility of susceptibility to some diseases have become accessible and easy. Although all over the world millions of people are falling victim to diabetes, in most of the cases they are not even aware of their situation due to the silent nature of diabetes. Therefore, the objective of this research is to propose an intelligent system based on a machine learning algorithm to improve the accuracy of predicting diabetes. To attain this objective, an algorithm was proposed based on Naïve Bayes with prior clustering. Second, the performance of the proposed algorithm was evaluated using 532 data related to diabetic patients. Finally, the performance of the existing Naïve Bayes algorithm was compared with the proposed algorithm. The results of the comparative study showed that the improvement in the accuracy has been made apparent for the proposed algorithm.

Download Full-text

Implementasi Particle Swarm Optimization (PSO) Pada Analysis Sentiment Review Aplikasi Halodoc Menggunakan Algoritma Naïve Bayes

Jurnal Teknologi Informasi ◽

10.52643/jti.v7i1.1330 ◽

2021 ◽

Vol 7 (1) ◽

pp. 17-23

Author(s):

Nuzuliarini Nuris ◽

Eka Rini Yulia ◽

Kusmayanti Solecha

Keyword(s):

Particle Swarm Optimization ◽

Naive Bayes ◽

Confusion Matrix ◽

Particle Swarm ◽

Roc Curves ◽

Naïve Bayes ◽

Swarm Optimization ◽

Classification Evaluation ◽

Bayes Algorithm ◽

Increase In Accuracy

Health is very important for humans, if you experience symptoms or feel pain, it is appropriate for us to have a health check and go to a hospital or clinic, but if it is not possible to leave the house, an online health consultation application is considered to be helpful. But before you can use and take advantage of these applications, it is necessary to know reviews from consumers based on positive opinions and negative opinions. This study applies the Naive Bayes algorithm to perform text classification and selects the particle swarm optimazation selection feature to support the increased accuracy obtained. Classification evaluation and validation are performed using confusion matrix and ROC curves. The results showed an increase in accuracy previously 88.50% and AUC 0.535, increased to 90.50% and AUC 0.525. It can be concluded that the selection of the particle swarm optimazation feature has succeeded in increasing the accuracy.Keywords: selection features, naïve bayes, particle swarm optimization.

Download Full-text