Automated Space Classification for Network Robots in Ubiquitous Environments

Network robots provide services to users in smart spaces while being connected to ubiquitous instruments through wireless networks in ubiquitous environments. For more effective behavior planning of network robots, it is necessary to reduce the state space by recognizing a smart space as a set of spaces. This paper proposes a space classification algorithm based on automatic graph generation and naive Bayes classification. The proposed algorithm first filters spaces in order of priority using automatically generated graphs, thereby minimizing the number of tasks that need to be predefined by a human. The filtered spaces then induce the final space classification result using naive Bayes space classification. The results of experiments conducted using virtual agents in virtual environments indicate that the performance of the proposed algorithm is better than that of conventional naive Bayes space classification.

Download Full-text

Multiple Naïve Bayes Classifiers Ensemble for Traffic Incident Detection

Mathematical Problems in Engineering ◽

10.1155/2014/383671 ◽

2014 ◽

Vol 2014 ◽

pp. 1-16 ◽

Cited By ~ 7

Author(s):

Qingchao Liu ◽

Jian Lu ◽

Shuyan Chen ◽

Kangjia Zhao

Keyword(s):

Decision Tree ◽

Naive Bayes ◽

Classification Performance ◽

Naïve Bayes ◽

Classifier Ensemble ◽

Optimal Threshold ◽

Incident Detection ◽

Bayes Classifier ◽

Traffic Incident ◽

Better Than

This study presents the applicability of the Naïve Bayes classifier ensemble for traffic incident detection. The standard Naive Bayes (NB) has been applied to traffic incident detection and has achieved good results. However, the detection result of the practically implemented NB depends on the choice of the optimal threshold, which is determined mathematically by using Bayesian concepts in the incident-detection process. To avoid the burden of choosing the optimal threshold and tuning the parameters and, furthermore, to improve the limited classification performance of the NB and to enhance the detection performance, we propose an NB classifier ensemble for incident detection. In addition, we also propose to combine the Naïve Bayes and decision tree (NBTree) to detect incidents. In this paper, we discuss extensive experiments that were performed to evaluate the performances of three algorithms: standard NB, NB ensemble, and NBTree. The experimental results indicate that the performances of five rules of the NB classifier ensemble are significantly better than those of standard NB and slightly better than those of NBTree in terms of some indicators. More importantly, the performances of the NB classifier ensemble are very stable.

Download Full-text

Peringkasan dan Support Vector Machine pada Klasifikasi Dokumen

JURNAL INFOTEL ◽

10.20895/infotel.v9i4.312 ◽

2017 ◽

Vol 9 (4) ◽

pp. 416 ◽

Cited By ~ 1

Author(s):

Nelly Indriani Widiastuti ◽

Ednawati Rainarli ◽

Kania Evita Dewi

Keyword(s):

Support Vector Machine ◽

Naive Bayes ◽

Naïve Bayes ◽

Training Data ◽

Support Vector ◽

Good Reputation ◽

Multiclass Support Vector Machine ◽

Simple Logistic ◽

Better Than

Classification is the process of grouping objects that have the same features or characteristics into several classes. The automatic documents classification use words frequency that appears on training data as features. The large number of documents cause the number of words that appears as a feature will increase. Therefore, summaries are chosen to reduce the number of words that used in classification. The classification uses multiclass Support Vector Machine (SVM) method. SVM was considered to have a good reputation in the classification. This research tests the effect of summary as selection features into documents classification. The summaries reduce text into 50%. A result obtained that the summaries did not affect value accuracy of classification of documents that use SVM. But, summaries improve the accuracy of Simple Logistic Classifier. The classification testing shows that the accuracy of Naïve Bayes Multinomial (NBM) better than SVM

Download Full-text

Comparison of Classification Data Mining C4.5 and Naïve Bayes Algorithms of EDM Dataset

TEM Journal ◽

10.18421/tem104-34 ◽

2021 ◽

pp. 1738-1744

Author(s):

Joseph Teguh Santoso ◽

Ni Luh Wiwik Sri Rahayu Ginantra ◽

Muhammad Arifin ◽

R Riinawati ◽

Dadang Sudrajat ◽

...

Keyword(s):

Data Mining ◽

Naive Bayes ◽

Educational Data Mining ◽

Naïve Bayes ◽

T Test ◽

Bayes Method ◽

Average Accuracy ◽

Difference Test ◽

Naive Bayes Method ◽

Better Than

The purpose of this research is to choose the best method by comparing two classification methods of data mining C4.5 and Naïve Bayes on Educational Data Mining, in which the data used is student graduation data consisting of 79 records. Both methods are tested for validation with 10-ford X Validation and perform a T-Test difference test to produce a table that contains the best method ranking. Different results were obtained for each method. Based on the results of these two methods, it is very influential on the dataset and the value of the area under curve in the Naïve Bayes method is better than the C4.5 method in various datasets. Comparison of the method with the 10-Ford X Validation test and the T-Test difference test is that the Naïve Bayes method is better than C4.5 with an average accuracy value of 73.41% and an under-curve area of 0.664.

Download Full-text

Analysis on Ensemble Methods for the Prediction of Cardiovascular Disease

Journal of Medical Imaging and Health Informatics ◽

10.1166/jmihi.2021.3839 ◽

2021 ◽

Vol 11 (10) ◽

pp. 2529-2537

Author(s):

C. Murale ◽

M. Sundarambal ◽

R. Nedunchezhian

Keyword(s):

Cardiac Disease ◽

Nearest Neighbor ◽

Naive Bayes ◽

Ensemble Methods ◽

Naïve Bayes ◽

K Nearest Neighbor ◽

Reduced Dimensions ◽

The People ◽

Ensemble Algorithms ◽

Better Than

Coronary Heart disease is one of the dominant sources of death and morbidity for the people worldwide. The identification of cardiac disease in the clinical review is considered one of the main problems. As the amount of data grows increasingly, interpretation and retrieval become even more complex. In addition, the Ensemble learning prediction model seems to be an important fact in this area of study. The prime aim of this paper is also to forecast CHD accurately. This paper is intended to offer a modern paradigm for prediction of cardiovascular diseases with the use of such processes such as pre-processing, detection of features, feature selection and classification. The pre-processing will initially be performed using the ordinal encoding technique, and the statistical and the features of higher order are extracted using the Fisher algorithm. Later, the minimization of record and attribute is performed, in which principle component analysis performs its extensive part in figuring out the “curse of dimensionality.” Lastly, the process of prediction is carried out by the different Ensemble models (SVM, Gaussian Naïve Bayes, Random forest, K-nearest neighbor, Logistic regression, decision tree and Multilayer perceptron that intake the features with reduced dimensions. Finally, in comparison to such success metrics the reliability of the proposal work is compared and its superiority has been confirmed. From the analysis, Naïve bayes with regards to accuracy is 98.4% better than other Ensemble algorithms.

Download Full-text

Random Subclasses Ensembles by Using 1-Nearest Neighbor Framework

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001417500318 ◽

2017 ◽

Vol 31 (10) ◽

pp. 1750031

Author(s):

Amir Ahmad ◽

Hamza Abujabal ◽

C. Aswani Kumar

Keyword(s):

Nearest Neighbor ◽

Naive Bayes ◽

Ensemble Methods ◽

Naïve Bayes ◽

Training Data ◽

Classifier Ensemble ◽

Base Classifier ◽

Decision Boundaries ◽

Better Than

A classifier ensemble is a combination of diverse and accurate classifiers. Generally, a classifier ensemble performs better than any single classifier in the ensemble. Naive Bayes classifiers are simple but popular classifiers for many applications. As it is difficult to create diverse naive Bayes classifiers, naive Bayes ensembles are not very successful. In this paper, we propose Random Subclasses (RS) ensembles for Naive Bayes classifiers. In the proposed method, new subclasses for each class are created by using 1-Nearest Neighbor (1-NN) framework that uses randomly selected points from the training data. A classifier considers each subclass as a class of its own. As the method to create subclasses is random, diverse datasets are generated. Each classifier in an ensemble learns on one dataset from the pool of diverse datasets. Diverse training datasets ensure diverse classifiers in the ensemble. New subclasses create easy to learn decision boundaries that in turn create accurate naive Bayes classifiers. We developed two variants of RS, in the first variant RS(2), two subclasses per class were created whereas in the second variant RS(4), four subclasses per class were created. We studied the performance of these methods against other popular ensemble methods by using naive Bayes as the base classifier. RS(4) outperformed other popular ensemble methods. A detailed study was carried out to understand the behavior of RS ensembles.

Download Full-text

Perbandingan Algoritma Naive Bayes dan C.45 dalam Klasifikasi Data Mining

Jurnal Teknologi Informasi dan Ilmu Komputer ◽

10.25126/jtiik.201854803 ◽

2018 ◽

Vol 5 (4) ◽

pp. 455 ◽

Cited By ~ 3

Author(s):

Yogiek Indra Kurniawan

Keyword(s):

Data Mining ◽

Case Studies ◽

Credit Card ◽

Naive Bayes ◽

Naïve Bayes ◽

Training Data ◽

Test Results ◽

Absolute Value ◽

Better Than

Pada paper ini, telah diterapkan metode Naive Bayes serta C.45 ke dalam 4 buah studi kasus, yaitu kasus penerimaan “Kartu Indonesia Sehat”, penentuan pengajuan kartu kredit di sebuah bank, penentuan usia kelahiran, serta penentuan kelayakan calon anggota kredit pada koperasi untuk mengetahui algoritma terbaik di setiap kasus. Setelah itu, dilakukan perbandingan dalam hal Precision, Recall serta Accuracy untuk setiap data training dan data testing yang telah diberikan. Dari hasil implementasi yang dilakukan, telah dibangun sebuah aplikasi yang dapat menerapkan algoritma Naive Bayes dan C.45 di 4 buah kasus tersebut. Aplikasi telah diuji dengan blackbox dan algoritma dengan hasil valid dan dapat mengimplementasikan kedua buah algoritma dengan benar. Berdasarkan hasil pengujian, semakin banyaknya data training yang digunakan, maka nilai precision, recall dan accuracy akan semakin meningkat. Selain itu, hasil klasifikasi pada algoritma Naive Bayes dan C.45 tidak dapat memberikan nilai yang absolut atau mutlak di setiap kasus. Pada kasus penentuan penerimaan Kartu Indonesia Sehat, kedua buah algoritma tersebut sama-sama efektif untuk digunakan. Untuk kasus pengajuan kartu kredit di sebuah bank, C.45 lebih baik daripada Naive Bayes. Pada kasus penentuan usia kelahiran, Naive Bayes lebih baik daripada C.45. Sedangkan pada kasus penentuan kelayakan calon anggota kredit di koperasi, Naive Bayes memberikan nilai yang lebih baik pada precision, tapi untuk recall dan accuracy, C.45 memberikan hasil yang lebih baik. Sehingga untuk menentukan algoritma terbaik yang akan dipakai di sebuah kasus, harus melihat kriteria, variable maupun jumlah data di kasus tersebut. AbstractIn this paper, applied Naive Bayes and C.45 into 4 case studies, namely the case of acceptance of “Kartu Indonesia Sehat”, determination of credit card application in a bank, determination of birth age, and determination of eligibility of prospective members of credit to Koperasi to find out the best algorithm in each case. After that, the comparison in Precision, Recall and Accuracy for each training data and data testing has been given. From the results of the implementation, has built an application that can apply the Naive Bayes and C.45 algorithm in 4 cases. Applications have been tested in blackbox and algorithms with valid results and can implement both algorithms correctly. Based on the test results, the more training data used, the value of precision, recall and accuracy will increase. The classification results of Naive Bayes and C.45 algorithms can not provide absolute value in each case. In the case of determining the acceptance of the Kartu Indonesia Indonesia, the two algorithms are equally effective to use. For credit card submission cases at a bank, C.45 is better than Naive Bayes. In the case of determining the age of birth, Naive Bayes is better than C.45. Whereas in the case of determining the eligibility of prospective credit members in the cooperative, Naive Bayes provides better value in precision, but for recall and accuracy, C.45 gives better results. So, to determine the best algorithm to be used in a case, it must look at the criteria, variables and amount of data in the case

Download Full-text

ANALISA DAN PREDIKSI IKLAN LOWONGAN KERJA PALSU DENGAN METODE NATURAL LANGUAGE PROGRAMING DAN MACHINE LEARNING

Jurnal Informatika ◽

10.30873/ji.v21i1.2865 ◽

2021 ◽

Vol 21 (1) ◽

pp. 14-22

Author(s):

Hary Sabita ◽

Fitria Fitria ◽

Riko Herwanto

Keyword(s):

Machine Learning ◽

Gradient Descent ◽

Naive Bayes ◽

Group Discussion ◽

Naïve Bayes ◽

Stochastic Gradient Descent ◽

Baseline Model ◽

Bayes Model ◽

The Us ◽

Better Than

This research was conducted using the data provided by Kaggle. This data contains features that describe job vacancies. This study used location-based data in the US, which covered 60% of all data. Job vacancies that are posted are categorized as real or fake. This research was conducted by following five stages, namely: defining the problem, collecting data, cleaning data (exploration and pre-processing) and modeling. The evaluation and validation models use Naïve Bayes as a baseline model and Small Group Discussion as end model. For the Naïve Bayes model, an accuracy value of 0.971 and an F1-score of 0.743 is obtained. While the Stochastic Gradient Descent obtained an accuracy value of 0.977 and an F1-score of 0.81. These final results indicate that SGD performs slightly better than Naïve Bayes.Keywords—NLP, Machine Learning, Naïve Bayes, SGD, Fake Jobs

Download Full-text

Less Sparse Feature Set with Meta Heuristic Weighted Classifier for Tweet Sentiment Classification

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b6699.019320 ◽

2020 ◽

Vol 9 (3) ◽

pp. 1328-1334

Keyword(s):

Naive Bayes ◽

Hybrid Approach ◽

Naïve Bayes ◽

Sentiment Classification ◽

Bayes Classifier ◽

Hybrid Approaches ◽

Machine Leaning ◽

Feature Weight ◽

Precision And Accuracy ◽

Better Than

Twitter using Machine Leaning Techniques has been done. While consideration Bigram, Unigram,. SVM and naïve Bayes classifier which hybrid with PSO and ACO for effective feature weight. In Fig. 4.9 compare all experiment by on graph which shows that SVM_ACO and SVM_PSO better perform than SVM. NB_ACO and NB_PSO perform better than NB but if compare between hybrid approaches then SVM_PSO show 81.80% accuracy,85% precision and 80% recall. IN case of naïve Bayes NB_PSO 76.93% accuracy,76.24 precision and 82.55% recall, so experiments conclude that Naive Bayes improve recall and SVM improve precision and accuracy when use as hybrid approach.

Download Full-text

LL-KNN ACW-NB: Local Learning K-Nearest Neighbor in Absolute Correlation Weighted Naïve Bayes for Numerical Data Classification

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v4i1.1348 ◽

2020 ◽

Vol 4 (1) ◽

pp. 28-36

Author(s):

Azminuddin I. S. Azis ◽

Budy Santoso ◽

Serwin

Keyword(s):

Nearest Neighbor ◽

Naive Bayes ◽

Numerical Data ◽

Naïve Bayes ◽

Local Learning ◽

K Nearest Neighbor ◽

Data Mining Algorithms ◽

Kernel Approach ◽

Absolute Correlation ◽

Better Than

Naïve Bayes (NB) algorithm is still in the top ten of the Data Mining algorithms because of it is simplicity, efficiency, and performance. To handle classification on numerical data, the Gaussian distribution and kernel approach can be applied to NB (GNB and KNB). However, in the process of NB classifying, attributes are considered independent, even though the assumption is not always right in many cases. Absolute Correlation Coefficient can determine correlations between attributes and work on numerical attributes, so that it can be applied for attribute weighting to GNB (ACW-NB). Furthermore, because performance of NB does not increase in large datasets, so ACW-NB can be a classifier in the local learning model, where other classification methods, such as K-Nearest Neighbor (K-NN) which are very well known in local learning can be used to obtain sub-dataset in the ACW-NB training. To reduction of noise/bias, then missing value replacement and data normalization can also be applied. This proposed method is termed "LL-KNN ACW-NB (Local Learning K-Nearest Neighbor in Absolute Correlation Weighted Naïve Bayes)," with the objective to improve the performance of NB (GNB and KNB) in handling classification on numerical data. The results of this study indicate that the LL-KNN ACW-NB is able to improve the performance of NB, with an average accuracy of 91,48%, 1,92% better than GNB and 2,86% better than KNB.

Download Full-text

ANALISIS PEMBOBOTAN KATA PADA KLASIFIKASI TEXT MINING

JURNAL TEKNOLOGI INFORMASI ◽

10.36294/jurti.v3i2.1077 ◽

2019 ◽

Vol 3 (2) ◽

pp. 179 ◽

Cited By ~ 1

Author(s):

Agatha Deolika ◽

Kusrini Kusrini ◽

Emha Taufiq Luthfi

Keyword(s):

Text Mining ◽

Naive Bayes ◽

Data Transfer ◽

Confusion Matrix ◽

Naïve Bayes ◽

Test Method ◽

Quality Information ◽

High Quality ◽

High Quality Information ◽

Better Than

Abstract - In this era, we need to extract the text needed to visualize or need knowledge from a large collection of document texts. Text mining is the process of obtaining high-quality information from text. High-quality information obtained because of attention to patterns and trends by reading statistical patterns. In the process of extracting the text, we need to pay for the words offered to give value/weight to the terms provided in a document. The weight given to the term depends on the method used. In weighting many words such as algorithms for example such as TF, IDF, RF, TF-IDF, TF.RF, TF.CHI, WIDF. This research will be analyzed and compared with the TF-IDF, TF.RF, and WIDF algorithms. For the test method, the naïve Bayes classification method will be used and the valuation analysis using the confusion matrix. With a dataset used as many as 130 documents in which 100 data transfer and 30 test data. Based on the analysis of the results of the classification that has been done, it can determine the weighting of TF.RF with naif classification is better than weighting TF.IDF and WIDF with Accuracy values of 98.67%, Precision 93.81%, and Recall 96.67%.Keywords - Text Mining, TF-IDF, TF-RF, WIDF, Classification, Naïve Bayes. Abstract - Pada era sekarang ini pemanfaatan text mining sangatlah diperlukan untuk mevisualkan atau mengevaluasi pengetahuan dari kumpulan besar dari teks dokumen. Text mining adalah proses untuk memperoleh informasi berkualitas tinggi dari teks. Informasi berkualitas tinggi biasanya didapatkan karena memperhatikan pola dan tren dengan cara mempelajari pola statistik. Pada proses teks mining terdapat pembobobtan kata yang bertujuan untuk memberikan nilai/bobot pada term yang terdapat pada suatu dokumen. Bobot yang diberikan pada term tergantung kepada metode yang digunakan. Dalam pembobotan kata banyak sekali terdapat algoritma-algoritma contohnya seperti TF, Idf, RF, TF-IDF, TF.RF, TF.CHI, WIDF. Pada penelitian ini akan dianalisis dan dibandingkan algoritma TF-IDF, TF.RF, dan WIDF. Untuk metode pengujiannya akan digunakan metode klasifikasi naïve bayes dan analisis perbandingannya menggunakan confussion matrix. Dengan dataset yang digunakan sebanyak 130 dokumen yang mana 100 data traning dan 30 data uji. Berdasarkan analisa pada hasil klasifikasi yang telah dilakukan, dapat disimpulkan bahwa pembobotan TF.RF dengan klasifikasi Naïve bayes lebih baik dari pembobotan TF.IDF dan WIDF dengan nilai Accuracy 98,67%, Precision 93,81%, dan Recall 96,67%. Kata Kunci - Text Mining, TF-IDF, TF-RF, WIDF, Klasifikasi, Naïve Bayes.

Download Full-text