The Effect of Best First and Spreadsubsample on Selection of a Feature Wrapper With Na�ve Bayes Classifier for The Classification of the Ratio of Inpatients

Diabetes can lead to mortality and disability, so patients should be inpatient again to undergo treatment again to be saved. On previous research about feature selection with greedy stepwise forward fail to predict classification ratio inpatient of patient with the result of recall and precision 0 on data training 60%, 75%, 80%, and 90% and there is suggestion to handle unbalanced class data problem by comparison of data readmitted 6293 and the otherwise 64141. The research purposed to know the effect of choosing the best model using best first instead of greedy stepwise forward and data sampling with spreadsubsample to resolve unbalanced class data problem. The data used was patient data from 130 American Hospital in 1999 until 2008 with 70434 data. The method that used was best first search and spreadsubsample. The result of this research are precision found 0.4 and 0.333 on training dataset 75% and 90% with best first method, while spreadsubsample method found that value of precision and recall is more significantly increased. Spreadsubsample has more effect with the result of precision and recall rather than using best first method.

Download Full-text

ImplementasiAlgoritma NaÃ¯ve Bayes Classifier (NBC) Untuk Analisis Sentimen Komentar Kebijakan Full Day School

MEANS (Media Informasi Analisa dan Sistem) ◽

10.54367/means.v6i1.1251 ◽

2021 ◽

pp. 61-66

Author(s):

Yarma Agustya Dewi Utami ◽

Volvo Sihombing ◽

Muhammad Halmi Dar

Keyword(s):

Feature Selection ◽

Training Model ◽

School Policy ◽

Day School ◽

Training Data ◽

Bayes Classifier ◽

Ve Bayes ◽

A Value ◽

Full Day ◽

Selection Of

Sentiment analysis is an important research topic and is currently being developed. Sentiment analysis is carried out to see the opinion or tendency of a person's opinion on a problem or object, whether it tends to have a negative or positive view. The main purpose of this research is to find out public sentiment towards the Full Day school policy comments from the Facebook Page of the Ministry of Education and Culture of the Republic of Indonesia and to determine the performance of the Na-Ã¯ve Bayes Classifier Algorithm. The results of this study indicate that the public's negative sentiment towards the Full Day School policy is higher than positive or neutral sentiment. The highest accuracy value is the NaÃ¯ve Bayes Classifier algorithm with the trigram feature selection of the 300 data training model with a value of 80%. This simulation has proven that the larger the training data and the selection of features used in the NBC Algorithm affect the accuracy of the results. Meanwhile, the simulation results from 10 test data with 5 different NBC and Lexicon algorithms also show that the Full Day School Policy proposed by the Indonesian Minister of Education and Culture has a higher negative sentiment than positive or neutral by most Facebook users who express opinions through comments. The highest accuracy value is the NaÃ¯ve Bayes Classifier algorithm with the trigram feature selection of the 300 data training model with a value of 80%. This simulation has proven that the larger the training data and the selection of features used in the NBC Algorithm affect the accuracy of the results. Meanwhile, the simulation results from 10 test data with 5 different NBC and Lexicon algorithms also show that the Full Day School Policy proposed by the Indonesian Minister of Education and Culture has a higher negative sentiment than positive or neutral by most users. Facebook that expresses opinions through comments. The highest accuracy value is the NaÃ¯ve Bayes Classifier algorithm with the tri-gram feature selection of the 300 data training model with a value of 80%. This simulation has proven that the larger the training data and the selection of features used in the NBC Algorithm affect the accuracy results.

Download Full-text

Real-Time Classification of Water Spray and Leaks for Robotic Firefighting

International Journal of Computer Vision and Image Processing ◽

10.4018/ijcvip.2015010101 ◽

2015 ◽

Vol 5 (1) ◽

pp. 1-26 ◽

Cited By ~ 5

Author(s):

Joshua G. McNeil ◽

Brian Y. Lattimer

Keyword(s):

Vision System ◽

Training Dataset ◽

Water Spray ◽

Bayes Classifier ◽

Test Dataset ◽

Hazardous Environments ◽

Water Classification ◽

Probabilistic Classification ◽

Suppression System

Robotic firefighting is an area of increased focus as a way of limiting the exposure of firefighters to hazardous environments. A suppression system must incorporate multiple functionalities to allow for closed-loop firefighting control. One area of development is classifying water spray as a way of correcting errors between suppressant placement and fire location. An IR vision system is presented which is capable of identifying water. Image segmentation is performed, followed by a process that classifies regions of interest as water or non-water objects. A probabilistic classification method, using Naïve Bayes classifier, was applied on a varied dataset of differing water temperatures and sprays. Objects were segmented using frame differencing with image intensity and difference thresholds. Segments were manually labeled to create a training dataset. Precision, recall, F-measure, and G-measure results of the classifier on a separate test dataset ranged from 86.1-97.4% for classifying water objects using the test dataset with water classification alone having 94.2-97.4% accuracy.

Download Full-text

ON ONE APPROACH FOR FEATURE SELECTION BASED ON THE APPLICATION OF THE METHOD OF LOGICAL ANALYSIS OF DATA

СИСТЕМЫ УПРАВЛЕНИЯ И ИНФОРМАЦИОННЫЕ ТЕХНОЛОГИИ ◽

10.36622/vstu.2021.86.4.008 ◽

2021 ◽

pp. 37-41

Author(s):

Р.И. Кузьмич ◽

А.А. Ступина ◽

М.И. Цепкова ◽

С.Н. Ежеманская

Keyword(s):

Feature Selection ◽

Logical Analysis ◽

Classification Task ◽

Logical Analysis Of Data ◽

Selection Of

Предлагается подход для отбора важных признаков при классификации наблюдений. Реализация подхода основана на построении логических правил на базе метода логического анализа данных и учете частоты использования признаков при их формировании для конкретной задачи классификации. An approach is proposed for the selection of important features in the classification of observations. The implementation of the approach is based on the construction of patterns based on the method of logical analysis of data and taking into account the frequency of using features when forming them for a specific classification task.

Download Full-text

Feature Selection of Combining Relieff and Rough Set for Syndrome Classification of Chronic Gastritis in Traditional Chinese Medicine

Proceedings of the 2015 International conference on Applied Science and Engineering Innovation ◽

10.2991/asei-15.2015.242 ◽

2015 ◽

Author(s):

Jianjun Yan ◽

Qiyue Chen ◽

Guoping Liu ◽

Xiong Lu ◽

Yiqin Wang ◽

...

Keyword(s):

Feature Selection ◽

Chinese Medicine ◽

Traditional Chinese Medicine ◽

Rough Set ◽

Chronic Gastritis ◽

Selection Of ◽

Syndrome Classification

Download Full-text

Classification of surface water objects in visible spectrum images

Journal of «Almaz – Antey» Air and Defence Corporation ◽

10.38013/2542-0542-2020-1-87-95 ◽

2020 ◽

pp. 87-95

Author(s):

A. A. Artemyev ◽

E. A. Kazachkov ◽

S. N. Matyugin ◽

V. V. Sharonov

Keyword(s):

Neural Network ◽

Neural Networks ◽

Surface Water ◽

Convolutional Neural Network ◽

Convolutional Neural Networks ◽

Visible Spectrum ◽

Training Dataset ◽

And Training ◽

Selection Of

This paper considers the problem of classifying surface water objects, e.g. ships of different classes, in visible spectrum images using convolutional neural networks. A technique for forming a database of images of surface water objects and a special training dataset for creating a classification are presented. A method for forming and training of a convolutional neural network is described. The dependence of the probability of correct recognition on the number and variants of the selection of specific classes of surface water objects is analysed. The results of recognizing different sets of classes are presented.

Download Full-text

Perbandingan Seleksi Fitur Term Frequency & Tri-Gram Character Menggunakan Algoritma Naïve Bayes Classifier (Nbc) Pada Tweet Hashtag #2019gantipresiden

Kilat ◽

10.33322/kilat.v9i1.878 ◽

2020 ◽

Vol 9 (1) ◽

pp. 103-114

Author(s):

Arini - Arini ◽

Luh Kesuma Wardhani ◽

Dimas - Octaviano

Keyword(s):

Feature Selection ◽

Naive Bayes ◽

High Accuracy ◽

Naïve Bayes ◽

Training Data ◽

Naive Bayes Classifier ◽

Bayes Classifier ◽

Naïve Bayes Classifier ◽

Term Frequency ◽

Selection Of

Towards an election year (elections) in 2019 to come, many mass campaign conducted through social media networks one of them on twitter. One online campaign is very popular among the people of the current campaign with the hashtag #2019GantiPresiden. In studies sentiment analysis required hashtag 2019GantiPresiden classifier and the selection of robust functionality that mendaptkan high accuracy values. One of the classifier and feature selection algorithms are Naive Bayes classifier (NBC) with Tri-Gram feature selection Character & Term-Frequency which previous research has resulted in a fairly high accuracy. The purpose of this study was to determine the implementation of Algorithm Naive Bayes classifier (NBC) with each selection and compare features and get accurate results from Algorithm Naive Bayes classifier (NBC) with both the selection of the feature. The author uses the method of observation to collect data and do the simulation. By using the data of 1,000 tweets originating from hashtag # 2019GantiPresiden taken on 15 September 2018, the author divides into two categories: 950 tweets as training data and 50 tweets as test data where the labeling process using methods Lexicon Based sentiment. From this study showed Naïve Bayes classifier algorithm accuracy (NBC) with feature selection Character Tri-Gram by 76% and Term-Frequency by 74%,the result show that the feature selection Character Tri-Gram better than Term-Frequency.

Download Full-text

An integrated PSO for parameter determination and feature selection of ELM and its application in classification of power system disturbances

Applied Soft Computing ◽

10.1016/j.asoc.2015.03.036 ◽

2015 ◽

Vol 32 ◽

pp. 23-37 ◽

Cited By ~ 73

Author(s):

R. Ahila ◽

V. Sadasivam ◽

K. Manimala

Keyword(s):

Feature Selection ◽

Power System ◽

Parameter Determination ◽

Selection Of ◽

Power System Disturbances

Download Full-text

Data-point and feature selection of motor imagery EEG signals for neural classification of cognitive tasks in car-driving

2015 International Joint Conference on Neural Networks (IJCNN) ◽

10.1109/ijcnn.2015.7280831 ◽

2015 ◽

Cited By ~ 1

Author(s):

Anuradha Saha ◽

Amit Konar ◽

Pratyusha Das ◽

Basabdatta Sen Bhattacharya ◽

Atulya K. Nagar

Keyword(s):

Feature Selection ◽

Motor Imagery ◽

Cognitive Tasks ◽

Eeg Signals ◽

Data Point ◽

Car Driving ◽

Selection Of

Download Full-text

An Improved Naive Bayesian Classification Algorithm for Sentiment Classification of Microblogs

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.543-547.3614 ◽

2014 ◽

Vol 543-547 ◽

pp. 3614-3620

Author(s):

Zhi Qiang Li ◽

De Quan Yang ◽

Yuan Tan ◽

Yuan Ping Zou

Keyword(s):

Feature Selection ◽

Bayesian Classification ◽

Sentiment Classification ◽

Experimental Result ◽

Naive Bayesian ◽

Naïve Bayesian ◽

Correlation Degree ◽

Weight Calculation ◽

Selection Of

For the attribute-weighted based naive Bayesian classification algorithms, the selection of the weight directly affects the classification results. Based on this, the drawbacks of the TFIDF feature selection approaches in sentiment classification for the microblogs are analyzed, and an improved algorithm named TF-D(t)-CHI is proposed, which applies statistical calculation to obtain the correlation degree between the feature words and the classes. It presents the distribution of the feature items by variance in classes, which solves the problem that the short-texts contain few feature words while the high frequency feature words have too high weight. Experimental result indicate that TF-D(T)-CHI based naive Bayesian classification for feature selection and weight calculation has better classification results in sentiment classification for microblogs.

Download Full-text

Information Retrieval System for Determining The Title of Journal Trends in Indonesian Language Using TF-IDF and Na?ve Bayes Classifier

Scientific Journal of Informatics ◽

10.15294/sji.v4i2.11876 ◽

2017 ◽

Vol 4 (2) ◽

pp. 179-190

Author(s):

Wandha Budhi Trihanto ◽

Riza Arifudin ◽

Much Aziz Muslim

Keyword(s):

Information Retrieval ◽

Retrieval System ◽

Information Retrieval System ◽

Journal Title ◽

Bayes Classifier ◽

Ve Bayes ◽

Digital Format ◽

Library Users ◽

Weight Calculation

The journal is known as one of the relevant serial literature that can support a researcher in doing his research. In its development journal has two formats that can be accessed by library users namely: printed format and digital format. Then from the number of published journals, not accompanied by the growing amount of information and knowledge that can be retrieved from these documents. The TF-IDF method is one of the fastest and most efficient text mining methods to extract useful words as the value of information from a document. This method combines two concepts of weight calculation that is the frequency of word appearance on a particular document and the inverse frequency of documents containing the word. Furthermore, data analysis of journal title is done by Nave Bayes Classifier method. The purpose of the research is to build a website-based information retrieval system that can help to classify and define trends from Indonesian journal titles. This research produces a system that can be used to classify journal titles in Indonesian language, with system accuracy in determining the classification of 90,6% and 9,4% error rate. The highest percentage result that became the trend of title classification was decision support system category which was 24.7%.

Download Full-text