Naive Bayes classification model for isotopologue detection in LC-HRMS data

Isotopologue identification or removal is a necessary step to reduce the number of features that need to be identified in samples analyzed with non-targeted analysis. Currently available approaches rely on either predicted isotopic patterns or an arbitrary mass tolerance, requiring information on the molecular formula or instrumental error, respectively. Therefore, a Naive Bayes isotopologue classification model was developed that does not depend on any thresholds or molecular formula information. This classification model uses elemental mass defects of six elemental ratios and can successfully identify isotopologues in both theoretical isotopic patterns and wastewater influent samples, outperforming one of the most commonly used approaches (i.e., 1.0033 Da mass difference method - CAMERA).

Download Full-text

Children’s Activity Classification for Domestic Risk Scenarios Using Environmental Sound and a Bayesian Network

Healthcare ◽

10.3390/healthcare9070884 ◽

2021 ◽

Vol 9 (7) ◽

pp. 884

Author(s):

Antonio García-Domínguez ◽

Carlos E. Galván-Tejada ◽

Ramón F. Brena ◽

Antonio A. Aguileta ◽

Jorge I. Galván-Tejada ◽

...

Keyword(s):

Feature Selection ◽

Naive Bayes ◽

Naïve Bayes ◽

Classification Model ◽

Activity Classification ◽

Environmental Sound ◽

Non Invasive ◽

Akaike Criterion ◽

Data Source ◽

Feature Selection Techniques

Children’s healthcare is a relevant issue, especially the prevention of domestic accidents, since it has even been defined as a global health problem. Children’s activity classification generally uses sensors embedded in children’s clothing, which can lead to erroneous measurements for possible damage or mishandling. Having a non-invasive data source for a children’s activity classification model provides reliability to the monitoring system where it is applied. This work proposes the use of environmental sound as a data source for the generation of children’s activity classification models, implementing feature selection methods and classification techniques based on Bayesian networks, focused on the recognition of potentially triggering activities of domestic accidents, applicable in child monitoring systems. Two feature selection techniques were used: the Akaike criterion and genetic algorithms. Likewise, models were generated using three classifiers: naive Bayes, semi-naive Bayes and tree-augmented naive Bayes. The generated models, combining the methods of feature selection and the classifiers used, present accuracy of greater than 97% for most of them, with which we can conclude the efficiency of the proposal of the present work in the recognition of potentially detonating activities of domestic accidents.

Download Full-text

Perbandingan Optimasi Feature Selection pada Naïve Bayes untuk Klasifikasi Kepuasan Airline Passenger

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v5i3.3086 ◽

2021 ◽

Vol 5 (3) ◽

pp. 527-533

Author(s):

Yoga Religia ◽

Amali Amali

Keyword(s):

Feature Selection ◽

Customer Satisfaction ◽

Naive Bayes ◽

Naïve Bayes ◽

Point Of View ◽

Classification Model ◽

Passenger Satisfaction ◽

Airline Passenger ◽

Bayes Algorithm

The quality of an airline's services cannot be measured from the company's point of view, but must be seen from the point of view of customer satisfaction. Data mining techniques make it possible to predict airline customer satisfaction with a classification model. The Naïve Bayes algorithm has demonstrated outstanding classification accuracy, but currently independent assumptions are rarely discussed. Some literature suggests the use of attribute weighting to reduce independent assumptions, which can be done using particle swarm optimization (PSO) and genetic algorithm (GA) through feature selection. This study conducted a comparison of PSO and GA optimization on Naïve Bayes for the classification of Airline Passenger Satisfaction data taken from www.kaggle.com. After testing, the best performance is obtained from the model formed, namely the classification of Airline Passenger Satisfaction data using the Naïve Bayes algorithm with PSO optimization, where the accuracy value is 86.13%, the precision value is 87.90%, the recall value is 87.29%, and the value is AUC of 0.923.

Download Full-text

Improving Techniques for Naïve Bayes Text Classifiers

Handbook of Research on Text and Web Mining Technologies ◽

10.4018/978-1-59904-990-8.ch007 ◽

2010 ◽

pp. 111-127

Author(s):

Han-joon Kim

Keyword(s):

Text Classification ◽

Naive Bayes ◽

Naïve Bayes ◽

Classification Systems ◽

Classification Model ◽

Learning Approaches ◽

Learning Framework ◽

The Em Algorithm ◽

Meta Learning ◽

Text Classifiers

This chapter introduces two practical techniques for improving Naïve Bayes text classifiers that are widely used for text classification. The Naïve Bayes has been evaluated to be a practical text classification algorithm due to its simple classification model, reasonable classification accuracy, and easy update of classification model. Thus, many researchers have a strong incentive to improve the Naïve Bayes by combining it with other meta-learning approaches such as EM (Expectation Maximization) and Boosting. The EM approach is to combine the Naïve Bayes with the EM algorithm and the Boosting approach is to use the Naïve Bayes as a base classifier in the AdaBoost algorithm. For both approaches, a special uncertainty measure fit for Naïve Bayes learning is used. In the Naïve Bayes learning framework, these approaches are expected to be practical solutions to the problem of lack of training documents in text classification systems.

Download Full-text

Study of Learning System based on Naive Bayes Classification Model

Journal of Convergence Information Technology ◽

10.4156/jcit.vol7.issue22.88 ◽

2012 ◽

Vol 7 (22) ◽

pp. 746-753

Author(s):

Yingyan Wang ◽

Rui Zeng

Keyword(s):

Naive Bayes ◽

Naïve Bayes ◽

Learning System ◽

Classification Model ◽

Naive Bayes Classification ◽

Naïve Bayes Classification

Download Full-text

Classifying the Level of Energy-Environmental Efficiency Rating of Brazilian Ethanol

Energies ◽

10.3390/en13082067 ◽

2020 ◽

Vol 13 (8) ◽

pp. 2067

Author(s):

Nilsa Duarte da Silva Lima ◽

Irenilza de Alencar Nääs ◽

João Gilberto Mendes dos Reis ◽

Raquel Baracat Tosi Rodrigues da Silva

Keyword(s):

Decision Tree ◽

High Efficiency ◽

Rating Scale ◽

Naive Bayes ◽

Naïve Bayes ◽

Environmental Efficiency ◽

Classification Model ◽

Bayes Algorithm ◽

J48 Decision Tree

The present study aimed to assess and classify energy-environmental efficiency levels to reduce greenhouse gas emissions in the production, commercialization, and use of biofuels certified by the Brazilian National Biofuel Policy (RenovaBio). The parameters of the level of energy-environmental efficiency were standardized and categorized according to the Energy-Environmental Efficiency Rating (E-EER). The rating scale varied between lower efficiency (D) and high efficiency + (highest efficiency A+). The classification method with the J48 decision tree and naive Bayes algorithms was used to predict the models. The classification of the E-EER scores using a decision tree using the J48 algorithm and Bayesian classifiers using the naive Bayes algorithm produced decision tree models efficient at estimating the efficiency level of Brazilian ethanol producers and importers certified by the RenovaBio. The rules generated by the models can assess the level classes (efficiency scores) according to the scale discretized into high efficiency (Classification A), average efficiency (Classification B), and standard efficiency (Classification C). These results might generate an ethanol energy-environmental efficiency label for the end consumers and resellers of the product, to assist in making a purchase decision concerning its performance. The best classification model was naive Bayes, compared to the J48 decision tree. The classification of the Energy Efficiency Note levels using the naive Bayes algorithm produced a model capable of estimating the efficiency level of Brazilian ethanol to create labels.

Download Full-text

Naive Bayes Classification Model for the Student Performance Prediction

2019 2nd International Conference on Intelligent Computing, Instrumentation and Control Technologies (ICICICT) ◽

10.1109/icicict46008.2019.8993237 ◽

2019 ◽

Cited By ~ 2

Author(s):

Akarshita Tripathi ◽

Saumya Yadav ◽

Rajiv Rajan

Keyword(s):

Student Performance ◽

Performance Prediction ◽

Naive Bayes ◽

Naïve Bayes ◽

Classification Model ◽

Naive Bayes Classification ◽

Naïve Bayes Classification

Download Full-text

Predicting the Enrollment and Dropout of Students in the Post-Graduation Degree using Machine Learning Classifier

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.k2435.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 3083-3088 ◽

Cited By ~ 1

Keyword(s):

Performance Evaluation ◽

Naive Bayes ◽

Confusion Matrix ◽

Dropout Rate ◽

Naïve Bayes ◽

Random Tree ◽

Drop Out ◽

Classification Model ◽

Evaluation Metrics ◽

Tree Classifier

Nowadays, In Bangladesh, the dropout rate at post-graduation level or incompletion of the post-graduation degree is considered as a serious problem in the education sector. This work can be used to support for identifying the specific individuals as well as the institutional factors which may next lead to the enrollment or drop out at the post-graduation degree. The real dataset is used to accomplish this work. Here, seven classification algorithms namely Naïve Bayes, Multilayer Perceptron, Logistic, Locally Weighted Learning (LWL), Random Forest, Random Tree, and Part are applied in this context. A confusion matrix is calculated for each classification model. Then, we computed all the seven performance evaluation metrics (accuracy, sensitivity, precision, specificity, F1 score, FPR, and FNR). Each classifier's performances are analyzed and measured from the computed performance evaluation metrics. Naïve Bayes, LWL, and Part classifier perform better than all other working classifiers attaining 86.36% accuracy and on the contrary, Random Tree classifier performs worst achieving 74.24% accuracy. After further analyzing of the result based on performance evaluation metrics, it is observed that LWL classifier performed best in this context among all the classifiers.

Download Full-text

Text Analysis of Applicants for Personality Classification Using Multinomial Naïve Bayes and Decision Tree

JURNAL INFOTEL ◽

10.20895/infotel.v12i3.505 ◽

2020 ◽

Vol 12 (3) ◽

Author(s):

Nanda Yonda Hutama ◽

Kemas Muslim Lhaksmana ◽

Isman Kurniawan

Keyword(s):

Decision Tree ◽

Personality Traits ◽

Big Five ◽

Naive Bayes ◽

Naïve Bayes ◽

Classification Model ◽

Big Five Personality ◽

Text Data ◽

Personality Classification ◽

Best Parameters

Employees' qualities affect companies' performances and with a large number of applicants, it's difficult to find suitable applicants. To help with it, companies carry out psychological tests to know applicants' personalities, since personality's considered to have a relationship with work performances. But psychological testing requires a lot of effort, cost, and human resources. Thus with a system that can classify personalities through text can help reduce the effort needed. Similar studies carried out with the big five personalities as the theoretical basis and used one of the personality traits, namely using the k-NN method with 65% accuracy. Based on these studies, accuracy can improve by finding the best parameters using all of the big five personalities. This research is conducted based on the big five personality traits and related traits, namely consciousness and agreeableness. The data used is text data that's been labelled, pre-processed and feature selected. The clean text data is used to create a classification model using multinomial Naive Bayes and decision trees. There are 6 models built based on 3 work cultures, decision tree with an accuracy of 33%, 66%, 80%, and multinomial naïve Bayes with an accuracy of 83%, 50%, 60%, which resulted as better performance.

Download Full-text

Opinion Mining on Culinary Food Customer Satisfaction Using Naïve Bayes Based-on Hybrid Feature Selection

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v15.i1.pp468-475 ◽

2019 ◽

Vol 15 (1) ◽

pp. 468 ◽

Cited By ~ 3

Author(s):

Oman Somantri ◽

Dyah Apriliani

Keyword(s):

Feature Selection ◽

Opinion Mining ◽

Naive Bayes ◽

Information Gain ◽

Naïve Bayes ◽

Classification Model ◽

Consumer Ratings ◽

Bayes Algorithm ◽

Restaurant Owners

<p>Conducting an assessment of consumer sentiments taken from social media in assessing a culinary food gives useful information for everyone who wants to get this information especially for migrants and tourists, in th other hand that information is very valuable for food stall and restaurant owners as information in improvinf food quality. Overcoming this problem, a sentiment analysis classification model using naïve bayes algorithm (NB) was applied to get this information. This problem occurs is the level of accuracy of classification of consumer ratings of culinary food is still not optimal because the weight of values in the data preprocessing process are not optimal. In this paper proposed a hybrid feature selection models to overcome the problems in the process of selecting the feature attributes that have not been optimal by using a combination of information gain (IG) and genetic algorithm (GA) algorithms. The result of this research showed that after the experiment and compared to using others algorithms produce the best of the level occuracy is 93%.</p>

Download Full-text

Twitter Sentiment Analysis Using an Ensemble Majority Vote Classifier

Journal of Southwest Jiaotong University ◽

10.35741/issn.0258-2724.55.1.9 ◽

2020 ◽

Vol 55 (1) ◽

Cited By ~ 2

Author(s):

Alaa Khudhair Abbas ◽

Ali Khalil Salih ◽

Harith A. Hussein ◽

Qasim Mohammed Hussein ◽

Saba Alaa Abdulwahhab

Keyword(s):

Social Media ◽

Naive Bayes ◽

Majority Vote ◽

Ensemble Classifier ◽

Classification Performance ◽

Naïve Bayes ◽

Classification Model ◽

Media Messages ◽

Individual Classifier ◽

The Individual

Twitter social media data generally uses ambiguous text that can cause difficulty in identifying positive or negative sentiments. There are more than one billion social media messages that need to be stored in a proper database and processed correctly to analyze them. In this paper, an ensemble majority vote classifier to enhance sentiment classification performance and accuracy is proposed. The proposed classification model is combined with four classifiers, using varying techniques—naive Bayes, decision trees, multilayer perceptron and logistic regression—to form a single ensemble classifier. In addition to these, a comparison is drawn among the four classifiers to evaluate the performance of the individual classifiers. The result shows that in terms of an individual classifier, the naive Bayes classifier is optimal as compared to the others. However, for comparing the proposed ensemble majority vote classifier with the four individual classifiers, the result illustrates that the performance of the proposed classifier is better than the independent one.

Download Full-text