Antlion Optimization-Based Feature Selection Scheme for Cloud Intrusion Detection Using Naïve Bayes Algorithm

Intrusion Detection System (IDS) is useful for detecting an attack or disturbance on a network or information system. Anomaly detection is a type of IDS that can detect a deviate attack on the network based on statistical probability. The increasing use of the internet also increases interference or attacks from intruders or crackers that exploit weak internet protocols and application software. When many data packets arrive, a problem arises that needs to be analyzed. The right technique to analyze the data package is data mining. This study aims to classify IDS anomalies using the Naïve Bayes classification algorithm from the results of attribute selection with correlation-based feature selection. This study uses a UNSW-NB15 intrusion detection system data collection consisting of 49 attributes and 321,283 data records. Performance measurements are based on accuracy, precision, F-Measure and ROC Area. The results of attribute selection with correlation-based feature selection leave 4 attributes. The results of the evaluation of IDS anomaly classification using the naïve Bayes algorithm without the precedence of the attributes selected by the correlation technique obtained an accuracy rate of 71.2%. While the classification results if preceded by the attributes selected by the correlation technique obtained an accuracy of 74.8%. Classification with the naïve Bayes algorithm can be improved its accuracy which is preceded by the selection of attributes with correlation techniques.

Download Full-text

Analysis of Sentiment of Moving a National Capital with Feature Selection Naive Bayes Algorithm and Support Vector Machine

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v4i3.1942 ◽

2020 ◽

Vol 4 (3) ◽

pp. 504-512

Author(s):

Faried Zamachsari ◽

Gabriel Vangeran Saragih ◽

Susafa'ati ◽

Windu Gata

Keyword(s):

Social Media ◽

Support Vector Machine ◽

Feature Selection ◽

Public Opinion ◽

Naive Bayes ◽

Naïve Bayes ◽

Capital City ◽

Support Vector ◽

National Capital ◽

Bayes Algorithm

The decision to move Indonesia's capital city to East Kalimantan received mixed responses on social media. When the poverty rate is still high and the country's finances are difficult to be a factor in disapproval of the relocation of the national capital. Twitter as one of the popular social media, is used by the public to express these opinions. How is the tendency of community responses related to the move of the National Capital and how to do public opinion sentiment analysis related to the move of the National Capital with Feature Selection Naive Bayes Algorithm and Support Vector Machine to get the highest accuracy value is the goal in this study. Sentiment analysis data will take from public opinion using Indonesian from Twitter social media tweets in a crawling manner. Search words used are #IbuKotaBaru and #PindahIbuKota. The stages of the research consisted of collecting data through social media Twitter, polarity, preprocessing consisting of the process of transform case, cleansing, tokenizing, filtering and stemming. The use of feature selection to increase the accuracy value will then enter the ratio that has been determined to be used by data testing and training. The next step is the comparison between the Support Vector Machine and Naive Bayes methods to determine which method is more accurate. In the data period above it was found 24.26% positive sentiment 75.74% negative sentiment related to the move of a new capital city. Accuracy results using Rapid Miner software, the best accuracy value of Naive Bayes with Feature Selection is at a ratio of 9:1 with an accuracy of 88.24% while the best accuracy results Support Vector Machine with Feature Selection is at a ratio of 5:5 with an accuracy of 78.77%.

Download Full-text

Perbandingan Optimasi Feature Selection pada Naïve Bayes untuk Klasifikasi Kepuasan Airline Passenger

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v5i3.3086 ◽

2021 ◽

Vol 5 (3) ◽

pp. 527-533

Author(s):

Yoga Religia ◽

Amali Amali

Keyword(s):

Feature Selection ◽

Customer Satisfaction ◽

Naive Bayes ◽

Naïve Bayes ◽

Point Of View ◽

Classification Model ◽

Passenger Satisfaction ◽

Airline Passenger ◽

Bayes Algorithm

The quality of an airline's services cannot be measured from the company's point of view, but must be seen from the point of view of customer satisfaction. Data mining techniques make it possible to predict airline customer satisfaction with a classification model. The Naïve Bayes algorithm has demonstrated outstanding classification accuracy, but currently independent assumptions are rarely discussed. Some literature suggests the use of attribute weighting to reduce independent assumptions, which can be done using particle swarm optimization (PSO) and genetic algorithm (GA) through feature selection. This study conducted a comparison of PSO and GA optimization on Naïve Bayes for the classification of Airline Passenger Satisfaction data taken from www.kaggle.com. After testing, the best performance is obtained from the model formed, namely the classification of Airline Passenger Satisfaction data using the Naïve Bayes algorithm with PSO optimization, where the accuracy value is 86.13%, the precision value is 87.90%, the recall value is 87.29%, and the value is AUC of 0.923.

Download Full-text

Implementation of The Naïve Bayes Algorithm with Feature Selection using Genetic Algorithm for Sentiment Review Analysis of Fashion Online Companies

2018 6th International Conference on Cyber and IT Service Management (CITSM) ◽

10.1109/citsm.2018.8674286 ◽

2018 ◽

Cited By ~ 2

Author(s):

Siti Ernawati ◽

Eka Rini Yulia ◽

Frieyadie ◽

Samudi

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Naive Bayes ◽

Naïve Bayes ◽

Review Analysis ◽

Bayes Algorithm

Download Full-text

An Intrusion Detection Model based on Hybrid Classification algorithm

MATEC Web of Conferences ◽

10.1051/matecconf/201824603027 ◽

2018 ◽

Vol 246 ◽

pp. 03027

Author(s):

Manfu Ma ◽

Wei Deng ◽

Hongtong Liu ◽

Xinmiao Yun

Keyword(s):

Intrusion Detection ◽

Detection Rate ◽

Naive Bayes ◽

Naïve Bayes ◽

Classification Algorithm ◽

Data Set ◽

Detection Model ◽

Performance Requirements ◽

Hybrid Classification ◽

Bayes Algorithm

Due to using the single classification algorithm can not meet the performance requirements of intrusion detection, combined with the numerical value of KNN and the advantage of naive Bayes in the structure of data, an intrusion detection model KNN-NB based on KNN and Naive Bayes hybrid classification algorithm is proposed. The model first preprocesses the NSL-KDD intrusion detection data set. And then by exploiting the advantages of KNN algorithm in data values, the model calculates the distance between the samples according to the feature items and selects the K sample data with the smallest distance. Finally, by naive Bayes to get the final result. The experimental results on the NSL-KDD dataset show that the KNN-NB algorithm can meet the requirement of balanced performance than the traditional KNN and Naive Bayes algorithm in term of accuracy, sensitivity, false detection rate, specificity, and missed detection rate.

Download Full-text

Improving Text Categorization by Multicriteria Feature Selection

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2005.p0570 ◽

2005 ◽

Vol 9 (5) ◽

pp. 570-575

Author(s):

Son Doan ◽

◽

Susumu Horiguchi ◽

Keyword(s):

Feature Selection ◽

Natural Language ◽

Text Categorization ◽

Naive Bayes ◽

Naïve Bayes ◽

Experimental Results ◽

Benchmark Data ◽

Bayes Algorithm

Text categorization involves assigning a natural language document to one or more predefined classes. One of the most interesting issues is feature selection. We propose an approach using multicriteria ranking of eatures, a new procedure for feature selection, and apply these to text categorization. Experimental results dealing with Reuters-21578 and 20Newsgroups benchmark data and the naive Bayes algorithm show that our proposal outperforms conventional feature selection in text categorization performance.

Download Full-text

Naive Bayes Algorithm Using Selection of Correlation Based Featured Selections Features for Chronic Diagnosis Disease

IJIIS: International Journal of Informatics and Information Systems ◽

10.47738/ijiis.v2i2.14 ◽

2019 ◽

Vol 2 (2) ◽

pp. 56-60

Author(s):

Irfan Santiko ◽

Ikhsan Honggo

Keyword(s):

Chronic Kidney Disease ◽

Feature Selection ◽

Kidney Disease ◽

Naive Bayes ◽

Naïve Bayes ◽

Progressive Decline ◽

Bayes Algorithm ◽

Test Algorithm ◽

Selection Testing ◽

Selection Of

Chronic kidney disease is a disease that can cause death, because the pathophysiological etiology resulting in a progressive decline in renal function, and ends in kidney failure. Chronic Kidney Disease (CKD) has now become a serious problem in the world. Kidney and urinary tract diseases have caused the death of 850,000 people each year. This suggests that the disease was ranked the 12th highest mortality rate. Some studies in the field of health including one with chronic kidney disease have been carried out to detect the disease early, In this study, testing the Naive Bayes algorithm to detect the disease on patients who tested positive for negative CKD and CKD. From the results of the test algorithm accuracy value will be compared against the results of the algorithm accuracy before use and after feature selection using feature selection Featured Correlation Based Selection (CFS), it is known that Naive Bayes algorithm after feature selection that is 93.58%, while the naive Bayes without feature selection the result is 93.54% accuracy. Seeing the value of a second accuracy testing Naive Bayes algorithm without using the feature selection and feature selection, testing both these algorithms including the classification is very good, because the accuracy value above 0.90 to 1.00. Included in the excellent classification. higher accuracy results.

Download Full-text

KLASIFIKASI TEKS MENGGUNAKAN CHI SQUARE FEATURE SELECTION UNTUK MENENTUKAN KOMIK BERDASARKAN PERIODE, MATERI DAN FISIKDENGAN ALGORITMA NAIVEBAYES

Compiler ◽

10.28989/compiler.v5i2.171 ◽

2016 ◽

Vol 5 (2) ◽

Author(s):

Siti Anisah ◽

Anton Setiawan Honggowibowo ◽

Asih Pujiastuti

Keyword(s):

Feature Selection ◽

Error Rate ◽

Classification System ◽

Naive Bayes ◽

Naïve Bayes ◽

Chi Square ◽

Oracle Database ◽

Category O ◽

The Difference ◽

Bayes Algorithm

A comic has its own characteristics compared the other types of books. The difference between comic and other books can be seen from the category o f period, material and physical. Comicand other booksneeded an application o f classification system. Looking for the problem, classification system was made using Chi Square Feature Selection and Naive Bayes algorithm to determine the comic based on the period, material and physical. Delphi programming language and Oracle Database are used to build the Classification System. Chi Square Feature Selection acquired trait a comic is in 0.10347 and which not comic is in 1.9531. Furthermore, data is classified by the Naive Bayes algorithm. From 120 titles o f comic that consists 60 titles o f comic and non comicused to build classesfor trainand 60 titles o f comic and non comic used to test. The results o f Naive Bayesalgorithm for comic is 96,67%with 3.33% error rate, and non comic is 90% with 10% error rate. The classification to determine comic is good.

Download Full-text

Opinion Mining on Culinary Food Customer Satisfaction Using Naïve Bayes Based-on Hybrid Feature Selection

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v15.i1.pp468-475 ◽

2019 ◽

Vol 15 (1) ◽

pp. 468 ◽

Cited By ~ 3

Author(s):

Oman Somantri ◽

Dyah Apriliani

Keyword(s):

Feature Selection ◽

Opinion Mining ◽

Naive Bayes ◽

Information Gain ◽

Naïve Bayes ◽

Classification Model ◽

Consumer Ratings ◽

Bayes Algorithm ◽

Restaurant Owners

<p>Conducting an assessment of consumer sentiments taken from social media in assessing a culinary food gives useful information for everyone who wants to get this information especially for migrants and tourists, in th other hand that information is very valuable for food stall and restaurant owners as information in improvinf food quality. Overcoming this problem, a sentiment analysis classification model using naïve bayes algorithm (NB) was applied to get this information. This problem occurs is the level of accuracy of classification of consumer ratings of culinary food is still not optimal because the weight of values in the data preprocessing process are not optimal. In this paper proposed a hybrid feature selection models to overcome the problems in the process of selecting the feature attributes that have not been optimal by using a combination of information gain (IG) and genetic algorithm (GA) algorithms. The result of this research showed that after the experiment and compared to using others algorithms produce the best of the level occuracy is 93%.</p>

Download Full-text

Sentiment Analysis Using Naive Bayes Algorithm with Feature Selection Particle Swarm Optimization (PSO) and Genetic Algorithm

International Journal of Advances in Data and Information Systems ◽

10.25008/ijadis.v2i2.1224 ◽

2021 ◽

Vol 2 (2) ◽

Author(s):

Abi Rafdi ◽

Herman Mawengkang Herman ◽

Syahril Efendi

Keyword(s):

Genetic Algorithm ◽

Feature Selection ◽

Particle Swarm Optimization ◽

Sentiment Analysis ◽

Naive Bayes ◽

Confusion Matrix ◽

Particle Swarm ◽

Naïve Bayes ◽

Swarm Optimization ◽

Bayes Algorithm

This study analyzes Sentiment to see opinions, points of view, judgments, attitudes, and emotions towards creatures and aspects expressed through texts. One of Social Media is like Twitter is one of the most widely used means of communication as a research topic. The main problem with sentiment analysis is voting and using the best feature options for maximum results. Either, the most widely known classification method is Naive Bayes. However, Naive Bayes is very sensitive to significant features. That way, in this test, a comparison of feature selection is carried out using Particle Swarm Optimization and Genetic Algorithm to improve the accuracy performance of the Naive Bayes algorithm. Analyses are performed by comparing before and after testing using feature selection. Validation uses a cross-validation technique, while the confusion matrix ??is appealed to measure accuracy. The results showed the highest increase for Naïve Bayes algorithm accuracy when using the feature selection of the Particle Swarm Optimization Algorithm from 60.26% to 77.50%, while the genetic algorithm from 60.26% to 70.71%. Therefore, the choice of the best characteristics is Particle Swarm Optimization which is superior with an increase in accuracy of 17.24%.

Download Full-text