scholarly journals Analysis of the Effect of Clustering the Training Data in Naive Bayes Classifier for Anomaly Network Intrusion Detection

2014 ◽  
Vol 2 (1) ◽  
pp. 85-88 ◽  
Author(s):  
Uma Subramanian ◽  
Hang See Ong
2019 ◽  
Vol 8 (3) ◽  
pp. 5906-5910

The major important factor of network intrusion detection is to avoid malicious process in network. Since, existing modules are out-dated because of improper authentication and the network may get affected because of new attacks and malwares. In this research, Hybrid module is formed by using Chicken Swarm Optimization and Naive Bayes classifier (HCSONB) for classification of intrusion data. The hybrid method is introduced to detect the features efficiently in complex dataset because strategy which is designed to be capable of detecting huge data in network. Some traditional methods results in serious limitations in case of complex datasets. The algorithms are shared their properties together to discover better optimization results and the classification precisions values. This paper examines the feature selection performance by utilizing NSLKDD-99 dataset and comparing it with the Swarm Intelligence (SI), Naïve-Bayes classifier and proposed HCSO-NB algorithms. The proposed classification process designed in NETBEANS 8.2 tool. Experiments show that proposed HCSO-NB successfully improved the accuracy


2020 ◽  
Vol 17 (1) ◽  
pp. 37-42
Author(s):  
Yuris Alkhalifi ◽  
Ainun Zumarniansyah ◽  
Rian Ardianto ◽  
Nila Hardi ◽  
Annisa Elfina Augustia

Non-Cash Food Assistance or Bantuan Pangan Non-Tunai (BPNT) is food assistance from the government given to the Beneficiary Family (KPM) every month through an electronic account mechanism that is used only to buy food at the Electronic Shop Mutual Assistance Joint Business Group Hope Family Program (e-Warong KUBE PKH ) or food traders working with Bank Himbara. In its distribution, BPNT still has problems that occur that are experienced by the village apparatus especially the apparatus of Desa Wanasari on making decisions, which ones are worthy of receiving (poor) and not worthy of receiving (not poor). So one way that helps in making decisions can be done through the concept of data mining. In this study, a comparison of 2 algorithms will be carried out namely Naive Bayes Classifier and Decision Tree C.45. The total sample used is as much as 200 head of household data which will then be divided into 2 parts into validation techniques is 90% training data and 10% test data of the total sample used then the proposed model is made in the RapidMiner application and then evaluated using the Confusion Matrix table to find out the highest level of accuracy from 2 of these methods. The results in this classification indicate that the level of accuracy in the Naive Bayes Classifier method is 98.89% and the accuracy level in the Decision Tree C.45 method is 95.00%. Then the conclusion that in this study the algorithm with the highest level of accuracy is the Naive Bayes Classifier algorithm method with a difference in the accuracy rate of 3.89%.


SinkrOn ◽  
2020 ◽  
Vol 5 (1) ◽  
Author(s):  
Miftahul Kahfi Al Fath ◽  
Arini Arini ◽  
Nasrul Hakiem

Sentiment analysis is an important and emerging research topic today. Sentiment analysis is done to see opinion or tendency of opinion to a problem or object by someone, whether it tends to have a negative or positive view. The main purpose of this study is to find out public sentiment on Full Day school's policy comment from Facebook Page of Kemendikbud RI and to find out the performance of the Naïve Bayes Classifier Algorithm. In this study, the authors used the Naïve Bayes Classifier algorithm with trigram and quad ram character feature selection with two different training data models and labeling of training data using Lexicon Based method in the classification of public sentiment toward the Full day school policy. The result of this research shows that public negative sentiment toward Full Day School policy is more than positive or neutral sentiment. The highest accuracy value is the Naïve Bayes Classifier algorithm with trigram feature selection of 300 data training models with a value of 80%. The greater of training data and feature selection used on the Naïve Bayes Classifier Algorithm affected the accurate result.


Repositor ◽  
2019 ◽  
Vol 1 (2) ◽  
pp. 125
Author(s):  
Vinna Rahmayanti ◽  
Setio Basuki ◽  
Hilman Hilman

It is undeniable that technological progress is developing very quickly in the field of computers, now with computers the work that was originally done by humans can be taken over by computers to help human work itself, like case studi of this research is a system that can classification the text like synopsis into genre group. Genre is the style of story in a novel, there are many genres in the novel that are expected to be romantic, comedy, mystery, horror and others, by knowing the genre of the novel the reader will be able to know the story style of the novel. The method used in this research is TF-IDF (Term Frequency Inverse Document Frequency) and Naïve Bayes Classifier. The TF-IDF method is used to get the weight of each word contained in the resulting document is used in the Naïve Bayes Classifier method to get the synopsis classification results into genre. Based on the evaluation using a confusion matrix using 600 training data and 200 test data obtained an accuracy of 80.5%.AbstractIt is undeniable that technological progress is developing very quickly in the field of computers, now with computers the work that was originally done by humans can be taken over by computers to help human work itself, like case studi of this research is a system that can classification the text like synopsis into genre group. Genre is the style of story in a novel, there are many genres in the novel that are expected to be romantic, comedy, mystery, horror and others, by knowing the genre of the novel the reader will be able to know the story style of the novel. The method used in this research is TF-IDF (Term Frequency Inverse Document Frequency) and Naïve Bayes Classifier. The TF-IDF method is used to get the weight of each word contained in the resulting document is used in the Naïve Bayes Classifier method to get the synopsis classification results into genre. Based on the evaluation using a confusion matrix using 600 training data and 200 test data obtained an accuracy of 80.5%.


Author(s):  
Mohammad Zoqi Sarwani ◽  
Muhammad Shubkhan Salafudin ◽  
Dian Ahkam Sani

With the development of social media trends among students by using Facebook social media, students can communicate and pour out everything that is felt in the form of status. Personality is the character or various characters of a person - therefore, how a person to adjust to the surrounding environment for the achievement of communication smoothly. In the personality category, many things classify a person's category in the psychologist theory. In this exercise, the Big Five, the psychologist theory, is described in five codes, namely Openness, Conscientiousness, Extraversion, Agreeables, Neuroticism. Naive Bayes Classifier is used to determine the highest probability value with the aim to determine the highest value. The data used are two namely training data and testing data obtained from the Facebook status of students. From the data obtained can be tested in the system that the accuracy value is 88%.


Author(s):  
Jie Ji ◽  
◽  
Qiangfu Zhao

Document clustering partitions sets of unlabeled documents so that documents in clusters share common concepts. A Naive Bayes Classifier (BC) is a simple probabilistic classifier based on applying Bayes’ theorem with strong (naive) independence assumptions. BC requires a small amount of training data to estimate parameters required for classification. Since training data must be labeled, we propose an Iterative Bayes Clustering (IBC) algorithm. To improve IBC performance, we propose combining IBC with Comparative Advantage-based (CA) initialization method. Experimental results show that our proposal improves performance significantly over classical clustering methods.


2012 ◽  
Vol 5s1 ◽  
pp. BII.S8945 ◽  
Author(s):  
Irena Spasić ◽  
Pete Burnap ◽  
Mark Greenwood ◽  
Michael Arribas-Ayllon

The authors present a system developed for the 2011 i2b2 Challenge on Sentiment Classification, whose aim was to automatically classify sentences in suicide notes using a scheme of 15 topics, mostly emotions. The system combines machine learning with a rule-based methodology. The features used to represent a problem were based on lexico–semantic properties of individual words in addition to regular expressions used to represent patterns of word usage across different topics. A naïve Bayes classifier was trained using the features extracted from the training data consisting of 600 manually annotated suicide notes. Classification was then performed using the naïve Bayes classifier as well as a set of pattern–matching rules. The classification performance was evaluated against a manually prepared gold standard consisting of 300 suicide notes, in which 1,091 out of a total of 2,037 sentences were associated with a total of 1,272 annotations. The competing systems were ranked using the micro-averaged F-measure as the primary evaluation metric. Our system achieved the F-measure of 53% (with 55% precision and 52% recall), which was significantly better than the average performance of 48.75% achieved by the 26 participating teams.


2021 ◽  
Vol 5 (4) ◽  
pp. 389
Author(s):  
Muhammad Ikbal ◽  
Septi Andryana ◽  
Ratih Titi Komala Sari

The covid-19 virus became a pandemic in 2020. The spread of covid cases has hit the whole world, reaching 63 million cases in 190 countries as of November 2020. Information regarding the spread of covid is necessary for the general public. This research will produce a system that can provide information on the geographic distribution of covid cases. The data on the distribution of covid cases in this study were also used to analyze the classification using the Naive Bayes Classifier method. The Naive Bayes Classifier method works by using probability calculations so that this research can be used to classify the covid status in an area. The results of this study have succeeded in providing information on the status of the covid pandemic based on data on covid cases that have occurred around the world. Covid case data becomes training data for the analysis of the Naive Bayes classifier method so that it can determine the status of the Covid pandemic based on test data provided by system users. This research has succeeded in helping users to know the status of the Covid pandemic in an area well because it has reliable training data.Keywords:System, Covid, Naïve Bayes Classifier.


2020 ◽  
Vol 1 (3) ◽  
pp. 185-199
Author(s):  
Khoirul Zuhri ◽  
Nurul Adha Oktarini Saputri

Twitter is a social media that is currently popular, where the public is free to comment and write anything. It is not uncommon for the public to comment with harsh words and even hate speech. The 2019 presidential election drew many comments, some praised, criticized and insulted. To be able to dig up information and classify a text, sentiment analysis is needed. In this study, sentiment analysis is a process of classifying textual documents into two classes, namely negative and positive sentiment classes. Opinion data were obtained from the Twitter social network in the form of tweets. The data used was 3337 tweets consisting of 80% training data and 20% training data. Training data is data with known sentiment. This study aims to determine whether a tweet is a positive or negative tweet conveyed on Twitter in Indonesian. The classification of tweet data uses the naïve Bayes classifier algorithm. The classification results of the test data show that the Naïve Bayes Classifier algorithm provides an accuracy value of 71%. The accuracy value for each sentiment is 71% for positive sentiment and 70% for negative sentiment


Sign in / Sign up

Export Citation Format

Share Document