Naïve Bayes Algorithm for Classification of Student Major’s Specialization

Astia Weni Syaputri; Erno Irwandi; Mustakim Mustakim

doi:10.26714/jichi.v1i1.5570

Naïve Bayes Algorithm for Classification of Student Major’s Specialization

Journal of Intelligent Computing & Health Informatics ◽

10.26714/jichi.v1i1.5570 ◽

2020 ◽

Vol 1 (1) ◽

pp. 17

Author(s):

Astia Weni Syaputri ◽

Erno Irwandi ◽

Mustakim Mustakim

Keyword(s):

Social Sciences ◽

Naive Bayes ◽

Confusion Matrix ◽

High Accuracy ◽

Naïve Bayes ◽

Natural Sciences ◽

Training Data ◽

Average Value ◽

Bayes Algorithm

Majors are important in determining student specialization. If there is an error in the direction of the student, it will certainly affect the education of subsequent students. In SMA Negeri 1 Kampar Timur, there are two majors, namely Natural Sciences and Social Sciences. To determine these majors, it is necessary to reference the average value of student grades from semester 3 to semester 5 which includes the average value of Islamic religious education, Indonesian, Citizenship Education, English, Natural Sciences, Social Sciences, and Mathematics. Naive Beyes algorithm is an algorithm that can be used in classifying majors found in SMA Negeri 1 Kampar Timur. To determine the classification of majors in SMA Negeri 1 Kampar Timur, training data and test data are used, respectively at 70% and 30%. This data will be tested for accuracy using a confusion matrix and produces a fairly high accuracy of 96.19%. With this high accuracy, the Naive Bayes algorithm is very suitable to be used in determining the direction of students in SMA Negeri 1 Kampar Timur.

Download Full-text

Analisis Klasifikasi Kanker Payudara Menggunakan Algoritma Naive Bayes

INFORMAL: Informatics Journal ◽

10.19184/isj.v4i3.14170 ◽

2020 ◽

Vol 4 (3) ◽

pp. 117

Author(s):

Hardian Oktavianto ◽

Rahman Puji Handri

Keyword(s):

Breast Cancer ◽

Naive Bayes ◽

Naïve Bayes ◽

World Health ◽

Average Percentage ◽

Average Value ◽

Treatment Measures ◽

Bayes Algorithm ◽

Health Organization

Breast cancer is one of the highest causes of death among women, this disease ranks second cause of death after lung cancer. According to the world health organization, 1 million women get a diagnosis of breast cancer every year and half of them die, in general this is due to early treatment and slow treatment resulting in new cancers being detected after entering the final stage. In the field of health and medicine, machine learning-based classification has been carried out to help doctors and health professionals in classifying the types of cancer, to determine which treatment measures should be performed. In this study breast cancer classification will be carried out using the Naive Bayes algorithm to group the types of cancer. The dataset used is from the Wisconsin breast cancer database. The results of this study are the ability of the Naive Bayes algorithm for the classification of breast cancer produces a good value, where the average percentage of correctly classified data reaches 96.9% and the average percentage of data is classified as incorrect only 3.1%. While the level of effectiveness of classification with naive bayes is high, where the average value of precision and recall is around 0.96. The highest precision and recall values are when the test data uses a percentage split of 40% with the respective values reaching 0.974 and 0.973.

Download Full-text

Application of the Naïve Bayes Algorithm for Student Graduation Analysis

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i4.15.23596 ◽

2018 ◽

Vol 7 (4.15) ◽

pp. 421

Author(s):

Erick Akhmad Fahmi Alfa’izy ◽

Khairil Anam ◽

Naidah Naing ◽

Rosanita Tritias Utami ◽

Nur Anim Jauhariyah ◽

...

Keyword(s):

Naive Bayes ◽

Naïve Bayes ◽

Training Data ◽

College System ◽

Student Graduation ◽

Bayes Algorithm ◽

Using Data ◽

Analysis System ◽

Law Student

Design an analysis system to find out graduation by comparing previous data and existing data to overcome errors in a college system. By taking data records that are already available to be processed using the naïve Bayes algorithm. This research was conducted at Universitas Maarif Hasyim Latif. In this case, the object of research is to analyze the data of students with naïve Bayes algorithms to find out their graduation. For sampling the data taken is the previous Faculty of Law Student data to be used as training data, to retrieve the entire data using data records that are already available in the Directorate of Information Systems. That the naïve Bayes algorithm can be used in the classification of data in the form of a string or textual. This is based on researchers' trials in taking examples of calculations that have been done before. To compare the results of the classification of graduation analysis using the naïve Bayes algorithm testing is done with a sample of data in the form of training data compared to data testing. From the calculations that have been made, the accuracy is 77.78%.

Download Full-text

Comparison Analysis of K-Nearest Neighbor and Naïve Bayes in Determining Talent of Adolescence

International Journal of Artificial Intelligence Research ◽

10.29099/ijair.v4i1.118 ◽

2020 ◽

Vol 4 (1) ◽

Author(s):

Yessi Jusman ◽

Widdya Rahmalina ◽

Juni Zarman

Keyword(s):

Nearest Neighbor ◽

Naive Bayes ◽

Confusion Matrix ◽

Naïve Bayes ◽

Training Data ◽

K Nearest Neighbor ◽

Combined Training ◽

Testing Data ◽

Bayes Algorithm ◽

Children's Interests

Adolescence always searches for the identity to shape the personality character. This paper aims to use the artificial intelligent analysis to determine the talent of the adolescence. This study uses a sample of children aged 10-18 years with testing data consisting of 100 respondents. The algorithm used for analysis is the K-Nearest Neigbor and Naive Bayes algorithm. The analysis results are performance of accuracy results of both algorithms of classification. In knowing the accurate algorithm in determining children's interests and talents, it can be seen from the accuracy of the data with the confusion matrix using the RapidMiner software for training data, testing data, and combined training and testing data. This study concludes that the K-Nearest Neighbor algorithm is better than Naive Bayes in terms of classification accuracy.

Download Full-text

Klasifikasi Opini Masyarakat Terhadap Jasa Ekspedisi JNE dengan Naïve Bayes

JURNAL SISTEM INFORMASI BISNIS ◽

10.21456/vol8iss1pp92-98 ◽

2018 ◽

Vol 8 (1) ◽

pp. 92

Author(s):

Fithri Selva Jumeilah

Keyword(s):

Naive Bayes ◽

Probability Model ◽

Confusion Matrix ◽

Naïve Bayes ◽

Service Users ◽

Average Percentage ◽

Online Sales ◽

Bayes Algorithm ◽

Using Data

The large number of online sales transactions has increased the number of service users. One of the companies engaged in the delivery service in Indonesia is Tiki Nugraha Ekakurir or more known JNE. Currently, JNE service users reach 14.000.000 per month. JNE has used many media communications with its customers one of them with Twitter. The number of followers of JNECare is 108,000 and the number of tweets is 375,000. The number of comments for people who can be used to see what they think of JNE is an inseparable comment is a negative, positive or neutral category. To simplify the grouping of comments, the data will be classified using the Naïve Bayes method present in Rstudio. The amount data used on the internet is 1725 tweets. The data will be divided into allegations of 70% data training as much as 1208 data and 30% data testing or as many as 517 data. Before the data is classified the previous data must go through the process of preprocessing that is changing all the letters into lowercase and other letters other than letters and spaces (case folding), tokenizing words, and the removal of the word common (stopword remove). After the data is cleared the data will be labeled manually one by one and new data can be used for the training process to get the probability model for each category. Probailitas obtained by using Naïve bayes algorithm. Models obtained from the training will be used using data testing. The categories obtained from the test will be used to process the data used by using the confusion matrix and will calculate the accuracy, precision and recall. From the results of the classification of JNE comments obtained that Naïve Bayes was able to classify the data well. This is evidenced by the average percentage accuracy of 85%, 78% precision and 67% recall.

Download Full-text

Determining Bullying Text Classification Using Naive Bayes Classification on Social Media

Jurnal Varian ◽

10.30812/varian.v4i2.1086 ◽

2021 ◽

Vol 4 (2) ◽

pp. 133-140

Author(s):

Ade Clinton Sitepu ◽

Wanayumini Wanayumini ◽

Zakarias Situmorang

Keyword(s):

Social Media ◽

Naive Bayes ◽

Rapid Development ◽

Confusion Matrix ◽

Area Under The Curve ◽

Naïve Bayes ◽

Cyber Bullying ◽

Training Data ◽

The Media ◽

Bayes Algorithm

Cyber-bullying includes repeated acts with the aim of scaring, angering, or embarrassing those who are targeted Cyber-bullying is happening along with the rapid development of technology and social media in society. The media and users need to filter out bully comments because they can indirectly affect the mental psychology that reads them especially directly aimed at that person. By utilizing information mining, the system is expected to be able to classify information circulating in the community. One of the classification techniques that can be applied to text-based classification is Naïve Bayes. The algorithm is good at performing the classification process. In this research, the precision of the algorithm's has been carried out on 1000 comment datasets. The data is grouped manually first into the labels "bully" and "not bully" then the data is divided into training data and test data. To test the system's ability, the classified data is analyzed using the confusion matrix method. The results showed that the Naïve Bayes Algorithm got the level of precision at 87%. and the level of area under the curve (AUC) at 88%. In terms of speed of completing the system, the Naïve Bayes Algorithm has a very good rate of speed with completion time of 0.033 seconds.

Download Full-text

Perbandingan Optimasi Feature Selection pada Naïve Bayes untuk Klasifikasi Kepuasan Airline Passenger

Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi) ◽

10.29207/resti.v5i3.3086 ◽

2021 ◽

Vol 5 (3) ◽

pp. 527-533

Author(s):

Yoga Religia ◽

Amali Amali

Keyword(s):

Feature Selection ◽

Customer Satisfaction ◽

Naive Bayes ◽

Naïve Bayes ◽

Point Of View ◽

Classification Model ◽

Passenger Satisfaction ◽

Airline Passenger ◽

Bayes Algorithm

The quality of an airline's services cannot be measured from the company's point of view, but must be seen from the point of view of customer satisfaction. Data mining techniques make it possible to predict airline customer satisfaction with a classification model. The Naïve Bayes algorithm has demonstrated outstanding classification accuracy, but currently independent assumptions are rarely discussed. Some literature suggests the use of attribute weighting to reduce independent assumptions, which can be done using particle swarm optimization (PSO) and genetic algorithm (GA) through feature selection. This study conducted a comparison of PSO and GA optimization on Naïve Bayes for the classification of Airline Passenger Satisfaction data taken from www.kaggle.com. After testing, the best performance is obtained from the model formed, namely the classification of Airline Passenger Satisfaction data using the Naïve Bayes algorithm with PSO optimization, where the accuracy value is 86.13%, the precision value is 87.90%, the recall value is 87.29%, and the value is AUC of 0.923.

Download Full-text

Analysis and Classification of Danger Level in Android Applications Using Naive Bayes Algorithm

2018 6th International Conference on Information and Communication Technology (ICoICT) ◽

10.1109/icoict.2018.8528733 ◽

2018 ◽

Author(s):

Ridho Alif Utama ◽

Parman Sukarno ◽

Erwid Musthofa Jadied

Keyword(s):

Naive Bayes ◽

Naïve Bayes ◽

Android Applications ◽

Bayes Algorithm ◽

Danger Level

Download Full-text

SENTIMEN ANALISIS KEBIJAKAN GANJIL GENAP DI TOL BEKASI MENGGUNAKAN ALGORITMA NAIVE BAYES DENGAN OPTIMALISASI INFORMATION GAIN

Jurnal Pilar Nusa Mandiri ◽

10.33480/pilar.v15i2.705 ◽

2019 ◽

Vol 15 (2) ◽

pp. 247-254

Author(s):

Heru Sukma Utama ◽

Didi Rosiyadi ◽

Dedi Aridarma ◽

Bobby Suryo Prakoso

Keyword(s):

Social Media ◽

Opinion Mining ◽

Naive Bayes ◽

Information Gain ◽

Confusion Matrix ◽

Naïve Bayes ◽

Support Vector ◽

Toll Road ◽

Textual Data ◽

Bayes Algorithm

Analysis of the odd even-numbered sentiment systems in Bekasi toll using the Naïve Bayes Algorithm, is a process of understanding, extracting, and processing textual data automatically from social media. The purpose of this study was to determine the level of accuracy, recall and precision of opinion mining generated using the Naïve Bayes algorithm to provide information community sentiment towards the effectiveness of the odd system of Bekasi tiolls on social media. The research method used in this study was to do text mining in comments-comments regarding posts regarding even odd oddities on Bekasi toll on Twitter, Instagram, Youtube and Facebook. The steps taken are starting from preprocessing, transformation, datamining and evaluation, followed by information gaon feature selection, select by weight and applying NB Algorithm model. The results obtained from the study using the NB model are obtained Confusion Matrix result, namely accuracy of 79,55%, Precision of 80,51%, and Sensitivity or Recall of 80,91%. Thus this study concludes that the use of Support Vector Machine Algorithms can analyze even odd sentiments on the Bekasi toll road.

Download Full-text

Prediction of Solid Garbage Waste Generation in Smart Cities using Naive Bayes Algorithm

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.b1031.1292s19 ◽

2019 ◽

Vol 9 (2S) ◽

pp. 53-56

Keyword(s):

Naive Bayes ◽

Learning Algorithm ◽

Smart Cities ◽

Confusion Matrix ◽

Daily Basis ◽

Naïve Bayes ◽

Human Beings ◽

Waste Generation ◽

Future Prediction ◽

Bayes Algorithm

Smart cities which are becoming overcrowded today are making human beings life miserable and prone to more challenges on daily basis. Overcrowded is leading to vast generation of wastes contributing to air pollution and in turn is affecting health causing various diseases. Even though various measures are taken to recycle wastes, the rate at which it is being produced is becoming higher and higher. This paper deals with prediction of waste generation using Naïve Bayes machine learning algorithm(Classifier) based on the statistics of previous waste datasets. The datasets used for the future prediction are obtained from reliable sources. The implementation of the algorithm is done in Pyspark using Anaconda Jupyter. The performance of the classifier on the datasets is analyzed with confusion matrix and accuracy metric is used to rate the efficiency of the classifier. The accuracy obtained indicates that algorithm can be effectively used for real time prediction and it gives more accurate results for huge input datasets based on independence assumption.

Download Full-text

Attribute Selection in Naive Bayes Algorithm Using Genetic Algorithms and Bagging for Prediction of Liver Disease

JOURNAL OF INFORMATICS AND TELECOMMUNICATION ENGINEERING ◽

10.31289/jite.v4i1.3793 ◽

2020 ◽

Vol 4 (1) ◽

pp. 76-85

Author(s):

Dwi Yuni Utami ◽

Elah Nurlelah ◽

Noer Hikmah

Keyword(s):

Genetic Algorithms ◽

Liver Disease ◽

Naive Bayes ◽

Confusion Matrix ◽

Naïve Bayes ◽

Attribute Selection ◽

World Health ◽

The Difference ◽

Bayes Algorithm ◽

Health Organization

Liver disease is an inflammatory disease of the liver and can cause the liver to be unable to function as usual and even cause death. According to WHO (World Health Organization) data, almost 1.2 million people per year, especially in Southeast Asia and Africa, have died from liver disease. The problem that usually occurs is the difficulty of recognizing liver disease early on, even when the disease has spread. This study aims to compare and evaluate Naive Bayes algorithm as a selected algorithm and Naive Bayes algorithm based on Genetic Algorithm (GA) and Bagging to find out which algorithm has a higher accuracy in predicting liver disease by processing a dataset taken from the UCI Machine Learning Repository database (GA). University of California Invene). From the results of testing by evaluating both the confusion matrix and the ROC curve, it was proven that the testing carried out by the Naive Bayes Optimization algorithm using Algortima Genetics and Bagging has a higher accuracy value than only using the Naive Bayes algorithm. The accuracy value for the Naive Bayes algorithm model is 66.66% and the accuracy value for the Naive Bayes model with attribute selection using Genetic Algorithms and Bagging is 72.02%. Based on this value, the difference in accuracy is 5.36%.Keywords: Liver Disease, Naïve Bayes, Genetic Agorithms, Bagging.

Download Full-text