A novel way to classify passenger data using Naïve Bayes algorithm (A real time anti-terrorism approach)

Author(s):  
Saurabh Singh ◽  
Shashikant Verma ◽  
Akhilesh Tiwari ◽  
Aditya Tiwari
Author(s):  
Ms. Shama Kabeer

Abstract: Cyberbullying is an online form of harassment. By posting, commenting, sending, or distributing personal, derogatory, false, or nasty stuff about others that can shame or humiliate them, this conduct is done with the goal of harming others. Once such content is published on the internet, it remains accessible indefinitely. This activity is considered unlawful, and it is more widespread among children and teenagers. Cyberbullying is an online epidemic that has the potential to result in devastating outcomes such as violence and suicide, and so must be dealt with swiftly and properly. To detect bullying behavior in textual messages, a real-time cyberbullying detection system based on machine learning—Naïve Bayes Algorithm is presented. The model was created to determine whether a tweet was bullying or non-bullying in nature. Also, to assist victims in dealing with bullying difficulties without their identities being revealed. Keywords: Machine Learning, Cyberbullying, Naïve Bayes, Cybercrimes, Cyberbullying Detection


2020 ◽  
Vol 4 (2) ◽  
pp. 362-369
Author(s):  
Sharazita Dyah Anggita ◽  
Ikmah

The needs of the community for freight forwarding are now starting to increase with the marketplace. User opinion about freight forwarding services is currently carried out by the public through many things one of them is social media Twitter. By sentiment analysis, the tendency of an opinion will be able to be seen whether it has a positive or negative tendency. The methods that can be applied to sentiment analysis are the Naive Bayes Algorithm and Support Vector Machine (SVM). This research will implement the two algorithms that are optimized using the PSO algorithms in sentiment analysis. Testing will be done by setting parameters on the PSO in each classifier algorithm. The results of the research that have been done can produce an increase in the accreditation of 15.11% on the optimization of the PSO-based Naive Bayes algorithm. Improved accuracy on the PSO-based SVM algorithm worth 1.74% in the sigmoid kernel.


2020 ◽  
Vol 4 (3) ◽  
pp. 504-512
Author(s):  
Faried Zamachsari ◽  
Gabriel Vangeran Saragih ◽  
Susafa'ati ◽  
Windu Gata

The decision to move Indonesia's capital city to East Kalimantan received mixed responses on social media. When the poverty rate is still high and the country's finances are difficult to be a factor in disapproval of the relocation of the national capital. Twitter as one of the popular social media, is used by the public to express these opinions. How is the tendency of community responses related to the move of the National Capital and how to do public opinion sentiment analysis related to the move of the National Capital with Feature Selection Naive Bayes Algorithm and Support Vector Machine to get the highest accuracy value is the goal in this study. Sentiment analysis data will take from public opinion using Indonesian from Twitter social media tweets in a crawling manner. Search words used are #IbuKotaBaru and #PindahIbuKota. The stages of the research consisted of collecting data through social media Twitter, polarity, preprocessing consisting of the process of transform case, cleansing, tokenizing, filtering and stemming. The use of feature selection to increase the accuracy value will then enter the ratio that has been determined to be used by data testing and training. The next step is the comparison between the Support Vector Machine and Naive Bayes methods to determine which method is more accurate. In the data period above it was found 24.26% positive sentiment 75.74% negative sentiment related to the move of a new capital city. Accuracy results using Rapid Miner software, the best accuracy value of Naive Bayes with Feature Selection is at a ratio of 9:1 with an accuracy of 88.24% while the best accuracy results Support Vector Machine with Feature Selection is at a ratio of 5:5 with an accuracy of 78.77%.


2020 ◽  
Vol 41 (S1) ◽  
pp. s367-s368
Author(s):  
Michael Korvink ◽  
John Martin ◽  
Michael Long

Background: The Bundled Payment Care Improvement Program is a CMS initiative designed to encourage greater collaboration across settings of care, especially as it relates to an initial set of targeted clinical episodes, which include sepsis and pneumonia. As with many CMS incentive programs, performance evaluation is retrospective in nature, resulting in after-the-fact changes in operational processes to improve both efficiency and quality. Although retrospective performance evaluation is informative, care providers would ideally identify a patient’s potential clinical cohort during the index stay and implement care management procedures as necessary to prevent or reduce the severity of the condition. The primary challenges for real-time identification of a patient’s clinical cohort are CMS-targeted cohorts are based on either MS-DRG (grouping of ICD-10 codes) or HCPCS coding—coding that occurs after discharge by clinical abstractors. Additionally, many informative data elements in the EHR lack standardization and no simple and reliable heuristic rules can be employed to meaningfully identify those cohorts without human review. Objective: To share the results of an ensemble statistical model to predict patient risks of sepsis and pneumonia during their hospital (ie, index) stay. Methods: The predictive model uses a combination of Bernoulli Naïve Bayes natural language processing (NLP) classifiers, to reduce text dimensionality into a single probability value, and an eXtreme Gradient Boosting (XGBoost) algorithm as a meta-model to collectively evaluate both standardized clinical elements alongside the NLP-based text probabilities. Results: Bernoulli Naïve Bayes classifiers have proven to perform well on short text strings and allow for highly explanatory unstructured or semistructured text fields (eg, reason for visit, culture results), to be used in a both comparative and generalizable way within the larger XGBoost model. Conclusions: The choice of XGBoost as the meta-model has the benefits of mitigating concerns of nonlinearity among clinical features, reducing potential of overfitting, while allowing missing values to exist within the data. Both the Bayesian classifier and meta-model were trained using a patient-level integrated dataset extracted from both a patient-billing and EHR data warehouse maintained by Premier. The data set, joined by patient admission-date, medical record number, date of birth, and hospital entity code, allows the presence of both the coded clinical cohort (derived from the MS-DRG) and the explanatory features in the EHR to exist within a single patient encounter record. The resulting model produced F1 performance scores of .65 for the sepsis population and .61 for the pneumonia population.Funding: NoneDisclosures: None


2020 ◽  
Vol 1 (2) ◽  
pp. 61-66
Author(s):  
Febri Astiko ◽  
Achmad Khodar

This study aims to design a machine learning model of sentiment analysis on Indosat Ooredoo service reviews on social media twitter using the Naive Bayes algorithm as a classifier of positive and negative labels. This sentiment analysis uses machine learning to get patterns an model that can be used again to predict new data.


Author(s):  
Lingchong Jia ◽  
B. Santhosh Kumar ◽  
R. Parthasarathy

Nowadays, in various educational institutions, artificial intelligence technology is applied effectively and successfully. This artificial intelligence improves learning and student development in academic performance. Challenges of the conventional education approach, students’ dependence on teachers in all resources for study, unavailability of professional instructors, and a greater focus on conditioning learning than practical usefulness lead to lower learning performance. In this paper integrated teaching-learning model approach has been proposed using artificial intelligence in student education. It involves speeding up fulfilling education targets by reducing barriers to entry, automating management processes, and maximizing learning performance. The proposed ITLMA method used the naive Bayes algorithm to evaluate the student ranking using a class score, task, project score, and final exam. The result of artificial intelligence-based ITLMA and naive Bayes algorithm hasa high accuracy ratio of 80.1% with less error ratio of 15.7%, high prediction 88.2%, precision 98.2%, and improves student and teacher interaction compared to other existing methods.


2021 ◽  
Vol 5 (3) ◽  
pp. 527-533
Author(s):  
Yoga Religia ◽  
Amali Amali

The quality of an airline's services cannot be measured from the company's point of view, but must be seen from the point of view of customer satisfaction. Data mining techniques make it possible to predict airline customer satisfaction with a classification model. The Naïve Bayes algorithm has demonstrated outstanding classification accuracy, but currently independent assumptions are rarely discussed. Some literature suggests the use of attribute weighting to reduce independent assumptions, which can be done using particle swarm optimization (PSO) and genetic algorithm (GA) through feature selection. This study conducted a comparison of PSO and GA optimization on Naïve Bayes for the classification of Airline Passenger Satisfaction data taken from www.kaggle.com. After testing, the best performance is obtained from the model formed, namely the classification of Airline Passenger Satisfaction data using the Naïve Bayes algorithm with PSO optimization, where the accuracy value is 86.13%, the precision value is 87.90%, the recall value is 87.29%, and the value is AUC of 0.923.


Sign in / Sign up

Export Citation Format

Share Document