Event classification from the Urdu language text on social media

In current day information transmitted from one place to another by using network communication technology. Due to such transmission of information, networking system required a high security environment. The main strategy to secure this environment is to correctly identify the packet and detect if the packet contains a malicious and any illegal activity happened in network environments. To accomplish this, we use intrusion detection system (IDS). Intrusion detection is a security technology that design detects and automatically alert or notify to a responsible person. However, creating an efficient Intrusion Detection System face a number of challenges. These challenges are false detection and the data contain high number of features. Currently many researchers use machine learning techniques to overcome the limitation of intrusion detection and increase the efficiency of intrusion detection for correctly identify the packet either the packet is normal or malicious. Many machine-learning techniques use in intrusion detection. However, the question is which machine learning classifiers has been potentially to address intrusion detection issue in network security environment. Choosing the appropriate machine learning techniques required to improve the accuracy of intrusion detection system. In this work, three machine learning classifiers are analyzed. Support vector Machine, Naïve Bayes Classifier and K-Nearest Neighbor classifiers. These algorithms tested using NSL KDD dataset by using the combination of Chi square and Extra Tree feature selection method and Python used to implement, analyze and evaluate the classifiers. Experimental result show that K-Nearest Neighbor classifiers outperform the method in categorizing the packet either is normal or malicious.

Download Full-text

Towards Improving Offline Signature Verification based Authentication Using Machine Learning Classifiers

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.j9910.0981119 ◽

2019 ◽

Vol 8 (11) ◽

pp. 3393-3401

Keyword(s):

Machine Learning ◽

Sample Size ◽

Nearest Neighbor ◽

Turnaround Time ◽

Signature Verification ◽

Support Vector ◽

Paper Machine ◽

K Nearest Neighbor ◽

Machine Learning Classifiers ◽

Learning Classifiers

Signatures have been accepted in commercial transactions as a method of authentication. Digitizing credentials reduce the storage space requisite for the same information from a few cubic inches to so many bytes on a server. The most frequent use of offline signature authentication is to reduce the turnaround time for cheque clearance. In this paper, machine learning classifiers are used to verify the signature using four image based features. BHsig260 dataset (Bangla and Hindi) has been used. We used signatures of 55 users of Hindi and Bangla each. .Six classifier i.e. Boosted Tree, Random forest classifier (RFC), K-nearest neighbor, Multilayer Perceptron, Support Vector Machine (SVM) and Naive Bayes classifier are used in the work. In the paper, the results of Writer independent model show that accuracy of Hindi off-line signature verification is 72.3 % using MLP with the signature sample size of 20 and that of Bangla is 79 % using RFC with the signature sample size of 23.In user dependent model, for some users, we achieved accuracy of more than 92 % using KNN and SVM.

Download Full-text

Comparative Analysis of Intrusion Detection Attack Based on Machine Learning Classifiers

Indian Journal of Artificial Intelligence and Neural Networking ◽

10.54105/ijainn.b1025.041221 ◽

2021 ◽

pp. 22-28

Author(s):

Surafel Mehari Atnafu ◽

◽

Prof (Dr.) Anuja Kumar Acharya ◽

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Nearest Neighbor ◽

Detection System ◽

Machine Learning Techniques ◽

K Nearest Neighbor ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Learning Techniques

In current day information transmitted from one place to another by using network communication technology. Due to such transmission of information, networking system required a high security environment. The main strategy to secure this environment is to correctly identify the packet and detect if the packet contains a malicious and any illegal activity happened in network environments. To accomplish this, we use intrusion detection system (IDS). Intrusion detection is a security technology that design detects and automatically alert or notify to a responsible person. However, creating an efficient Intrusion Detection System face a number of challenges. These challenges are false detection and the data contain high number of features. Currently many researchers use machine learning techniques to overcome the limitation of intrusion detection and increase the efficiency of intrusion detection for correctly identify the packet either the packet is normal or malicious. Many machine-learning techniques use in intrusion detection. However, the question is which machine learning classifiers has been potentially to address intrusion detection issue in network security environment. Choosing the appropriate machine learning techniques required to improve the accuracy of intrusion detection system. In this work, three machine learning classifiers are analyzed. Support vector Machine, Naïve Bayes Classifier and K-Nearest Neighbor classifiers. These algorithms tested using NSL KDD dataset by using the combination of Chi square and Extra Tree feature selection method and Python used to implement, analyze and evaluate the classifiers. Experimental result show that K-Nearest Neighbor classifiers outperform the method in categorizing the packet either is normal or malicious.

Download Full-text

Physiological Stress Prediction using Machine Learning Classifiers

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.a4556.119119 ◽

2019 ◽

Vol 9 (1) ◽

pp. 675-677

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Physiological Stress ◽

Classification Algorithms ◽

K Nearest Neighbor ◽

Nearest Neighbor Algorithm ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Stress Prediction ◽

K Nearest Neighbor Algorithm

The aim of this study is to predict the stress of a person using Machine Learning classifiers. This system classifies the stress of a person as either High or Low. There are various classification algorithms present, out of which 9 classification algorithms have been chosen for this study. The algorithms implemented are K-Nearest Neighbor classifier, Support Vector Machine with an RBF kernel, Decision Tree algorithm, Random Forest algorithm, Bagging Classifier, Adaboost algorithm, Voting classifier, Logistic Regression and MLP classifier. The different algorithms are applied on the same dataset. The dataset is obtained from a GitHub repository labelled Stress classifier with AutoML. The different accuracies of each algorithm are found, and the classification algorithm with the best accuracy is determined. On comparison, it was found that the K-Nearest Neighbor algorithm has the best accuracy with an accuracy rate of 79.3% for physiological stress prediction. While other algorithms had varying accuracies, K-Nearest Neighbor algorithm was the most consistent.

Download Full-text

A textual-based featuring approach for depression detection using machine learning classifiers and social media texts

Computers in Biology and Medicine ◽

10.1016/j.compbiomed.2021.104499 ◽

2021 ◽

pp. 104499

Author(s):

Raymond Chiong ◽

Gregorius Satia Budhi ◽

Sandeep Dhakal ◽

Fabian Chiong

Keyword(s):

Machine Learning ◽

Social Media ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

Depression Detection ◽

Media Texts

Download Full-text

A Hadoop Based Framework Integrating Machine Learning Classifiers for Anomaly Detection in the Internet of Things

Electronics ◽

10.3390/electronics10161955 ◽

2021 ◽

Vol 10 (16) ◽

pp. 1955

Author(s):

Ikram Sumaiya Thaseen ◽

Vanitha Mohanraj ◽

Sakthivel Ramachandran ◽

Kishore Sanapala ◽

Sang-Soo Yeo

Keyword(s):

Machine Learning ◽

Internet Of Things ◽

Experimental Analysis ◽

Parameter Tuning ◽

Computational Time ◽

Support Vector ◽

Learning Approaches ◽

K Nearest Neighbor ◽

Machine Learning Classifiers ◽

Learning Classifiers

In recent years, different variants of the botnet are targeting government, private organizations and there is a crucial need to develop a robust framework for securing the IoT (Internet of Things) network. In this paper, a Hadoop based framework is proposed to identify the malicious IoT traffic using a modified Tomek-link under-sampling integrated with automated Hyper-parameter tuning of machine learning classifiers. The novelty of this paper is to utilize a big data platform for benchmark IoT datasets to minimize computational time. The IoT benchmark datasets are loaded in the Hadoop Distributed File System (HDFS) environment. Three machine learning approaches namely naive Bayes (NB), K-nearest neighbor (KNN), and support vector machine (SVM) are used for categorizing IoT traffic. Artificial immune network optimization is deployed during cross-validation to obtain the best classifier parameters. Experimental analysis is performed on the Hadoop platform. The average accuracy of 99% and 90% is obtained for BoT_IoT and ToN_IoT datasets. The accuracy difference in ToN-IoT dataset is due to the huge number of data samples captured at the edge layer and fog layer. However, in BoT-IoT dataset only 5% of the training and test samples from the complete dataset are considered for experimental analysis as released by the dataset developers. The overall accuracy is improved by 19% in comparison with state-of-the-art techniques. The computational times for the huge datasets are reduced by 3–4 hours through Map Reduce in HDFS.

Download Full-text

Sentiment Analysis for Iraqis Dialect in Social Media

Iraqi Journal of Information & Communications Technology ◽

10.31987/ijict.1.2.17 ◽

2018 ◽

Vol 1 (2) ◽

pp. 24-32

Author(s):

Lamiaa Abd Habeeb

Keyword(s):

Machine Learning ◽

Social Media ◽

Nearest Neighbor ◽

Learning Algorithm ◽

Learning Approach ◽

Machine Learning Algorithm ◽

K Nearest Neighbor ◽

Multinomial Models ◽

Ensemble Machine Learning ◽

Machine Learning Approach

In this paper, we designed a system that extract citizens opinion about Iraqis government and Iraqis politicians through analyze their comments from Facebook (social media network). Since the data is random and contains noise, we cleaned the text and builds a stemmer to stem the words as much as possible, cleaning and stemming reduced the number of vocabulary from 28968 to 17083, these reductions caused reduction in memory size from 382858 bytes to 197102 bytes. Generally, there are two approaches to extract users opinion; namely, lexicon-based approach and machine learning approach. In our work, machine learning approach is applied with three machine learning algorithm which are; Naïve base, K-Nearest neighbor and AdaBoost ensemble machine learning algorithm. For Naïve base, we apply two models; Bernoulli and Multinomial models. We found that, Naïve base with Multinomial models give highest accuracy.

Download Full-text

Identification and Detection of Cyberbullying on Facebook Using Machine Learning Algorithms

Journal of Cases on Information Technology ◽

10.4018/jcit.296254 ◽

2021 ◽

Vol 23 (4) ◽

pp. 1-21

Author(s):

Nureni Ayofe AZEEZ ◽

Sanjay Misra ◽

Omotola Ifeoluwa LAWAL ◽

Jonathan Oluranti

Keyword(s):

Machine Learning ◽

Social Media ◽

Nearest Neighbor ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

K Nearest Neighbor ◽

Chi Square ◽

Social Media Platforms ◽

Bayes Algorithm ◽

Use Of Social Media

The use of social media platforms such as Facebook, Twitter, Instagram, WhatsApp, etc. have enabled a lot of people to communicate effectively and frequently with each other and this has enabled cyberbullying to occur more frequently while using these networks. Cyberbullying is known to be the cause of some serious health issues among social media users and creating a way to identify and detect this holds significant importance. This paper takes a look at unique features gotten from the Facebook dataset and develops a model that identifies and detect cyberbullying posts by applying machine learning algorithms (Naïve Bayes Algorithm and K-Nearest Neighbor). The project also uses a feature selection algorithm namely x2 test (Chi-Square test) to select important features which can improve the performance of the classifiers and decrease classification time. The result of this paper tends to detect cyberbullying in Facebook with a high degree of accuracy and also improve the performance of the machine learning classifiers.

Download Full-text

Detecting Fake News using Machine Learning: A Systematic Literature Review

Psychology and Education Journal ◽

10.17762/pae.v58i1.1046 ◽

2021 ◽

Vol 58 (1) ◽

pp. 1932-1939

Author(s):

Alim Al Ayub Ahmed Et al.

Keyword(s):

Machine Learning ◽

Social Media ◽

Literature Review ◽

Systematic Literature Review ◽

Political Party ◽

Fake News ◽

Machine Learning Classifiers ◽

Online Platforms ◽

Learning Classifiers ◽

Social Media Platforms

Internet is one of the important inventions and a large number of persons are its users. These persons use this for different purposes. There are different social media platforms that are accessible to these users. Any user can make a post or spread the news through these online platforms. These platforms do not verify the users or their posts. So some of the users try to spread fake news through these platforms. These fake news can be a propaganda against an individual, society, organization or political party. A human being is unable to detect all these fake news. So there is a need for machine learning classifiers that can detect these fake news automatically. Use of machine learning classifiers for detecting the fake news is described in this systematic literature review.

Download Full-text

Coding and Classifying Knowledge Exchange on Social Media: a Comparative Analysis of the #Twitterstorians and AskHistorians Communities

Computer Supported Cooperative Work (CSCW) ◽

10.1007/s10606-020-09376-y ◽

2020 ◽

Vol 29 (6) ◽

pp. 629-656

Author(s):

Anatoliy Gruzd ◽

Priya Kumar ◽

Deena Abul-Fottouh ◽

Caroline Haythornthwaite

Keyword(s):

Machine Learning ◽

Social Media ◽

Information Seeking ◽

Knowledge Exchange ◽

Information Communication ◽

High Agreement ◽

Online Learning Communities ◽

Machine Learning Classifiers ◽

Learning Classifiers ◽

In The Wild

AbstractAs social media become a staple for knowledge discovery and sharing, questions arise about how self-organizing communities manage learning outside the domain of organized, authority-led institutions. Yet examination of such communities is challenged by the quantity of posts and variety of media now used for learning. This paper addresses the challenges of identifying (1) what information, communication, and discursive practices support successful online communities, (2) whether such practices are similar on Twitter and Reddit, and (3) whether machine learning classifiers can be successfully used to analyze larger datasets of learning exchanges. This paper builds on earlier work that used manual coding of learning and exchange in Reddit ‘Ask’ communities to derive a coding schema we refer to as ‘learning in the wild’. This schema of eight categories: explanation with disagreement, agreement, or neutral presentation; socializing with negative, or positive intent; information seeking; providing resources; and comments about forum rules and norms. To compare across media, results from coding Reddit’s AskHistorians are compared to results from coding a sample of #Twitterstorians tweets (n = 594). High agreement between coders affirmed the applicability of the coding schema to this different medium. LIWC lexicon-based text analysis was used to build machine learning classifiers and apply these to code a larger dataset of tweets (n = 69,101). This research shows that the ‘learning in the wild’ coding schema holds across at least two different platforms, and is partially scalable to study larger online learning communities.

Download Full-text