Detecting Potential Insider Threat: Analyzing Insiders’ Sentiment Exposed in Social Media

In the era of Internet of Things (IoT), impact of social media is increasing gradually. With the huge progress in the IoT device, insider threat is becoming much more dangerous. Trying to find what kind of people are in high risk for the organization, about one million of tweets were analyzed by sentiment analysis methodology. Dataset made by the web service “Sentiment140” was used to find possible malicious insider. Based on the analysis of the sentiment level, users with negative sentiments were classified by the criteria and then selected as possible malicious insiders according to the threat level. Machine learning algorithms in the open-sourced machine learning software “Weka (Waikato Environment for Knowledge Analysis)” were used to find the possible malicious insider. Decision Tree had the highest accuracy among supervised learning algorithms and K-Means had the highest accuracy among unsupervised learning. In addition, we extract the frequently used words from the topic modeling technique and then verified the analysis results by matching them to the information security compliance elements. These findings can contribute to achieve higher detection accuracy by combining individual’s characteristics to the previous studies such as analyzing system behavior.

Download Full-text

Prediction of social media effects on students’ academic performance using Machine Learning Algorithms (MLAs)

Journal of Computers in Education ◽

10.1007/s40692-021-00201-z ◽

2021 ◽

Author(s):

Isaac Kofi Nti ◽

Samuel Akyeramfo-Sam ◽

Bright Bediako-Kyeremeh ◽

Sylvester Agyemang

Keyword(s):

Machine Learning ◽

Social Media ◽

Academic Performance ◽

Media Effects ◽

Learning Algorithms ◽

Machine Learning Algorithms

Download Full-text

Heart disease prediction using machine learning techniques : a survey

International Journal of Engineering & Technology ◽

10.14419/ijet.v7i2.8.10557 ◽

2018 ◽

Vol 7 (2.8) ◽

pp. 684 ◽

Cited By ~ 12

Author(s):

V V. Ramalingam ◽

Ayantan Dandapath ◽

M Karthik Raja

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Complex Data ◽

Learning Techniques ◽

Vector Machines ◽

Supervised Learning Algorithms ◽

Life Threatening

Heart related diseases or Cardiovascular Diseases (CVDs) are the main reason for a huge number of death in the world over the last few decades and has emerged as the most life-threatening disease, not only in India but in the whole world. So, there is a need of reliable, accurate and feasible system to diagnose such diseases in time for proper treatment. Machine Learning algorithms and techniques have been applied to various medical datasets to automate the analysis of large and complex data. Many researchers, in recent times, have been using several machine learning techniques to help the health care industry and the professionals in the diagnosis of heart related diseases. This paper presents a survey of various models based on such algorithms and techniques andanalyze their performance. Models based on supervised learning algorithms such as Support Vector Machines (SVM), K-Nearest Neighbour (KNN), NaïveBayes, Decision Trees (DT), Random Forest (RF) and ensemble models are found very popular among the researchers.

Download Full-text

Cyber Bullying Detection for Twitter Using ML Classification Algorithms

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.38701 ◽

2021 ◽

Vol 9 (11) ◽

pp. 24-29

Author(s):

Muskan Patidar

Keyword(s):

Machine Learning ◽

Social Media ◽

Natural Language ◽

Naive Bayes ◽

Learning Algorithms ◽

Naïve Bayes ◽

Cyber Bullying ◽

Machine Learning Algorithms ◽

Support Vector ◽

Classification Algorithms

Abstract: Social networking platforms have given us incalculable opportunities than ever before, and its benefits are undeniable. Despite benefits, people may be humiliated, insulted, bullied, and harassed by anonymous users, strangers, or peers. Cyberbullying refers to the use of technology to humiliate and slander other people. It takes form of hate messages sent through social media and emails. With the exponential increase of social media users, cyberbullying has been emerged as a form of bullying through electronic messages. We have tried to propose a possible solution for the above problem, our project aims to detect cyberbullying in tweets using ML Classification algorithms like Naïve Bayes, KNN, Decision Tree, Random Forest, Support Vector etc. and also we will apply the NLTK (Natural language toolkit) which consist of bigram, trigram, n-gram and unigram on Naïve Bayes to check its accuracy. Finally, we will compare the results of proposed and baseline features with other machine learning algorithms. Findings of the comparison indicate the significance of the proposed features in cyberbullying detection. Keywords: Cyber bullying, Machine Learning Algorithms, Twitter, Natural Language Toolkit

Download Full-text

Towards scaling Twitter for digital epidemiology of birth defects

npj Digital Medicine ◽

10.1038/s41746-019-0170-5 ◽

2019 ◽

Vol 2 (1) ◽

Cited By ~ 4

Author(s):

Ari Z. Klein ◽

Abeed Sarker ◽

Davy Weissenbacher ◽

Graciela Gonzalez-Hernandez

Keyword(s):

Machine Learning ◽

Social Media ◽

Language Processing ◽

Birth Defects ◽

Birth Defect ◽

Learning Algorithms ◽

Class Imbalance ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Svm Classifier

Abstract Social media has recently been used to identify and study a small cohort of Twitter users whose pregnancies with birth defect outcomes—the leading cause of infant mortality—could be observed via their publicly available tweets. In this study, we exploit social media on a larger scale by developing natural language processing (NLP) methods to automatically detect, among thousands of users, a cohort of mothers reporting that their child has a birth defect. We used 22,999 annotated tweets to train and evaluate supervised machine learning algorithms—feature-engineered and deep learning-based classifiers—that automatically distinguish tweets referring to the user’s pregnancy outcome from tweets that merely mention birth defects. Because 90% of the tweets merely mention birth defects, we experimented with under-sampling and over-sampling approaches to address this class imbalance. An SVM classifier achieved the best performance for the two positive classes: an F1-score of 0.65 for the “defect” class and 0.51 for the “possible defect” class. We deployed the classifier on 20,457 unlabeled tweets that mention birth defects, which helped identify 542 additional users for potential inclusion in our cohort. Contributions of this study include (1) NLP methods for automatically detecting tweets by users reporting their birth defect outcomes, (2) findings that an SVM classifier can outperform a deep neural network-based classifier for highly imbalanced social media data, (3) evidence that automatic classification can be used to identify additional users for potential inclusion in our cohort, and (4) a publicly available corpus for training and evaluating supervised machine learning algorithms.

Download Full-text

Application Based Cigarette Detection on Social Media Platforms Using Machine Learning Algorithms

10.1007/978-3-030-91387-8_5 ◽

2021 ◽

pp. 68-80

Author(s):

Muhammad Umer Hashmi ◽

Ngoc Duy Nguyen ◽

Michael Johnstone ◽

Kathryn Backholer ◽

Asim Bhatti

Keyword(s):

Machine Learning ◽

Social Media ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Social Media Platforms

Download Full-text

Detecting “Clickbait” News on Social Media Using Machine Learning Algorithms

2019 27th Signal Processing and Communications Applications Conference (SIU) ◽

10.1109/siu.2019.8806257 ◽

2019 ◽

Author(s):

Sura Genc ◽

Elif Surer

Keyword(s):

Machine Learning ◽

Social Media ◽

Learning Algorithms ◽

Machine Learning Algorithms

Download Full-text

Behavioral Analysis of User Data on Social Media Applications using Machine Learning Algorithms

International Journal for Modern Trends in Science and Technology - RTT2020 ◽

10.46501/ijmtstciet17 ◽

2020 ◽

Vol 6 (8S) ◽

pp. 89-94

Author(s):

Prof. Chethan Raj C, Abhishek V Dhapte and Namratha V

Keyword(s):

Machine Learning ◽

Social Media ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Behavioral Analysis ◽

User Data ◽

Media Applications

Download Full-text

DETECTION OF FAKE REVIEWS ON SOCIAL MEDIA USING MACHINE LEARNING ALGORITHMS

Issues In Information Systems ◽

10.48009/1_iis_2020_185-194 ◽

2020 ◽

Keyword(s):

Machine Learning ◽

Social Media ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Fake Reviews

Download Full-text

An Efficient Classifier for U2R, R2L, DoS Attack

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.a1942.059120 ◽

2020 ◽

Vol 9 (1) ◽

pp. 644-647

Keyword(s):

Machine Learning ◽

Network Security ◽

Learning Algorithms ◽

Research Area ◽

Attack Detection ◽

Machine Learning Algorithms ◽

The Internet ◽

Detection Accuracy ◽

Cyber Attack ◽

Detection Systems

The internet has become an irreplaceable communicating and informative tool in the current world. With the ever-growing importance and massive use of the internet today, there has been interesting from researchers to find the perfect Cyber Attack Detection Systems (CADSs) or rather referred to as Intrusion Detection Systems (IDSs) to protect against the vulnerabilities of network security. CADS presently exist in various variants but can be largely categorized into two broad classifications; signature-based detection and anomaly detection CADSs, based on their approaches to recognize attack packets.The signature-based CADS use the well-known signatures or fingerprints of the attack packets to signal the entry across the gateways of secured networks. Signature-based CADS can only recognize threats that use the known signature, new attacks with unknown signatures can, therefore, strike without notice. Alternatively, anomaly-based CADS are enabled to detect any abnormal traffic within the network and report. There are so many ways of identifying anomalies and different machine learning algorithms are introduced to counter such threats. Most systems, however, fall short of complete attack prevention in the real world due system administration and configuration, system complexity and abuse of authorized access. Several scholars and researchers have achieved a significant milestone in the development of CADS owing to the importance of computer and network security. This paper reviews the current trends of CADS analyzing the efficiency or level of detection accuracy of the machine learning algorithms for cyber-attack detection with an aim to point out to the best. CADS is a developing research area that continues to attract several researchers due to its critical objective.

Download Full-text

Machine Learning Algorithms for Visualization and Prediction Modeling of Boston Crime Data

10.20944/preprints202002.0108.v1 ◽

2020 ◽

Author(s):

Jiarui Yin ◽

Inikuro Afa Michael ◽

Iduabo John Afa

Keyword(s):

Machine Learning ◽

Random Forest ◽

Decision Tree ◽

Data Science ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Crime Data ◽

Detection Analysis ◽

Supervised Learning Algorithms ◽

Supervised Methods

Machine learning plays a key role in present day crime detection, analysis and prediction. The goal of this work is to propose methods for predicting crimes classified into different categories of severity. We implemented visualization and analysis of crime data statistics in recent years in the city of Boston. We then carried out a comparative study between two supervised learning algorithms, which are decision tree and random forest based on the accuracy and processing time of the models to make predictions using geographical and temporal information provided by splitting the data into training and test sets. The result shows that random forest as expected gives a better result by 1.54% more accuracy in comparison to decision tree, although this comes at a cost of at least 4.37 times the time consumed in processing. The study opens doors to application of similar supervised methods in crime data analytics and other fields of data science

Download Full-text