scholarly journals Bot Detection on Social Networks Using Persistent Homology

2020 ◽  
Vol 25 (3) ◽  
pp. 58
Author(s):  
Minh Nguyen ◽  
Mehmet Aktas ◽  
Esra Akbas

The growth of social media in recent years has contributed to an ever-increasing network of user data in every aspect of life. This volume of generated data is becoming a vital asset for the growth of companies and organizations as a powerful tool to gain insights and make crucial decisions. However, data is not always reliable, since primarily, it can be manipulated and disseminated from unreliable sources. In the field of social network analysis, this problem can be tackled by implementing machine learning models that can learn to classify between humans and bots, which are mostly harmful computer programs exploited to shape public opinions and circulate false information on social media. In this paper, we propose a novel topological feature extraction method for bot detection on social networks. We first create weighted ego networks of each user. We then encode the higher-order topological features of ego networks using persistent homology. Finally, we use these extracted features to train a machine learning model and use that model to classify users as bot vs. human. Our experimental results suggest that using the higher-order topological features coming from persistent homology is promising in bot detection and more effective than using classical graph-theoretic structural features.

Social Networks are the source of rich, interactive, textual, and other media.Users of the social media generate data at a tremendous pace.This data consisting of user opinions and attitudes is so large that it has necessitated automated methods to analyze and extract knowledge from the same. Social networks have been studied and analyzed using various graph-based analysis techniques. Prominent analysishas centered on features like ego-networks, distance, centrality, sub-networks etc. The areas of study for social media analysis have been centered around populations, boundaries, Cohesion, Centrality and Brokerage, Prestige and Ranking. In the past several models have been propounded for various machine learning based analytics for the Social Networks study but there is a perceived need for studying social networks for health data using Ensemble Learning wherein an array of various Machine Learning techniques can be employed to achieve better classification or clustering results. We introduce an Analytical Model which will identify most discussed terms/ topics of health/ healthcare on social networks to predict the emerging health trends. The model is to use temporal datasets to deduce multi-label classification of health-related topics. The Model employs the technique of Temporal Clustering (using Machine Learning) on the Topic Classification done on datasets using Ensemble Machine Learning to deduce the most discussed topics. Using this model, we will see how Ensemble Machine Learning based Analytical Model for analyzing social network data for health topics is efficient than traditional Machine Learning technique(s).


2021 ◽  
pp. 1-13
Author(s):  
C S Pavan Kumar ◽  
L D Dhinesh Babu

Sentiment analysis is widely used to retrieve the hidden sentiments in medical discussions over Online Social Networking platforms such as Twitter, Facebook, Instagram. People often tend to convey their feelings concerning their medical problems over social media platforms. Practitioners and health care workers have started to observe these discussions to assess the impact of health-related issues among the people. This helps in providing better care to improve the quality of life. Dementia is a serious disease in western countries like the United States of America and the United Kingdom, and the respective governments are providing facilities to the affected people. There is much chatter over social media platforms concerning the patients’ care, healthy measures to be followed to avoid disease, check early indications. These chatters have to be carefully monitored to help the officials take necessary precautions for the betterment of the affected. A novel Feature engineering architecture that involves feature-split for sentiment analysis of medical chatter over online social networks with the pipeline is proposed that can be used on any Machine Learning model. The proposed model used the fuzzy membership function in refining the outputs. The machine learning model has obtained sentiment score is subjected to fuzzification and defuzzification by using the trapezoid membership function and center of sums method, respectively. Three datasets are considered for comparison of the proposed and the regular model. The proposed approach delivered better results than the normal approach and is proved to be an effective approach for sentiment analysis of medical discussions over online social networks.


2020 ◽  
Vol 34 (10) ◽  
pp. 13971-13972
Author(s):  
Yang Qi ◽  
Farseev Aleksandr ◽  
Filchenkov Andrey

Nowadays, social networks play a crucial role in human everyday life and no longer purely associated with spare time spending. In fact, instant communication with friends and colleagues has become an essential component of our daily interaction giving a raise of multiple new social network types emergence. By participating in such networks, individuals generate a multitude of data points that describe their activities from different perspectives and, for example, can be further used for applications such as personalized recommendation or user profiling. However, the impact of the different social media networks on machine learning model performance has not been studied comprehensively yet. Particularly, the literature on modeling multi-modal data from multiple social networks is relatively sparse, which had inspired us to take a deeper dive into the topic in this preliminary study. Specifically, in this work, we will study the performance of different machine learning models when being learned on multi-modal data from different social networks. Our initial experimental results reveal that social network choice impacts the performance and the proper selection of data source is crucial.


Technologies ◽  
2020 ◽  
Vol 8 (4) ◽  
pp. 64
Author(s):  
Panagiotis Kantartopoulos ◽  
Nikolaos Pitropakis ◽  
Alexios Mylonas ◽  
Nicolas Kylilis

Social media has become very popular and important in people’s lives, as personal ideas, beliefs and opinions are expressed and shared through them. Unfortunately, social networks, and specifically Twitter, suffer from massive existence and perpetual creation of fake users. Their goal is to deceive other users employing various methods, or even create a stream of fake news and opinions in order to influence an idea upon a specific subject, thus impairing the platform’s integrity. As such, machine learning techniques have been widely used in social networks to address this type of threat by automatically identifying fake accounts. Nonetheless, threat actors update their arsenal and launch a range of sophisticated attacks to undermine this detection procedure, either during the training or test phase, rendering machine learning algorithms vulnerable to adversarial attacks. Our work examines the propagation of adversarial attacks in machine learning based detection for fake Twitter accounts, which is based on AdaBoost. Moreover, we propose and evaluate the use of k-NN as a countermeasure to remedy the effects of the adversarial attacks that we have implemented.


2021 ◽  
Author(s):  
◽  
Seungho Choe

Persistent homology is a powerful tool in topological data analysis (TDA) to compute, study and encode efficiently multi-scale topological features and is being increasingly used in digital image classification. The topological features represent number of connected components, cycles, and voids that describe the shape of data. Persistent homology extracts the birth and death of these topological features through a filtration process. The lifespan of these features can represented using persistent diagrams (topological signatures). Cubical homology is a more efficient method for extracting topological features from a 2D image and uses a collection of cubes to compute the homology, which fits the digital image structure of grids. In this research, we propose a cubical homology-based algorithm for extracting topological features from 2D images to generate their topological signatures. Additionally, we propose a score, which measures the significance of each of the sub-simplices in terms of persistence. Also, gray level co-occurrence matrix (GLCM) and contrast limited adapting histogram equalization (CLAHE) are used as a supplementary method for extracting features. Machine learning techniques are then employed to classify images using the topological signatures. Among the eight tested algorithms with six published image datasets with varying pixel sizes, classes, and distributions, our experiments demonstrate that cubical homology-based machine learning with deep residual network (ResNet 1D) and Light Gradient Boosting Machine (lightGBM) shows promise with the extracted topological features.


Author(s):  
Pushkar Dubey

Social networks are the main resources to gather information about people’s opinion towards different topics as they spend hours daily on social media and share their opinion. Twitter is one of the social media that is gaining popularity. Twitter offers organizations a fast and effective way to analyze customers’ perspectives toward the critical to success in the market place. Developing a program for sentiment analysis is an approach to be used to computationally measure customers’ perceptions. .We use natural language processing and machine learning concepts to create a model for analysis . In this paper we are discussing how we can create a model for analysis of twittes which is trained by various nlp , machine learning and Deep learning Approach.


Author(s):  
Hardeo Kumar Thakur ◽  
Anand Gupta ◽  
Ayushi Bhardwaj ◽  
Devanshi Verma

This article describes how a rumor can be defined as a circulating unverified story or a doubtful truth. Rumor initiators seek social networks vulnerable to illimitable spread, therefore, online social media becomes their stage. Hence, this misinformation imposes colossal damage to individuals, organizations, and the government, etc. Existing work, analyzing temporal and linguistic characteristics of rumors seems to give ample time for rumor propagation. Meanwhile, with the huge outburst of data on social media, studying these characteristics for each tweet becomes spatially complex. Therefore, in this article, a two-fold supervised machine-learning framework is proposed that detects rumors by filtering and then analyzing their linguistic properties. This method attempts to automate filtering by training multiple classification algorithms with accuracy higher than 81.079%. Finally, using textual characteristics on the filtered data, rumors are detected. The effectiveness of the proposed framework is shown through extensive experiments on over 10,000 tweets.


Author(s):  
T Heena Fayaz

Abstract: The way politicians communicate with the electorateand run electoral campaigns was reshaped by the emergence and popularization of contemporary social media (SM), such as Facebook, Twitter, and Instagram social networks (SN). Due to inherent capabilities of SM, such as the large amount of available data accessed in real time, a new research subject has emerged, focusing on using SM data to predict election outcomes. Despite many studies conducted in the last decade, results are very controversial, and many times challenged. In this context, this work aims to investigate and summarize how research on predicting elections based on SM data has evolved since its beginning, to outline the state of both the art and the practice,and to identify research opportunities within this field. In termsof method, we performed a systematic literature review analyzingthe quantity and quality of publications, the electoral context of studies, the main approaches to and characteristics of the successful studies, as well as their main strengths and challenges, and compared our results with previous reviews. We identified and analyzed 83 relevant studies, and the challenges were identified in many areas such as process, sampling, modeling, performance evaluation and scientific rigor. Main findings include the low success of the most-used approach, namely volume and sentiment analysis on Twitter, and the better results with new approaches, such as regression methods trained with traditional polls. Finally, a vision of future research on integrating advances on process definitions, modeling, and evaluation is also discussed, pointing out, among others, the need for better investigating the application of state-of-art machine learning approaches. Index Terms: Elections, Social Media, Social Networks, Machine Learning, Systematic Review


Sign in / Sign up

Export Citation Format

Share Document