Detecting Pharmaceutical Spam in Microblog Messages

Data Mining ◽  
2013 ◽  
pp. 1407-1420
Author(s):  
Kathy J. Liszka ◽  
Chien-Chung Chan ◽  
Chandra Shekar

Microblogs are one of a growing group of social network tools. Twitter is, at present, one of the most popular forums for microblogging in online social networks, and the fastest growing. Fifty million messages flow through servers, computers, and cell phones on a wide variety of topics exchanged daily. With this considerable volume, Twitter is a natural and obvious target for spreading spam via the messages, called tweets. The challenge is how to determine if a tweet is a spam or not, and more specifically a special category advertising pharmaceutical products. The authors look at the essential characteristics of spam tweets and what makes microblogging spam unique from email or other types of spam. They review methods and tools currently available to identify general spam tweets. Finally, this work introduces a new methodology of applying text mining and data mining techniques to generate classifiers that can be used for pharmaceutical spam detection in the context of microblogging.

Author(s):  
Kathy J. Liszka ◽  
Chien-Chung Chan ◽  
Chandra Shekar

Microblogs are one of a growing group of social network tools. Twitter is, at present, one of the most popular forums for microblogging in online social networks, and the fastest growing. Fifty million messages flow through servers, computers, and cell phones on a wide variety of topics exchanged daily. With this considerable volume, Twitter is a natural and obvious target for spreading spam via the messages, called tweets. The challenge is how to determine if a tweet is a spam or not, and more specifically a special category advertising pharmaceutical products. The authors look at the essential characteristics of spam tweets and what makes microblogging spam unique from email or other types of spam. They review methods and tools currently available to identify general spam tweets. Finally, this work introduces a new methodology of applying text mining and data mining techniques to generate classifiers that can be used for pharmaceutical spam detection in the context of microblogging.


2015 ◽  
pp. 1539-1556
Author(s):  
Dhiraj Murthy ◽  
Alexander Gross ◽  
Alex Takata

This chapter identifies a number of the most common data mining toolkits and evaluates their utility in the extraction of data from heterogeneous online social networks. It introduces not only the complexities of scraping data from the diverse forms of data manifested in these sources, but also critically evaluates currently available tools. This analysis is followed by a presentation and discussion on the development of a hybrid system, which builds upon the work of the open-source Web-Harvest framework, for the collection of information from online social networks. This tool, VoyeurServer, attempts to address the weaknesses of tools identified in earlier sections, as well as prototype the implementation of key functionalities thought to be missing from commonly available data extraction toolkits. The authors conclude the chapter with a case study and subsequent evaluation of the VoyeurServer system itself. This evaluation presents future directions, remaining challenges, and additional extensions thought to be important to the effective development of data mining tools for the study of online social networks.


2016 ◽  
Vol 18 (5) ◽  
pp. 459-477
Author(s):  
Sarah Whitcomb Laiola

This article addresses issues of user precarity and vulnerability in online social networks. As social media criticism by Jose van Dijck, Felix Stalder, and Geert Lovink describes, the social web is a predatory system that exploits users’ desires for connection. Although accurate, this critical description casts the social web as a zone where users are always already disempowered, so fails to imagine possibilities for users beyond this paradigm. This article examines Natalie Bookchin’s composite video series, Testament, as it mobilizes an alt-(ernative) social network of vernacular video on YouTube. In the first place, the alt-social network works as an iteration of “tactical media” to critically reimagine empowered user-to-user interactions on the social web. In the second place, it obfuscates YouTube’s data-mining functionality, so allows users to socialize online in a way that evades their direct translation into data and the exploitation of their social labor.


Author(s):  
Dhiraj Murthy ◽  
Alexander Gross ◽  
Alex Takata

This chapter identifies a number of the most common data mining toolkits and evaluates their utility in the extraction of data from heterogeneous online social networks. It introduces not only the complexities of scraping data from the diverse forms of data manifested in these sources, but also critically evaluates currently available tools. This analysis is followed by a presentation and discussion on the development of a hybrid system, which builds upon the work of the open-source Web-Harvest framework, for the collection of information from online social networks. This tool, VoyeurServer, attempts to address the weaknesses of tools identified in earlier sections, as well as prototype the implementation of key functionalities thought to be missing from commonly available data extraction toolkits. The authors conclude the chapter with a case study and subsequent evaluation of the VoyeurServer system itself. This evaluation presents future directions, remaining challenges, and additional extensions thought to be important to the effective development of data mining tools for the study of online social networks.


2021 ◽  
Author(s):  
Darshika Koggalahewa ◽  
Yue Xu ◽  
Ernest Foo

Abstract Online Social Networks (OSNs) are a popular platform for communication and collaboration. Spammers are highly active in OSNs. Uncovering spammers has become one of the most challenging problems in OSNs. Classification-based supervised approaches are the most commonly used method for detecting spammers. The classification-based systems suffer from limitations of “data labelling”, “spam drift”, “imbalanced datasets” and “data fabrication”. These limitations effect the accuracy of a classifier’s detection. We present a pure unsupervised approach for spammer detection based on peer acceptance of a user in a social network to distinguish spammers from genuine users. The peer acceptance of a user to another user is calculated based on common shared interests over multiple shared topics between the two users. The main contribution of this paper is the introduction of a pure unsupervised spammer detection approach based on users’ peer acceptance. Our approach does not require labelled training datasets. While it does not better the accuracy of supervised classification-based approaches, our approach has become a successful alternative for traditional classifiers for spam detection by achieving an accuracy of 96.9%.


2022 ◽  
Vol 9 (1) ◽  
Author(s):  
Darshika Koggalahewa ◽  
Yue Xu ◽  
Ernest Foo

AbstractOnline Social Networks (OSNs) are a popular platform for communication and collaboration. Spammers are highly active in OSNs. Uncovering spammers has become one of the most challenging problems in OSNs. Classification-based supervised approaches are the most commonly used method for detecting spammers. Classification-based systems suffer from limitations of “data labelling”, “spam drift”, “imbalanced datasets” and “data fabrication”. These limitations effect the accuracy of a classifier’s detection. An unsupervised approach does not require labelled datasets. We aim to address the limitation of data labelling and spam drifting through an unsupervised approach.We present a pure unsupervised approach for spammer detection based on the peer acceptance of a user in a social network to distinguish spammers from genuine users. The peer acceptance of a user to another user is calculated based on common shared interests over multiple shared topics between the two users. The main contribution of this paper is the introduction of a pure unsupervised spammer detection approach based on users’ peer acceptance. Our approach does not require labelled training datasets. While it does not better the accuracy of supervised classification-based approaches, our approach has become a successful alternative for traditional classifiers for spam detection by achieving an accuracy of 96.9%.


Author(s):  
Vijayaganth V.

Social networks have increased momentously in the last decade. Individuals are depending on interpersonal organizations for data, news, and the assessment of different clients on various topics. These issues often make social network data very complex to analyze manually, resulting in the persistent use of computational means for analyzing them. Data mining gives a variety of systems for identifying helpful learning from huge datasets and a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules. This chapter discusses different data mining techniques used in mining social networks.


2019 ◽  
Vol 15 (2) ◽  
pp. 275-280
Author(s):  
Agus Setiyono ◽  
Hilman F Pardede

It is now common for a cellphone to receive spam messages. Great number of received messages making it difficult for human to classify those messages to Spam or no Spam.  One way to overcome this problem is to use Data Mining for automatic classifications. In this paper, we investigate various data mining techniques, named Support Vector Machine, Multinomial Naïve Bayes and Decision Tree for automatic spam detection. Our experimental results show that Support Vector Machine algorithm is the best algorithm over three evaluated algorithms. Support Vector Machine achieves 98.33%, while Multinomial Naïve Bayes achieves 98.13% and Decision Tree is at 97.10 % accuracy.


2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Sunyoung Park ◽  
Lasse Gerrits

AbstractAlthough migration has long been an imperative topic in social sciences, there are still needs of study on migrants’ unique and dynamic transnational identity, which heavily influences the social integration in the host society. In Online Social Network (OSN), where the contemporary migrants actively communicate and share their stories the most, different challenges against migrants’ belonging and identity and how they cope or reconcile may evidently exist. This paper aims to scrutinise how migrants are manifesting their belonging and identity via different technological types of online social networks, to understand the relations between online social networks and migrants’ multi-faceted transnational identity. The research introduces a comparative case study on an online social movement led by Koreans in Germany via their online communities, triggered by a German TV advertisement considered as stereotyping East Asians given by white supremacy’s point of view. Starting with virtual ethnography on three OSNs representing each of internet generations (Web 1.0 ~ Web 3.0), two-step Qualitative Data Analysis is carried out to examine how Korean migrants manifest their belonging and identity via their views on “who we are” and “who are others”. The analysis reveals how Korean migrants’ transnational identities differ by their expectation on the audience and the members in each online social network, which indicates that the distinctive features of the online platform may encourage or discourage them in shaping transnational identity as a group identity. The paper concludes with the two main emphases: first, current OSNs comprising different generational technologies play a significant role in understanding the migrants’ dynamic social values, and particularly, transnational identities. Second, the dynamics of migrants’ transnational identity engages diverse social and situational contexts. (keywords: transnational identity, migrants’ online social networks, stereotyping migrants, technological evolution of online social network).


Sign in / Sign up

Export Citation Format

Share Document