scholarly journals Spam, a Digital Pollution and Ways to Eradicate It

Due to the growing popularity of the microblogging and networking sites like twitter, Gmail, Facebook etc., there has been an increase in the number of spammers. Spammers on Twitter seem to be more dangerous than the mail spammers as they exploit the limitation on the characters of Twitter for their own purposes. Spammers have also become creative in framing their content to cleverly escape the classifiers. This survey is thus mainly used to discuss and analyze the recent research that had been put forth regarding the spam detection in social media sites such as Twitter. This survey analyses the papers that tackled various problems faced on Twitter and the problems faced by the methods that have already been presented before. We then compared all the methods present in the papers to see which method or combination of methods could give the best result in detecting spam.

Author(s):  
Gauri Jain ◽  
Manisha Sharma ◽  
Basant Agarwal

This article describes how spam detection in the social media text is becoming increasing important because of the exponential increase in the spam volume over the network. It is challenging, especially in case of text within the limited number of characters. Effective spam detection requires more number of efficient features to be learned. In the current article, the use of a deep learning technology known as a convolutional neural network (CNN) is proposed for spam detection with an added semantic layer on the top of it. The resultant model is known as a semantic convolutional neural network (SCNN). A semantic layer is composed of training the random word vectors with the help of Word2vec to get the semantically enriched word embedding. WordNet and ConceptNet are used to find the word similar to a given word, in case it is missing in the word2vec. The architecture is evaluated on two corpora: SMS Spam dataset (UCI repository) and Twitter dataset (Tweets scrapped from public live tweets). The authors' approach outperforms the-state-of-the-art results with 98.65% accuracy on SMS spam dataset and 94.40% accuracy on Twitter dataset.


2019 ◽  
Vol 11 (2) ◽  
pp. 144
Author(s):  
Danar Wido Seno ◽  
Arief Wibowo

Social media writing content growing make a lot of new words that appear on Twitter in the form of words and abbreviations that appear so that sentiment analysis is increasingly difficult to get high accuracy of textual data on Twitter social media. In this study, the authors conducted research on sentiment analysis of the pairs of candidates for President and Vice President of Indonesia in the 2019 Elections. To obtain higher accuracy results and accommodate the problem of textual data development on Twitter, the authors conducted a combination of methods to conduct the sentiment analysis with unsupervised and supervised methods. namely Lexicon Based. This study used Twitter data in October 2018 using the search keywords with the names of each pair of candidates for President and Vice President of the 2019 Elections totaling 800 datasets. From the study with 800 datasets the best accuracy was obtained with a value of 92.5% with 80% training data composition and 20% testing data with a Precision value in each class between 85.7% - 97.2% and Recall value for each class among 78, 2% - 93.5%. With the Lexicon Based method as a labeling dataset, the process of labeling the Support Vector Machine dataset is no longer done manually but is processed by the Lexicon Based method and the dictionary on the lexicon can be added along with the development of data content on Twitter social media.


2014 ◽  
Vol 10 (10) ◽  
pp. 2135-2140 ◽  
Author(s):  
Saini Jacob Soman ◽  
S. Murugappan

Author(s):  
Roman Pyrma

The study contributes to defining the impact of digital communication on civic and political participation, explaining how social media mediate public activism. Based on the concept of the ‘digital citizenship’ the paper reveals the political aspect of the public activism of Russian youth online. The empirical model is based on a combination of methods and procedures of applied research in order to reveal the details of civil and political participation, and protest activism of youth online. The research model includes analysis of social media and a large-scale online survey of the younger audience. Based on the analysis of social media information flows, the paper states the prevalence of the youth’s civic participation over political participation, as well as the fact that the dynamics of social activity depend on the events and the current agenda. The authors describe the level of civic and political activity of youth online based on sociological data. They also divide the audience of the protest theatre according to the following models: leaders, activists, followers, and spectators. In general, the study reveals the status and details of the younger generation’s communication activity online, where communities establish and implications of linking actions appear.


2020 ◽  
Vol 8 (6) ◽  
pp. 5326-5329

The current use of social media has created incomparable amounts of social data, as it is a cheap and popular information sharing communication platform. Nowadays, a huge percentage of people depend on the accessible material on social networking in their choices (e.g. comments and suggestions about a subject or product). This feature on exchanging knowledge with a wide number of users has quickly prompted social spammers to exploit the network of confidence to distribute spam messages and support personal forums, advertising, phishing, scams and so on. Identifying these spammers and spam material is a hot subject of study, and while large amounts of experiments have recently been conducted to this end, so far the methodologies are only barely able to identify spam feedback, and none of them demonstrates the value of each derived function type. In this study, we have suggested a machine learning-based spam detection system that determines whether or not a specific message in the dataset is spam using a set of machine learning algorithms. Four main features have been used; including user-behavioral, user-linguistic, reviewbehavioral and review-linguistic, to improve the spam detection process and to gather reliable data


Spam has become one of the growing issues in social media websites. Some of the users in these websites creates spam news. Coming to twitter, Users inject tweets in trending topics and replies with promotional messages providing links. A large amount of spam has been noticied in twitter. It is necessary to identify these spams tweets in a twitter stream. Now a days ,a big part of people rely on content available in social media in their decisions, so detecting and deleting these spam details is very important. A basic framework is suggested to detect malicious account holders in twitter..At present to detect these spam users or accounts there are methods which are based on content based features, Graph based features. The system which is going to be created works on machine learning based algorithms. These algorithms help to give accurate results. In this system algorithm named Naïve Bayes classifier algorithm is going to be used. This algorithm is said to be combination of many other principles relyingupon “Bayes theorem” wherein the methods share a common mode of working.


Sign in / Sign up

Export Citation Format

Share Document