Spam Detection on Social Media Using Semantic Convolutional Neural Network

Author(s):  
Gauri Jain ◽  
Manisha Sharma ◽  
Basant Agarwal

This article describes how spam detection in the social media text is becoming increasing important because of the exponential increase in the spam volume over the network. It is challenging, especially in case of text within the limited number of characters. Effective spam detection requires more number of efficient features to be learned. In the current article, the use of a deep learning technology known as a convolutional neural network (CNN) is proposed for spam detection with an added semantic layer on the top of it. The resultant model is known as a semantic convolutional neural network (SCNN). A semantic layer is composed of training the random word vectors with the help of Word2vec to get the semantically enriched word embedding. WordNet and ConceptNet are used to find the word similar to a given word, in case it is missing in the word2vec. The architecture is evaluated on two corpora: SMS Spam dataset (UCI repository) and Twitter dataset (Tweets scrapped from public live tweets). The authors' approach outperforms the-state-of-the-art results with 98.65% accuracy on SMS spam dataset and 94.40% accuracy on Twitter dataset.

Author(s):  
Gauri Jain ◽  
Manisha Sharma ◽  
Basant Agarwal

This article describes how spam detection in the social media text is becoming increasing important because of the exponential increase in the spam volume over the network. It is challenging, especially in case of text within the limited number of characters. Effective spam detection requires more number of efficient features to be learned. In the current article, the use of a deep learning technology known as a convolutional neural network (CNN) is proposed for spam detection with an added semantic layer on the top of it. The resultant model is known as a semantic convolutional neural network (SCNN). A semantic layer is composed of training the random word vectors with the help of Word2vec to get the semantically enriched word embedding. WordNet and ConceptNet are used to find the word similar to a given word, in case it is missing in the word2vec. The architecture is evaluated on two corpora: SMS Spam dataset (UCI repository) and Twitter dataset (Tweets scrapped from public live tweets). The authors' approach outperforms the-state-of-the-art results with 98.65% accuracy on SMS spam dataset and 94.40% accuracy on Twitter dataset.


2019 ◽  
Vol 8 (2S11) ◽  
pp. 3464-3468

Psychological stress which is a mental illness also causes physical problems to the human. Nowadays social media plays an important role in the world for communication to share their thoughts with their friends and family. The social media analysis is the process of detecting and predicting the user's thoughts and opinions which also one of the important perspective in the developing business environment. The overwhelming stress and long term stress sometimes lead to suicidal ideation. By analyzing the social media content to predict the overwhelming stress state of the users in the earlier stage will reduce the psychological stress and suicidal rate too. In this paper, we address the problem of stress prediction by using social media. The machine learning and deep learning methods to perform the classification of stress analysis. Here both image and text- tweet data are used and the images are processed with the Optical Character Recognition and the text data are processed by using the Natural Language Processing and Convolutional Neural Network for classifying the tweet content of the user as stressed or non-stressed. Furthermore, with the advancement of the machine learning and deep learning method of classification gives a better result in terms of performance and accuracy of the prediction.


2021 ◽  
Author(s):  
Mayank Mishra ◽  
Tanupriya Choudhury ◽  
Tanmay Sarkar

Abstract In our work, we look to classify images that make their way into our smartphone devices through various social-media text-messaging platforms. We aim at classifying images into three broad categories: document-based images, quote-based images, and photographs. People, especially students, share many document-based images that include snapshots of essential emails, handwritten notes, articles, etc. Quote based images, consisting of birthday wishes, motivational messages, festival greetings, etc., are among the highly shared images on social media platforms. A significant share of images constitutes photographs of people, including group photographs, selfies, portraits, etc. We train various convolutional neural network (CNN) based models on our self-made dataset and compare their results to find our task’s optimum model.


2021 ◽  
Vol 2137 (1) ◽  
pp. 012056
Author(s):  
Hongli Ma ◽  
Fang Xie ◽  
Tao Chen ◽  
Lei Liang ◽  
Jie Lu

Abstract Convolutional neural network is a very important research direction in deep learning technology. According to the current development of convolutional network, in this paper, convolutional neural networks are induced. Firstly, this paper induces the development process of convolutional neural network; then it introduces the structure of convolutional neural network and some typical convolutional neural networks. Finally, several examples of the application of deep learning is introduced.


2020 ◽  
Vol 29 (05) ◽  
pp. 2050014
Author(s):  
Anupam Jamatia ◽  
Steve Durairaj Swamy ◽  
Björn Gambäck ◽  
Amitava Das ◽  
Swapan Debbarma

Sentiment analysis is a circumstantial analysis of text, identifying the social sentiment to better understand the source material. The article addresses sentiment analysis of an English-Hindi and English-Bengali code-mixed textual corpus collected from social media. Code-mixing is an amalgamation of multiple languages, which previously mainly was associated with spoken language. However, social media users also deploy it to communicate in ways that tend to be somewhat casual. The coarse nature of social media text poses challenges for many language processing applications. Here, the focus is on the low predictive nature of traditional machine learners when compared to Deep Learning counterparts, including the contextual language representation model BERT (Bidirectional Encoder Representations from Transformers), on the task of extracting user sentiment from code-mixed texts. Three deep learners (a BiLSTM CNN, a Double BiLSTM and an Attention-based model) attained accuracy 20–60% greater than traditional approaches on code-mixed data, and were for comparison also tested on monolingual English data.


2019 ◽  
Vol 10 (11) ◽  
pp. 3313-3325
Author(s):  
Rong Xiang ◽  
Qin Lu ◽  
Ying Jiao ◽  
Yufei Zheng ◽  
Wenhao Ying ◽  
...  

Abstract Affective analysis of social media text is in great demand. Online text written in Chinese communities often contains mixed scripts including major text written in Chinese, an ideograph-based writing system, and minor text using Latin letters, an alphabet-based writing system. This phenomenon is referred to as writing systems changes (WSCs). Past studies have shown that WSCs often reflect unfiltered immediate affections. However, the use of WSCs poses more challenges in Natural Language Processing tasks because WSCs can break the syntax of the major text. In this work, we present our work to use WSCs as an effective feature in a hybrid deep learning model with attention network. The WSCs scripts are first identified by their encoding range. Then, the document representation of the text is learned through a Long Short-Term Memory model and the minor text is learned by a separate Convolution Neural Network model. To further highlight the WSCs components, an attention mechanism is adopted to re-weight the feature vector before the classification layer. Experiments show that the proposed hybrid deep learning method which better incorporates WSCs features can further improve performance compared to the state-of-the-art classification models. The experimental result indicates that WSCs can serve as effective information in affective analysis of the social media text.


Sign in / Sign up

Export Citation Format

Share Document