scholarly journals Deep Learning for Image Spam Detection

2019 ◽  
Author(s):  
Tazmina Sharmin
2021 ◽  
pp. 1-34
Author(s):  
Kadam Vikas Samarthrao ◽  
Vandana M. Rohokale

Email has sustained to be an essential part of our lives and as a means for better communication on the internet. The challenge pertains to the spam emails residing a large amount of space and bandwidth. The defect of state-of-the-art spam filtering methods like misclassification of genuine emails as spam (false positives) is the rising challenge to the internet world. Depending on the classification techniques, literature provides various algorithms for the classification of email spam. This paper tactics to develop a novel spam detection model for improved cybersecurity. The proposed model involves several phases like dataset acquisition, feature extraction, optimal feature selection, and detection. Initially, the benchmark dataset of email is collected that involves both text and image datasets. Next, the feature extraction is performed using two sets of features like text features and visual features. In the text features, Term Frequency-Inverse Document Frequency (TF-IDF) is extracted. For the visual features, color correlogram and Gray-Level Co-occurrence Matrix (GLCM) are determined. Since the length of the extracted feature vector seems to the long, the optimal feature selection process is done. The optimal feature selection is performed by a new meta-heuristic algorithm called Fitness Oriented Levy Improvement-based Dragonfly Algorithm (FLI-DA). Once the optimal features are selected, the detection is performed by the hybrid learning technique that is composed of two deep learning approaches named Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN). For improving the performance of existing deep learning approaches, the number of hidden neurons of RNN and CNN is optimized by the same FLI-DA. Finally, the optimized hybrid learning technique having CNN and RNN classifies the data into spam and ham. The experimental outcomes show the ability of the proposed method to perform the spam email classification based on improved deep learning.


2017 ◽  
Vol 77 (11) ◽  
pp. 13249-13278 ◽  
Author(s):  
Amiza Amir ◽  
Bala Srinivasan ◽  
Asad I. Khan

2021 ◽  
Author(s):  
Wei-Bang Chen ◽  
Yongjin Lu ◽  
Zanyah Ailsworth ◽  
Xiaoliang Wang ◽  
Chengcui Zhang
Keyword(s):  

Spam features represent the unique and special characteristics associated with spam, which are further used to differentiate them from other genuine messages. Each message m is processed by a feature extraction module to represent m in terms of n dimensional feature vector x = (x1, x2, …, xn) containing n features. This feature vector consists of many such features extracted from spam. In case of text based spam filters, a feature can be a word and a feature vector may be composed of various words extracted from spam. Each spam is associated with one feature vector. Based on the characteristics discussed in previous chapter, we will try to extract different features capturing those unique characteristics from image spam, in order to build the robust spam detection algorithms further. These features are broadly classified into high level metadata features, low level image features like color features, grayscale features, texture related features and embedded text related features.


In order to understand the never-ending fights between developers of anti-spam detection techniques and the spammers; it is important to have an insight of the history of spam mails. On May 3, 1978, Gary Thuerk, a marketing manager at Digital Equipment Corporation sent his first mass email to more than 400 customers over the Arpanet in order to promote and sell Digital's new T-Series of VAX systems (Streitfeld, 2003). In this regard, he said, “It's too much work to send everyone an e-mail. So we'll send one e-mail to everyone”. He said with pride, “I was the pioneer. I saw a new way of doing things.” As every coin has two sides, any technology too can be utilized for good and bad intention. At that time, Gary Thuerk would have never dreamt of this method of sending mails to emerge as an area of research in future. Gary Thuerk ended up getting crowned as the father of spam mails instead of the father of e-marketing. In the present scenario, the internet receives 2.5 billion pieces of spam a day by spiritual followers of Thuerk.


Sign in / Sign up

Export Citation Format

Share Document