Enhanced Bio-Inspired Algorithms for Detecting and Filtering Spam

Author(s):  
Hadj Ahmed Bouarara

The internet era promotes electronic commerce and facilitates access to many services. In today's digital society, the explosion in communication has revolutionized the field of electronic communication. Unfortunately, this technology has become incontestably the original source of malicious activities, especially the plague called undesirables email (SPAM) that has grown tremendously in the last few years. This chapter unveils fresh bio-inspired techniques (artificial social cockroaches [ASC], artificial haemostasis system [AHS], and artificial heart lungs system [AHLS]) and their application for SPAM detection. For the experimentation, the authors used the benchmark SMS Spam corpus V.0.1 and the validation measures (recall, precision, f-measure, entropy, accuracy, and error). They optimize the sensitive parameters of each algorithm (text representation technique, distance measure, weightings, and threshold). The results are positive compared to the result of artificial social bees and machine-learning algorithms (decision tree C4.5 and K-means).

2020 ◽  
pp. 693-726
Author(s):  
Hadj Ahmed Bouarara ◽  
Reda Mohamed Hamou ◽  
Abdelmalek Amine

The internet era promotes electronic commerce and facilitates access to many services. In today's digital society the explosion in communication has revolutionized the field of electronic communication. Unfortunately, this technology has become incontestably the original source of malicious activities, especially the plague called undesirables email (SPAM) that has grown tremendously in the last few years. This paper deals on the unveiling of fresh bio-inspired techniques (artificial social cockroaches (ASC), artificial haemostasis system (AHS) and artificial heart lungs system (AHLS)) and their application for SPAM detection. For the authors' experimentation, they have used the benchmark SMS Spam corpus V.0.1 and the validation measures (recall, precision, f-measure, entropy, accuracy and error). They have optimising the sensitive parameters of each algorithm (text representation technique, distance measure, weightings, and threshold). The results are positive compared to the result of artificial social bees and machine learning algorithms (decision tree C4.5 and K-means).


2016 ◽  
Vol 9 (2) ◽  
pp. 47-77
Author(s):  
Hadj Ahmed Bouarara ◽  
Reda Mohamed Hamou ◽  
Abdelmalek Amine

The internet era promotes electronic commerce and facilitates access to many services. In today's digital society the explosion in communication has revolutionized the field of electronic communication. Unfortunately, this technology has become incontestably the original source of malicious activities, especially the plague called undesirables email (SPAM) that has grown tremendously in the last few years. This paper deals on the unveiling of fresh bio-inspired techniques (artificial social cockroaches (ASC), artificial haemostasis system (AHS) and artificial heart lungs system (AHLS)) and their application for SPAM detection. For the authors' experimentation, they have used the benchmark SMS Spam corpus V.0.1 and the validation measures (recall, precision, f-measure, entropy, accuracy and error). They have optimising the sensitive parameters of each algorithm (text representation technique, distance measure, weightings, and threshold). The results are positive compared to the result of artificial social bees and machine learning algorithms (decision tree C4.5 and K-means).


2021 ◽  
Author(s):  
Simarjeet Kaur ◽  
Meenakshi Bansal ◽  
Ashok Kumar Bathla

Due to the rise in the use of messaging and mailing services, spam detection tasks are of much greater importance than before. In such a set of communications, efficient classification is a comparatively onerous job. For an addressee or any email that the user does not want to have in his inbox, spam can be defined as redundant or trash email. After pre-processing and feature extraction, various machine learning algorithms were applied to a Spam base dataset from the UCI Machine Learning repository in order to classify incoming emails into two categories: spam and non-spam. The outcomes of various algorithms have been compared. This paper used random forest, naive bayes, support vector machine (SVM), logistic regression, and the k nearest (KNN) machine learning algorithm to successfully classify email spam messages. The main goal of this study is to improve the prediction accuracy of spam email filters.


2019 ◽  
Vol 8 (3) ◽  
pp. 4148-4153

The swiftly growth of spam email has escalated the need to upgrade the existing spam detection and filtration methods. There is the existence of several machine learning methods for the classification and detection of email spam but these lacks in some cases. In this research work ensemble methods are adapted to detect the email spam. The machine learning methods of Multinomial Naïve Bayes and J48 Decision Tree algorithms are considered and ensembled. The considered ensemble methods are bagging and boosting. The experimentation is conducted on the dataset of CSDMC2010 Spam corpus. The results for the considered dataset are evaluated using individual classifiers, bagging, and boosting ensemble approaches. The system performance is accessed in terms of precision, recall, f-measure, and accuracy. The experimental outcomes indicates the distinguish results for the detection of email spam using ensemble methods.


10.29007/qshd ◽  
2020 ◽  
Author(s):  
N Sutta ◽  
Z Liu ◽  
X Zhang

Despite the fact that different techniques have been developed to filter spam, due to the spammer’s rapid adoption of new spam detection techniques, we are still overwhelmed with spam emails. Currently, machine learning techniques are the most effective ways to classify and filter spam emails. In this paper, a comprehensive comparison and analysis of the performance of various classification models on the 2007 TREC Public Spam Corpus are exhibited in various cases of without or with N- Grams as well as using separate or combined datasets. It is shown that the inclusion of the N-Grams in the pre-processing phase provides high accuracy results for classification models in most of the cases, and the models using the split approach with combined datasets give better results than models using the separate dataset.


2021 ◽  
pp. 1-34
Author(s):  
Kadam Vikas Samarthrao ◽  
Vandana M. Rohokale

Email has sustained to be an essential part of our lives and as a means for better communication on the internet. The challenge pertains to the spam emails residing a large amount of space and bandwidth. The defect of state-of-the-art spam filtering methods like misclassification of genuine emails as spam (false positives) is the rising challenge to the internet world. Depending on the classification techniques, literature provides various algorithms for the classification of email spam. This paper tactics to develop a novel spam detection model for improved cybersecurity. The proposed model involves several phases like dataset acquisition, feature extraction, optimal feature selection, and detection. Initially, the benchmark dataset of email is collected that involves both text and image datasets. Next, the feature extraction is performed using two sets of features like text features and visual features. In the text features, Term Frequency-Inverse Document Frequency (TF-IDF) is extracted. For the visual features, color correlogram and Gray-Level Co-occurrence Matrix (GLCM) are determined. Since the length of the extracted feature vector seems to the long, the optimal feature selection process is done. The optimal feature selection is performed by a new meta-heuristic algorithm called Fitness Oriented Levy Improvement-based Dragonfly Algorithm (FLI-DA). Once the optimal features are selected, the detection is performed by the hybrid learning technique that is composed of two deep learning approaches named Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN). For improving the performance of existing deep learning approaches, the number of hidden neurons of RNN and CNN is optimized by the same FLI-DA. Finally, the optimized hybrid learning technique having CNN and RNN classifies the data into spam and ham. The experimental outcomes show the ability of the proposed method to perform the spam email classification based on improved deep learning.


Sign in / Sign up

Export Citation Format

Share Document