spam filters
Recently Published Documents


TOTAL DOCUMENTS

76
(FIVE YEARS 15)

H-INDEX

7
(FIVE YEARS 1)

2021 ◽  
Vol 21 (4) ◽  
pp. 1-27
Author(s):  
Di Wu ◽  
Wei Shi ◽  
Xiangyu Ma

As one of the most pervasive current modes of communication, email needs to be fast and reliable. However, spammers and attackers use it as a primary channel to conduct illegal activities. Although many approaches have been developed and evaluated for spam detection, they do not provide sufficient accuracy. This deficiency results in significant economic losses for organizations. In this article, we first propose a framework for creating novel spam filters using Keras to combine a Convolutional Neural Network (CNN) with Long Short-Term Memory (LSTM) classification models. We then use this framework to introduce a specific solution applicable to realistic scenarios involving dynamic incoming email data in real-time. This solution takes the form of a real-time content-based spam classifier. We evaluate its performance concerning accuracy, precision, recall, false-positive, and false-negative rates. Our experimental results show that our approach can significantly outperform existing solutions for real-time spam detection.


2021 ◽  
Vol 38 (5) ◽  
pp. 1413-1421
Author(s):  
Vallamchetty Sreenivasulu ◽  
Mohammed Abdul Wajeed

Spam emails based on images readily evade text-based spam email filters. More and more spammers are adopting the technology. The essence of email is necessary in order to recognize image content. Web-based social networking is a method of communication between the information owner and end users for online exchanges that use social network data in the form of images and text. Nowadays, information is passed on to users in shorter time using social networks, and the spread of fraudulent material on social networks has become a major issue. It is critical to assess and decide which features the filters require to combat spammers. Spammers also insert text into photographs, causing text filters to fail. The detection of visual garbage material has become a hotspot study on spam filters on the Internet. The suggested approach includes a supplementary detection engine that uses visuals as well as text input. This paper proposed a system for the assessment of information, the detection of information on fraud-based mails and the avoidance of distribution to end users for the purpose of enhancing data protection and preventing safety problems. The proposed model utilizes Machine Learning and Convolutional Neural Network (CNN) methods to recognize and prevent fraud information being transmitted to end users.


Author(s):  
Syed Md. Minhaz Hossain ◽  
Iqbal H. Sarker

Recently, spam emails have become a significant problem with the expanding usage of the Internet. It is to some extend obvious to filter emails. A spam filter is a system that detects undesired and malicious emails and blocks them from getting into the users' inboxes. Spam filters check emails for something "suspicious" in terms of text, email address, header, attachments, and language. However, we have used different features such as word2vec, word n-grams, character n-grams, and a combination of variable length n-grams for comparative analysis in our proposed approach. Different machine learning models such as support vector machine (SVM), decision tree (DT), logistic regression (LR), and multinomial naïve bayes (MNB) are applied to train the extracted features. We use different evaluation metrics such as precision, recall, f1-score, and accuracy to evaluate the experimental results. Among them, SVM provides 97.6 \% of accuracy, 98.8\% of precision, and 94.9\% of f1-score using a combination of n-gram features.


Author(s):  
Anushka Srivastava

As the world is seamlessly developing at a very high pace, we have been seeing enormous growth in various sectors of Technology. Networking has played a crucial part in the exchange of technological culture around the globe, and the Internet being the sole medium of Network enhancement has taken over every aspect of our society. Today, most of the professional communications are done through emailing. As far as email has proven to be an efficient, professional and easy way of communication, it also comes with the disadvantage of unwanted bulk bombarding of spam content. This has been a critical concern for email users. Consequently, it has become very difficult for spam filters to efficiently filter the unwanted emails, since nowadays emails are written in such a manner that any existing algorithm cannot give 100% accuracy in predicting spam. This paper deals with Naive Bayesian Classifier that is a Machine Learning algorithm for antispam filtering, which gives satisfactory results by automatically constructing anti-spam filters with extended conduct. The review over the researched performance of Naive Bayes algorithm is done by the investigations of Spam ham csv datasets. The performance of the algorithm is evaluated based on the accuracy, recall and precision it shows on the mentioned datasets. This technique gives 96-97% accuracy and 89% precision on the investigated dataset. The result also highlights that the content of the email and the number of instances of the dataset has an apparent effect on the performance of the algorithm.


2021 ◽  
Vol 19 (2) ◽  
pp. 1926-1943
Author(s):  
Khan Farhan Rafat ◽  
◽  
Qin Xin ◽  
Abdul Rehman Javed ◽  
Zunera Jalil ◽  
...  

<abstract><p>Spam is any form of annoying and unsought digital communication sent in bulk and may contain offensive content feasting viruses and cyber-attacks. The voluminous increase in spam has necessitated developing more reliable and vigorous artificial intelligence-based anti-spam filters. Besides text, an email sometimes contains multimedia content such as audio, video, and images. However, text-centric email spam filtering employing text classification techniques remains today's preferred choice. In this paper, we show that text pre-processing techniques nullify the detection of malicious contents in an obscure communication framework. We use <italic>Spamassassin</italic> corpus with and without text pre-processing and examined it using machine learning (ML) and deep learning (DL) algorithms to classify these as ham or spam emails. The proposed DL-based approach consistently outperforms ML models. In the first stage, using pre-processing techniques, the long-short-term memory (LSTM) model achieves the highest results of 93.46% precision, 96.81% recall, and 95% F1-score. In the second stage, without using pre-processing techniques, LSTM achieves the best results of 95.26% precision, 97.18% recall, and 96% F1-score. Results show the supremacy of DL algorithms over the standard ones in filtering spam. However, the effects are unsatisfactory for detecting encrypted communication for both forms of ML algorithms.</p></abstract>


2020 ◽  
Vol 7 (4) ◽  
pp. 91-95
Author(s):  
Venkata RamiReddy Chirra ◽  
Hoolda Daniel Maddiboyina ◽  
Yakobu Dasari ◽  
Ranganadhareddy Aluru

Spam in email box is received because of advertising, collecting personal information, or to indulge malware through websites or scripts. Most often, spammers send junk mail with an intention of committing email fraud. Today spam mail accounts for 45% of all email and hence there is an ever-increasing need to build efficient spam filters to identify and block spam mail. However, notably today’s spam filters in use are built using traditional approaches such as statistical and content-based techniques. These techniques don’t improve their performance while handling huge data and they need a lot of domain expertise, human intervention and they neglect the relation between the words in context and consider the occurrence of the word. To address these limitations, we developed a spam filter using deep neural networks. In this work, various deep neural networks such as RNN, LSTM, GRU, Bidirectional RNN, Bidirectional LSTM, and Bidirectional GRU are used to a built spam filter. The experimentation was carried out on two datasets, one is a 20 newsgroup dataset, which contains multi-classes with 20,000 documents and the other is ENRON, a dataset contains 5,000 emails. The custom-designed models have performed well on both benchmark datasets and attained greater accuracy.


2020 ◽  
Author(s):  
Sriram Srinivasan ◽  
vinayakumar R ◽  
Sowmya V ◽  
Moez Krichen ◽  
Dhouha Ben Noureddine ◽  
...  

With the tremendous growth of the internet, cyberspace is facing several threats from the attackers. Threats like spam emails account for 55\% of total emails according to the Symantec monthly threat report. Over time, the attackers moved on to image spam to evade the text-based spam filters. To deal with this, the researchers have several machine learning and deep learning approaches that use various features like metadata, color, shape, texture features. But the Deep Convolutional Neural Network (DCNN) and transfer learning-based pre-trained CNN models are not explored much for Image spam classification. Therefore, in this work, 2 DCNN models along with few pre-trained ImageNet architectures like VGG19, Xception are trained on 3 different datasets. The effect of employing a Cost-sensitive learning approach to handle data imbalance is also studied. Some of the proposed models in this work achieves an accuracy up to 99\% with zero false positive rate in best case.


2020 ◽  
Author(s):  
Sriram Srinivasan ◽  
vinayakumar R ◽  
Sowmya V ◽  
Moez Krichen ◽  
Dhouha Ben Noureddine ◽  
...  

With the tremendous growth of the internet, cyberspace is facing several threats from the attackers. Threats like spam emails account for 55\% of total emails according to the Symantec monthly threat report. Over time, the attackers moved on to image spam to evade the text-based spam filters. To deal with this, the researchers have several machine learning and deep learning approaches that use various features like metadata, color, shape, texture features. But the Deep Convolutional Neural Network (DCNN) and transfer learning-based pre-trained CNN models are not explored much for Image spam classification. Therefore, in this work, 2 DCNN models along with few pre-trained ImageNet architectures like VGG19, Xception are trained on 3 different datasets. The effect of employing a Cost-sensitive learning approach to handle data imbalance is also studied. Some of the proposed models in this work achieves an accuracy up to 99\% with zero false positive rate in best case.


Spam emails, also known as non-self, are unsolicited commercial emails or fraudulent emails sent to a particular individual or company, or to a group of individuals. Machine learning algorithms in the area of spam filtering is commonly used. There has been a lot of effort to render spam filtering more efficient in classifying e-mails as either ham (valid messages) or spam (unwanted messages) through the ML classifiers. We may recognize the distinguishing features of the material of documents. Much important work has been carried out in the area of spam filtering which cannot be adapted to various conditions and problems which are limited to certain domains. Our analysis contrasts the positives methods as well as some shortcomings of current ML methods and open spam filters study challenges. We suggest some of the new ongoing approaches towards deep leaning as potential tactics that can tackle the challenge of spam emails efficiently.


Sign in / Sign up

Export Citation Format

Share Document