spam filter Latest Research Papers

Content-based Spam Email Detection Using N-gram Machine Learning Approach

10.20944/preprints202109.0236.v1 ◽

2021 ◽

Author(s):

Syed Md. Minhaz Hossain ◽

Iqbal H. Sarker

Keyword(s):

Machine Learning ◽

Support Vector ◽

Learning Approach ◽

Ve Bayes ◽

Spam Filter ◽

Email Address ◽

Machine Learning Approach ◽

Spam Filters ◽

N Gram ◽

Machine Learning Models

Recently, spam emails have become a significant problem with the expanding usage of the Internet. It is to some extend obvious to filter emails. A spam filter is a system that detects undesired and malicious emails and blocks them from getting into the users' inboxes. Spam filters check emails for something "suspicious" in terms of text, email address, header, attachments, and language. However, we have used different features such as word2vec, word n-grams, character n-grams, and a combination of variable length n-grams for comparative analysis in our proposed approach. Different machine learning models such as support vector machine (SVM), decision tree (DT), logistic regression (LR), and multinomial naïve bayes (MNB) are applied to train the extracted features. We use different evaluation metrics such as precision, recall, f1-score, and accuracy to evaluate the experimental results. Among them, SVM provides 97.6 \% of accuracy, 98.8\% of precision, and 94.9\% of f1-score using a combination of n-gram features.

Download Full-text

Boyer Moore string-match framework for a hybrid short message service spam filtering technique

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v10.i3.pp519-527 ◽

2021 ◽

Vol 10 (3) ◽

pp. 519

Author(s):

Arnold Adimabua Ojugo ◽

David Ademola Oyemade

Keyword(s):

Network Structure ◽

Text Message ◽

Disruptive Technology ◽

Spam Filtering ◽

Markov Network ◽

String Match ◽

Short Text ◽

Spam Filter ◽

Message Service ◽

Filtering Technique

Advances in technology and the proliferation of mobile device have continued to advance the ubiquitous nature of computing alongside their many prowess and improved features it brings as a disruptive technology to aid information sharing amongst many online users. This popularity, usage and adoption ease, mobility, and portability of the mobile smartphone devices have allowed for its acceptability and popularity. Mobile smartphones continue to adopt the use of short messages services accompanied with a scenario for spamming to thrive. Spams are unsolicited message or inappropriate contents. An effective spam filter studies are limited as short-text message service (SMS) are 140bytes, 160-characters, and rippled with abbreviation and slangs that further inhibits the effective training of models. The study proposes a string match algorithm used as deep learning ensemble on a hybrid spam filtering technique to normalize noisy features, expand text and use semantic dictionaries of disambiguation to train underlying learning heuristics and effectively classify SMS into legitimate and spam classes. Study uses a profile hidden Markov network to select and train the network structure and employs the deep neural network as a classifier network structure. Model achieves an accuracy of 97% with an error rate of 1.2%.

Download Full-text

Detecting spam e-mails using stop word TF-IDF and stemming algorithm with Naïve Bayes classifier on the multicore GPU

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v11i4.pp3168-3175 ◽

2021 ◽

Vol 11 (4) ◽

pp. 3168

Author(s):

Manjit Jaiswal ◽

Sukriti Das ◽

Khushboo Khushboo

Keyword(s):

Naive Bayes ◽

Naïve Bayes ◽

Testing Time ◽

Training Time ◽

Spam Filter ◽

Time Period ◽

Stop Word ◽

Working Principle ◽

Testing Accuracy ◽

Bayes Algorithm

<span>A spam filter is a program which is used to identify unwanted emails and prevents those messages from getting into a user's mail. The study was focused on how the algorithms can be applied on a number of e-mails consisting of both ham and spam e-mails. First, the working principle and steps which are followed for implementation of stop words, TF-IDF and stemming algorithm on NVIDIA’s Tesla P100 GPU are discussed and to verify the findings by executing of Naïve Bayes algorithm. After complete training and testing of the spam e-mails dataset taken from Kaggle by using the proposed method, we got a high training accuracy of 99.67% and got a testing accuracy of about 99.03% on the multicore GPU that boosted the speed of execution of training time period and testing time period which is improved of training and testing accuracy around 0.22% and 0.18% respectively when compared to that after applying only Naïve Bayes i.e. conventional method to the same dataset where we found training and testing accuracy to be 99.45% and 98.85% respectively. Also, we found that training time taken on GPU is 1.361 seconds which was about 1.49X faster than that taken on CPU which is 2.029 seconds. And the testing time taken on GPU is 1.978 seconds which was about 1.15X faster than that taken on CPU which is 2.280 seconds.</span>

Download Full-text

Experiment Research on Spam Filter Classifier Based on Naive Bayesian Algorithm

10.1109/icaa53760.2021.00146 ◽

2021 ◽

Author(s):

Teng Lv ◽

Ping Yan ◽

Hongwu Yuan ◽

Weimin He

Keyword(s):

Experiment Research ◽

Bayesian Algorithm ◽

Naive Bayesian ◽

Spam Filter ◽

Naïve Bayesian

Download Full-text

A COMPARISON OF MACHINE LEARNING ALGORITHMS FOR INDIAN YOUTUBE SPAM FILTER

Journal of Research in Engineering and Applied Sciences ◽

10.46565/jreas.2021.v06i02.003 ◽

2021 ◽

Vol 6 (2) ◽

pp. 70-74

Author(s):

Hasan Asif ◽

Masood Anzar

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Spam Filter

Download Full-text

Hybrid categorical expert system for the use in content aggregation

Программные системы и вычислительные методы ◽

10.7256/2454-0714.2021.4.37019 ◽

2021 ◽

pp. 1-22

Author(s):

Denis Aleksandrovich Kiryanov

Keyword(s):

Expert System ◽

Expert Systems ◽

Main Category ◽

Aggregated Data ◽

Advantages And Disadvantages ◽

Spam Filter ◽

Hybrid Solutions ◽

The Cost ◽

Use Of Knowledge

The subject of this research is the development of the architecture of expert system for distributed content aggregation system, the main purpose of which is the categorization of aggregated data. The author examines the advantages and disadvantages of expert systems, toolset for development of expert systems, classification of expert systems, as well as application of expert systems for categorization of data. Special attention is given to the description of architecture of the proposed expert system, which consists of spam filter, component for determination of the main category for each type of the processed content, and components for determination of subcategories, one of which is based on the domain rules, and the other uses the methods of machine learning methods and complements the first one. The conclusion is made that expert system can be effectively applied for solution of the problems of categorization of data in the content aggregation systems. The author establishes that hybrid solutions, which combine an approach based on the use of knowledge base and rules with implementation of neural networks allow reducing the cost of the expert system. The novelty of this research lies in the proposed architecture of the system, which is easily extensible and adaptable to workloads by scaling existing modules or adding new ones. The proposed module for spam detection leans on adapting the behavioral algorithm for detecting spam in emails; the proposed module for determination of the key categories of content uses two types of algorithms: fuzzy fingerprints and Twitter topic fuzzy fingerprints that was initially applied for categorization of messages in the social network Twitter. The module that determine subcategory based on the keywords functions in interaction with the thesaurus database. The latter classifier uses the reference vector algorithm for the final determination of subcategories.

Download Full-text

Performance Evaluation of Email Spam Text Classification Using Deep Neural Networks

Review of Computer Engineer Studies ◽

10.18280/rces.070403 ◽

2020 ◽

Vol 7 (4) ◽

pp. 91-95

Author(s):

Venkata RamiReddy Chirra ◽

Hoolda Daniel Maddiboyina ◽

Yakobu Dasari ◽

Ranganadhareddy Aluru

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Personal Information ◽

Spam Filter ◽

Huge Data ◽

Junk Mail ◽

Spam Filters ◽

Benchmark Datasets ◽

Domain Expertise ◽

Traditional Approaches

Spam in email box is received because of advertising, collecting personal information, or to indulge malware through websites or scripts. Most often, spammers send junk mail with an intention of committing email fraud. Today spam mail accounts for 45% of all email and hence there is an ever-increasing need to build efficient spam filters to identify and block spam mail. However, notably today’s spam filters in use are built using traditional approaches such as statistical and content-based techniques. These techniques don’t improve their performance while handling huge data and they need a lot of domain expertise, human intervention and they neglect the relation between the words in context and consider the occurrence of the word. To address these limitations, we developed a spam filter using deep neural networks. In this work, various deep neural networks such as RNN, LSTM, GRU, Bidirectional RNN, Bidirectional LSTM, and Bidirectional GRU are used to a built spam filter. The experimentation was carried out on two datasets, one is a 20 newsgroup dataset, which contains multi-classes with 20,000 documents and the other is ENRON, a dataset contains 5,000 emails. The custom-designed models have performed well on both benchmark datasets and attained greater accuracy.

Download Full-text

Personalised Spam Filter for Social Networks Using Machine Learning Algorithms

Bioscience Biotechnology Research Communications ◽

10.21786/bbrc/13.14/87 ◽

2020 ◽

Vol 13 (14) ◽

pp. 376-380

Author(s):

Mohammed Husain

Keyword(s):

Machine Learning ◽

Social Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Spam Filter

Download Full-text

Naïve Bayes Filter for Communication & Enhancing Semantic in Email

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d4904.119420 ◽

2020 ◽

Vol 9 (4) ◽

pp. 282-288

Keyword(s):

Computer Security ◽

Information Exchange ◽

Naive Bayes ◽

Online Communication ◽

Vital Role ◽

Naïve Bayes ◽

Security Issues ◽

Spam Filter ◽

Python Language ◽

Bayes Algorithm

Due tothe current pandemic of COVID-19, the world has turned into ONLINE modeand an increase in online communication thereby information exchange, sharing useful data through emails and other social Medias.So addressing the security issues places a vital role in computer security and shouldhave thepriorities. We need a security check to enhance the inbox so that the important information or emails should not reach to the spam box. In this paper to improve the filtering techniques, wehave adopted the Naïve Bayes approach in implementation and enhancing the spam filter in the email. Bayes's approach is efficient, accurate, and simple in implementing the proposed algorithm. Bayes algorithm is used to verify correct semantic information of the email and avoidsthe pass to pass approach if the incoming mail is important. The Python language is used to develop the proposed algorithm.

Download Full-text

Effective spam filter based on a hybrid method of header checking and content parsing

IET Networks ◽

10.1049/iet-net.2019.0191 ◽

2020 ◽

Vol 9 (6) ◽

pp. 338-347

Author(s):

Ko-Tsung Chu ◽

Hua-Ting Hsu ◽

Jyh-Jian Sheu ◽

Wei-Pang Yang ◽

Cheng-Chi Lee

Keyword(s):

Hybrid Method ◽

Spam Filter

Download Full-text

spam filter
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Content-based Spam Email Detection Using N-gram Machine Learning Approach

Boyer Moore string-match framework for a hybrid short message service spam filtering technique

Detecting spam e-mails using stop word TF-IDF and stemming algorithm with Naïve Bayes classifier on the multicore GPU

Experiment Research on Spam Filter Classifier Based on Naive Bayesian Algorithm

A COMPARISON OF MACHINE LEARNING ALGORITHMS FOR INDIAN YOUTUBE SPAM FILTER

Hybrid categorical expert system for the use in content aggregation

Performance Evaluation of Email Spam Text Classification Using Deep Neural Networks

Personalised Spam Filter for Social Networks Using Machine Learning Algorithms

Naïve Bayes Filter for Communication & Enhancing Semantic in Email

Effective spam filter based on a hybrid method of header checking and content parsing

Export Citation Format

spam filterRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Content-based Spam Email Detection Using N-gram Machine Learning Approach

Boyer Moore string-match framework for a hybrid short message service spam filtering technique

Detecting spam e-mails using stop word TF-IDF and stemming algorithm with Naïve Bayes classifier on the multicore GPU

Experiment Research on Spam Filter Classifier Based on Naive Bayesian Algorithm

A COMPARISON OF MACHINE LEARNING ALGORITHMS FOR INDIAN YOUTUBE SPAM FILTER

Hybrid categorical expert system for the use in content aggregation

Performance Evaluation of Email Spam Text Classification Using Deep Neural Networks

Personalised Spam Filter for Social Networks Using Machine Learning Algorithms

Naïve Bayes Filter for Communication & Enhancing Semantic in Email

Effective spam filter based on a hybrid method of header checking and content parsing

spam filter
Recently Published Documents