spam filter
Recently Published Documents


TOTAL DOCUMENTS

124
(FIVE YEARS 22)

H-INDEX

12
(FIVE YEARS 2)

Author(s):  
Syed Md. Minhaz Hossain ◽  
Iqbal H. Sarker

Recently, spam emails have become a significant problem with the expanding usage of the Internet. It is to some extend obvious to filter emails. A spam filter is a system that detects undesired and malicious emails and blocks them from getting into the users' inboxes. Spam filters check emails for something "suspicious" in terms of text, email address, header, attachments, and language. However, we have used different features such as word2vec, word n-grams, character n-grams, and a combination of variable length n-grams for comparative analysis in our proposed approach. Different machine learning models such as support vector machine (SVM), decision tree (DT), logistic regression (LR), and multinomial naïve bayes (MNB) are applied to train the extracted features. We use different evaluation metrics such as precision, recall, f1-score, and accuracy to evaluate the experimental results. Among them, SVM provides 97.6 \% of accuracy, 98.8\% of precision, and 94.9\% of f1-score using a combination of n-gram features.


Author(s):  
Arnold Adimabua Ojugo ◽  
David Ademola Oyemade

Advances in technology and the proliferation of mobile device have continued to advance the ubiquitous nature of computing alongside their many prowess and improved features it brings as a disruptive technology to aid information sharing amongst many online users. This popularity, usage and adoption ease, mobility, and portability of the mobile smartphone devices have allowed for its acceptability and popularity. Mobile smartphones continue to adopt the use of short messages services accompanied with a scenario for spamming to thrive. Spams are unsolicited message or inappropriate contents. An effective spam filter studies are limited as short-text message service (SMS) are 140bytes, 160-characters, and rippled with abbreviation and slangs that further inhibits the effective training of models. The study proposes a string match algorithm used as deep learning ensemble on a hybrid spam filtering technique to normalize noisy features, expand text and use semantic dictionaries of disambiguation to train underlying learning heuristics and effectively classify SMS into legitimate and spam classes. Study uses a profile hidden Markov network to select and train the network structure and employs the deep neural network as a classifier network structure. Model achieves an accuracy of 97% with an error rate of 1.2%.


Author(s):  
Manjit Jaiswal ◽  
Sukriti Das ◽  
Khushboo Khushboo

<span>A spam filter is a program which is used to identify unwanted emails and prevents those messages from getting into a user's mail. The study was focused on how the algorithms can be applied on a number of e-mails consisting of both ham and spam e-mails. First, the working principle and steps which are followed for implementation of stop words, TF-IDF and stemming algorithm on NVIDIA’s Tesla P100 GPU are discussed and to verify the findings by executing of Naïve Bayes algorithm. After complete training and testing of the spam e-mails dataset taken from Kaggle by using the proposed method, we got a high training accuracy of 99.67% and got a testing accuracy of about 99.03% on the multicore GPU that boosted the speed of execution of training time period and testing time period which is improved of training and testing accuracy around 0.22% and 0.18% respectively when compared to that after applying only Naïve Bayes i.e. conventional method to the same dataset where we found training and testing accuracy to be 99.45% and 98.85% respectively. Also, we found that training time taken on GPU is 1.361 seconds which was about 1.49X faster than that taken on CPU which is 2.029 seconds. And the testing time taken on GPU is 1.978 seconds which was about 1.15X faster than that taken on CPU which is 2.280 seconds.</span>


Author(s):  
Denis Aleksandrovich Kiryanov

The subject of this research is the development of the architecture of expert system for distributed content aggregation system, the main purpose of which is the categorization of aggregated data. The author examines the advantages and disadvantages of expert systems, toolset for development of expert systems, classification of expert systems, as well as application of expert systems for categorization of data. Special attention is given to the description of architecture of the proposed expert system, which consists of spam filter, component for determination of the main category for each type of the processed content, and components for determination of subcategories, one of which is based on the domain rules, and the other uses the methods of machine learning methods and complements the first one. The conclusion is made that expert system can be effectively applied for solution of the problems of categorization of data in the content aggregation systems. The author establishes that hybrid solutions, which combine an approach based on the use of knowledge base and rules with implementation of neural networks allow reducing the cost of the expert system. The novelty of this research lies in the proposed architecture of the system, which is easily extensible and adaptable to workloads by scaling existing modules or adding new ones. The proposed module for spam detection leans on adapting the behavioral algorithm for detecting spam in emails; the proposed module for determination of the key categories of content uses two types of algorithms: fuzzy fingerprints and Twitter topic fuzzy fingerprints that was initially applied for categorization of messages in the social network Twitter. The module that determine subcategory based on the keywords functions in interaction with the thesaurus database. The latter classifier uses the reference vector algorithm for the final determination of subcategories.


2020 ◽  
Vol 7 (4) ◽  
pp. 91-95
Author(s):  
Venkata RamiReddy Chirra ◽  
Hoolda Daniel Maddiboyina ◽  
Yakobu Dasari ◽  
Ranganadhareddy Aluru

Spam in email box is received because of advertising, collecting personal information, or to indulge malware through websites or scripts. Most often, spammers send junk mail with an intention of committing email fraud. Today spam mail accounts for 45% of all email and hence there is an ever-increasing need to build efficient spam filters to identify and block spam mail. However, notably today’s spam filters in use are built using traditional approaches such as statistical and content-based techniques. These techniques don’t improve their performance while handling huge data and they need a lot of domain expertise, human intervention and they neglect the relation between the words in context and consider the occurrence of the word. To address these limitations, we developed a spam filter using deep neural networks. In this work, various deep neural networks such as RNN, LSTM, GRU, Bidirectional RNN, Bidirectional LSTM, and Bidirectional GRU are used to a built spam filter. The experimentation was carried out on two datasets, one is a 20 newsgroup dataset, which contains multi-classes with 20,000 documents and the other is ENRON, a dataset contains 5,000 emails. The custom-designed models have performed well on both benchmark datasets and attained greater accuracy.


Due tothe current pandemic of COVID-19, the world has turned into ONLINE modeand an increase in online communication thereby information exchange, sharing useful data through emails and other social Medias.So addressing the security issues places a vital role in computer security and shouldhave thepriorities. We need a security check to enhance the inbox so that the important information or emails should not reach to the spam box. In this paper to improve the filtering techniques, wehave adopted the Naïve Bayes approach in implementation and enhancing the spam filter in the email. Bayes's approach is efficient, accurate, and simple in implementing the proposed algorithm. Bayes algorithm is used to verify correct semantic information of the email and avoidsthe pass to pass approach if the incoming mail is important. The Python language is used to develop the proposed algorithm.


IET Networks ◽  
2020 ◽  
Vol 9 (6) ◽  
pp. 338-347
Author(s):  
Ko-Tsung Chu ◽  
Hua-Ting Hsu ◽  
Jyh-Jian Sheu ◽  
Wei-Pang Yang ◽  
Cheng-Chi Lee
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document