E-Mail Spam Detection using Machine Learning and Deep Learning

Communication through email plays an essential part especially in every sector of our day-to-day life. Considering its significance, it is important to filter spam emails from emails. Spam email, also known as junk email, is unwanted messages that are sent by the electronic medium in large quantities. Most of the spam emails are commercial in nature that is not only irritating but also harmful due to malicious scams or malware-hosting sites or use viruses attached to the message. In this paper, we identify spam emails and expose how spam emails can be distinguished from legitimate/normal emails. We deployed four machine learning models and two deep learning models over the datasets including the combined dataset. Besides, we also try to find the important keywords that are found repeatedly from spam emails repository. This type of knowledge will enable us to detect spam emails for our personnel and community security purpose.<br>

Download Full-text

An Observation and Experimental Evaluation of Image Spam Detection

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.c4733.098319 ◽

2019 ◽

Vol 8 (3) ◽

pp. 5892-5896

Keyword(s):

Machine Learning ◽

Experimental Evaluation ◽

Empirical Investigation ◽

Classification Systems ◽

Spam Detection ◽

Data Sets ◽

Font Size ◽

Image Spam ◽

E Mail ◽

Background Pattern

In belonging to other supports duel beside researchers of image spam detections, unsolicited mail have newly developed the image based spam dodge to construct the investigation of e-mails’ content of text unsuccessful. To avoid signature based recognition, it involves in implanting the unsolicited text or message into an appendage image, which is frequently arbitrarily customized. Identifying image based spam emails tries out to be an motivating illustration of the problem text embedded in images were subjected to noise such as background pattern, color, font variations and imperfections in a font size so as to eliminate the chances of being identified as unsolicited e-mail by classification techniques. In this research paper we spring a exhaustive review and categorization of machine learning and classification systems suggested so far in contradiction of image based spam email, and make an empirical investigation and correlation of few of them on real, widely accessible data sets.

Download Full-text

A Comparitive Study of E-Mail Spam Detection using Various Machine Learning Techniques

10.21467/proceedings.114.56 ◽

2021 ◽

Author(s):

Simarjeet Kaur ◽

Meenakshi Bansal ◽

Ashok Kumar Bathla

Keyword(s):

Machine Learning ◽

Prediction Accuracy ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Spam Detection ◽

Learning Techniques ◽

E Mail ◽

Email Spam

Due to the rise in the use of messaging and mailing services, spam detection tasks are of much greater importance than before. In such a set of communications, efficient classification is a comparatively onerous job. For an addressee or any email that the user does not want to have in his inbox, spam can be defined as redundant or trash email. After pre-processing and feature extraction, various machine learning algorithms were applied to a Spam base dataset from the UCI Machine Learning repository in order to classify incoming emails into two categories: spam and non-spam. The outcomes of various algorithms have been compared. This paper used random forest, naive bayes, support vector machine (SVM), logistic regression, and the k nearest (KNN) machine learning algorithm to successfully classify email spam messages. The main goal of this study is to improve the prediction accuracy of spam email filters.

Download Full-text

Spam-Detection with Comparative Analysis and Spamming Words Extractions

10.36227/techrxiv.16832320 ◽

2021 ◽

Author(s):

Md Khairul Islam ◽

Md Al Amin ◽

Md Rakibul Islam ◽

Md Nosin Ibna Mahbub ◽

Md Imran Hossain Showrov ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Comparative Analysis ◽

Spam Detection ◽

Learning Models ◽

Electronic Medium ◽

Machine Learning Models

Communication through email plays an essential part especially in every sector of our day-to-day life. Considering its significance, it is important to filter spam emails from emails. Spam email, also known as junk email, is unwanted messages that are sent by the electronic medium in large quantities. Most of the spam emails are commercial in nature that is not only irritating but also harmful due to malicious scams or malware-hosting sites or use viruses attached to the message. In this paper, we identify spam emails and expose how spam emails can be distinguished from legitimate/normal emails. We deployed four machine learning models and two deep learning models over the datasets including the combined dataset. Besides, we also try to find the important keywords that are found repeatedly from spam emails repository. This type of knowledge will enable us to detect spam emails for our personnel and community security purpose.<br>

Download Full-text

Spam-Detection with Comparative Analysis and Spamming Words Extractions

10.36227/techrxiv.16832320.v1 ◽

2021 ◽

Author(s):

Md Khairul Islam ◽

Md Al Amin ◽

Md Rakibul Islam ◽

Md Nosin Ibna Mahbub ◽

Md Imran Hossain Showrov ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Comparative Analysis ◽

Spam Detection ◽

Learning Models ◽

Electronic Medium ◽

Machine Learning Models

Communication through email plays an essential part especially in every sector of our day-to-day life. Considering its significance, it is important to filter spam emails from emails. Spam email, also known as junk email, is unwanted messages that are sent by the electronic medium in large quantities. Most of the spam emails are commercial in nature that is not only irritating but also harmful due to malicious scams or malware-hosting sites or use viruses attached to the message. In this paper, we identify spam emails and expose how spam emails can be distinguished from legitimate/normal emails. We deployed four machine learning models and two deep learning models over the datasets including the combined dataset. Besides, we also try to find the important keywords that are found repeatedly from spam emails repository. This type of knowledge will enable us to detect spam emails for our personnel and community security purpose.<br>

Download Full-text

Twitter Spam Detection using Pre-trained Model

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.d4228.118419 ◽

2019 ◽

Vol 8 (4) ◽

pp. 10620-10623

Keyword(s):

Machine Learning ◽

Social Media ◽

Language Processing ◽

Spam Detection ◽

Social Media Platform ◽

Machine Learning Model ◽

The Social ◽

Media Platform ◽

E Mail ◽

Day By Day

In the age of technology social media platform is becoming a great companion for expressing the thoughts, information, and opinion. It became the powerful tool for every person who wants to expand their networks of people beyond the physical boundation. We are living at that age where various categories of social media platform available according to work needed, it may be Facebook, LinkedIn, WhatsApp or Twitter. We are focusing our work on Twitter, It is also known as microblogging site which provides service to express the opinion in limited words. As the popularity of twitter is growing day by day users are joining the platform very fast, as it happens another side many spammers are also taking undue advantages of this platform, for any social media platform it is very important to maintain the secure, safer and trustworthy environment for their legitimate users. Twitter spams are more harmful than e-mail spam because of their higher clickthrough rate, as in the social network if someone trusted some spam a genuine post than it is higher chance that the persons in the network might also trust on that spam post and may click on it. There are plenty of methods available to handle the task of twitter spam detection problem, we are solving this problem of twitter spam at tweet level.Pre-trained models are some breakthrough in the journey of machine learning and natural language processing after their advancement they are of great help. Here we are using Bidirectional Encoder Representation from Transformer (BERT) model to solve the problem as our task is to solve the problem of imbalance dataset as well as the multilingual dataset, BERT makes a clear distinction in this type of task, the main advantage of this type’s model is that we don’t have to collect millions of data for better performance of the machine learning model.

Download Full-text

Regulatory-approved Deep Learning/Machine Learning-Based Medical Devices in Japan as of 2020: A Systematic Review

10.1101/2021.02.19.21252031 ◽

2021 ◽

Author(s):

Nao Aisu ◽

Masahiro Miyake ◽

Kohei Takeshita ◽

Masato Akiyama ◽

Ryo Kawasaki ◽

...

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Deep Learning ◽

Medical Devices ◽

The Public ◽

Search Service ◽

Public Announcements ◽

The Status ◽

Learning Machine ◽

E Mail

AbstractMachine learning (ML) and deep learning (DL) are changing the world and reshaping the medical field. Thus, we conducted a systematic review to determine the status of regulatory-approved ML/DL-based medical devices in Japan, a leading stakeholder in international regulatory harmonization. Information about the medical devices were obtained from the Japan Association for the Advancement of Medical Equipment search service. The usage of ML/DL methodology in the medical devices was confirmed using public announcements or by contacting the marketing authorization holders via e-mail when the public announcements were insufficient for confirmation. Among the 114,150 medical devices found, 11 were regulatory-approved ML/DL-based Software as a Medical Device (SaMD), with 6 products (54.5%) related to radiology and 5 products (45.5%) related to gastroenterology. The domestic ML/DL-based SaMD were mostly related to health check-ups, which are common in Japan. Our review provides a global overview that can foster international competitiveness and further tailored advancements.

Download Full-text

Estudo de Abordagens para Classificação de Textos sobre Dúvidas Tributárias Utilizando Mineração de Texto

10.21528/cbic2021-155 ◽

2021 ◽

Author(s):

Rodrigo Dantas ◽

Karla Figueiredo ◽

Leonardo Andrade

Keyword(s):

Machine Learning ◽

Deep Learning ◽

E Mail

A SEFAZ-RJ possui um canal de atendimento pelo Sistema “Fale Conosco” usado para esclarecer dúvidas de contribuintes enviadas por e-mail. Com o distanciamento social, ocasionado pela COVID-19, a automatização das respostas para um maior volume de mensagens recebidas, tornou-se fundamental. Dessa forma, esse trabalho apresenta investigação de técnicas de Mineração de Texto, visando, em um primeiro momento, a classificação das mensagens de contribuintes a partir de técnicas de Machine Learning/Deep Learning para a automatização do processo de respostas aos contribuintes. Os resultados obtidos contribuíram para uma proposta de reformulação do formulário de dúvidas dos contribuintes, além de indicarem técnicas mais promissoras para se iniciar o processo de automatização do ``Fale Conosco'' da SEFAZ-RJ.

Download Full-text