Detection Approaches for Categorization of Spam and Legitimate E-Mail
The internet has become very popular, and the concept of electronic mail has made it easy and cheap to communicate with many people. But, many undesired mails are also received by users and the higher percentage of these e-mails is termed spam. The goal of spam classification is to distinguish between spam and legitimate e-mail messages. But, with the popularization of the internet, it is challenging to develop spam filters that can effectively eliminate the increasing volumes of unwanted e-mails automatically before they enter a user's mailbox. The main objective of this chapter is to examine and identify the best detection approach for spam categorization. Different types of algorithms and data mining models are proposed, implemented, and evaluated on data sets. For improvement of spam filtering technique, the authors analyze the methods of feature selection and give recommendations of their use. The chapter concludes that the data mining models using a combination of supervised learning algorithms provide better results than single data models.