scholarly journals A New Multi-Criteria Quadratic-Programming Linear Classification Model for VIP E-Mail Analysis

Author(s):  
Peng Zhang ◽  
Juliang Zhang ◽  
Yong Shi
2016 ◽  
Vol 1 (2) ◽  
pp. 13 ◽  
Author(s):  
Mahmud Dwi Sulistiyo ◽  
Rita Rismala

<span style="font-size: 9.0pt; mso-bidi-font-size: 11.0pt; line-height: 107%; font-family: 'Times New Roman','serif'; mso-fareast-font-family: Calibri; mso-fareast-theme-font: minor-latin; mso-ansi-language: EN-US; mso-fareast-language: EN-US; mso-bidi-language: AR-SA;">Classification becomes one of the classic problems that are often encountered in the field of artificial intelligence and data mining. The problem in classification is how to build a classifier model through training or learning process. Process in building the classifier model can be seen as an optimization problem. Therefore, optimization algorithms can be used as an alternative way to generate the classifier models. In this study, the process of learning is done by utilizing one of Evolutionary Algorithms (EAs), namely Evolution Strategies (ES). Observation and analysis conducted on several parameters that influence the ES, as well as how far the general classifier model used in this study solve the problem. The experiments and analyze results show that ES is pretty good in optimizing the linear classification model used. For Fisher’s Iris dataset, as the easiest to be classified, the test accuracy is best achieved by 94.4%; KK Selection dataset is 84%; and for SMK Major Election datasets which is the hardest to be classified reach only 49.2%.</span>


2021 ◽  
Vol 27 (4) ◽  
pp. 364-386
Author(s):  
Marcia Henke ◽  
Eulanda Santos ◽  
Eduardo Souto ◽  
Altair O Santin

Electronic messages are still considered the most significant tools in business and personal applications due to their low cost and easy access. However, e-mails have become a major problem owing to the high amount of junk mail, named spam, which fill the e-mail boxes of users. Several approaches have been proposed to detect spam, such as filters implemented in e-mail servers and user-based spam message classification mechanisms. A major problem with these approaches is spam detection in the presence of concept drift, especially as a result of changes in features over time. To overcome this problem, this work proposes a new spam detection system based on analyzing the evolution of features. The proposed method is divided into three steps: 1) spam classification model training; 2) concept drift detection; and 3) knowledge transfer learning. The first step generates classification models, as commonly conducted in machine learning. The second step introduces a new strategy to avoid concept drift: SFS (Similarity-based Features Se- lection) that analyzes the evolution of the features taking into account similarity obtained between the feature vectors extracted from training data and test data. Finally, the third step focuses on the following questions: what, how, and when to transfer acquired knowledge? The proposed method is evaluated using two public datasets. The results of the experiments show that it is possible to infer a threshold to detect changes (drift) in order to ensure that the spam classification model is updated through knowledge transfer. Moreover, our anomaly detection system is able to perform spam classification and concept drift detection as two parallel and independent tasks.


2019 ◽  
Vol 2019 ◽  
pp. 1-14
Author(s):  
Jerry Opoku-Ansah ◽  
Moses Jojo Eghan ◽  
Benjamin Anderson ◽  
Johnson Nyarko Boampong ◽  
Raymond Edziah ◽  
...  

Plasmodium falciparum (P. falciparum) malarial degree of infection, termed as parasite density (PD), estimation is vital for point-of-care diagnosis and treatment of the disease. In this work, we present application of optical techniques: optical absorption and multispectral imaging for P. falciparum malarial byproduct (hemozoin) detection in human‐infected blood samples to estimate PD. The blood samples were collected from volunteers who were tested positive for P. falciparum infections (i-blood), and after treatment, another set of blood samples (u-blood) were also taken. The i-blood samples were grouped based on PD (+, ++, +++, and ++++). Optical densities (ODs) of u-blood samples and i-blood samples at blood absorption bands of 405 nm, 541 nm, and 577 nm showed different optical absorption characteristics. Empirical computation of ratio of the ODs for the blood absorption bands revealed reduction in the ODs with increasing PD. Multispectral images containing uninfected red blood cells (u-RBCs) and P. falciparum‐infected red blood cells (i-RBCs) on unstained blood smear slides exhibited spectrally determined decrease in both reflected and scattered pixel intensities and increase in transmitted pixel intensities with increasing PD. We further propose a linear classification model based on Fisher’s approach using reflected, scattered, and transmitted pixel intensities for easy and inexpensive estimation of PD as an alternative to manual estimation of PD, currently, the widely used technique. Application of the optical techniques and the proposed linear classification model are therefore recommended for improved malaria diagnosis and therapy.


Author(s):  
Temitayo O. Oyegoke ◽  
Kehinde K. Akomolede ◽  
Adesola G. Aderounmu ◽  
Emmanuel R. Adagunodo

This study was developed an e-mail classification model to preempt fraudulent activities. The e-mail has such a predominant nature that makes it suitable for adoption by cyber-fraudsters. This research used a combination of two databases: CLAIR fraudulent and Spambase datasets for creating the training and testing dataset. The CLAIR dataset consists of raw e-mails from users’ inbox which were pre-processed into structured form using Natural Language Processing (NLP) techniques. This dataset was then consolidated with the Spambase dataset as a single dataset. The study deployed the Multi-Layer Perceptron (MLP) architecture which used a back-propagation algorithm for training the fraud detection model. The model was simulated using 70% and 80% for training while 30% and 20% of datasets were used for testing respectively. The results of the performance of the models were compared using a number of evaluation metrics. The study concluded that using the MLP, an effective model for fraud detection among e-mail dataset was proposed.


2021 ◽  
Vol 11 (24) ◽  
pp. 11968
Author(s):  
Ghizlane Hnini ◽  
Jamal Riffi ◽  
Mohamed Adnane Mahraz ◽  
Ali Yahyaouy ◽  
Hamid Tairi

Hybrid spam is an undesirable e-mail (electronic mail) that contains both image and text parts. It is more harmful and complex as compared to image-based and text-based spam e-mail. Thus, an efficient and intelligent approach is required to distinguish between spam and ham. To our knowledge, a small number of studies have been aimed at detecting hybrid spam e-mails. Most of these multimodal architectures adopted the decision-level fusion method, whereby the classification scores of each modality were concatenated and fed to another classification model to make a final decision. Unfortunately, this method not only demands many learning steps, but it also loses correlation in mixed feature space. In this paper, we propose a deep multimodal feature-level fusion architecture that concatenates two embedding vectors to have a strong representation of e-mails and increase the performance of the classification. The paragraph vector distributed bag of words (PV-DBOW) and the convolutional neural network (CNN) were used as feature extraction techniques for text and image parts, respectively, of the same e-mail. The extracted feature vectors were concatenated and fed to the random forest (RF) model to classify a hybrid e-mail as either spam or ham. The experiments were conducted on three hybrid datasets made using three publicly available corpora: Enron, Dredze, and TREC 2007. According to the obtained results, the proposed model provides a higher accuracy of 99.16% compared to recent state-of-the-art methods.


Author(s):  
P Sai Teja

Unsolicited e-mail also known as Spam has become a huge concern for each e-mail user. In recent times, it is very difficult to filter spam emails as these emails are produced or created or written in a very special manner so that anti-spam filters cannot detect such emails. This paper compares and reviews performance metrics of certain categories of supervised machine learning techniques such as SVM (Support Vector Machine), Random Forest, Decision Tree, CNN, (Convolutional Neural Network), KNN(K Nearest Neighbor), MLP(Multi-Layer Perceptron), Adaboost (Adaptive Boosting) Naïve Bayes algorithm to predict or classify into spam emails. The objective of this study is to consider the details or content of the emails, learn a finite dataset available and to develop a classification model that will be able to predict or classify whether an e-mail is spam or not.


Author(s):  
Nestor J. Zaluzec

The Information SuperHighway, Email, The Internet, FTP, BBS, Modems, : all buzz words which are becoming more and more routine in our daily life. Confusing terminology? Hopefully it won't be in a few minutes, all you need is to have a handle on a few basic concepts and terms and you will be on-line with the rest of the "telecommunication experts". These terms all refer to some type or aspect of tools associated with a range of computer-based communication software and hardware. They are in fact far less complex than the instruments we use on a day to day basis as microscopist's and microanalyst's. The key is for each of us to know what each is and how to make use of the wealth of information which they can make available to us for the asking. Basically all of these items relate to mechanisms and protocols by which we as scientists can easily exchange information rapidly and efficiently to colleagues in the office down the hall, or half-way around the world using computers and various communications media. The purpose of this tutorial/paper is to outline and demonstrate the basic ideas of some of the major information systems available to all of us today. For the sake of simplicity we will break this presentation down into two distinct (but as we shall see later connected) areas: telecommunications over conventional phone lines, and telecommunications by computer networks. Live tutorial/demonstrations of both procedures will be presented in the Computer Workshop/Software Exchange during the course of the meeting.


Sign in / Sign up

Export Citation Format

Share Document