Cloud e-mail security: An accurate e-mail spam classification based on enhanced binary differential evolution (BDE) algorithm

2021 ◽  
pp. 1-13
Author(s):  
Nadir O. Hamed ◽  
Ahmed H. Samak ◽  
Mostafa A. Ahmad

The evolution of technology has brought new challenges and opportunities for the different dimensions of feature space. The higher dimension of the feature space is one of the most critical issues in e-mail classification problems due to accuracy considerations. The problem of finding the subset features that significantly influence the performance of e-mail spam classification has become one of the important challenges. This paper proposes to overcome such a problem, an intelligent approach to Binary Differential Evolution Support Vector Machine (BDE-SVM). The proposed approach enhances the Binary Differential Evolution (BDE) algorithm based on the correlation coefficient as a fitness function to select the significant subset feature evaluated by an SVM classifier. To our best of knowledge, the correlation coefficient as the fitness function has not been used in the differential evolution algorithm before. The selected subset feature is used to assess the most features that contribute to the reliability of the email spam classification. The finding of the enhanced BDE is to present a powerful accuracy. The tests were conducted using “Spambase” and “SpamAssassin.” Identified benchmark datasets are to assess the feasibility of the proposed solution. The result with full-feature accuracy was 93.55 percent compared to the proposed BDE-SVM approach, which is 93.99 percent. Empirical findings also show that our method is capable of effectively increasing the number of features required to enhance the reliability of the email spam classification.

2018 ◽  
Vol 8 (9) ◽  
pp. 1621 ◽  
Author(s):  
Fan Jiang ◽  
Zhencai Zhu ◽  
Wei Li ◽  
Yong Ren ◽  
Gongbo Zhou ◽  
...  

Acceleration sensors are frequently applied to collect vibration signals for bearing fault diagnosis. To fully use these vibration signals of multi-sensors, this paper proposes a new approach to fuse multi-sensor information for bearing fault diagnosis by using ensemble empirical mode decomposition (EEMD), correlation coefficient analysis, and support vector machine (SVM). First, EEMD is applied to decompose the vibration signal into a set of intrinsic mode functions (IMFs), and a correlation coefficient ratio factor (CCRF) is defined to select sensitive IMFs to reconstruct new vibration signals for further feature fusion analysis. Second, an original feature space is constructed from the reconstructed signal. Afterwards, weights are assigned by correlation coefficients among the vibration signals of the considered multi-sensors, and the so-called fused features are extracted by the obtained weights and original feature space. Finally, a trained SVM is employed as the classifier for bearing fault diagnosis. The diagnosis results of the original vibration signals, the first IMF, the proposed reconstruction signal, and the proposed method are 73.33%, 74.17%, 95.83% and 100%, respectively. Therefore, the experiments show that the proposed method has the highest diagnostic accuracy, and it can be regarded as a new way to improve diagnosis results for bearings.


2020 ◽  
Vol 24 (18) ◽  
pp. 14221-14234
Author(s):  
Amir Karbassi Yazdi ◽  
Mohamad Amin Kaviani ◽  
Thomas Hanne ◽  
Andres Ramos

2015 ◽  
Vol 31 (4) ◽  
pp. 361-380 ◽  
Author(s):  
Alfonso Martinez Cruz ◽  
Ricardo Barrón Fernández ◽  
Herón Molina Lozano ◽  
Marco Antonio Ramírez Salinas ◽  
Luis Alfonso Villa Vargas

Electronics ◽  
2020 ◽  
Vol 9 (9) ◽  
pp. 1443
Author(s):  
Mai Ramadan Ibraheem ◽  
Shaker El-Sappagh ◽  
Tamer Abuhmed ◽  
Mohammed Elmogy

The formation of malignant neoplasm can be seen as deterioration of a pre-malignant skin neoplasm in its functionality and structure. Distinguishing melanocytic skin neoplasms is a challenging task due to their high visual similarity with different types of lesions and the intra-structural variants of melanocytic neoplasms. Besides, there is a high visual likeliness level between different lesion types with inhomogeneous features and fuzzy boundaries. The abnormal growth of melanocytic neoplasms takes various forms from uniform typical pigment network to irregular atypical shape, which can be described by border irregularity of melanocyte lesion image. This work proposes analytical reasoning for the human-observable phenomenon as a high-level feature to determine the neoplasm growth phase using a novel pixel-based feature space. The pixel-based feature space, which is comprised of high-level features and other color and texture features, are fed into the classifier to classify different melanocyte neoplasm phases. The proposed system was evaluated on the PH2 dermoscopic images benchmark dataset. It achieved an average accuracy of 95.1% using a support vector machine (SVM) classifier with the radial basis function (RBF) kernel. Furthermore, it reached an average Disc similarity coefficient (DSC) of 95.1%, an area under the curve (AUC) of 96.9%, and a sensitivity of 99%. The results of the proposed system outperform the results of other state-of-the-art multiclass techniques.


2016 ◽  
Vol 26 (3) ◽  
pp. 293-313 ◽  
Author(s):  
André Vellino ◽  
Inge Alberts

Purpose This paper aims to investigate how automatic classification can assist employees and records managers with the appraisal of e-mails as records of value for the organization. Design/methodology/approach The study performed a qualitative analysis of the appraisal behaviours of eight records management experts to train a series of support vector machine classifiers to replicate the decision process for identifying e-mails of business value. Automatic classification experiments were performed on a corpus of 846 e-mails from two of these experts’ mailboxes. Findings Despite the highly contextual nature of record value, these experiments show that classifiers have a high degree of accuracy. Unlike existing manual practices in corporate e-mail archiving, machine classification models are not highly dependent on features such as the identity of the sender and receiver or on threading, forwarding or importance flags. Rather, the dominant discriminating features are textual features from the e-mail body and subject field. Research limitations/implications The need to automatically classify corporate e-mails is growing in importance, as e-mail remains one of the prevalent recordkeeping challenges. Practical implications Automated methods for identifying e-mail records promise to be of significant benefit to organizations that need to appraise e-mail for long-term preservation and access on demand. Social implications The research adopts an innovative approach to assist employees and records managers with the appraisal of digital records. By doing so, the research fosters new insights on the adoption of technological strategies to automate recordkeeping tasks, an important research gap. Originality/value Our experiment show that a SVM classifier can be trained to replicate an expert's decision process for identifying e-mails of business value with a reasonably high degree of accuracy. In principle, such a classifier could be integrated into a corporate Electronic Document and Records Management System (EDRMS) to improve the quality of e-mail records appraisal.


2012 ◽  
Vol 433-440 ◽  
pp. 1692-1700
Author(s):  
Zhong Hua Han ◽  
Xiang Bin Meng ◽  
Bin Ma ◽  
Chang Tao Wang

A differential evolution algorithm based job scheduling method is presented, whose optimization target is production cost. The cost optimization model of hybrid flow-shop is thereby constructed through considering production cost as a factor in scheduling problem of hybrid flow-shop. In the implementation process of the method, DE is used to take global optimization and find which machine the jobs should be assigned on at each stage, which is also called the process route of the job; then the local assignment rules are used to determine the job’s starting time and processing sequence at each stage. With converting time-based scheduling results to fitness function which is comprehensively considering the processing cost, waiting costs, and the products storage costs, the processing cost is taken as the optimization objective. The numerical results show the effectiveness of the algorithm after comparing between multi-group programs.


Sign in / Sign up

Export Citation Format

Share Document