Using an Artificial Neural Network to Improve Email Security

2022 ◽  
pp. 1465-1477
Author(s):  
Mohamed Abdulhussain Ali Madan Maki ◽  
Suresh Subramanian

Email is one of the most widely used features of internet, and it is the most convenient method of transferring messages electronically. However, email productivity has been decreased due to phishing attacks, spam emails, and viruses. Recently, filtering the email flow is a challenging task for researchers due to techniques that spammers used to avoid spam detection. This research proposes an email spam filtering system that filters the spam emails using artificial back propagation neural network (BPNN) technique. Enron1 dataset was used, and after the preprocessing, TF-IDF algorithm was used to extract features and convert them into frequency. To select best features, mutual information technique has been applied. Performance of classifiers were measured using BoW, n-gram, and chi-squared methods. BPNN model was compared with Naïve Bayes and support vector machine based on accuracy, precision, recall, and f1-score. The results show that the proposed email spam system achieved 98.6% accuracy with cross-validation.

Author(s):  
Mohamed Abdulhussain Ali Madan Maki ◽  
Suresh Subramanian

Email is one of the most widely used features of internet, and it is the most convenient method of transferring messages electronically. However, email productivity has been decreased due to phishing attacks, spam emails, and viruses. Recently, filtering the email flow is a challenging task for researchers due to techniques that spammers used to avoid spam detection. This research proposes an email spam filtering system that filters the spam emails using artificial back propagation neural network (BPNN) technique. Enron1 dataset was used, and after the preprocessing, TF-IDF algorithm was used to extract features and convert them into frequency. To select best features, mutual information technique has been applied. Performance of classifiers were measured using BoW, n-gram, and chi-squared methods. BPNN model was compared with Naïve Bayes and support vector machine based on accuracy, precision, recall, and f1-score. The results show that the proposed email spam system achieved 98.6% accuracy with cross-validation.


Symmetry ◽  
2020 ◽  
Vol 12 (7) ◽  
pp. 1163
Author(s):  
Wei Peng ◽  
Qing Wang ◽  
Xudong Zhang ◽  
Xiaohui Sun ◽  
Yongchao Li ◽  
...  

With the increase in transportation emissions, road diseases in the saline soil area of Jilin Province have become a problem that requires serious attention. In order to improve the subgrade performance, the structural yield strength (SYS) of remolded soil and its factor sensitivity are investigated in this study. Saline soils in Western Jilin are structural in the sense that the bonding strength of soil skeleton is mainly provided by the solidification bond formed by a physicochemical interaction between particles. Its SYS is influenced by its cementation type, genetic characteristics, original rock structure, and environment. Because of the high clay content in Zhenlai saline soil, the specific surface area of soil particles is large, and the surface adsorption capacity of soil particles is strong. In addition, the main cation is Na+. The cementation strength of bound water film between soil particles is thus easily affected by water content and salt content, and compaction is also an important factor affecting the strength of soil. Therefore, in this study, the back-propagation neural network (BPNN) model and a support vector machine (SVM) are used to explore the relationship of saline soil’s SYS with its compactness, water content, and salt content. In total, 120 data points collected by a high-pressure consolidation experiment are applied to building BPNN and SVM model. For eliminate redundant features, Pearson correlation coefficient (rPCC) is used as an evaluation standard of feature selection. The K-fold cross-validation method was used to avoid over fitting. To compare the performance of the BPNN and SVM models, three statistical parameters were used: the determination coefficient (R2), root mean square error (RMSE), and mean absolute percentage deviation (MAPD). The result shows that the average values of R2, RMSE, and MAPD of the BPNN model are superior to the values of the SVM. We conclude that the BPNN model is slightly better than the SVM for predicting the SYS of saline soil. Thus, the BPNN model is used to analyze the factor sensitivity of SYS. The results indicate that the influence degrees of the three parameters are as follows: water content > compactness > salt content. This study can provide a basis for estimating the structural yield pressure of soil from its basic properties, and can provide a new way to obtain parameters for geotechnical engineering, ensuring safety while maintaining symmetry in engineering costs.


Author(s):  
Bo Huang

This study analyzed three prediction models: ID model, GM (1,1) model and back-propagation neural network (BPNN) model. Firstly, the principles of the three models were introduced, and the prediction methods of the three models were analyzed. Then, taking enterprise A as an example, the demand for human resources was predicted, and the prediction results of the three models were compared. The results showed that the maximum and minimum errors were 240 people and 12 people respectively in the prediction results of the ID3 model and 64 people and 37 people respectively in the prediction results of the GM (1, 1) model; the errors of the BPNN model were smaller than ten people, and the minimum value of the BPNN model was three people, which was in good agreement with the actual value. The prediction of the human resource demand of enterprise A in the future five years with the BPNN model suggested that the demand for employees would growing rapidly. The results show that the BPNN model has better reliability and can be popularized and applied in practice.


2011 ◽  
Vol 11 (04) ◽  
pp. 897-915 ◽  
Author(s):  
ROSHAN JOY MARTIS ◽  
CHANDAN CHAKRABORTY

This work aims at presenting a methodology for electrocardiogram (ECG)-based arrhythmia disease detection using genetic algorithm (GA)-optimized k-means clustering. The open-source ECG data from MIT-BIH arrhythmia database and MIT-BIH normal sinus rhythm database are subjected to a sequence of steps including segmentation using R-point detection, extraction of features using principal component analysis (PCA), and pattern classification. Here, the classical classifiers viz., k-means clustering, error back propagation neural network (EBPNN), and support vector machine (SVM) have been initially attempted and subsequently m-fold (m = 3) cross validation is used to reduce the bias during training of the classifier. The average classification accuracy is computed as the average over all the three folds. It is observed that EBPNN and SVM with different order polynomial kernel provide significant accuracies in comparison with k-means one. In fact, the parameters (centroids) of k-means algorithm are locally optimized by minimizing its objective function. In order to overcome this limitation, a global optimization technique viz., GA is suggested here and implemented to find more robust parameters of k-means clustering. Finally, it is shown that GA-optimized k-means algorithm enhances its accuracy to those of other classifiers. The results are discussed and compared. It is concluded that the GA-optimized k-means algorithm is an alternate approach for classification whose accuracy will be near to that of supervised (viz., EBPNN and SVM) classifiers.


Sign in / Sign up

Export Citation Format

Share Document