Classifying Spam Emails Using Artificial Intelligent Techniques

Author(s):  
Sanjiban Sekhar Roy ◽  
V. Madhu Viswanatham

Spam emails have become an increasing difficulty for the entire web-users.These unsolicited messages waste the resources of network unnecessarily. Customarily, machine learning techniques are adopted for filtering email spam. This article examines the capabilities of the extreme learning machine (ELM) and support vector machine (SVM) for the classification of spam emails with the class level (d). The ELM method is an efficient model based on single layer feed-forward neural network, which can choose weights from hidden layers,randomly. Support vector machine is a strong statistical learning theory used frequently for classification. The performance of ELM has been compared with SVM. The comparative study examines accuracy, precision, recall, false positive, true positive.Moreover, a sensitivity analysis has been performed by ELM and SVM for spam email classification.

2016 ◽  
Vol 25 (01) ◽  
pp. 1550026 ◽  
Author(s):  
Juan J. Carrasco ◽  
Mónica Millán-Giraldo ◽  
Juan Caravaca ◽  
Pablo Escandell-Montero ◽  
José M. Martínez-Martínez ◽  
...  

Extreme Learning Machine (ELM) is a recently proposed algorithm, efficient and fast for learning the parameters of single layer neural structures. One of the main problems of this algorithm is to choose the optimal architecture for a given problem solution. To solve this limitation several solutions have been proposed in the literature, including the regularization of the structure. However, to the best of our knowledge, there are no works where such adjustment is applied to classification problems in the presence of a non-linearity in the output; all published works tackle modelling or regression problems. Our proposal has been applied to a series of standard databases for the evaluation of machine learning techniques. Results obtained in terms of classification success rate and training time, are compared to the original ELM, to the well known Least Square Support Vector Machine (LS-SVM) algorithm and with two other methods based on the ELM regularization: Optimally Pruned Extreme Learning Machine (OP-ELM) and Bayesian Extreme Learning Machine (BELM). The obtained results clearly demonstrate the usefulness of the proposed method and its superiority over a classical approach.


2021 ◽  
Author(s):  
Lekshmi Kalinathan ◽  
Deepika Sivasankaran ◽  
Janet Reshma Jeyasingh ◽  
Amritha Sennappa Sudharsan ◽  
Hareni Marimuthu

Hepatocellular Carcinoma (HCC) proves to be challenging for detection and classification of its stages mainly due to the lack of disparity between cancerous and non cancerous cells. This work focuses on detecting hepatic cancer stages from histopathology data using machine learning techniques. It aims to develop a prototype which helps the pathologists to deliver a report in a quick manner and detect the stage of the cancer cell. Hence we propose a system to identify and classify HCC based on the features obtained by deep learning using pre-trained models such as VGG-16, ResNet-50, DenseNet-121, InceptionV3, InceptionResNet50 and Xception followed by machine learning using support vector machine (SVM) to learn from these features. The accuracy obtained using the system comprised of DenseNet-121 for feature extraction and SVM for classification gives 82% accuracy.


2020 ◽  
Vol 13 (1) ◽  
pp. 130-149
Author(s):  
Puneet Misra ◽  
Siddharth Chaurasia

Stock market movements are affected by numerous factors making it one of the most challenging problems for forecasting. This article attempts to predict the direction of movement of stock and stock indices. The study uses three classifiers - Artificial Neural Network, Random Forest and Support Vector Machine with four different representation of inputs. First representation uses raw data (open, high, low, close and volume), The second uses ten features in the form of technical indicators generated by use of technical analysis. The third and fourth portrayal presents two different ways of converting the indicator data into discrete trend data. Experimental results suggest that for raw data support vector machine provides the best results. For other representations, there is no clear winner regarding models applied, but portrayal of data by the proposed approach gave best overall results for all the models and financial series. Consistency of the results highlight the importance of feature generation and right representation of dataset to machine learning techniques.


RSC Advances ◽  
2014 ◽  
Vol 4 (106) ◽  
pp. 61624-61630 ◽  
Author(s):  
N. S. Hari Narayana Moorthy ◽  
Silvia A. Martins ◽  
Sergio F. Sousa ◽  
Maria J. Ramos ◽  
Pedro A. Fernandes

Classification models to predict the solvation free energies of organic molecules were developed using decision tree, random forest and support vector machine approaches and with MACCS fingerprints, MOE and PaDEL descriptors.


2020 ◽  
Vol 13 (1-2) ◽  
pp. 43-52
Author(s):  
Boudewijn van Leeuwen ◽  
Zalán Tobak ◽  
Ferenc Kovács

AbstractClassification of multispectral optical satellite data using machine learning techniques to derive land use/land cover thematic data is important for many applications. Comparing the latest algorithms, our research aims to determine the best option to classify land use/land cover with special focus on temporary inundated land in a flat area in the south of Hungary. These inundations disrupt agricultural practices and can cause large financial loss. Sentinel 2 data with a high temporal and medium spatial resolution is classified using open source implementations of a random forest, support vector machine and an artificial neural network. Each classification model is applied to the same data set and the results are compared qualitatively and quantitatively. The accuracy of the results is high for all methods and does not show large overall differences. A quantitative spatial comparison demonstrates that the neural network gives the best results, but that all models are strongly influenced by atmospheric disturbances in the image.


The advancement in cyber-attack technologies have ushered in various new attacks which are difficult to detect using traditional intrusion detection systems (IDS).Existing IDS are trained to detect known patterns because of which newer attacks bypass the current IDS and go undetected. In this paper, a two level framework is proposed which can be used to detect unknown new attacks using machine learning techniques. In the first level the known types of classes for attacks are determined using supervised machine learning algorithms such as Support Vector Machine (SVM) and Neural networks (NN). The second level uses unsupervised machine learning algorithms such as K-means. The experimentation is carried out with four models with NSL- KDD dataset in Openstack cloud environment. The Model with Support Vector Machine for supervised machine learning, Gradual Feature Reduction (GFR) for feature selection and K-means for unsupervised algorithm provided the optimum efficiency of 94.56 %.


2020 ◽  
Vol 12 (22) ◽  
pp. 3708 ◽  
Author(s):  
Ziyi Feng ◽  
Guanhua Huang ◽  
Daocai Chi

Many approaches have been developed to analyze remote sensing images. However, for the classification of large-scale problems, most algorithms showed low computational efficiency and low accuracy. In this paper, the newly developed semi-supervised extreme learning machine (SS-ELM) framework with k-means clustering algorithm for image segmentation and co-training algorithm to enlarge the sample sets was used to classify the agricultural planting structure at large-scale areas. Data sets collected from a small-scale area within the Hetao Irrigation District (HID) at the upper reaches of the Yellow River basin were used to evaluate the SS-ELM framework. The results of the SS-ELM algorithm were compared with those of the random forest (RF), ELM, support vector machine (SVM) and semi-supervised support vector machine (S-SVM) algorithms. Then the SS-ELM algorithm was applied to analyze the complex planting structure of HID in 1986–2010 by comparing the remote sensing estimated results with the statistical data. In the small-scale case, the SS-ELM algorithm performed better than the RF, ELM, SVM, and S-SVM algorithms. For the SS-ELM algorithm, the average overall accuracy (OA) was in a range of 83.00–92.17%. On the contrary, for the other four algorithms, their average OA values ranged from 56.97% to 92.84%. Whereas, in the classification of planting structure in HID, the SS-ELM algorithm had an excellent performance in classification accuracy and computational efficiency for three major planting crops including maize, wheat, and sunflowers. The estimated areas by using the SS-ELM algorithm based on the remote sensing images were consistent with the statistical data, and their difference was within a range of 3–25%. This implied that the SS-ELM framework could be served as an effective method for the classification of complex planting structures with relatively fast training, good generalization, universal approximation capability, and reasonable learning accuracy.


2020 ◽  
Vol 10 (19) ◽  
pp. 6750
Author(s):  
Ditsuhi Iskandaryan ◽  
Francisco Ramos ◽  
Denny Asarias Palinggi ◽  
Sergio Trilles

The growing popularity of soccer has led to the prediction of match results becoming of interest to the research community. The aim of this research is to detect the effects of weather on the result of matches by implementing Random Forest, Support Vector Machine, K-Nearest Neighbors Algorithm, and Extremely Randomized Trees Classifier. The analysis was executed using the Spanish La Liga and Segunda division from the seasons 2013–2014 to 2017–2018 in combination with weather data. Two tasks were proposed as part of this study: the first was to find out whether the game will end in a draw, a win by the hosts or a victory by the guests, and the second was to determine whether the match will end in a draw or if one of the teams will win. The results show that, for the first task, Extremely Randomized Trees Classifier is a better method, with an accuracy of 65.9%, and, for the second task, Support Vector Machine yielded better results with an accuracy of 79.3%. Moreover, it is possible to predict whether the game will end in a draw or not with 0.85 AUC-ROC. Additionally, for comparative purposes, the analysis was also performed without weather data.


Author(s):  
Babita Majhi ◽  
Sachin Singh Rajput ◽  
Ritanjali Majhi

The principle objective of this chapter is to build up a churn prediction model which helps telecom administrators to foresee clients who are no doubt liable to agitate. Many studies affirmed that AI innovation is profoundly effective to anticipate this circumstance as it is applied through training from past information. The prediction procedure is involved three primary stages: normalization of the data, then feature selection based on information gain, and finally, classification utilizing different AI methods, for example, back propagation neural network (BPNNM), naïve Bayesian, k-nearest neighborhood (KNN), support vector machine (SVM), discriminant analysis (DA), decision tree (DT), and extreme learning machine (ELM). It is shown from simulation study that out of these seven methods SVM with polynomial based kernel is coming about 91.33% of precision where ELM is at the primary situation with 92.10% of exactness and MLANN-based CCP model is at third rank with 90.4% of accuracy. Similar observation is noted for 10-fold cross validation also.


Sign in / Sign up

Export Citation Format

Share Document