Machine Learning Applications based on SVM Classification A Review

Extending technologies and data development culminated in the need for quicker and more reliable processing of massive data sets. Machine Learning techniques are used excessively. This paper, therefore, attempts to deal with data processing, using a support vector machine (SVM) algorithm in different fields since it is a reliable, efficient classification method in the area of machine learning. Accordingly, many works have been explored in this paper to cover the use of SVM classifier. Classification based on SVM has been used in many fields like face recognition, diseases diagnostics, text recognition, sentiment analysis, plant disease identification and intrusion detection system for network security application. Based on this study, it can be concluded that SVM classifier has obtained high accuracy results in most of the applications, specifically, for face recognition and diseases identification applications.

Download Full-text

Decision Tree: A Machine Learning for Intrusion Detection

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.f1234.0486s419 ◽

2019 ◽

Vol 8 (6S4) ◽

pp. 1126-1130

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Detection System ◽

Research Work ◽

Machine Learning Techniques ◽

Data Sets ◽

Legitimate User ◽

Learning Techniques ◽

Three Stages

The Intrusion is a major threat to unauthorized data or legal network using the legitimate user identity or any of the back doors and vulnerabilities in the network. IDS mechanisms are developed to detect the intrusions at various levels. The objective of the research work is to improve the Intrusion Detection System performance by applying machine learning techniques based on decision trees for detection and classification of attacks. The methodology adapted will process the datasets in three stages. The experimentation is conducted on KDDCUP99 data sets based on number of features. The Bayesian three modes are analyzed for different sized data sets based upon total number of attacks. The time consumed by the classifier to build the model is analyzed and the accuracy is done.

Download Full-text

Highly accurate and efficient two phase-intrusion detection system (TP-IDS) using distributed processing of HADOOP and machine learning techniques

Journal Of Big Data ◽

10.1186/s40537-021-00521-y ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Abhijit Dnyaneshwar Jadhav ◽

Vidyullatha Pellakuri

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Phase Ii ◽

Intrusion Detection System ◽

Detection System ◽

Research Work ◽

Machine Learning Techniques ◽

Support Vector ◽

Network Connections ◽

Learning Techniques

AbstractNetwork security and data security are the biggest concerns now a days. Every organization decides their future business process based on the past and day to day transactional data. This data may consist of consumer’s confidential data, which needs to be kept secure. Also, the network connections when established with the external communication devices or entities, a care should be taken to authenticate these and block the unwanted access. This consists of identification of the malicious connection nodes and identification of normal connection nodes. For that, we use a continuous monitoring of the network input traffic to recognize the malicious connection request called as intrusion and this type of monitoring system is called as an Intrusion detection system (IDS). IDS helps us to protect our network and data from insecure and malicious network connections. Many such systems exists in the real time scenario, but they have critical issues of performance like accuracy and efficiency. These issues are addressed as a part of this research work of IDS using machine learning techniques and HDFS. The TP-IDS is designed in two phases for increasing accuracy. In phase I of TP-IDS, Support Vector Machine (SVM) and k Nearest Neighbor (kNN) are used. In phase II of TP-IDS, Decision Tree (DT) and Naïve Bayes (NB) are used, where phase II is the validation phase of the system for increasing accuracy. Also, both the phases are having Hadoop distributed file system underlying data storage and processing architecture, which allows parallel processing to increase the speed of the system and hence achieve the efficiency in TP-IDS.

Download Full-text

Blog Backlinks Malicious Domain Name Detection via Supervised Learning

International Journal on Semantic Web and Information Systems ◽

10.4018/ijswis.2021070101 ◽

2021 ◽

Vol 17 (3) ◽

pp. 1-17

Author(s):

Abdulrahman A. Alshdadi ◽

Ahmed S. Alghamdi ◽

Ali Daud ◽

Saqib Hussain

Keyword(s):

Machine Learning ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Financial Loss ◽

Domain Name ◽

Hybrid Features ◽

Learning Techniques ◽

Web Spam ◽

Social Media Platforms

Web spam is the unwanted request on websites, low-quality backlinks, emails, and reviews which is generated by an automated program. It is the big threat for website owners; because of it, they can lose their top keywords ranking from search engines, which will result in huge financial loss to the business. Over the years, researchers have tried to identify malicious domains based on specific features. However, lighthouse plugin, Ahrefs tool, and social media platforms features are ignored. In this paper, the authors are focused on detection of the spam domain name from a mixture of legit and spam domain name dataset. The dataset is taken from Google webmaster tools. Machine learning models are applied on individual, distributed, and hybrid features, which significantly improved the performance of existing malicious domain machine learning techniques. Better accuracy is achieved for support vector machine (SVM) classifier, as compared to Naïve Bayes, C4.5, AdaBoost, LogitBoost.

Download Full-text

Prediction of venous thromboembolism with machine learning techniques in young-middle-aged inpatients

Scientific Reports ◽

10.1038/s41598-021-92287-9 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Hua Liu ◽

Hua Yuan ◽

Yongmei Wang ◽

Weiwei Huang ◽

Hui Xue ◽

...

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Support Vector Machine Model ◽

Adverse Outcomes ◽

Machine Learning Techniques ◽

Support Vector ◽

Data Sets ◽

Middle Aged ◽

Machine Model ◽

Learning Techniques

AbstractAccumulating studies appear to suggest that the risk factors for venous thromboembolism (VTE) among young-middle-aged inpatients are different from those among elderly people. Therefore, the current prediction models for VTE are not applicable to young-middle-aged inpatients. The aim of this study was to develop and externally validate a new prediction model for young-middle-aged people using machine learning methods. The clinical data sets linked with 167 inpatients with deep venous thrombosis (DVT) and/or pulmonary embolism (PE) and 406 patients without DVT or PE were compared and analysed with machine learning techniques. Five algorithms, including logistic regression, decision tree, feed-forward neural network, support vector machine, and random forest, were used for training and preparing the models. The support vector machine model had the best performance, with AUC values of 0.806–0.944 for 95% CI, 59% sensitivity and 99% specificity, and an accuracy of 87%. Although different top predictors of adverse outcomes appeared in the different models, life-threatening illness, fibrinogen, RBCs, and PT appeared to be more consistently featured by the different models as top predictors of adverse outcomes. Clinical data sets of young and middle-aged inpatients can be used to accurately predict the risk of VTE with a support vector machine model.

Download Full-text

Identification of bipolar disorder using a combination of multimodality magnetic resonance imaging and machine learning techniques

10.21203/rs.3.rs-15480/v4 ◽

2020 ◽

Author(s):

Hao Li ◽

Liqian Cui ◽

Liping Cao ◽

Yizhi Zhang ◽

Yueheng Liu ◽

...

Keyword(s):

Machine Learning ◽

Bipolar Disorder ◽

Functional Mri ◽

Added Value ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Accurate Identification ◽

Learning Techniques ◽

Svm Model

Abstract Background: Bipolar disorder (BPD) is a common mood disorder that is often goes misdiagnosed or undiagnosed. Recently, machine learning techniques have been combined with neuroimaging methods to aid in the diagnosis of BPD. However, most studies have focused on the construction of classifiers based on single-modality MRI. Hence, in this study, we aimed to construct a support vector machine (SVM) model using a combination of structural and functional MRI, which could be used to accurately identify patients with BPD.Methods: In total, 44 patients with BPD and 36 healthy controls were enrolled in the study. Clinical evaluation and MRI scans were performed for each subject. Next, image pre-processing, VBM and ReHo analyses were performed. The ReHo values of each subject in the clusters showing significant differences were extracted. Further, LASSO approach was recruited to screen features. Based on selected features, the SVM model was established, and discriminant analysis was performed.Results: After using the two-sample t-test with multiple comparisons, a total of 8 clusters were extracted from the data (VBM = 6; ReHo = 2). Next, we used both VBM and ReHo data to construct the new SVM classifier, which could effectively identify patients with BPD at an accuracy of 87.5% (95%CI: 72.5-95.3%), sensitivity of 86.4% (95%CI: 64.0-96.4%), and specificity of 88.9% (95%CI: 63.9-98.0%) in the test data (p=0.0022). Conclusions: A combination of structural and functional MRI can be of added value in the construction of SVM classifiers to aid in the accurate identification of BPD in the clinic.

Download Full-text

Ensemble of SVM Classifiers for Spam Filtering

Encyclopedia of Artificial Intelligence ◽

10.4018/978-1-59904-849-9.ch086 ◽

2011 ◽

pp. 561-566

Author(s):

Ángela Blanco ◽

Manuel Martín-Merino

Keyword(s):

Machine Learning ◽

False Positive ◽

Machine Learning Techniques ◽

Support Vector ◽

Applied Machine Learning ◽

Internet Users ◽

Learning Techniques ◽

Svm Algorithm ◽

Misclassification Errors ◽

Voting Strategy

Unsolicited commercial email also known as Spam is becoming a serious problem for Internet users and providers (Fawcett, 2003). Several researchers have applied machine learning techniques in order to improve the detection of spam messages. Naive Bayes models are the most popular (Androutsopoulos, 2000) but other authors have applied Support Vector Machines (SVM) (Drucker, 1999), boosting and decision trees (Carreras, 2001) with remarkable results. SVM has revealed particularly attractive in this application because it is robust against noise and is able to handle a large number of features (Vapnik, 1998). Errors in anti-spam email filtering are strongly asymmetric. Thus, false positive errors or valid messages that are blocked, are prohibitively expensive. Several authors have proposed new versions of the original SVM algorithm that help to reduce the false positive errors (Kolz, 2001, Valentini, 2004 & Kittler, 1998). In particular, it has been suggested that combining non-optimal classifiers can help to reduce particularly the variance of the predictor (Valentini, 2004 & Kittler, 1998) and consequently the misclassification errors. In order to achieve this goal, different versions of the classifier are usually built by sampling the patterns or the features (Breiman, 1996). However, in our application it is expected that the aggregation of strong classifiers will help to reduce more the false positive errors (Provost, 2001 & Hershop, 2005). In this paper, we address the problem of reducing the false positive errors by combining classifiers based on multiple dissimilarities. To this aim, a diversity of classifiers is built considering dissimilarities that reflect different features of the data. The dissimilarities are first embedded into an Euclidean space where a SVM is adjusted for each measure. Next, the classifiers are aggregated using a voting strategy (Kittler, 1998). The method proposed has been applied to the Spam UCI machine learning database (Hastie, 2001) with remarkable results.

Download Full-text

Prediction of Severity of Non Proliferated Diabetic Retinopathy Using Machine Learning Techniques

Journal of Computational and Theoretical Nanoscience ◽

10.1166/jctn.2020.9049 ◽

2020 ◽

Vol 17 (9) ◽

pp. 4219-4222

Author(s):

ManjulaSri Rayudu ◽

Srujana Pendam ◽

Srilaxmi Dasari

Keyword(s):

Machine Learning ◽

Diabetic Retinopathy ◽

Vision Loss ◽

Diversity Index ◽

Research Work ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Linear Discriminant ◽

Learning Techniques

All the patients of Type1 and more than 60% of Type2 Diabetes suffer from Diabetic Retinopathy (DR). Diabetic retinopathy causes damage to retina of eye and slowly leads to complete vision loss. The longer the patients are suffering from diabetes the probability of presence of DR is more. Hence diabetic retinopathy is to be identified in early stage to avoid blindness. The objective of this research work is to predict the severity of diabetic retinopathy (Non Proliferated) using machine learning techniques. Proliferated diabetic retinopathy (later stage) is characterized by neovasculature in the retinal veins and is the final stage. Non proliferated DR (earlier stage) is identified by any of the abnormalities out of microaneurysms, Hard exudates and hemorrhages. Then Machine learning techniques are employed to identify the class of DR. The following Classification and regression techniques are employed for categorizing the DR: Gini Diversity Index method, Linear discriminant analysis, Ensemble method with bagged and boosted trees, K-Nearest Neighbor, and Support Vector Machine classification methods. 89 images from DRIVE database (DiaRet DB1) are classified using the machine learning techniques cited above. It is observed the maximum accuracy is achieved as 88.8% with Linear SVM classifier.

Download Full-text

Multistage System-Based Machine Learning Techniques for Intrusion Detection in WiFi Network

Journal of Computer Networks and Communications ◽

10.1155/2019/4708201 ◽

2019 ◽

Vol 2019 ◽

pp. 1-13 ◽

Cited By ~ 1

Author(s):

Vu Viet Thang ◽

F. F. Pashchenko

Keyword(s):

Machine Learning ◽

Intrusion Detection ◽

Local Density ◽

Detection System ◽

Machine Learning Techniques ◽

Data Sets ◽

Incremental Clustering ◽

Multistage System ◽

Learning Techniques ◽

Semisupervised Clustering

The aim of machine learning is to develop algorithms that can learn from data and solve specific problems in some context as human do. This paper presents some machine learning models applied to the intrusion detection system in WiFi network. Firstly, we present an incremental semisupervised clustering based on a graph. Incremental clustering or one-pass clustering is very useful when we work with data stream or dynamic data. In fact, for traditional clustering such as K-means, Fuzzy C-Means, DBSCAN, etc., many versions of incremental clustering have been developed. However, to the best of our knowledge, there is no incremental semisupervised clustering in the literature. Secondly, by combining a K-means algorithm and a measure of local density score, we propose a fast outlier detection algorithm, named FLDS. The complexity of FLDS is On1.5 while the results obtained are comparable with the algorithm LOF. Thirdly, we introduce a multistage system-based machine learning techniques for mining the intrusion detection data applied for the 802.11 WiFi network. Finally, experiments conducted on some data sets extracted from the 802.11 networks and UCI data sets show the effectiveness of our new proposed methods.

Download Full-text

Performance evaluation of 2D face recognition techniques under image processing attacks

Modern Physics Letters B ◽

10.1142/s0217984918502123 ◽

2018 ◽

Vol 32 (19) ◽

pp. 1850212 ◽

Cited By ~ 7

Author(s):

Sahil Sharma ◽

Vijay Kumar

Keyword(s):

Machine Learning ◽

Image Processing ◽

Face Recognition ◽

Recognition System ◽

Two Dimensions ◽

Machine Learning Techniques ◽

Support Vector ◽

Ensemble Techniques ◽

Learning Techniques ◽

Face Recognition System

Face recognition is a vastly researched topic in the field of computer vision. A lot of work have been done for facial recognition in two dimensions and three dimensions. The amount of work done with face recognition invariant of image processing attacks is very limited. This paper presents a total of three classes of image processing attacks on face recognition system, namely image enhancement attacks, geometric attacks and the image noise attacks. The well-known machine learning techniques have been used to train and test the face recognition system using two different databases namely Bosphorus Database and University of Milano Bicocca three-dimensional (3D) Face Database (UMBDB). Three classes of classification models, namely discriminant analysis, support vector machine and k-nearest neighbor along with ensemble techniques have been implemented. The significance of machine learning techniques has been mentioned. The visual verification has been done with multiple image processing attacks.

Download Full-text

Identification of bipolar disorder using a combination of multimodality magnetic resonance imaging and machine learning techniques

10.21203/rs.3.rs-15480/v3 ◽

2020 ◽

Author(s):

Hao Li ◽

Liqian Cui ◽

Liping Cao ◽

Yizhi Zhang ◽

Yueheng Liu ◽

...

Keyword(s):

Machine Learning ◽

Bipolar Disorder ◽

Functional Mri ◽

Added Value ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Accurate Identification ◽

Learning Techniques ◽

Svm Model

Abstract Background: Bipolar disorder (BPD) is a common mood disorder that is often goes misdiagnosed or undiagnosed. Recently, machine learning techniques have been combined with neuroimaging methods to aid in the diagnosis of BPD. However, most studies have focused on the construction of classifiers based on single-modality MRI. Hence, in this study, we aimed to construct a support vector machine (SVM) model using a combination of structural and functional MRI, which could be used to accurately identify patients with BPD.Methods: In total, 44 patients with BPD and 36 healthy controls were enrolled in the study. Clinical evaluation and MRI scans were performed for each subject. Next, image pre-processing, VBM and ReHo analyses were performed. The ReHo values of each subject in the clusters showing significant differences were extracted. Further, LASSO approach was recruited to screen features. Based on selected features, the SVM model was established, and discriminant analysis was performed.Results: After using the two-sample t-test with multiple comparisons, a total of 8 clusters were extracted from the data (VBM = 6; ReHo = 2). Next, we used both VBM and ReHo data to construct the new SVM classifier, which could effectively identify patients with BPD at an accuracy of 87.5% (95%CI: 72.5-95.3%), sensitivity of 86.4% (95%CI: 64.0-96.4%), and specificity of 88.9% (95%CI: 63.9-98.0%) in the test data (p=0.0022). Limitations: The sample size was small, and we were unable to eliminate the potential effects of medications. Conclusions: A combination of structural and functional MRI can be of added value in the construction of SVM classifiers to aid in the accurate identification of BPD in the clinic.

Download Full-text