Classification model for accuracy and intrusion detection using machine learning approach

Performance Evaluation of Different Machine Learning Classification Algorithms for Disease Diagnosis

International Journal of E-Health and Medical Communications ◽

10.4018/ijehmc.20211101.oa5 ◽

2021 ◽

Vol 12 (6) ◽

pp. 1-28

Author(s):

Munder Abdulatef Al-Hashem ◽

Ali Mohammad Alqudah ◽

Qasem Qananwah

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Performance Metrics ◽

Confusion Matrix ◽

Learning Algorithms ◽

Disease Diagnosis ◽

Machine Learning Algorithms ◽

Classification Algorithms ◽

K Nearest Neighbor ◽

Machine Learning Classification

Knowledge extraction within a healthcare field is a very challenging task since we are having many problems such as noise and imbalanced datasets. They are obtained from clinical studies where uncertainty and variability are popular. Lately, a wide number of machine learning algorithms are considered and evaluated to check their validity of being used in the medical field. Usually, the classification algorithms are compared against medical experts who are specialized in certain disease diagnoses and provide an effective methodological evaluation of classifiers by applying performance metrics. The performance metrics contain four criteria: accuracy, sensitivity, and specificity forming the confusion matrix of each used algorithm. We have utilized eight different well-known machine learning algorithms to evaluate their performances in six different medical datasets. Based on the experimental results we conclude that the XGBoost and K-Nearest Neighbor classifiers were the best overall among the used datasets and signs can be used for diagnosing various diseases.

Download Full-text

A novel ensemble modeling for intrusion detection system

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v10i2.pp1963-1971 ◽

2020 ◽

Vol 10 (2) ◽

pp. 1963

Author(s):

Pullagura Indira Priyadarsini ◽

G. Anuradha

Keyword(s):

Feature Selection ◽

Intrusion Detection ◽

Intrusion Detection System ◽

Nearest Neighbor ◽

Detection System ◽

Distance Functions ◽

Classification Model ◽

Support Vector ◽

K Nearest Neighbor ◽

Data Set

Vast increase in data through internet services has made computer systems more vulnerable and difficult to protect from malicious attacks. Intrusion detection systems (IDSs) must be more potent in monitoring intrusions. Therefore an effectual Intrusion Detection system architecture is built which employs a facile classification model and generates low false alarm rates and high accuracy. Noticeably, IDS endure enormous amounts of data traffic that contain redundant and irrelevant features, which affect the performance of the IDS negatively. Despite good feature selection approaches leads to a reduction of unrelated and redundant features and attain better classification accuracy in IDS. This paper proposes a novel ensemble model for IDS based on two algorithms Fuzzy Ensemble Feature selection (FEFS) and Fusion of Multiple Classifier (FMC). FEFS is a unification of five feature scores. These scores are obtained by using feature-class distance functions. Aggregation is done using fuzzy union operation. On the other hand, the FMC is the fusion of three classifiers. It works based on Ensemble decisive function. Experiments were made on KDD cup 99 data set have shown that our proposed system works superior to well-known methods such as Support Vector Machines (SVMs), K-Nearest Neighbor (KNN) and Artificial Neural Networks (ANNs). Our examinations ensured clearly the prominence of using ensemble methodology for modeling IDSs. And hence our system is robust and efficient.

Download Full-text

Determining the Tiers of a Supply Chain Using Machine Learning Algorithms

Symmetry ◽

10.3390/sym13101934 ◽

2021 ◽

Vol 13 (10) ◽

pp. 1934

Author(s):

Kyoung Jong Park

Keyword(s):

Machine Learning ◽

Supply Chain ◽

Nearest Neighbor ◽

Confusion Matrix ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Accurate Information ◽

K Nearest Neighbor ◽

Multi Class Classification

Companies in the same supply chain influence each other, so sharing information enables more efficient supply chain management. An efficient supply chain must have a symmetry of information between participating entities, but in reality, the information is asymmetric, causing problems. The sustainability of the supply chain continues to be threatened because companies are reluctant to disclose information to others. If companies participating in the supply chain do not disclose accurate information, the next best way to improve the sustainability of the supply chain is to use data from the supply chain to determine each enterprise’s information. This study takes data from the supply chain and then uses machine learning algorithms to find which enterprise the data refer to when new data from unknown sources arise. The machine learning algorithms used are logistic regression, random forest, naive Bayes, decision tree, support vector machine, k-nearest neighbor, and multi-layer perceptron. Indicators for evaluating the performance of multi-class classification machine learning methods are accuracy, confusion matrix, precision, recall, and F1-score. The experimental results showed that LR and MLP accurately predicted companies (tiers), but NB, DT, RF, SVM, and K-NN did not accurately predict companies. In addition, the performance similarity of machine learning algorithms through experiments was classified into LR and MLP groups, NB and DT groups, and RF, SVM, and K-NN groups.

Download Full-text

Detection of Ovarian Tumor Using Machine Learning Approaches A Review

10.46532/978-81-950008-1-4_103 ◽

2020 ◽

pp. 471-476

Author(s):

Gitanjali Wadhwa ◽

Mansi Mathur

Keyword(s):

Machine Learning ◽

Ovarian Cancer ◽

Deep Learning ◽

Nearest Neighbor ◽

Performance Metrics ◽

Machine Learning Algorithms ◽

Support Vector ◽

Learning Approaches ◽

K Nearest Neighbor ◽

Female Sex Hormones

The important part of female reproductive system is ovaries. The importance of these tiny glands is derived from the production of female sex hormones and female gametes. The place of these ductless almond shaped tiny glandular organs is on just opposite sides of uterus attached with ovarian ligament. There are several reasons due to which ovarian cancer can arise but it can be classified by using different number of techniques. Early prediction of ovarian cancer will decrease its progress rate and may possibly save countless lives. CAD systems (Computer-aided diagnosis) is a noninvasive routine for finding ovarian cancer in its initial stages of cancer which can keep away patients’ anxiety and unnecessary biopsy. This review paper states us about how we can use different techniques to classify the ovarian cancer tumor. In this survey effort we have also deliberate about the comparison of different machine learning algorithms like K-Nearest Neighbor, Support Vector Machine and deep learning techniques used in classification process of ovarian cancer. Later comparing the different techniques for this type of cancer detection, it gives the impression that Deep Learning Technique has provided good results and come out with good accuracy and other performance metrics.

Download Full-text

Performance Evaluation of Different Machine Learning Classification Algorithms for Diseases Diagnosis

International Journal of E-Health and Medical Communications ◽

10.4018/ijehmc.20211101oa09 ◽

2021 ◽

Vol 12 (6) ◽

pp. 0-0

Keyword(s):

Machine Learning ◽

Nearest Neighbor ◽

Performance Metrics ◽

Confusion Matrix ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Classification Algorithms ◽

K Nearest Neighbor ◽

Machine Learning Classification ◽

Nearest Neighbor Classifiers

Knowledge extraction within a healthcare field is a very challenging task since we are having many problems such as noise and imbalanced datasets. They are obtained from clinical studies where uncertainty and variability are popular. Lately, a wide number of machine learning algorithms are considered and evaluated to check their validity of being used in the medical field. Usually, the classification algorithms are compared against medical experts who are specialized in certain disease diagnoses and provide an effective methodological evaluation of classifiers by applying performance metrics. The performance metrics contain four criteria: accuracy, sensitivity, and specificity forming the confusion matrix of each used algorithm. We have utilized eight different well-known machine learning algorithms to evaluate their performances in six different medical datasets. Based on the experimental results we conclude that the XGBoost and K-Nearest Neighbor classifiers were the best overall among the used datasets and signs can be used for diagnosing various diseases.

Download Full-text

Optimizing Error Rate in Intrusion Detection System Using Artificial Neural Network Algorithm

International Journal of Emerging Research in Management and Technology ◽

10.23956/ijermt.v6i9.102 ◽

2018 ◽

Vol 6 (9) ◽

pp. 152

Author(s):

S. Vijaya Rani ◽

G. N. K. Suresh Babu

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Intrusion Detection ◽

Error Rate ◽

Learning Process ◽

Nearest Neighbor ◽

Detection System ◽

Support Vector ◽

K Nearest Neighbor ◽

Artificial Neural

The illegal hackers penetrate the servers and networks of corporate and financial institutions to gain money and extract vital information. The hacking varies from one computing system to many system. They gain access by sending malicious packets in the network through virus, worms, Trojan horses etc. The hackers scan a network through various tools and collect information of network and host. Hence it is very much essential to detect the attacks as they enter into a network. The methods available for intrusion detection are Naive Bayes, Decision tree, Support Vector Machine, K-Nearest Neighbor, Artificial Neural Networks. A neural network consists of processing units in complex manner and able to store information and make it functional for use. It acts like human brain and takes knowledge from the environment through training and learning process. Many algorithms are available for learning process This work carry out research on analysis of malicious packets and predicting the error rate in detection of injured packets through artificial neural network algorithms.

Download Full-text

Supervised Classifier Approach for Intrusion Detection on KDD with Optimal MapReduce Framework Model in Cloud Computing

Recent Patents on Computer Science ◽

10.2174/1573401315666190619113510 ◽

2019 ◽

Vol 12 ◽

Author(s):

M. Ilayaraja ◽

S. Hemalatha ◽

P. Manickam ◽

K. Sathesh Kumar ◽

K. Shankar

Keyword(s):

Machine Learning ◽

Cloud Computing ◽

Intrusion Detection ◽

Decision Tree ◽

Learning Strategies ◽

Nearest Neighbor ◽

Detection System ◽

K Nearest Neighbor ◽

Mapreduce Model ◽

The Web

Cloud computing is characterized as the arrangement of assets or administrations accessible through the web to the clients on their request by cloud providers. It communicates everything as administrations over the web in view of the client request, for example operating system, organize equipment, storage, assets, and software. Nowadays, Intrusion Detection System (IDS) plays a powerful system, which deals with the influence of experts to get actions when the system is hacked under some intrusions. Most intrusion detection frameworks are created in light of machine learning strategies. Since the datasets, this utilized as a part of intrusion detection is Knowledge Discovery in Database (KDD). In this paper detect or classify the intruded data utilizing Machine Learning (ML) with the MapReduce model. The primary face considers Hadoop MapReduce model to reduce the extent of database ideal weight decided for reducer model and second stage utilizing Decision Tree (DT) classifier to detect the data. This DT classifier comprises utilizing an appropriate classifier to decide the class labels for the non-homogeneous leaf nodes. The decision tree fragment gives a coarse section profile while the leaf level classifier can give data about the qualities that influence the label inside a portion. From the proposed result accuracy for detection is 96.21% contrasted with existing classifiers, for example, Neural Network (NN), Naive Bayes (NB) and K Nearest Neighbor (KNN).

Download Full-text

Efficient detection of hacker community based on twitter data using complex networks and machine learning algorithm

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-210458 ◽

2021 ◽

pp. 1-17

Author(s):

Ahmed Al-Tarawneh ◽

Ja’afer Al-Saraireh

Keyword(s):

Machine Learning ◽

Complex Networks ◽

Nearest Neighbor ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

K Nearest Neighbor ◽

Efficient Detection ◽

Suggested Keywords

Twitter is one of the most popular platforms used to share and post ideas. Hackers and anonymous attackers use these platforms maliciously, and their behavior can be used to predict the risk of future attacks, by gathering and classifying hackers’ tweets using machine-learning techniques. Previous approaches for detecting infected tweets are based on human efforts or text analysis, thus they are limited to capturing the hidden text between tweet lines. The main aim of this research paper is to enhance the efficiency of hacker detection for the Twitter platform using the complex networks technique with adapted machine learning algorithms. This work presents a methodology that collects a list of users with their followers who are sharing their posts that have similar interests from a hackers’ community on Twitter. The list is built based on a set of suggested keywords that are the commonly used terms by hackers in their tweets. After that, a complex network is generated for all users to find relations among them in terms of network centrality, closeness, and betweenness. After extracting these values, a dataset of the most influential users in the hacker community is assembled. Subsequently, tweets belonging to users in the extracted dataset are gathered and classified into positive and negative classes. The output of this process is utilized with a machine learning process by applying different algorithms. This research build and investigate an accurate dataset containing real users who belong to a hackers’ community. Correctly, classified instances were measured for accuracy using the average values of K-nearest neighbor, Naive Bayes, Random Tree, and the support vector machine techniques, demonstrating about 90% and 88% accuracy for cross-validation and percentage split respectively. Consequently, the proposed network cyber Twitter model is able to detect hackers, and determine if tweets pose a risk to future institutions and individuals to provide early warning of possible attacks.

Download Full-text

A Comparative Analysis of Machine Learning Algorithms Modeled from Machine Vision-Based Lettuce Growth Stage Classification in Smart Aquaponics

International Journal of Environmental Science and Development ◽

10.18178/ijesd.2020.11.9.1288 ◽

2020 ◽

Vol 11 (9) ◽

pp. 442-449 ◽

Cited By ~ 1

Author(s):

Sandy C. Lauguico ◽

◽

Ronnie S. Concepcion II ◽

Jonnel D. Alejandrino ◽

Rogelio Ruzcko Tobias ◽

...

Keyword(s):

Machine Learning ◽

Comparative Analysis ◽

Machine Vision ◽

Nearest Neighbor ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

Urban Farming ◽

K Nearest Neighbor ◽

Lettuce Growth

The arising problem on food scarcity drives the innovation of urban farming. One of the methods in urban farming is the smart aquaponics. However, for a smart aquaponics to yield crops successfully, it needs intensive monitoring, control, and automation. An efficient way of implementing this is the utilization of vision systems and machine learning algorithms to optimize the capabilities of the farming technique. To realize this, a comparative analysis of three machine learning estimators: Logistic Regression (LR), K-Nearest Neighbor (KNN), and Linear Support Vector Machine (L-SVM) was conducted. This was done by modeling each algorithm from the machine vision-feature extracted images of lettuce which were raised in a smart aquaponics setup. Each of the model was optimized to increase cross and hold-out validations. The results showed that KNN having the tuned hyperparameters of n_neighbors=24, weights='distance', algorithm='auto', leaf_size = 10 was the most effective model for the given dataset, yielding a cross-validation mean accuracy of 87.06% and a classification accuracy of 91.67%.

Download Full-text

Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) using Customized Dataset

International Journal of Computer Science and Mobile Computing ◽

10.47760/ijcsmc.2021.v10i03.002 ◽

2021 ◽

Vol 10 (3) ◽

pp. 14-25

Author(s):

Parilkumar Shiroya

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Logistic Regression ◽

Nearest Neighbor ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Support Vector ◽

K Nearest Neighbor

Download Full-text

Classification model for accuracy and intrusion detection using machine learning approach

Performance Evaluation of Different Machine Learning Classification Algorithms for Disease Diagnosis

A novel ensemble modeling for intrusion detection system

Determining the Tiers of a Supply Chain Using Machine Learning Algorithms

Detection of Ovarian Tumor Using Machine Learning Approaches A Review

Performance Evaluation of Different Machine Learning Classification Algorithms for Diseases Diagnosis

Optimizing Error Rate in Intrusion Detection System Using Artificial Neural Network Algorithm

Supervised Classifier Approach for Intrusion Detection on KDD with Optimal MapReduce Framework Model in Cloud Computing

Efficient detection of hacker community based on twitter data using complex networks and machine learning algorithm

A Comparative Analysis of Machine Learning Algorithms Modeled from Machine Vision-Based Lettuce Growth Stage Classification in Smart Aquaponics

Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) using Customized Dataset﻿

Book Genre Categorization Using Machine Learning Algorithms (K-Nearest Neighbor, Support Vector Machine and Logistic Regression) using Customized Dataset