IoT Dataset Validation Using Machine Learning Techniques for Traffic Anomaly Detection

With advancements in engineering and science, the application of smart systems is increasing, generating a faster growth of the IoT network traffic. The limitations due to IoT restricted power and computing devices also raise concerns about security vulnerabilities. Machine learning-based techniques have recently gained credibility in a successful application for the detection of network anomalies, including IoT networks. However, machine learning techniques cannot work without representative data. Given the scarcity of IoT datasets, the DAD emerged as an instrument for knowing the behavior of dedicated IoT-MQTT networks. This paper aims to validate the DAD dataset by applying Logistic Regression, Naive Bayes, Random Forest, AdaBoost, and Support Vector Machine to detect traffic anomalies in IoT. To obtain the best results, techniques for handling unbalanced data, feature selection, and grid search for hyperparameter optimization have been used. The experimental results show that the proposed dataset can achieve a high detection rate in all the experiments, providing the best mean accuracy of 0.99 for the tree-based models, with a low false-positive rate, ensuring effective anomaly detection.

Download Full-text

Early Weed Detection Using Image Processing and Machine Learning Techniques in an Australian Chilli Farm

Agriculture ◽

10.3390/agriculture11050387 ◽

2021 ◽

Vol 11 (5) ◽

pp. 387

Author(s):

Nahina Islam ◽

Md Mamunur Rashid ◽

Santoso Wibowo ◽

Cheng-Yuan Xu ◽

Ahsan Morshed ◽

...

Keyword(s):

Machine Learning ◽

False Positive Rate ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Weed Detection ◽

Learning Techniques ◽

Positive Rate ◽

Uav Images

This paper explores the potential of machine learning algorithms for weed and crop classification from UAV images. The identification of weeds in crops is a challenging task that has been addressed through orthomosaicing of images, feature extraction and labelling of images to train machine learning algorithms. In this paper, the performances of several machine learning algorithms, random forest (RF), support vector machine (SVM) and k-nearest neighbours (KNN), are analysed to detect weeds using UAV images collected from a chilli crop field located in Australia. The evaluation metrics used in the comparison of performance were accuracy, precision, recall, false positive rate and kappa coefficient. MATLAB is used for simulating the machine learning algorithms; and the achieved weed detection accuracies are 96% using RF, 94% using SVM and 63% using KNN. Based on this study, RF and SVM algorithms are efficient and practical to use, and can be implemented easily for detecting weed from UAV images.

Download Full-text

A Review of Machine Learning Techniques for Anomaly Detection in Static Graphs

Implementing Computational Intelligence Techniques for Security Systems Design - Advances in Computational Intelligence and Robotics ◽

10.4018/978-1-7998-2418-3.ch007 ◽

2020 ◽

pp. 146-162

Author(s):

Hesham M. Al-Ammal

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Anomaly Detection ◽

Real Life ◽

Machine Learning Techniques ◽

Support Vector ◽

Learning Methods ◽

Data Set ◽

Learning Techniques ◽

Vector Machines

Detection of anomalies in a given data set is a vital step in several applications in cybersecurity; including intrusion detection, fraud, and social network analysis. Many of these techniques detect anomalies by examining graph-based data. Analyzing graphs makes it possible to capture relationships, communities, as well as anomalies. The advantage of using graphs is that many real-life situations can be easily modeled by a graph that captures their structure and inter-dependencies. Although anomaly detection in graphs dates back to the 1990s, recent advances in research utilized machine learning methods for anomaly detection over graphs. This chapter will concentrate on static graphs (both labeled and unlabeled), and the chapter summarizes some of these recent studies in machine learning for anomaly detection in graphs. This includes methods such as support vector machines, neural networks, generative neural networks, and deep learning methods. The chapter will reflect the success and challenges of using these methods in the context of graph-based anomaly detection.

Download Full-text

Detecting Website Defacements Based on Machine Learning Techniques and Attack Signatures

Computers ◽

10.3390/computers8020035 ◽

2019 ◽

Vol 8 (2) ◽

pp. 35 ◽

Cited By ~ 2

Author(s):

Xuan Dau Hoang ◽

Ngoc Tuong Nguyen

Keyword(s):

Machine Learning ◽

Web Applications ◽

False Positive Rate ◽

Training Data ◽

Machine Learning Techniques ◽

Web Pages ◽

Government Organizations ◽

Detection Model ◽

Learning Techniques ◽

Positive Rate

Defacement attacks have long been considered one of prime threats to websites and web applications of companies, enterprises, and government organizations. Defacement attacks can bring serious consequences to owners of websites, including immediate interruption of website operations and damage of the owner reputation, which may result in huge financial losses. Many solutions have been researched and deployed for monitoring and detection of website defacement attacks, such as those based on checksum comparison, diff comparison, DOM tree analysis, and complicated algorithms. However, some solutions only work on static websites and others demand extensive computing resources. This paper proposes a hybrid defacement detection model based on the combination of the machine learning-based detection and the signature-based detection. The machine learning-based detection first constructs a detection profile using training data of both normal and defaced web pages. Then, it uses the profile to classify monitored web pages into either normal or attacked. The machine learning-based component can effectively detect defacements for both static pages and dynamic pages. On the other hand, the signature-based detection is used to boost the model’s processing performance for common types of defacements. Extensive experiments show that our model produces an overall accuracy of more than 99.26% and a false positive rate of about 0.27%. Moreover, our model is suitable for implementation of a real-time website defacement monitoring system because it does not demand extensive computing resources.

Download Full-text

An Ensemble-Based Malware Detection Model Using Minimum Feature Set

MENDEL ◽

10.13164/mendel.2019.2.001 ◽

2019 ◽

Vol 25 (2) ◽

pp. 1-10 ◽

Cited By ~ 2

Author(s):

Ivan Zelinka ◽

Eslam Amer

Keyword(s):

Machine Learning ◽

False Positive Rate ◽

Malware Detection ◽

Machine Learning Techniques ◽

Detection Methods ◽

Detection Model ◽

Learning Techniques ◽

Proposed Model ◽

Positive Rate ◽

Minimum Number

Current commercial antivirus detection engines still rely on signature-based methods. However, with the huge increase in the number of new malware, current detection methods become not suitable. In this paper, we introduce a malware detection model based on ensemble learning. The model is trained using the minimum number of signification features that are extracted from the file header. Evaluations show that the ensemble models slightly outperform individual classification models. Experimental evaluations show that our model can predict unseen malware with an accuracy rate of 0.998 and with a false positive rate of 0.002. The paper also includes a comparison between the performance of the proposed model and with different machine learning techniques. We are emphasizing the use of machine learning based approaches to replace conventional signature-based methods.

Download Full-text

Detection of Touchscreen-Based Urdu Braille Characters Using Machine Learning Techniques

Mobile Information Systems ◽

10.1155/2021/7211419 ◽

2021 ◽

Vol 2021 ◽

pp. 1-16

Author(s):

Sana Shokat ◽

Rabia Riaz ◽

Sanam Shahla Rizvi ◽

Inayat Khan ◽

Anand Paul

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Predictive Value ◽

Visually Impaired ◽

Machine Learning Techniques ◽

Receiver Operating Curve ◽

Support Vector ◽

Total N ◽

Learning Techniques ◽

Positive Rate

Revolution in technology is changing the way visually impaired people read and write Braille easily. Learning Braille in its native language can be more convenient for its users. This study proposes an improved backend processing algorithm for an earlier developed touchscreen-based Braille text entry application. This application is used to collect Urdu Braille data, which is then converted to Urdu text. Braille to text conversion has been done on Hindi, Arabic, Bangla, Chinese, English, and other languages. For this study, Urdu Braille Grade 1 data were collected with multiclass (39 characters of Urdu represented by class 1, Alif (ﺍ), to class 39, Bri Yay (ے). Total (N = 144) cases for each class were collected. The dataset was collected from visually impaired students from The National Special Education School. Visually impaired users entered the Urdu Braille alphabets using touchscreen devices. The final dataset contained (N = 5638) cases. Reconstruction Independent Component Analysis (RICA)-based feature extraction model is created for Braille to Urdu text classification. The multiclass was categorized into three groups (13 each), i.e., category-1 (1–13), Alif-Zaal (ﺫ - ﺍ), category-2 (14–26), Ray-Fay (ﻒ - ﺮ), and category-3 (27–39), Kaaf-Bri Yay (ے - ﻕ), to give better vision and understanding. The performance was evaluated in terms of true positive rate, true negative rate, positive predictive value, negative predictive value, false positive rate, total accuracy, and area under the receiver operating curve. Among all the classifiers, support vector machine has achieved the highest performance with a 99.73% accuracy. For comparisons, robust machine learning techniques, such as support vector machine, decision tree, and K-nearest neighbors were used. Currently, this work has been done on only Grade 1 Urdu Braille. In the future, we plan to enhance this work using Grade 2 Urdu Braille with text and speech feedback on touchscreen-based android phones.

Download Full-text

Challenges for Tractogram Filtering

Mathematics and Visualization - Anisotropy Across Fields and Scales ◽

10.1007/978-3-030-56215-1_7 ◽

2021 ◽

pp. 149-168

Author(s):

Daniel Jörgens ◽

Maxime Descoteaux ◽

Rodrigo Moreno

Keyword(s):

Machine Learning ◽

White Matter ◽

False Positive Rate ◽

Machine Learning Techniques ◽

Post Processing ◽

Learning Techniques ◽

Processing Step ◽

Positive Rate ◽

Neural Fiber ◽

Modern Machine

AbstractTractography aims at describing the most likely neural fiber paths in white matter. A general issue of current tractography methods is their large false-positive rate. An approach to deal with this problem is tractogram filtering in which anatomically implausible streamlines are discarded as a post-processing step after tractography. In this chapter, we review the main approaches and methods from literature that are relevant for the application of tractogram filtering. Moreover, we give a perspective on the central challenges for the development of new methods, including modern machine learning techniques, in this field in the next few years.

Download Full-text

Using Machine Learning Algorithms on Prediction of Stock Price

Journal of Modeling and Optimization ◽

10.32732/jmo.2020.12.2.84 ◽

2020 ◽

Vol 12 (2) ◽

pp. 84-99

Author(s):

Li-Pang Chen

Keyword(s):

Machine Learning ◽

Stock Price ◽

Short Term Memory ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Short Term ◽

Learning Techniques ◽

Historical Database ◽

Long Short Term Memory

In this paper, we investigate analysis and prediction of the time-dependent data. We focus our attention on four different stocks are selected from Yahoo Finance historical database. To build up models and predict the future stock price, we consider three different machine learning techniques including Long Short-Term Memory (LSTM), Convolutional Neural Networks (CNN) and Support Vector Regression (SVR). By treating close price, open price, daily low, daily high, adjusted close price, and volume of trades as predictors in machine learning methods, it can be shown that the prediction accuracy is improved.

Download Full-text

A Comparative Study of Different Machine Learning Algorithms for Disease Prediction

International Journal of Advanced Research in Computer Science and Software Engineering ◽

10.23956/ijarcsse/v7i7/0177 ◽

2017 ◽

Vol 7 (7) ◽

pp. 172

Author(s):

Anantvir Singh Romana

Keyword(s):

Machine Learning ◽

Subsequent Treatment ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Support Vector ◽

Disease Prediction ◽

Classification Problems ◽

Learning Techniques ◽

Neural Network Classifiers ◽

Diagnostic Detection

Accurate diagnostic detection of the disease in a patient is critical and may alter the subsequent treatment and increase the chances of survival rate. Machine learning techniques have been instrumental in disease detection and are currently being used in various classification problems due to their accurate prediction performance. Various techniques may provide different desired accuracies and it is therefore imperative to use the most suitable method which provides the best desired results. This research seeks to provide comparative analysis of Support Vector Machine, Naïve bayes, J48 Decision Tree and neural network classifiers breast cancer and diabetes datsets.

Download Full-text

Predictive modeling for peri-implantitis by using machine learning techniques

Scientific Reports ◽

10.1038/s41598-021-90642-4 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Tomoaki Mameno ◽

Masahiro Wada ◽

Kazunori Nozaki ◽

Toshihito Takahashi ◽

Yoshitaka Tsujioka ◽

...

Keyword(s):

Machine Learning ◽

Demographic Data ◽

Risk Indicators ◽

Machine Learning Techniques ◽

Support Vector ◽

Machine Learning Methods ◽

Complex Interactions ◽

Learning Techniques ◽

Increased Risk ◽

Vector Machines

AbstractThe purpose of this retrospective cohort study was to create a model for predicting the onset of peri-implantitis by using machine learning methods and to clarify interactions between risk indicators. This study evaluated 254 implants, 127 with and 127 without peri-implantitis, from among 1408 implants with at least 4 years in function. Demographic data and parameters known to be risk factors for the development of peri-implantitis were analyzed with three models: logistic regression, support vector machines, and random forests (RF). As the results, RF had the highest performance in predicting the onset of peri-implantitis (AUC: 0.71, accuracy: 0.70, precision: 0.72, recall: 0.66, and f1-score: 0.69). The factor that had the most influence on prediction was implant functional time, followed by oral hygiene. In addition, PCR of more than 50% to 60%, smoking more than 3 cigarettes/day, KMW less than 2 mm, and the presence of less than two occlusal supports tended to be associated with an increased risk of peri-implantitis. Moreover, these risk indicators were not independent and had complex effects on each other. The results of this study suggest that peri-implantitis onset was predicted in 70% of cases, by RF which allows consideration of nonlinear relational data with complex interactions.

Download Full-text