An Evaluation of Machine Learning Classifiers for Prediction of Attacks to Secure Green IoT Infrastructure

Internet of things is an emerging technology that allows many devices to be connected in an unparalleled way. Despite having many beneficial applications, IoT technology presents significant emission risks due to the large number of devices used in the applications. Therefore, to gain maximum benefit from IoT, we must step towards green IT. On the other hand, cloud computing has been successfully used to provide limitless computational storage and other resources for a variety of IoT devices across the internet. Unfortunately, security concerns in cloud computing for IoT are still a concern. Motivated by the goal of creating a better atmosphere for IoT and ensuring its resilience to risks and attacks, this report reveals ways to decrease the impact of energyuse by IoT on the environment. Additionally, it addresses research concerns for IoT security and reflects on how to protect green IoT networks through the use of an effective machine learning intrusion detection technology to deter attacks on IoT platforms. To do that, we first evaluated some existing ML classifiers such Artificial Neural Network (ANN), Support Vector Machine (SVM), Gaussian Naïve Bayes (NB), Decision Tree (DT) and Random Forest (RF) with the old KDD’99 datasets. The accuracy was extremelyhigh for all classifiers except Gaussian NB whose accuracy was < 90%.The SVM is the highest at 99.24% accuracy with a loss of 4.68% in the last epoch of training. However, using a more recent dataset (ISCX1DS2012) on these same ML classifiers, we observed some discrepancies, all the classifiers dropped in their predictive accuracy even after altering the hyper-parameters.The ANN was at its lowest accuracy at 85.92% and the SVM which was relatively accurate dropped to 90.02%. NB algorithm produced approximately 67.9% accuracy which made it less accurate for both datasets. Based on these findings, we proceeded to propose an efficient model with enough hidden layers and nodes to increase the detection accuracy and to outperform the existing ML classifiers when evaluated with a more recent dataset

Download Full-text

An Experimental Analysis of Attack Classification Using Machine Learning in IoT Networks

Sensors ◽

10.3390/s21020446 ◽

2021 ◽

Vol 21 (2) ◽

pp. 446

Author(s):

Andrew Churcher ◽

Rehmat Ullah ◽

Jawad Ahmad ◽

Sadaqat ur Rehman ◽

Fawad Masood ◽

...

Keyword(s):

Machine Learning ◽

Binary Classification ◽

Denial Of Service ◽

Intrusion Detection Systems ◽

Support Vector ◽

Distributed Denial Of Service ◽

Detection Systems ◽

Iot Devices ◽

Artificial Neural Network Ann ◽

Multi Class Classification

In recent years, there has been a massive increase in the amount of Internet of Things (IoT) devices as well as the data generated by such devices. The participating devices in IoT networks can be problematic due to their resource-constrained nature, and integrating security on these devices is often overlooked. This has resulted in attackers having an increased incentive to target IoT devices. As the number of attacks possible on a network increases, it becomes more difficult for traditional intrusion detection systems (IDS) to cope with these attacks efficiently. In this paper, we highlight several machine learning (ML) methods such as k-nearest neighbour (KNN), support vector machine (SVM), decision tree (DT), naive Bayes (NB), random forest (RF), artificial neural network (ANN), and logistic regression (LR) that can be used in IDS. In this work, ML algorithms are compared for both binary and multi-class classification on Bot-IoT dataset. Based on several parameters such as accuracy, precision, recall, F1 score, and log loss, we experimentally compared the aforementioned ML algorithms. In the case of HTTP distributed denial-of-service (DDoS) attack, the accuracy of RF is 99%. Furthermore, other simulation results-based precision, recall, F1 score, and log loss metric reveal that RF outperforms on all types of attacks in binary classification. However, in multi-class classification, KNN outperforms other ML algorithms with an accuracy of 99%, which is 4% higher than RF.

Download Full-text

Modeling of Cutting Force in the Turning of AISI 4340 Using Gaussian Process Regression Algorithm

Applied Sciences ◽

10.3390/app11094055 ◽

2021 ◽

Vol 11 (9) ◽

pp. 4055

Author(s):

Mahdi S. Alajmi ◽

Abdullah M. Almeshal

Keyword(s):

Gaussian Process ◽

Cutting Force ◽

Predictive Accuracy ◽

Gaussian Process Regression ◽

Machining Process ◽

Support Vector ◽

Process Data ◽

Cutting Force Prediction ◽

Artificial Neural Network Ann ◽

Aisi 4340

Machining process data can be utilized to predict cutting force and optimize process parameters. Cutting force is an essential parameter that has a significant impact on the metal turning process. In this study, a cutting force prediction model for turning AISI 4340 alloy steel was developed using Gaussian process regression (GPR), support vector machines (SVM), and artificial neural network (ANN) methods. The GPR simulations demonstrated a reliable prediction of surface roughness for the dry turning method with R2 = 0.9843, MAPE = 5.12%, and RMSE = 1.86%. Performance comparisons between GPR, SVM, and ANN show that GPR is an effective method that can ensure high predictive accuracy of the cutting force in the turning of AISI 4340.

Download Full-text

Machine Learning-Based Prediction of Air Quality

Applied Sciences ◽

10.3390/app10249151 ◽

2020 ◽

Vol 10 (24) ◽

pp. 9151

Author(s):

Yun-Chia Liang ◽

Yona Maimury ◽

Angela Hsiang-Ling Chen ◽

Josue Rodolfo Cuevas Juarez

Keyword(s):

Machine Learning ◽

Air Quality ◽

Random Forest ◽

Prediction Models ◽

Superior Performance ◽

Support Vector ◽

Economic Activities ◽

Adaptive Boosting ◽

Series Of Experiments ◽

Artificial Neural Network Ann

Air, an essential natural resource, has been compromised in terms of quality by economic activities. Considerable research has been devoted to predicting instances of poor air quality, but most studies are limited by insufficient longitudinal data, making it difficult to account for seasonal and other factors. Several prediction models have been developed using an 11-year dataset collected by Taiwan’s Environmental Protection Administration (EPA). Machine learning methods, including adaptive boosting (AdaBoost), artificial neural network (ANN), random forest, stacking ensemble, and support vector machine (SVM), produce promising results for air quality index (AQI) level predictions. A series of experiments, using datasets for three different regions to obtain the best prediction performance from the stacking ensemble, AdaBoost, and random forest, found the stacking ensemble delivers consistently superior performance for R2 and RMSE, while AdaBoost provides best results for MAE.

Download Full-text

Prediction of Healing Performance of Autogenous Healing Concrete Using Machine Learning

Materials ◽

10.3390/ma14154068 ◽

2021 ◽

Vol 14 (15) ◽

pp. 4068

Author(s):

Xu Huang ◽

Mirna Wasouf ◽

Jessada Sresakoolchai ◽

Sakdirat Kaewunruen

Keyword(s):

Machine Learning ◽

Search Algorithm ◽

Weather Conditions ◽

Prediction Performance ◽

Machine Learning Algorithms ◽

Coefficient Of Determination ◽

Gradient Boosting ◽

Support Vector ◽

Self Healing ◽

Artificial Neural Network Ann

Cracks typically develop in concrete due to shrinkage, loading actions, and weather conditions; and may occur anytime in its life span. Autogenous healing concrete is a type of self-healing concrete that can automatically heal cracks based on physical or chemical reactions in concrete matrix. It is imperative to investigate the healing performance that autogenous healing concrete possesses, to assess the extent of the cracking and to predict the extent of healing. In the research of self-healing concrete, testing the healing performance of concrete in a laboratory is costly, and a mass of instances may be needed to explore reliable concrete design. This study is thus the world’s first to establish six types of machine learning algorithms, which are capable of predicting the healing performance (HP) of self-healing concrete. These algorithms involve an artificial neural network (ANN), a k-nearest neighbours (kNN), a gradient boosting regression (GBR), a decision tree regression (DTR), a support vector regression (SVR) and a random forest (RF). Parameters of these algorithms are tuned utilising grid search algorithm (GSA) and genetic algorithm (GA). The prediction performance indicated by coefficient of determination (R2) and root mean square error (RMSE) measures of these algorithms are evaluated on the basis of 1417 data sets from the open literature. The results show that GSA-GBR performs higher prediction performance (R2GSA-GBR = 0.958) and stronger robustness (RMSEGSA-GBR = 0.202) than the other five types of algorithms employed to predict the healing performance of autogenous healing concrete. Therefore, reliable prediction accuracy of the healing performance and efficient assistance on the design of autogenous healing concrete can be achieved.

Download Full-text

Exploring Impact of Age and Gender on Sentiment Analysis Using Machine Learning

Electronics ◽

10.3390/electronics9020374 ◽

2020 ◽

Vol 9 (2) ◽

pp. 374 ◽

Cited By ~ 2

Author(s):

Sudhanshu Kumar ◽

Monika Gahalawat ◽

Partha Pratim Roy ◽

Debi Prosad Dogra ◽

Byung-Gyu Kim

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Short Term Memory ◽

Age Groups ◽

Modern World ◽

Support Vector ◽

Digital Information ◽

Age And Gender ◽

And Gender ◽

The Impact

Sentiment analysis is a rapidly growing field of research due to the explosive growth in digital information. In the modern world of artificial intelligence, sentiment analysis is one of the essential tools to extract emotion information from massive data. Sentiment analysis is applied to a variety of user data from customer reviews to social network posts. To the best of our knowledge, there is less work on sentiment analysis based on the categorization of users by demographics. Demographics play an important role in deciding the marketing strategies for different products. In this study, we explore the impact of age and gender in sentiment analysis, as this can help e-commerce retailers to market their products based on specific demographics. The dataset is created by collecting reviews on books from Facebook users by asking them to answer a questionnaire containing questions about their preferences in books, along with their age groups and gender information. Next, the paper analyzes the segmented data for sentiments based on each age group and gender. Finally, sentiment analysis is done using different Machine Learning (ML) approaches including maximum entropy, support vector machine, convolutional neural network, and long short term memory to study the impact of age and gender on user reviews. Experiments have been conducted to identify new insights into the effect of age and gender for sentiment analysis.

Download Full-text

Explainable machine learning can outperform Cox regression predictions and provide insights in breast cancer survival

Scientific Reports ◽

10.1038/s41598-021-86327-7 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Arturo Moncada-Torres ◽

Marissa C. van Maaren ◽

Mathijs P. Hendriks ◽

Sabine Siesling ◽

Gijs Geleijnse

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Explicit Knowledge ◽

Cox Regression ◽

Metastatic Breast ◽

Gradient Boosting ◽

Support Vector ◽

Netherlands Cancer Registry ◽

Extreme Gradient Boosting ◽

The Impact

AbstractCox Proportional Hazards (CPH) analysis is the standard for survival analysis in oncology. Recently, several machine learning (ML) techniques have been adapted for this task. Although they have shown to yield results at least as good as classical methods, they are often disregarded because of their lack of transparency and little to no explainability, which are key for their adoption in clinical settings. In this paper, we used data from the Netherlands Cancer Registry of 36,658 non-metastatic breast cancer patients to compare the performance of CPH with ML techniques (Random Survival Forests, Survival Support Vector Machines, and Extreme Gradient Boosting [XGB]) in predicting survival using the $$c$$ c -index. We demonstrated that in our dataset, ML-based models can perform at least as good as the classical CPH regression ($$c$$ c -index $$\sim \,0.63$$ ∼ 0.63 ), and in the case of XGB even better ($$c$$ c -index $$\sim 0.73$$ ∼ 0.73 ). Furthermore, we used Shapley Additive Explanation (SHAP) values to explain the models’ predictions. We concluded that the difference in performance can be attributed to XGB’s ability to model nonlinearities and complex interactions. We also investigated the impact of specific features on the models’ predictions as well as their corresponding insights. Lastly, we showed that explainable ML can generate explicit knowledge of how models make their predictions, which is crucial in increasing the trust and adoption of innovative ML techniques in oncology and healthcare overall.

Download Full-text

Remote sensing inversion of water quality in coastal sea area based on machine learning: a case study of Shenzhen bay, China

10.5194/egusphere-egu21-1972 ◽

2021 ◽

Author(s):

Xiaotong Zhu ◽

Jinhui Jeanne Huang

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Water Quality ◽

Predictive Accuracy ◽

Water Environment ◽

Quality Parameters ◽

Machine Learning Algorithms ◽

Dynamic Monitoring ◽

Support Vector ◽

Seawater Quality

Remote sensing monitoring has the characteristics of wide monitoring range, celerity, low cost for long-term dynamic monitoring of water environment. With the flourish of artificial intelligence, machine learning has enabled remote sensing inversion of seawater quality to achieve higher prediction accuracy. However, due to the physicochemical property of the water quality parameters, the performance of algorithms differs a lot. In order to improve the predictive accuracy of seawater quality parameters, we proposed a technical framework to identify the optimal machine learning algorithms using Sentinel-2 satellite and in-situ seawater sample data. In the study, we select three algorithms, i.e. support vector regression (SVR), XGBoost and deep learning (DL), and four seawater quality parameters, i.e. dissolved oxygen (DO), total dissolved solids (TDS), turbidity(TUR) and chlorophyll-a (Chla). The results show that SVR is a more precise algorithm to inverse DO (R2 = 0.81). XGBoost has the best accuracy for Chla and Tur inversion (R2 = 0.75 and 0.78 respectively) while DL performs better in TDS (R2 =0.789). Overall, this research provides a theoretical support for high precision remote sensing inversion of offshore seawater quality parameters based on machine learning.

Download Full-text

Application of machine learning in predicting construction project profit in Ghana using Support Vector Regression Algorithm (SVRA)

Engineering Construction & Architectural Management ◽

10.1108/ecam-08-2020-0618 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Emmanuel Adinyira ◽

Emmanuel Akoi-Gyebi Adjei ◽

Kofi Agyekum ◽

Frank Desmond Kofi Fugar

Keyword(s):

Machine Learning ◽

Support Vector Regression ◽

Cash Flow ◽

Predictive Accuracy ◽

Model Development ◽

Construction Project ◽

Support Vector ◽

Sensitivity Index ◽

Content Type ◽

Hyperparameter Selection

PurposeKnowledge of the effect of various cash-flow factors on expected project profit is important to effectively manage productivity on construction projects. This study was conducted to develop and test the sensitivity of a Machine Learning Support Vector Regression Algorithm (SVRA) to predict construction project profit in Ghana.Design/methodology/approachThe study relied on data from 150 institutional projects executed within the past five years (2014–2018) in developing the model. Eighty percent (80%) of the data from the 150 projects was used at hyperparameter selection and final training phases of the model development and the remaining 20% for model testing. Using MATLAB for Support Vector Regression, the parameters available for tuning were the epsilon values, the kernel scale, the box constraint and standardisations. The sensitivity index was computed to determine the degree to which the independent variables impact the dependent variable.FindingsThe developed model's predictions perfectly fitted the data and explained all the variability of the response data around its mean. Average predictive accuracy of 73.66% was achieved with all the variables on the different projects in validation. The developed SVR model was sensitive to labour and loan.Originality/valueThe developed SVRA combines variation, defective works and labour with other financial constraints, which have been the variables used in previous studies. It will aid contractors in predicting profit on completion at commencement and also provide information on the effect of changes to cash-flow factors on profit.

Download Full-text

Development of Machine Learning Models to Evaluate the Toughness of OPH Alloys

Materials ◽

10.3390/ma14216713 ◽

2021 ◽

Vol 14 (21) ◽

pp. 6713

Author(s):

Omid Khalaj ◽

Moslem Ghobadi ◽

Ehsan Saebnoori ◽

Alireza Zarezadeh ◽

Mohammadreza Shishesaz ◽

...

Keyword(s):

Machine Learning ◽

Mechanical Properties ◽

Mechanical Alloying ◽

Fuzzy Inference ◽

Oxide Dispersion Strengthened ◽

Machine Learning Techniques ◽

Support Vector ◽

Anfis Model ◽

Inference Systems ◽

The Impact

Oxide Precipitation-Hardened (OPH) alloys are a new generation of Oxide Dispersion-Strengthened (ODS) alloys recently developed by the authors. The mechanical properties of this group of alloys are significantly influenced by the chemical composition and appropriate heat treatment (HT). The main steps in producing OPH alloys consist of mechanical alloying (MA) and consolidation, followed by hot rolling. Toughness was obtained from standard tensile test results for different variants of OPH alloy to understand their mechanical properties. Three machine learning techniques were developed using experimental data to simulate different outcomes. The effectivity of the impact of each parameter on the toughness of OPH alloys is discussed. By using the experimental results performed by the authors, the composition of OPH alloys (Al, Mo, Fe, Cr, Ta, Y, and O), HT conditions, and mechanical alloying (MA) were used to train the models as inputs and toughness was set as the output. The results demonstrated that all three models are suitable for predicting the toughness of OPH alloys, and the models fulfilled all the desired requirements. However, several criteria validated the fact that the adaptive neuro-fuzzy inference systems (ANFIS) model results in better conditions and has a better ability to simulate. The mean square error (MSE) for artificial neural networks (ANN), ANFIS, and support vector regression (SVR) models was 459.22, 0.0418, and 651.68 respectively. After performing the sensitivity analysis (SA) an optimized ANFIS model was achieved with a MSE value of 0.003 and demonstrated that HT temperature is the most significant of these parameters, and this acts as a critical rule in training the data sets.

Download Full-text

Fault detection for air conditioning system using machine learning

IAES International Journal of Artificial Intelligence (IJ-AI) ◽

10.11591/ijai.v9.i1.pp109-116 ◽

2020 ◽

Vol 9 (1) ◽

pp. 109

Author(s):

Noor Asyikin Sulaiman ◽

Md Pauzi Abdullah ◽

Hayati Abdullah ◽

Muhammad Noorazlan Shah Zainudin ◽

Azdiana Md Yusop

Keyword(s):

Machine Learning ◽

Supervised Learning ◽

Air Conditioning ◽

Machine Learning Algorithms ◽

Coefficient Of Performance ◽

Support Vector ◽

Air Conditioning System ◽

Learning Classifier ◽

Negative Impacts ◽

The Impact

Air conditioning system is a complex system and consumes the most energy in a building. Any fault in the system operation such as cooling tower fan faulty, compressor failure, damper stuck, etc. could lead to energy wastage and reduction in the system’s coefficient of performance (COP). Due to the complexity of the air conditioning system, detecting those faults is hard as it requires exhaustive inspections. This paper consists of two parts; i) to investigate the impact of different faults related to the air conditioning system on COP and ii) to analyse the performances of machine learning algorithms to classify those faults. Three supervised learning classifier models were developed, which were deep learning, support vector machine (SVM) and multi-layer perceptron (MLP). The performances of each classifier were investigated in terms of six different classes of faults. Results showed that different faults give different negative impacts on the COP. Also, the three supervised learning classifier models able to classify all faults for more than 94%, and MLP produced the highest accuracy and precision among all.

Download Full-text