Design of Machine Learning Prediction System Based on the Internet of Things Framework for Monitoring Fine PM Concentrations

Shun-Yuan Wang; Wen-Bin Lin; Yu-Chieh Shu

doi:10.3390/environments8100099

Design of Machine Learning Prediction System Based on the Internet of Things Framework for Monitoring Fine PM Concentrations

Environments ◽

10.3390/environments8100099 ◽

2021 ◽

Vol 8 (10) ◽

pp. 99

Author(s):

Shun-Yuan Wang ◽

Wen-Bin Lin ◽

Yu-Chieh Shu

Keyword(s):

Machine Learning ◽

Air Pollution ◽

Particulate Matter ◽

Random Forest ◽

Internet Of Things ◽

Random Forest Model ◽

The Internet ◽

Learning Models ◽

Forest Model ◽

The Internet Of Things

In this study, a mobile air pollution sensing unit based on the Internet of Things framework was designed for monitoring the concentration of fine particulate matter in three urban areas. This unit was developed using the NodeMCU-32S microcontroller, PMS5003-G5 (particulate matter sensing module), and Ublox NEO-6M V2 (GPS positioning module). The sensing unit transmits data of the particulate matter concentration and coordinates of a polluted location to the backend server through 3G and 4G telecommunication networks for data collection. This system will complement the government’s PM2.5 data acquisition system. Mobile monitoring stations meet the air pollution monitoring needs of some areas that require special observation. For example, an AIoT development system will be installed. At intersections with intensive traffic, it can be used as a reference for government transportation departments or environmental inspection departments for environmental quality monitoring or evacuation of traffic flow. Furthermore, the particulate matter distributions in three areas, namely Xinzhuang, Sanchong, and Luzhou Districts, which are all in New Taipei City of Taiwan, were estimated using machine learning models, the data of stationary monitoring stations, and the measurements of the mobile sensing system proposed in this study. Four types of learning models were trained, namely the decision tree, random forest, multilayer perceptron, and radial basis function neural network, and their prediction results were evaluated. The root mean square error was used as the performance indicator, and the learning results indicate that the random forest model outperforms the other models for both the training and testing sets. To examine the generalizability of the learning models, the models were verified in relation to data measured on three days: 15 February, 28 February, and 1 March 2019. A comparison between the model predicted and the measured data indicates that the random forest model provides the most stable and accurate prediction values and could clearly present the distribution of highly polluted areas. The results of these models are visualized in the form of maps by using a web application. The maps allow users to understand the distribution of polluted areas intuitively.

Application of Traditional Machine Learning Models to Detect Abnormal Traffic in the Internet of Things Networks

10.1007/978-3-030-88081-1_55 ◽

2021 ◽

pp. 735-744

Author(s):

Evgeniya Istratova ◽

Mikhail Grif ◽

Dmitry Dostovalov

Keyword(s):

Machine Learning ◽

Internet Of Things ◽

The Internet ◽

Learning Models ◽

The Internet Of Things ◽

Machine Learning Models

Machine Learning Models That Integrate Tumor Texture and Perfusion Characteristics Using Low-Dose Breast Computed Tomography Are Promising for Predicting Histological Biomarkers and Treatment Failure in Breast Cancer Patients

Cancers ◽

10.3390/cancers13236013 ◽

2021 ◽

Vol 13 (23) ◽

pp. 6013

Author(s):

Hyun-Soo Park ◽

Kwang-sig Lee ◽

Bo-Kyoung Seo ◽

Eun-Sil Kim ◽

Kyu-Ran Cho ◽

...

Keyword(s):

Breast Cancer ◽

Machine Learning ◽

Random Forest ◽

Cancer Patients ◽

Low Dose ◽

Random Forest Model ◽

Breast Cancer Patients ◽

Learning Models ◽

Forest Model ◽

Machine Learning Models

This prospective study enrolled 147 women with invasive breast cancer who underwent low-dose breast CT (80 kVp, 25 mAs, 1.01–1.38 mSv) before treatment. From each tumor, we extracted eight perfusion parameters using the maximum slope algorithm and 36 texture parameters using the filtered histogram technique. Relationships between CT parameters and histological factors were analyzed using five machine learning algorithms. Performance was compared using the area under the receiver-operating characteristic curve (AUC) with the DeLong test. The AUCs of the machine learning models increased when using both features instead of the perfusion or texture features alone. The random forest model that integrated texture and perfusion features was the best model for prediction (AUC = 0.76). In the integrated random forest model, the AUCs for predicting human epidermal growth factor receptor 2 positivity, estrogen receptor positivity, progesterone receptor positivity, ki67 positivity, high tumor grade, and molecular subtype were 0.86, 0.76, 0.69, 0.65, 0.75, and 0.79, respectively. Entropy of pre- and postcontrast images and perfusion, time to peak, and peak enhancement intensity of hot spots are the five most important CT parameters for prediction. In conclusion, machine learning using texture and perfusion characteristics of breast cancer with low-dose CT has potential value for predicting prognostic factors and risk stratification in breast cancer patients.

Detecting Myocardial Infarction by Electrocardiogram Machine Learning Models With Greater Accuracy; A Technical Advance Article

10.21203/rs.3.rs-150700/v1 ◽

2021 ◽

Author(s):

M.D.S. Sudaraka ◽

I. Abeyagunawardena ◽

E. S. De Silva ◽

S Abeyagunawardena

Keyword(s):

Machine Learning ◽

Myocardial Infarction ◽

Random Forest ◽

Decision Trees ◽

Sensitivity And Specificity ◽

Random Forest Model ◽

Multi Layer Perceptron ◽

Learning Models ◽

Forest Model ◽

Machine Learning Models

Abstract BackgroundElectrocardiogram (ECG) is a key diagnostic test in cardiac investigation. Interpretation of ECG is based on the understanding of normal electrical patterns produced by the heart and alterations of those patterns in specific disease conditions. With machine learning techniques, it is possible to interpret ECGs with increased accuracy. However, there is a lacuna in machine learning models to detect myocardial infarction (MI) coupled with the affected territories of the heart. MethodsThe dataset was obtained from the University of California, Irvine, Machine Learning Repository. It was filtered to obtain observations categorized as Normal, Ischemic changes, Old Anterior MI and Old Inferior MI. The dataset was randomly split into a training set (70%) and a test set (30%). 73 out of the 270 ECG features were selected based on the changes observed following MI, after excluding predictors that had near zero variance across the observations. Three machine learning classification models (Bootstrap Aggregation Decision Trees, Random Forest, Multi-layer Perceptron) were trained using the training dataset, optimizing for the Kappa statistic and the parameter tuning was achieved with repeated 10-fold cross validation. Accuracy and Kappa of the samples were used to evaluate performance between the models. ResultsThe Random Forest model identified old anterior and old inferior MIs with 100% sensitivity and specificity and all 4 categorized observations with an overall accuracy of 0.9167 (95% CI 0.8424 - 0.9633). Both the Bootstrap Aggregation Decision Trees and the Multi-layer Perceptron models identified old anterior MIs with 100% sensitivity and specificity and their overall accuracies for all 4 observations were 0.8958 (95% CI 0.8168 - 0.9489) and 0.8542 (95% CI 0.7674 - 0.9179) respectively.Conclusion With a medically informed feature selection we were able to identify old anterior MI with 100% sensitivity and specificity by all three models in this study, and old inferior MI with 100% sensitivity and specificity by Random Forest Model. If the data set can be improved it is possible to utilize these machine learning models in hospital setting to identify cardiac emergencies by incorporating them into cardiac monitors, until trained personnel become available.

On the Performance of Machine Learning Models for Anomaly-Based Intelligent Intrusion Detection Systems for the Internet of Things

IEEE Internet of Things Journal ◽

10.1109/jiot.2021.3103829 ◽

2021 ◽

pp. 1-1

Author(s):

Ghada Abdelmoumin ◽

Danda B. Rawat ◽

Abdul Rahman

Keyword(s):

Machine Learning ◽

Internet Of Things ◽

Intrusion Detection ◽

Intrusion Detection Systems ◽

The Internet ◽

Learning Models ◽

Detection Systems ◽

The Internet Of Things ◽

Machine Learning Models

A Comparative Analysis of Machine Learning Models for the Prediction of Insurance Uptake in Kenya

Data ◽

10.3390/data6110116 ◽

2021 ◽

Vol 6 (11) ◽

pp. 116

Author(s):

Nelson Kemboi Yego ◽

Juma Kasozi ◽

Joseph Nkurunziza

Keyword(s):

Machine Learning ◽

Random Forest ◽

Phase I ◽

Survey Data ◽

Low Income ◽

Random Forest Model ◽

Learning Models ◽

Forest Model ◽

Learning Classifiers ◽

Machine Learning Models

The role of insurance in financial inclusion and economic growth, in general, is immense and is increasingly being recognized. However, low uptake impedes the growth of the sector, hence the need for a model that robustly predicts insurance uptake among potential clients. This study undertook a two phase comparison of machine learning classifiers. Phase I had eight machine learning models compared for their performance in predicting the insurance uptake using 2016 Kenya FinAccessHousehold Survey data. Taking Phase I as a base in Phase II, random forest and XGBoost were compared with four deep learning classifiers using 2019 Kenya FinAccess Household Survey data. The random forest model trained on oversampled data showed the highest F1-score, accuracy, and precision. The area under the receiver operating characteristic curve was furthermore highest for random forest; hence, it could be construed as the most robust model for predicting the insurance uptake. Finally, the most important features in predicting insurance uptake as extracted from the random forest model were income, bank usage, and ability and willingness to support others. Hence, there is a need for a design and distribution of low income based products, and bancassurance could be said to be a plausible channel for the distribution of insurance products.

VariantSpark, A Random Forest Machine Learning Implementation for Ultra High Dimensional Data

10.1101/702902 ◽

2019 ◽

Cited By ~ 1

Author(s):

Arash Bayat ◽

Piotr Szul ◽

Aidan R. O’Brien ◽

Robert Dunne ◽

Oscar J. Luo ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Internet Of Things ◽

Random Forests ◽

Life Sciences ◽

High Dimensional ◽

The Internet ◽

Machine Learning Methods ◽

High Dimensional Datasets ◽

The Internet Of Things

AbstractThe demands on machine learning methods to cater for ultra high dimensional datasets, datasets with millions of features, have been increasing in domains like life sciences and the Internet of Things (IoT). While Random Forests are suitable for “wide” datasets, current implementations such as Google’s PLANET lack the ability to scale to such dimensions. Recent improvements by Yggdrasil begin to address these limitations but do not extend to Random Forest. This paper introduces CursedForest, a novel Random Forest implementation on top of Apache Spark and part of the VariantSpark platform, which parallelises processing of all nodes over the entire forest. CursedForest is 9 and up to 89 times faster than Google’s PLANET and Yggdrasil, respectively, and is the first method capable of scaling to millions of features.

Robustness Evaluations of Sustainable Machine Learning Models against Data Poisoning Attacks in the Internet of Things

Sustainability ◽

10.3390/su12166434 ◽

2020 ◽

Vol 12 (16) ◽

pp. 6434 ◽

Cited By ~ 1

Author(s):

Corey Dunn ◽

Nour Moustafa ◽

Benjamin Turnbull

Keyword(s):

Machine Learning ◽

Internet Of Things ◽

Large Scale ◽

Gradient Boosting ◽

The Internet ◽

Learning Models ◽

Detection Rates ◽

Ongoing Research ◽

The Internet Of Things ◽

Machine Learning Models

With the increasing popularity of the Internet of Things (IoT) platforms, the cyber security of these platforms is a highly active area of research. One key technology underpinning smart IoT systems is machine learning, which classifies and predicts events from large-scale data in IoT networks. Machine learning is susceptible to cyber attacks, particularly data poisoning attacks that inject false data when training machine learning models. Data poisoning attacks degrade the performances of machine learning models. It is an ongoing research challenge to develop trustworthy machine learning models resilient and sustainable against data poisoning attacks in IoT networks. We studied the effects of data poisoning attacks on machine learning models, including the gradient boosting machine, random forest, naive Bayes, and feed-forward deep learning, to determine the levels to which the models should be trusted and said to be reliable in real-world IoT settings. In the training phase, a label modification function is developed to manipulate legitimate input classes. The function is employed at data poisoning rates of 5%, 10%, 20%, and 30% that allow the comparison of the poisoned models and display their performance degradations. The machine learning models have been evaluated using the ToN_IoT and UNSW NB-15 datasets, as they include a wide variety of recent legitimate and attack vectors. The experimental results revealed that the models’ performances will be degraded, in terms of accuracy and detection rates, if the number of the trained normal observations is not significantly larger than the poisoned data. At the rate of data poisoning of 30% or greater on input data, machine learning performances are significantly degraded.

Edge Machine Learning for AI-Enabled IoT Devices: A Review

Sensors ◽

10.3390/s20092533 ◽

2020 ◽

Vol 20 (9) ◽

pp. 2533 ◽

Cited By ~ 6

Author(s):

Massimo Merenda ◽

Carlo Porcaro ◽

Demetrio Iero

Keyword(s):

Machine Learning ◽

Internet Of Things ◽

Machine Learning Algorithms ◽

The Internet ◽

Learning Models ◽

Iot Devices ◽

High Level ◽

And Behavior ◽

The Internet Of Things ◽

Machine Learning Models

In a few years, the world will be populated by billions of connected devices that will be placed in our homes, cities, vehicles, and industries. Devices with limited resources will interact with the surrounding environment and users. Many of these devices will be based on machine learning models to decode meaning and behavior behind sensors’ data, to implement accurate predictions and make decisions. The bottleneck will be the high level of connected things that could congest the network. Hence, the need to incorporate intelligence on end devices using machine learning algorithms. Deploying machine learning on such edge devices improves the network congestion by allowing computations to be performed close to the data sources. The aim of this work is to provide a review of the main techniques that guarantee the execution of machine learning models on hardware with low performances in the Internet of Things paradigm, paving the way to the Internet of Conscious Things. In this work, a detailed review on models, architecture, and requirements on solutions that implement edge machine learning on Internet of Things devices is presented, with the main goal to define the state of the art and envisioning development requirements. Furthermore, an example of edge machine learning implementation on a microcontroller will be provided, commonly regarded as the machine learning “Hello World”.

Detecting Abnormal Behavior of an IoT Device in the Network Based on a Traffic Model

Telecom IT ◽

10.31854/2307-1303-2019-7-3-50-55 ◽

2019 ◽

Vol 7 (3) ◽

pp. 50-55

Author(s):

D. Saharov ◽

D. Kozlov

Keyword(s):

Machine Learning ◽

Internet Of Things ◽

Mobile Networks ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Abnormal Behavior ◽

Traffic Model ◽

The Internet ◽

Wide Spread ◽

The Internet Of Things

The article deals with the СoAP Protocol that regulates the transmission and reception of information traf-fic by terminal devices in IoT networks. The article describes a model for detecting abnormal traffic in 5G/IoT networks using machine learning algorithms, as well as the main methods for solving this prob-lem. The relevance of the article is due to the wide spread of the Internet of things and the upcoming update of mobile networks to the 5g generation.

Understanding and personalising smart city services using machine learning, The Internet-of-Things and Big Data

2017 IEEE 26th International Symposium on Industrial Electronics (ISIE) ◽

10.1109/isie.2017.8001570 ◽

2017 ◽

Cited By ~ 17

Author(s):

Jeannette Chin ◽

Vic Callaghan ◽

Ivan Lam

Keyword(s):

Machine Learning ◽

Big Data ◽

Internet Of Things ◽

Smart City ◽

The Internet ◽

The Internet Of Things