Applying Machine Learning for Sensor Data Analysis in Interactive Systems

Thomas PlÖtz

doi:10.1145/3459666

Applying Machine Learning for Sensor Data Analysis in Interactive Systems

ACM Computing Surveys ◽

10.1145/3459666 ◽

2021 ◽

Vol 54 (6) ◽

pp. 1-25

Author(s):

Thomas PlÖtz

Keyword(s):

Machine Learning ◽

Data Analysis ◽

Interactive Systems ◽

Sensor Data ◽

Paradigm Shifts ◽

The Core ◽

Massive Growth ◽

Interactive Computing ◽

Practical Guidelines ◽

Research Domains

With the widespread proliferation of (miniaturized) sensing facilities and the massive growth and popularity of the field of machine learning (ML) research, new frontiers in automated sensor data analysis have been explored that lead to paradigm shifts in many application domains. In fact, many practitioners now employ and rely more and more on ML methods as integral part of their sensor data analysis workflows—thereby not necessarily being ML experts or having an interest in becoming one. The availability of toolkits that can readily be used by practitioners has led to immense popularity and widespread adoption and, in essence, pragmatic use of ML methods. ML having become mainstream helps pushing the core agenda of practitioners, yet it comes with the danger of misusing methods and as such running the risk of leading to misguiding if not flawed results. Based on years of observations in the ubiquitous and interactive computing domain that extensively relies on sensors and automated sensor data analysis, and on having taught and worked with numerous students in the field, in this article I advocate a considerate use of ML methods by practitioners, i.e., non-ML experts, and elaborate on pitfalls of an overly pragmatic use of ML techniques. The article not only identifies and illustrates the most common issues, it also offers ways and practical guidelines to avoid these, which shall help practitioners to benefit from employing ML in their core research domains and applications.

IoT Sensor Data Analysis and Fusion Applying Machine Learning and Meta-Heuristic Approaches

Enabling AI Applications in Data Science - Studies in Computational Intelligence ◽

10.1007/978-3-030-52067-0_20 ◽

2020 ◽

pp. 441-469

Author(s):

Anindita Saha ◽

Chandreyee Chowdhury ◽

Mayurakshi Jana ◽

Suparna Biswas

Keyword(s):

Machine Learning ◽

Data Analysis ◽

Sensor Data ◽

Heuristic Approaches

Object Detection in Fog Computing Using Machine Learning Algorithms

Advances in Computer and Electrical Engineering - Architecture and Security Issues in Fog Computing Applications ◽

10.4018/978-1-7998-0194-8.ch006 ◽

2020 ◽

pp. 90-107

Author(s):

Peyakunta Bhargavi ◽

Singaraju Jyothi

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Object Detection ◽

Fog Computing ◽

Machine Learning Algorithms ◽

Sensor Data ◽

Core Network ◽

The Core ◽

The Moment ◽

Algorithmic Approaches

The moment we live in today demands the convergence of the cloud computing, fog computing, machine learning, and IoT to explore new technological solutions. Fog computing is an emerging architecture intended for alleviating the network burdens at the cloud and the core network by moving resource-intensive functionalities such as computation, communication, storage, and analytics closer to the end users. Machine learning is a subfield of computer science and is a type of artificial intelligence (AI) that provides machines with the ability to learn without explicit programming. IoT has the ability to make decisions and take actions autonomously based on algorithmic sensing to acquire sensor data. These embedded capabilities will range across the entire spectrum of algorithmic approaches that is associated with machine learning. Here the authors explore how machine learning methods have been used to deploy the object detection, text detection in an image, and incorporated for better fulfillment of requirements in fog computing.

Machine Learning based Improved Gaussian Mixture Model for IoT Real-Time Data Analysis

Ingeniería solidaria ◽

10.16925/2357-6014.2020.01.02 ◽

2020 ◽

Vol 16 (1) ◽

Author(s):

Sivadi Sivadi ◽

Moorthy Moorthy ◽

Vijender Solanki

Keyword(s):

Machine Learning ◽

Data Analysis ◽

Real Time ◽

Gaussian Mixture Model ◽

Gaussian Mixture ◽

Sensor Data ◽

Cloud Platform ◽

Time Data ◽

Huge Amount ◽

Real Time Data

Introduction: The article is the product of the research “Due to the increase in popularity of Internet of Things (IoT), a huge amount of sensor data is being generated from various smart city applications”, developed at Pondicherry University in the year 2019. Problem:To acquire and analyze the huge amount of sensor-generated data effectively is a significant problem when processing the data. Objective: To propose a novel framework for IoT sensor data analysis using machine learning based improved Gaussian Mixture Model (GMM) by acquired real-time data. Methodology:In this paper, the clustering based GMM models are used to find the density patterns on a daily or weekly basis for user requirements. The ThingSpeak cloud platform used for performing analysis and visualizations. Results:An analysis has been performed on the proposed mechanism implemented on real-time traffic data with Accuracy, Precision, Recall, and F-Score as measures. Conclusions:The results indicate that the proposed mechanism is efficient when compared with the state-of-the-art schemes. Originality:Applying GMM and ThingSpeak Cloud platform to perform analysis on IoT real-time data is the first approach to find traffic density patterns on busy roads. Restrictions:There is a need to develop the application for mobile users to find the optimal traffic routes based on density patterns. The authors could not concentrate on the security aspect for finding density patterns.

Industry 4.0: Sensor Data Analysis Using Machine Learning

Communications in Computer and Information Science - Data Management Technologies and Applications ◽

10.1007/978-3-030-54595-6_3 ◽

2020 ◽

pp. 37-58

Author(s):

Nadeem Iftikhar ◽

Finn Ebertsen Nordbjerg ◽

Thorkil Baattrup-Andersen ◽

Karsten Jeppesen

Keyword(s):

Machine Learning ◽

Data Analysis ◽

Industry 4.0 ◽

Sensor Data

A Survey on Distributed Fibre Optic Sensor Data Modelling Techniques and Machine Learning Algorithms for Multiphase Fluid Flow Estimation

Sensors ◽

10.3390/s21082801 ◽

2021 ◽

Vol 21 (8) ◽

pp. 2801

Author(s):

Hasan Asy’ari Arief ◽

Tomasz Wiktorski ◽

Peter James Thomas

Keyword(s):

Machine Learning ◽

Fluid Flow ◽

Data Analysis ◽

Real Time ◽

Machine Learning Algorithms ◽

Sensor Data ◽

Support Vector ◽

Measurement Technology ◽

Fibre Optic ◽

Multiphase Fluid

Real-time monitoring of multiphase fluid flows with distributed fibre optic sensing has the potential to play a major role in industrial flow measurement applications. One such application is the optimization of hydrocarbon production to maximize short-term income, and prolong the operational lifetime of production wells and the reservoir. While the measurement technology itself is well understood and developed, a key remaining challenge is the establishment of robust data analysis tools that are capable of providing real-time conversion of enormous data quantities into actionable process indicators. This paper provides a comprehensive technical review of the data analysis techniques for distributed fibre optic technologies, with a particular focus on characterizing fluid flow in pipes. The review encompasses classical methods, such as the speed of sound estimation and Joule-Thomson coefficient, as well as their data-driven machine learning counterparts, such as Convolutional Neural Network (CNN), Support Vector Machine (SVM), and Ensemble Kalman Filter (EnKF) algorithms. The study aims to help end-users establish reliable, robust, and accurate solutions that can be deployed in a timely and effective way, and pave the wave for future developments in the field.

A Novel Algorithm to Reduce Machine Learning Efforts in Real-Time Sensor Data Analysis

Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering - Wireless Mobile Communication and Healthcare ◽

10.1007/978-3-319-98551-0_10 ◽

2018 ◽

pp. 83-90

Author(s):

Majid Janidarmian ◽

Atena Roshan Fekr ◽

Katarzyna Radecka ◽

Zeljko Zilic

Keyword(s):

Machine Learning ◽

Data Analysis ◽

Real Time ◽

Sensor Data ◽

Novel Algorithm

Bipolar Disorder and Oxidative Stress Injury Mechanism - Clinical Big Data Analysis Based on Machine Learning

Case Medical Research ◽

10.31525/ct1-nct03949218 ◽

2019 ◽

Author(s):

Keyword(s):

Oxidative Stress ◽

Machine Learning ◽

Bipolar Disorder ◽

Big Data ◽

Data Analysis ◽

Big Data Analysis ◽

Injury Mechanism ◽

Stress Injury ◽

Oxidative Stress Injury ◽

And Oxidative Stress

Binary Spectrum Feature for Improved Classiﬁer Performance

10.36227/techrxiv.12993122 ◽

2020 ◽

Author(s):

Nalika Ulapane ◽

Karthick Thiyagarajan ◽

sarath kodagoda

Keyword(s):

Machine Learning ◽

Classification Performance ◽

Feature Reduction ◽

Sensor Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Svm Classifier ◽

Monitoring Task ◽

Classifier Performance ◽

Spectrum Feature

<div>Classiﬁcation has become a vital task in modern machine learning and Artiﬁcial Intelligence applications, including smart sensing. Numerous machine learning techniques are available to perform classiﬁcation. Similarly, numerous practices, such as feature selection (i.e., selection of a subset of descriptor variables that optimally describe the output), are available to improve classiﬁer performance. In this paper, we consider the case of a given supervised learning classiﬁcation task that has to be performed making use of continuous-valued features. It is assumed that an optimal subset of features has already been selected. Therefore, no further feature reduction, or feature addition, is to be carried out. Then, we attempt to improve the classiﬁcation performance by passing the given feature set through a transformation that produces a new feature set which we have named the “Binary Spectrum”. Via a case study example done on some Pulsed Eddy Current sensor data captured from an infrastructure monitoring task, we demonstrate how the classiﬁcation accuracy of a Support Vector Machine (SVM) classiﬁer increases through the use of this Binary Spectrum feature, indicating the feature transformation’s potential for broader usage.</div><div> </div>

Machine Learning Based Predictive Action on Categorical Non-Sequential Data

Recent Advances in Computer Science and Communications ◽

10.2174/2213275912666190417150421 ◽

2020 ◽

Vol 13 (5) ◽

pp. 1020-1030

Author(s):

Pradeep S. ◽

Jagadish S. Kallimani

Keyword(s):

Machine Learning ◽

Data Analysis ◽

Categorical Data ◽

Numerical Data ◽

Processing Technique ◽

Machine Learning Algorithms ◽

Sequential Data ◽

Industry Standard ◽

Robust Model ◽

Future Work

Background: With the advent of data analysis and machine learning, there is a growing impetus of analyzing and generating models on historic data. The data comes in numerous forms and shapes with an abundance of challenges. The most sorted form of data for analysis is the numerical data. With the plethora of algorithms and tools it is quite manageable to deal with such data. Another form of data is of categorical nature, which is subdivided into, ordinal (order wise) and nominal (number wise). This data can be broadly classified as Sequential and Non-Sequential. Sequential data analysis is easier to preprocess using algorithms. Objective: The challenge of applying machine learning algorithms on categorical data of nonsequential nature is dealt in this paper. Methods: Upon implementing several data analysis algorithms on such data, we end up getting a biased result, which makes it impossible to generate a reliable predictive model. In this paper, we will address this problem by walking through a handful of techniques which during our research helped us in dealing with a large categorical data of non-sequential nature. In subsequent sections, we will discuss the possible implementable solutions and shortfalls of these techniques. Results: The methods are applied to sample datasets available in public domain and the results with respect to accuracy of classification are satisfactory. Conclusion: The best pre-processing technique we observed in our research is one hot encoding, which facilitates breaking down the categorical features into binary and feeding it into an Algorithm to predict the outcome. The example that we took is not abstract but it is a real – time production services dataset, which had many complex variations of categorical features. Our Future work includes creating a robust model on such data and deploying it into industry standard applications.

Automatic Identification of Upper Extremity Rehabilitation Exercise Type and Dose Using Body-Worn Sensors and Machine Learning: A Pilot Study

Digital Biomarkers ◽

10.1159/000516619 ◽

2021 ◽

pp. 158-166

Author(s):

Noah Balestra ◽

Gaurav Sharma ◽

Linda M. Riek ◽

Ania Busza

Keyword(s):

Machine Learning ◽

Upper Extremity ◽

Sensor Data ◽

Inpatient Setting ◽

Accelerometer Data ◽

Data Set ◽

Machine Learning Classification ◽

Exercise Type ◽

Exercise Dose ◽

Rehabilitation Exercises

Background: Prior studies suggest that participation in rehabilitation exercises improves motor function poststroke; however, studies on optimal exercise dose and timing have been limited by the technical challenge of quantifying exercise activities over multiple days. Objectives: The objectives of this study were to assess the feasibility of using body-worn sensors to track rehabilitation exercises in the inpatient setting and investigate which recording parameters and data analysis strategies are sufficient for accurately identifying and counting exercise repetitions. Methods: MC10 BioStampRC® sensors were used to measure accelerometer and gyroscope data from upper extremities of healthy controls (n = 13) and individuals with upper extremity weakness due to recent stroke (n = 13) while the subjects performed 3 preselected arm exercises. Sensor data were then labeled by exercise type and this labeled data set was used to train a machine learning classification algorithm for identifying exercise type. The machine learning algorithm and a peak-finding algorithm were used to count exercise repetitions in non-labeled data sets. Results: We achieved a repetition counting accuracy of 95.6% overall, and 95.0% in patients with upper extremity weakness due to stroke when using both accelerometer and gyroscope data. Accuracy was decreased when using fewer sensors or using accelerometer data alone. Conclusions: Our exploratory study suggests that body-worn sensor systems are technically feasible, well tolerated in subjects with recent stroke, and may ultimately be useful for developing a system to measure total exercise “dose” in poststroke patients during clinical rehabilitation or clinical trials.