Relation Inference among Sensor Time Series in Smart Buildings with Metric Learning

Smart Building Technologies hold promise for better livability for residents and lower energy footprints. Yet, the rollout of these technologies, from demand response controls to fault detection and diagnosis, significantly lags behind and is impeded by the current practice of manual identification of sensing point relationships, e.g., how equipment is connected or which sensors are co-located in the same space. This manual process is still error-prone, albeit costly and laborious.We study relation inference among sensor time series. Our key insight is that, as equipment is connected or sensors co-locate in the same physical environment, they are affected by the same real-world events, e.g., a fan turning on or a person entering the room, thus exhibiting correlated changes in their time series data. To this end, we develop a deep metric learning solution that first converts the primitive sensor time series to the frequency domain, and then optimizes a representation of sensors that encodes their relations. Built upon the learned representation, our solution pinpoints the relationships among sensors via solving a combinatorial optimization problem. Extensive experiments on real-world buildings demonstrate the effectiveness of our solution.

Download Full-text

Detection and diagnosis of dynamics in time series data: Theory of noise reduction

10.1063/1.45298 ◽

1994 ◽

Author(s):

Robert Cawley ◽

Guan-Hson Hsu ◽

Liming W. Salvino

Keyword(s):

Time Series ◽

Noise Reduction ◽

Time Series Data ◽

Series Data ◽

Detection And Diagnosis

Download Full-text

A Metric Learning-Based Univariate Time Series Classification Method

Information ◽

10.3390/info11060288 ◽

2020 ◽

Vol 11 (6) ◽

pp. 288

Author(s):

Kuiyong Song ◽

Nianbin Wang ◽

Hongbin Wang

Keyword(s):

Time Series ◽

Time Series Data ◽

Multivariate Time Series ◽

Metric Learning ◽

Classification Method ◽

Series Data ◽

Classification Error ◽

Time Series Classification ◽

Classification Error Rate ◽

Univariate Time Series

High-dimensional time series classification is a serious problem. A similarity measure based on distance is one of the methods for time series classification. This paper proposes a metric learning-based univariate time series classification method (ML-UTSC), which uses a Mahalanobis matrix on metric learning to calculate the local distance between multivariate time series and combines Dynamic Time Warping(DTW) and the nearest neighbor classification to achieve the final classification. In this method, the features of the univariate time series are presented as multivariate time series data with a mean value, variance, and slope. Next, a three-dimensional Mahalanobis matrix is obtained based on metric learning in the data. The time series is divided into segments of equal intervals to enable the Mahalanobis matrix to more accurately describe the features of the time series data. Compared with the most effective measurement method, the related experimental results show that our proposed algorithm has a lower classification error rate in most of the test datasets.

Download Full-text

Enhancing Interpretability of Data-Driven Fault Detection and Diagnosis Methodology with Maintainability Rules in Smart Building Management

Journal of Sensors ◽

10.1155/2022/5975816 ◽

2022 ◽

Vol 2022 ◽

pp. 1-48

Author(s):

Michael Yit Lin Chew ◽

Ke Yan

Keyword(s):

Fault Detection ◽

Real World ◽

Data Science ◽

Building Design ◽

Facility Management ◽

Fault Detection And Diagnosis ◽

Data Driven ◽

Main Concern ◽

Smart Building ◽

Detection And Diagnosis

Data-driven fault detection and diagnosis (FDD) methods, referring to the newer generation of artificial intelligence (AI) empowered classification methods, such as data science analysis, big data, Internet of things (IoT), industry 4.0, etc., become increasingly important for facility management in the smart building design and smart city construction. While data-driven FDD methods nowadays outperform the majority of traditional FDD approaches, such as the physically based models and mathematically based models, in terms of both efficiency and accuracy, the interpretability of those methods does not grow significantly. Instead, according to the literature survey, the interpretability of the data-driven FDD methods becomes the main concern and creates barriers for those methods to be adopted in real-world industrial applications. In this study, we reviewed the existing data-driven FDD approaches for building mechanical & electrical engineering (M&E) services faults and discussed the interpretability of the modern data-driven FDD methods. Two data-driven FDD strategies integrating the expert reasoning of the faults were proposed. Lists of expert rules, knowledge of maintainability, international/local standards were concluded for various M&E services, including heating, ventilation air-conditioning (HVAC), plumbing, fire safety, electrical and elevator systems based on surveys of 110 buildings in Singapore. The surveyed results significantly enhance the interpretability of data-driven FDD methods for M&E services, potentially enhance the FDD performance in terms of accuracy and promote the data-driven FDD approaches to real-world facility management practices.

Download Full-text

Sensors to Events: Semantic Modeling and Recognition of Events from Data Streams

International Journal of Semantic Computing ◽

10.1142/s1793351x16400171 ◽

2016 ◽

Vol 10 (04) ◽

pp. 461-501 ◽

Cited By ~ 4

Author(s):

Om Prasad Patri ◽

Anand V. Panangadan ◽

Vikrambhai S. Sorathia ◽

Viktor K. Prasanna

Keyword(s):

Time Series ◽

Real World ◽

Data Streams ◽

Time Series Data ◽

Semantic Representation ◽

Expressive Power ◽

Sensor Data ◽

Series Data ◽

Formal Approach ◽

Semantic Computing

Detecting and responding to real-world events is an integral part of any enterprise or organization, but Semantic Computing has been largely underutilized for complex event processing (CEP) applications. A primary reason for this gap is the difference in the level of abstraction between the high-level semantic models for events and the low-level raw data values received from sensor data streams. In this work, we investigate the need for Semantic Computing in various aspects of CEP, and intend to bridge this gap by utilizing recent advances in time series analytics and machine learning. We build upon the Process-oriented Event Model, which provides a formal approach to model real-world objects and events, and specifies the process of moving from sensors to events. We extend this model to facilitate Semantic Computing and time series data mining directly over the sensor data, which provides the advantage of automatically learning the required background knowledge without domain expertise. We illustrate the expressive power of our model in case studies from diverse applications, with particular emphasis on non-intrusive load monitoring in smart energy grids. We also demonstrate that this powerful semantic representation is still highly accurate and performs at par with existing approaches for event detection and classification.

Download Full-text

Modelling the Working Week for Multi-Step Forecasting using Gaussian Process Regression

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/277 ◽

2017 ◽

Author(s):

Pasan Karunaratne ◽

Masud Moshtaghi ◽

Shanika Karunasekera ◽

Aaron Harwood ◽

Trevor Cohn

Keyword(s):

Time Series ◽

Gaussian Process ◽

Real World ◽

Time Series Data ◽

Gaussian Process Regression ◽

Series Data ◽

Combination Methods ◽

The University ◽

The City ◽

Source Of Information

In time-series forecasting, regression is a popular method, with Gaussian Process Regression widely held to be the state of the art. The versatility of Gaussian Processes has led to them being used in many varied application domains. However, though many real-world applications involve data which follows a working-week structure, where weekends exhibit substantially different behavior to weekdays, methods for explicit modelling of working-week effects in Gaussian Process Regression models have not been proposed. Not explicitly modelling the working week fails to incorporate a signiﬁcant source of information which can be invaluable in forecasting scenarios. In this work we provide novel kernel-combination methods to explicitly model working-week effects in time-series data for more accurate predictions using Gaussian Process Regression. Further, we demonstrate that prediction accuracy can be improved by constraining the non-convex optimization process of ﬁnding optimal hyperparameter values. We validate the effectiveness of our methods by performing multi-step prediction on two real-world publicly available time-series datasets - one relating to electricity Smart Meter data of the University of Melbourne, and the other relating to the counts of pedestrians in the City of Melbourne.

Download Full-text

Extreme Value Statistics

10.1093/oso/9780198782933.003.0011 ◽

2018 ◽

Author(s):

Ray Huffaker ◽

Marco Bittelli ◽

Rodolfo Rosa

Keyword(s):

Infectious Disease ◽

New York ◽

Time Series ◽

Real World ◽

Scarlet Fever ◽

Time Series Data ◽

Extreme Value Statistics ◽

Series Data ◽

Disease Dynamics ◽

Infectious Disease Dynamics

This Capstone chapter illustrates how concepts in the book come together to diagnose real-world dynamics from observed time series data. In particular, we apply NLTS to diagnose multi-strain infectious disease dynamics from weekly cases of scarlet fever, measles, and pertussis in New York during the pre-vaccine period 1924-1948.

Download Full-text

Data Preprocessing

10.1093/oso/9780198782933.003.0006 ◽

2018 ◽

Author(s):

Ray Huffaker ◽

Marco Bittelli ◽

Rodolfo Rosa

Keyword(s):

Stochastic Process ◽

Time Series ◽

Signal Processing ◽

Real World ◽

Null Hypothesis ◽

Time Series Data ◽

Nonlinear Behavior ◽

Surrogate Data ◽

Series Data ◽

Linear Behavior

Successful reconstruction of a shadow attractor provides preliminary empirical evidence that a signal isolated from observed time series data may be generated by deterministic dynamics. However, because we cannot reasonably expect signal processing to purge the signal of all noise in practice, and because noisy linear behavior can be visually indistinguishable from nonlinear behavior, the possibility remains that noticeable regularity detected in a shadow attractor may be fortuitously reconstructed from data generated by a linear-stochastic process. This chapter investigates how we can test this null hypothesis using surrogate data testing. The combination of a noticeably regular shadow attractor, along with strong statistical rejection of fortuitous regularity, increases the probability that observed data are generated by deterministic real-world dynamics.

Download Full-text

Unsupervised Pre-training of a Deep LSTM-based Stacked Autoencoder for Multivariate Time Series Forecasting Problems

Scientific Reports ◽

10.1038/s41598-019-55320-6 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 14

Author(s):

Alaa Sagheer ◽

Mostafa Kotb

Keyword(s):

Time Series ◽

Case Studies ◽

Real World ◽

Time Series Data ◽

Short Term Memory ◽

Multivariate Time Series ◽

Series Data ◽

Recurrent Networks ◽

Stacked Autoencoder ◽

Real World Datasets

AbstractCurrently, most real-world time series datasets are multivariate and are rich in dynamical information of the underlying system. Such datasets are attracting much attention; therefore, the need for accurate modelling of such high-dimensional datasets is increasing. Recently, the deep architecture of the recurrent neural network (RNN) and its variant long short-term memory (LSTM) have been proven to be more accurate than traditional statistical methods in modelling time series data. Despite the reported advantages of the deep LSTM model, its performance in modelling multivariate time series (MTS) data has not been satisfactory, particularly when attempting to process highly non-linear and long-interval MTS datasets. The reason is that the supervised learning approach initializes the neurons randomly in such recurrent networks, disabling the neurons that ultimately must properly learn the latent features of the correlated variables included in the MTS dataset. In this paper, we propose a pre-trained LSTM-based stacked autoencoder (LSTM-SAE) approach in an unsupervised learning fashion to replace the random weight initialization strategy adopted in deep LSTM recurrent networks. For evaluation purposes, two different case studies that include real-world datasets are investigated, where the performance of the proposed approach compares favourably with the deep LSTM approach. In addition, the proposed approach outperforms several reference models investigating the same case studies. Overall, the experimental results clearly show that the unsupervised pre-training approach improves the performance of deep LSTM and leads to better and faster convergence than other models.

Download Full-text

Importance of data preprocessing in time series prediction using SARIMA: A case study

International Journal of Knowledge-based and Intelligent Engineering Systems ◽

10.3233/kes-200065 ◽

2021 ◽

Vol 24 (4) ◽

pp. 331-342

Author(s):

Amir Hossein Adineh ◽

Zahra Narimani ◽

Suresh Chandra Satapathy

Keyword(s):

Time Series ◽

Data Analysis ◽

Real World ◽

Missing Values ◽

Time Series Data ◽

Working Hours ◽

Series Data ◽

Real World Data ◽

Seasonal Behavior ◽

Time Series Data Analysis

Over last decades, time series data analysis has been in practice of specific importance. Different domains such as financial data analysis, analyzing biological data and speech recognition inherently deal with time dependent signals. Monitoring the past behavior of signals is a key for precise predicting the behavior of a system in near future. In scenarios such as financial data prediction, the predominant signal has a periodic behavior (starting from beginning of the month, week, etc.) and a general trend and seasonal behavior can also be assumed. Autoregressive Integrated Moving Average (ARIMA) model and its seasonal extension, SARIMA, have been widely used in forecasting time-series data, and are also capable of dealing with the seasonal behavior/trend in the data. Although the behavior of data may be autoregressive and trends and seasonality can be detected and handled by SARIMA, the data is not always exactly compatible with SARIMA (or more generally ARIMA) assumptions. In addition, the existence of missing data is not pre-assumed in SARIMA, while in real-world, there can be always missing data for different reasons such as holidays for which no data may be recorded. For different week days, different working hours may be a cause of observing irregular patterns compared to what is expected by SARIMA assumptions. In this paper, we investigate the effectiveness of applying SARIMA on such real-world data, and demonstrate preprocessing methods that can be applied in order to make the data more suitable to be modeled by SARIMA model. The data in the existing research is derived from transactions of a mutual fund investment company, which contains missing values (single point and intervals) and also irregularities as a result of the number of working hours per week days being different from each other which makes the data inconsistent leading to poor result without preprocessing. In addition, the number of data points was not adequate at the time of analysis in order to fit a SARIM model. Preprocessing steps such as filling missing values and tricks to make data consistent has been proposed to deal with existing problems. Results show that prediction performance of SARIMA on this set of real-world data is significantly improved by applying several preprocessing steps introduced in order to deal with mentioned circumstances. The proposed preprocessing steps can be used in other real-world time-series data analysis.

Download Full-text

TE-ESN: Time Encoding Echo State Network for Prediction Based on Irregularly Sampled Time Series Data

Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2021/414 ◽

2021 ◽

Author(s):

Chenxi Sun ◽

Shenda Hong ◽

Moxian Song ◽

Yen-Hsiu Chou ◽

Yongyue Sun ◽

...

Keyword(s):

Time Series ◽

Real World ◽

Time Series Data ◽

Series Data ◽

Echo State Network ◽

Time Intervals ◽

Time Encoding ◽

Real World Datasets ◽

Ordinary Time ◽

Time Information

Prediction based on Irregularly Sampled Time Series (ISTS) is of wide concern in real-world applications. For more accurate prediction, methods had better grasp more data characteristics. Different from ordinary time series, ISTS is characterized by irregular time intervals of intra-series and different sampling rates of inter-series. However, existing methods have suboptimal predictions due to artificially introducing new dependencies in a time series and biasedly learning relations among time series when modeling these two characteristics. In this work, we propose a novel Time Encoding (TE) mechanism. TE can embed the time information as time vectors in the complex domain. It has the properties of absolute distance and relative distance under different sampling rates, which helps to represent two irregularities. Meanwhile, we create a new model named Time Encoding Echo State Network (TE-ESN). It is the first ESNs-based model that can process ISTS data. Besides, TE-ESN incorporates long short-term memories and series fusion to grasp horizontal and vertical relations. Experiments on one chaos system and three real-world datasets show that TE-ESN performs better than all baselines and has better reservoir property.

Download Full-text