scholarly journals Automatic identification of differences in behavioral co-occurrence between groups

2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Yiming Tian ◽  
Takuya Maekawa ◽  
Joseph Korpela ◽  
Daichi Amagata ◽  
Takahiro Hara ◽  
...  

Abstract Background Recent advances in sensing technologies have enabled us to attach small loggers to animals in their natural habitat. It allows measurement of the animals’ behavior, along with associated environmental and physiological data and to unravel the adaptive significance of the behavior. However, because animal-borne loggers can now record multi-dimensional (here defined as multimodal) time series information from a variety of sensors, it is becoming increasingly difficult to identify biologically important patterns hidden in the high-dimensional long-term data. In particular, it is important to identify co-occurrences of several behavioral modes recorded by different sensors in order to understand an internal hidden state of an animal because the observed behavioral modes are reflected by the hidden state. This study proposed a method for automatically detecting co-occurrence of behavioral modes that differs between two groups (e.g., males vs. females) from multimodal time-series sensor data. The proposed method first extracted behavioral modes from time-series data (e.g., resting and cruising modes in GPS trajectories or relaxed and stressed modes in heart rates) and then identified two different behavioral modes that were frequently co-occur (e.g., co-occurrence of the cruising mode and relaxed mode). Finally, behavioral modes that differ between the two groups in terms of the frequency of co-occurrence were identified. Results We demonstrated the effectiveness of our method using animal-locomotion data collected from male and female Streaked Shearwaters by showing co-occurrences of locomotion modes and diving behavior recorded by GPS and water-depth sensors. For example, we found that the behavioral mode of high-speed locomotion and that of multiple dives into the sea were highly correlated in male seabirds. In addition, compared to the naive method, the proposed method reduced the computation costs by about 99.9%. Conclusion Because our method can automatically mine meaningful behavioral modes from multimodal time-series data, it can be potentially applied to analyzing co-occurrences of locomotion modes and behavioral modes from various environmental and physiological data.

AI ◽  
2021 ◽  
Vol 2 (1) ◽  
pp. 48-70
Author(s):  
Wei Ming Tan ◽  
T. Hui Teo

Prognostic techniques attempt to predict the Remaining Useful Life (RUL) of a subsystem or a component. Such techniques often use sensor data which are periodically measured and recorded into a time series data set. Such multivariate data sets form complex and non-linear inter-dependencies through recorded time steps and between sensors. Many current existing algorithms for prognostic purposes starts to explore Deep Neural Network (DNN) and its effectiveness in the field. Although Deep Learning (DL) techniques outperform the traditional prognostic algorithms, the networks are generally complex to deploy or train. This paper proposes a Multi-variable Time Series (MTS) focused approach to prognostics that implements a lightweight Convolutional Neural Network (CNN) with attention mechanism. The convolution filters work to extract the abstract temporal patterns from the multiple time series, while the attention mechanisms review the information across the time axis and select the relevant information. The results suggest that the proposed method not only produces a superior accuracy of RUL estimation but it also trains many folds faster than the reported works. The superiority of deploying the network is also demonstrated on a lightweight hardware platform by not just being much compact, but also more efficient for the resource restricted environment.


Author(s):  
Baher Azzam ◽  
Ralf Schelenz ◽  
Björn Roscher ◽  
Abdul Baseer ◽  
Georg Jacobs

AbstractA current development trend in wind energy is characterized by the installation of wind turbines (WT) with increasing rated power output. Higher towers and larger rotor diameters increase rated power leading to an intensification of the load situation on the drive train and the main gearbox. However, current main gearbox condition monitoring systems (CMS) do not record the 6‑degree of freedom (6-DOF) input loads to the transmission as it is too expensive. Therefore, this investigation aims to present an approach to develop and validate a low-cost virtual sensor for measuring the input loads of a WT main gearbox. A prototype of the virtual sensor system was developed in a virtual environment using a multi-body simulation (MBS) model of a WT drivetrain and artificial neural network (ANN) models. Simulated wind fields according to IEC 61400‑1 covering a variety of wind speeds were generated and applied to a MBS model of a Vestas V52 wind turbine. The turbine contains a high-speed drivetrain with 4‑points bearing suspension, a common drivetrain configuration. The simulation was used to generate time-series data of the target and input parameters for the virtual sensor algorithm, an ANN model. After the ANN was trained using the time-series data collected from the MBS, the developed virtual sensor algorithm was tested by comparing the estimated 6‑DOF transmission input loads from the ANN to the simulated 6‑DOF transmission input loads from the MBS. The results show high potential for virtual sensing 6‑DOF wind turbine transmission input loads using the presented method.


Author(s):  
Meenakshi Narayan ◽  
Ann Majewicz Fey

Abstract Sensor data predictions could significantly improve the accuracy and effectiveness of modern control systems; however, existing machine learning and advanced statistical techniques to forecast time series data require significant computational resources which is not ideal for real-time applications. In this paper, we propose a novel forecasting technique called Compact Form Dynamic Linearization Model-Free Prediction (CFDL-MFP) which is derived from the existing model-free adaptive control framework. This approach enables near real-time forecasts of seconds-worth of time-series data due to its basis as an optimal control problem. The performance of the CFDL-MFP algorithm was evaluated using four real datasets including: force sensor readings from surgical needle, ECG measurements for heart rate, and atmospheric temperature and Nile water level recordings. On average, the forecast accuracy of CFDL-MFP was 28% better than the benchmark Autoregressive Integrated Moving Average (ARIMA) algorithm. The maximum computation time of CFDL-MFP was 49.1ms which was 170 times faster than ARIMA. Forecasts were best for deterministic data patterns, such as the ECG data, with a minimum average root mean squared error of (0.2±0.2).


Author(s):  
Shaolong Zeng ◽  
Yiqun Liu ◽  
Junjie Ding ◽  
Danlu Xu

This paper aims to identify the relationship among energy consumption, FDI, and economic development in China from 1993 to 2017, taking Zhejiang as an example. FDI is the main factor of the rapid development of Zhejiang’s open economy, which promotes the development of the economy, but also leads to the growth in energy consumption. Based on the time series data of energy consumption, FDI inflow, and GDP in Zhejiang from 1993 to 2017, we choose the vector auto-regression (VAR) model and try to identify the relationship among energy consumption, FDI, and economic development. The results indicate that there is a long-run equilibrium relationship among them. The FDI inflow promotes energy consumption, and the energy consumption promotes FDI inflow in turn. FDI promotes economic growth indirectly through energy consumption. Therefore, improving the quality of FDI and energy efficiency has become an inevitable choice to achieve the transition of Zhejiang’s economy from high speed growth to high quality growth.


2022 ◽  
Vol 3 (1) ◽  
pp. 1-26
Author(s):  
Omid Hajihassani ◽  
Omid Ardakanian ◽  
Hamzeh Khazaei

The abundance of data collected by sensors in Internet of Things devices and the success of deep neural networks in uncovering hidden patterns in time series data have led to mounting privacy concerns. This is because private and sensitive information can be potentially learned from sensor data by applications that have access to this data. In this article, we aim to examine the tradeoff between utility and privacy loss by learning low-dimensional representations that are useful for data obfuscation. We propose deterministic and probabilistic transformations in the latent space of a variational autoencoder to synthesize time series data such that intrusive inferences are prevented while desired inferences can still be made with sufficient accuracy. In the deterministic case, we use a linear transformation to move the representation of input data in the latent space such that the reconstructed data is likely to have the same public attribute but a different private attribute than the original input data. In the probabilistic case, we apply the linear transformation to the latent representation of input data with some probability. We compare our technique with autoencoder-based anonymization techniques and additionally show that it can anonymize data in real time on resource-constrained edge devices.


Mathematics ◽  
2021 ◽  
Vol 9 (17) ◽  
pp. 2146
Author(s):  
Mikhail Zymbler ◽  
Elena Ivanova

Currently, big sensor data arise in a wide spectrum of Industry 4.0, Internet of Things, and Smart City applications. In such subject domains, sensors tend to have a high frequency and produce massive time series in a relatively short time interval. The data collected from the sensors are subject to mining in order to make strategic decisions. In the article, we consider the problem of choosing a Time Series Database Management System (TSDBMS) to provide efficient storing and mining of big sensor data. We overview InfluxDB, OpenTSDB, and TimescaleDB, which are among the most popular state-of-the-art TSDBMSs, and represent different categories of such systems, namely native, add-ons over NoSQL systems, and add-ons over relational DBMSs (RDBMSs), respectively. Our overview shows that, at present, TSDBMSs offer a modest built-in toolset to mine big sensor data. This leads to the use of third-party mining systems and unwanted overhead costs due to exporting data outside a TSDBMS, data conversion, and so on. We propose an approach to managing and mining sensor data inside RDBMSs that exploits the Matrix Profile concept. A Matrix Profile is a data structure that annotates a time series through the index of and the distance to the nearest neighbor of each subsequence of the time series and serves as a basis to discover motifs, anomalies, and other time-series data mining primitives. This approach is implemented as a PostgreSQL extension that allows an application programmer both to compute matrix profiles and mining primitives and to represent them as relational tables. Experimental case studies show that our approach surpasses the above-mentioned out-of-TSDBMS competitors in terms of performance since it assumes that sensor data are mined inside a TSDBMS at no significant overhead costs.


While analyzing iot projects it is very expensive to buy a lot of sensors , corresponding processor boards, power supplies etc. Moreover the entire process is to be replicated to cater to large topologies. The whole experiment is to be planned at a large scale before we can actually start to see analytics working. At a smaller scale this can be implemented as a simulation program in linux where the sensor data is created using a random number generator and scaled appropriately for each type of sensor to mimic representative data. This is them encrypted before sending it over the network to the edge nodes. At the server a socket stream now continuously awaits sensor data Here the required sensor data is retrieved and decrypted to give true time series data. This time series is now given to an analytics engine which calculates the trends and cyclicity and is used to train a neural network. The anomalies so found are properly deciphered. The multiplicity of the nodes can be characterized by having several client programs running in separate terminals. A simple client server architecture is thus able to simulate a large iot infrastructure and is able to perform analytics on a scaled model


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Mahbubul Alam ◽  
Laleh Jalali ◽  
Mahbubul Alam ◽  
Ahmed Farahat ◽  
Chetan Gupta

Abstract—Prognostics aims to predict the degradation of equipment by estimating their remaining useful life (RUL) and/or the failure probability within a specific time horizon. The high demand of equipment prognostics in the industry have propelled researchers to develop robust and efficient prognostics techniques. Among data driven techniques for prognostics, machine learning and deep learning (DL) based techniques, particularly Recurrent Neural Networks (RNNs) have gained significant attention due to their ability of effectively representing the degradation progress by employing dynamic temporal behaviors. RNNs are well known for handling sequential data, especially continuous time series sequential data where the data follows certain pattern. Such data is usually obtained from sensors attached to the equipment. However, in many scenarios sensor data is not readily available and often very tedious to acquire. Conversely, event data is more common and can easily be obtained from the error logs saved by the equipment and transmitted to a backend for further processing. Nevertheless, performing prognostics using event data is substantially more difficult than that of the sensor data due to the unique nature of event data. Though event data is sequential, it differs from other seminal sequential data such as time series and natural language in the following manner, i) unlike time series data, events may appear at any time, i.e., the appearance of events lacks periodicity; ii) unlike natural languages, event data do not follow any specific linguistic rule. Additionally, there may be a significant variability in the event types appearing within the same sequence.  Therefore, this paper proposes an RUL estimation framework to effectively handle the intricate and novel event data. The proposed framework takes discrete events generated by an equipment (e.g., type, time, etc.) as input, and generates for each new event an estimate of the remaining operating cycles in the life of a given component. To evaluate the efficacy of our proposed method, we conduct extensive experiments using benchmark datasets such as the CMAPSS data after converting the time-series data in these datasets to sequential event data. The event data conversion is carried out by careful exploration and application of appropriate transformation techniques to the time series. To the best of our knowledge this is the first time such event-based RUL estimation problem is introduced to the community. Furthermore, we propose several deep learning and machine learning based solution for the event-based RUL estimation problem. Our results suggest that the deep learning models, 1D-CNN, LSTM, and multi-head attention show similar RMSE, MAE and Score performance. Foreseeably, the XGBoost model achieve lower performance compared to the deep learning models since the XGBoost model fails to capture ordering information from the sequence of events. 


In this paper, we analyze, model, predict and cluster Global Active Power, i.e., a time series data obtained at one minute intervals from electricity sensors of a household. We analyze changes in seasonality and trends to model the data. We then compare various forecasting methods such as SARIMA and LSTM to forecast sensor data for the household and combine them to achieve a hybrid model that captures nonlinear variations better than either SARIMA or LSTM used in isolation. Finally, we cluster slices of time series data effectively using a novel clustering algorithm that is a combination of density-based and centroid-based approaches, to discover relevant subtle clusters from sensor data. Our experiments have yielded meaningful insights from the data at both a micro, day-to-day granularity, as well as a macro, weekly to monthly granularity.


2021 ◽  
Vol 14 (13) ◽  
pp. 3253-3266
Author(s):  
Jian Liu ◽  
Kefei Wang ◽  
Feng Chen

Time-series databases are becoming an indispensable component in today's data centers. In order to manage the rapidly growing time-series data, we need an effective and efficient system solution to handle the huge traffic of time-series data queries. A promising solution is to deploy a high-speed, large-capacity cache system to relieve the burden on the backend time-series databases and accelerate query processing. However, time-series data is drastically different from other traditional data workloads, bringing both challenges and opportunities. In this paper, we present a flash-based cache system design for time-series data, called TSCache . By exploiting the unique properties of time-series data, we have developed a set of optimization schemes, such as a slab-based data management, a two-layered data indexing structure, an adaptive time-aware caching policy, and a low-cost compaction process. We have implemented a prototype based on Twitter's Fatcache. Our experimental results show that TSCache can significantly improve client query performance, effectively increasing the bandwidth by a factor of up to 6.7 and reducing the latency by up to 84.2%.


Sign in / Sign up

Export Citation Format

Share Document