Choosing the content of textual summaries of large time-series data sets

2006 ◽  
Vol 13 (1) ◽  
pp. 25-49 ◽  
Author(s):  
JIN YU ◽  
EHUD REITER ◽  
JIM HUNTER ◽  
CHRIS MELLISH

Natural Language Generation (NLG) can be used to generate textual summaries of numeric data sets. In this paper we develop an architecture for generating short (a few sentences) summaries of large (100KB or more) time-series data sets. The architecture integrates pattern recognition, pattern abstraction, selection of the most significant patterns, microplanning (especially aggregation), and realisation. We also describe and evaluate SumTime-Turbine, a prototype system which uses this architecture to generate textualsummaries of sensor data from gas turbines.

AI ◽  
2021 ◽  
Vol 2 (1) ◽  
pp. 48-70
Author(s):  
Wei Ming Tan ◽  
T. Hui Teo

Prognostic techniques attempt to predict the Remaining Useful Life (RUL) of a subsystem or a component. Such techniques often use sensor data which are periodically measured and recorded into a time series data set. Such multivariate data sets form complex and non-linear inter-dependencies through recorded time steps and between sensors. Many current existing algorithms for prognostic purposes starts to explore Deep Neural Network (DNN) and its effectiveness in the field. Although Deep Learning (DL) techniques outperform the traditional prognostic algorithms, the networks are generally complex to deploy or train. This paper proposes a Multi-variable Time Series (MTS) focused approach to prognostics that implements a lightweight Convolutional Neural Network (CNN) with attention mechanism. The convolution filters work to extract the abstract temporal patterns from the multiple time series, while the attention mechanisms review the information across the time axis and select the relevant information. The results suggest that the proposed method not only produces a superior accuracy of RUL estimation but it also trains many folds faster than the reported works. The superiority of deploying the network is also demonstrated on a lightweight hardware platform by not just being much compact, but also more efficient for the resource restricted environment.


2017 ◽  
Author(s):  
Anthony Szedlak ◽  
Spencer Sims ◽  
Nicholas Smith ◽  
Giovanni Paternostro ◽  
Carlo Piermarocchi

AbstractModern time series gene expression and other omics data sets have enabled unprecedented resolution of the dynamics of cellular processes such as cell cycle and response to pharmaceutical compounds. In anticipation of the proliferation of time series data sets in the near future, we use the Hopfield model, a recurrent neural network based on spin glasses, to model the dynamics of cell cycle in HeLa (human cervical cancer) and S. cerevisiae cells. We study some of the rich dynamical properties of these cyclic Hopfield systems, including the ability of populations of simulated cells to recreate experimental expression data and the effects of noise on the dynamics. Next, we use a genetic algorithm to identify sets of genes which, when selectively inhibited by local external fields representing gene silencing compounds such as kinase inhibitors, disrupt the encoded cell cycle. We find, for example, that inhibiting the set of four kinases BRD4, MAPK1, NEK7, and YES1 in HeLa cells causes simulated cells to accumulate in the M phase. Finally, we suggest possible improvements and extensions to our model.Author SummaryCell cycle – the process in which a parent cell replicates its DNA and divides into two daughter cells – is an upregulated process in many forms of cancer. Identifying gene inhibition targets to regulate cell cycle is important to the development of effective therapies. Although modern high throughput techniques offer unprecedented resolution of the molecular details of biological processes like cell cycle, analyzing the vast quantities of the resulting experimental data and extracting actionable information remains a formidable task. Here, we create a dynamical model of the process of cell cycle using the Hopfield model (a type of recurrent neural network) and gene expression data from human cervical cancer cells and yeast cells. We find that the model recreates the oscillations observed in experimental data. Tuning the level of noise (representing the inherent randomness in gene expression and regulation) to the “edge of chaos” is crucial for the proper behavior of the system. We then use this model to identify potential gene targets for disrupting the process of cell cycle. This method could be applied to other time series data sets and used to predict the effects of untested targeted perturbations.


Author(s):  
Meenakshi Narayan ◽  
Ann Majewicz Fey

Abstract Sensor data predictions could significantly improve the accuracy and effectiveness of modern control systems; however, existing machine learning and advanced statistical techniques to forecast time series data require significant computational resources which is not ideal for real-time applications. In this paper, we propose a novel forecasting technique called Compact Form Dynamic Linearization Model-Free Prediction (CFDL-MFP) which is derived from the existing model-free adaptive control framework. This approach enables near real-time forecasts of seconds-worth of time-series data due to its basis as an optimal control problem. The performance of the CFDL-MFP algorithm was evaluated using four real datasets including: force sensor readings from surgical needle, ECG measurements for heart rate, and atmospheric temperature and Nile water level recordings. On average, the forecast accuracy of CFDL-MFP was 28% better than the benchmark Autoregressive Integrated Moving Average (ARIMA) algorithm. The maximum computation time of CFDL-MFP was 49.1ms which was 170 times faster than ARIMA. Forecasts were best for deterministic data patterns, such as the ECG data, with a minimum average root mean squared error of (0.2±0.2).


Author(s):  
Pritpal Singh

Forecasting using fuzzy time series has been applied in several areas including forecasting university enrollments, sales, road accidents, financial forecasting, weather forecasting, etc. Recently, many researchers have paid attention to apply fuzzy time series in time series forecasting problems. In this paper, we present a new model to forecast the enrollments in the University of Alabama and the daily average temperature in Taipei, based on one-factor fuzzy time series. In this model, a new frequency based clustering technique is employed for partitioning the time series data sets into different intervals. For defuzzification function, two new principles are also incorporated in this model. In case of enrollments as well daily temperature forecasting, proposed model exhibits very small error rate.


2020 ◽  
Vol 496 (1) ◽  
pp. 629-637
Author(s):  
Ce Yu ◽  
Kun Li ◽  
Shanjiang Tang ◽  
Chao Sun ◽  
Bin Ma ◽  
...  

ABSTRACT Time series data of celestial objects are commonly used to study valuable and unexpected objects such as extrasolar planets and supernova in time domain astronomy. Due to the rapid growth of data volume, traditional manual methods are becoming extremely hard and infeasible for continuously analysing accumulated observation data. To meet such demands, we designed and implemented a special tool named AstroCatR that can efficiently and flexibly reconstruct time series data from large-scale astronomical catalogues. AstroCatR can load original catalogue data from Flexible Image Transport System (FITS) files or data bases, match each item to determine which object it belongs to, and finally produce time series data sets. To support the high-performance parallel processing of large-scale data sets, AstroCatR uses the extract-transform-load (ETL) pre-processing module to create sky zone files and balance the workload. The matching module uses the overlapped indexing method and an in-memory reference table to improve accuracy and performance. The output of AstroCatR can be stored in CSV files or be transformed other into formats as needed. Simultaneously, the module-based software architecture ensures the flexibility and scalability of AstroCatR. We evaluated AstroCatR with actual observation data from The three Antarctic Survey Telescopes (AST3). The experiments demonstrate that AstroCatR can efficiently and flexibly reconstruct all time series data by setting relevant parameters and configuration files. Furthermore, the tool is approximately 3× faster than methods using relational data base management systems at matching massive catalogues.


Fractals ◽  
2006 ◽  
Vol 14 (03) ◽  
pp. 165-170 ◽  
Author(s):  
ATIN DAS ◽  
PRITHA DAS

In this paper, we attempt musical analysis by measuring fractal dimension (D) of musical pieces played by several musical instruments. We collected solo performances of popular instruments of Western and Eastern origin as samples. We attempted usual spectral analysis of the selected clips to observe peaks of fundamental and harmonics in frequency regime. After appropriate processing, we converted them into time series data sets and computed their fractal dimension. Based on our results, we conclude that instrumental musical sounds may have higher Ds than those computed from vocal performances of different types of Indian songs.


Sign in / Sign up

Export Citation Format

Share Document