Deep LSTM-Based Transfer Learning Approach for Coherent Forecasts in Hierarchical Time Series

Hierarchical time series is a set of data sequences organized by aggregation constraints to represent many real-world applications in research and the industry. Forecasting of hierarchical time series is a challenging and time-consuming problem owing to ensuring the forecasting consistency among the hierarchy levels based on their dimensional features. The excellent empirical performance of our Deep Long Short-Term Memory (DLSTM) approach on various forecasting tasks motivated us to extend it to solve the forecasting problem through hierarchical architectures. Toward this target, we develop the DLSTM model in auto-encoder (AE) fashion and take full advantage of the hierarchical architecture for better time series forecasting. DLSTM-AE works as an alternative approach to traditional and machine learning approaches that have been used to manipulate hierarchical forecasting. However, training a DLSTM in hierarchical architectures requires updating the weight vectors for each LSTM cell, which is time-consuming and requires a large amount of data through several dimensions. Transfer learning can mitigate this problem by training first the time series at the bottom level of the hierarchy using the proposed DLSTM-AE approach. Then, we transfer the learned features to perform synchronous training for the time series of the upper levels of the hierarchy. To demonstrate the efficiency of the proposed approach, we compare its performance with existing approaches using two case studies related to the energy and tourism domains. An evaluation of all approaches was based on two criteria, namely, the forecasting accuracy and the ability to produce coherent forecasts through through the hierarchy. In both case studies, the proposed approach attained the highest accuracy results among all counterparts and produced more coherent forecasts.

Download Full-text

Pre-treating GNSS time series using a recurrent neural network to improve the automated detection of jump discontinuities

10.5194/egusphere-egu21-14488 ◽

2021 ◽

Author(s):

Luca Tavasci ◽

Pasquale Cascarano ◽

Stefano Gandolfi

Keyword(s):

Neural Network ◽

Time Series ◽

Time Series Analysis ◽

Ad Hoc ◽

Short Term Memory ◽

Learning Approaches ◽

Jump Detection ◽

Series Analysis ◽

Jump Discontinuities ◽

Gnss Time Series

<p>Ground motion monitoring is one of the main goals in the geoscientist community and at the time it is mainly performed by analyzing time series of data. Our capability of describing the most significant features characterizing the time evolution of a point-position is affected by the presence of undetected discontinuities in the time series. One of the most critical aspects in the automated time series analysis, which is quite necessary since the amount of data is increasing more and more, is still the detection of discontinuities and in particular the definition of their epoch. A number of algorithms have already been developed and proposed to the community in the last years, following different statistical approaches and different hypotheses on the coordinates behavior. In this work, we have chosen to analyze GNSS time series and to use an already published algorithm (STARS) for jump detection as a benchmark to test our approach, consisting of pre-treating the time series to be analyzed using a neural network. In particular, we chose a Long Short Term Memory (LSTM) neural network belonging to the class of the Recurrent Neural Networks (RNNs), ad hoc modified for the GNSS time series analysis. We focused both on the training algorithm and the testing one. The latter has been the object of a parametric test to find out the number of predicted data that mostly emphasize our capability of detecting jump discontinuities. Results will be presented considering several GNSS time series of daily positions. Finally, a discussion on the possible integration of machine learning approaches and classical deterministic approaches will be done.</p>

Download Full-text

Amsterdam to Dublin Eventually Delayed? LSTM and Transfer Learning for Predicting Delays of Low Cost Airlines

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33019541 ◽

2019 ◽

Vol 33 ◽

pp. 9541-9546

Author(s):

Nicholas McCarthy ◽

Mohammad Karzand ◽

Freddy Lecue

Keyword(s):

Transfer Learning ◽

Short Term Memory ◽

Low Cost ◽

Learning Approaches ◽

Flight Delays ◽

Commercial Aviation ◽

Feature Spaces ◽

Flight Delay ◽

Delay Prediction ◽

Heterogeneous Feature

Flight delays impact airlines, airports and passengers. Delay prediction is crucial during the decision-making process for all players in commercial aviation, and in particular for airlines to meet their on-time performance objectives. Although many machine learning approaches have been experimented with, they fail in (i) predicting delays in minutes with low errors (less than 15 minutes), (ii) being applied to small carriers i.e., low cost companies characterized by a small amount of data. This work presents a Long Short-Term Memory (LSTM) approach to predicting flight delay, modeled as a sequence of flights across multiple airports for a particular aircraft throughout the day. We then suggest a transfer learning approach between heterogeneous feature spaces to train a prediction model for a given smaller airline using the data from another larger airline. Our approach is demonstrated to be robust and accurate for low cost airlines in Europe.

Download Full-text

Single and Multi-Sequence Deep Learning Models for Short and Medium Term Electric Load Forecasting

Energies ◽

10.3390/en12010149 ◽

2019 ◽

Vol 12 (1) ◽

pp. 149 ◽

Cited By ~ 11

Author(s):

Salah Bouktif ◽

Ali Fiaz ◽

Ali Ouni ◽

Mohamed Adel Serhani

Keyword(s):

Neural Networks ◽

Time Series ◽

Deep Learning ◽

Short Term Memory ◽

Load Forecasting ◽

Forecasting Model ◽

Time Lags ◽

Electric Load ◽

Learning Approaches ◽

Electric Load Forecasting

Time series analysis using long short term memory (LSTM) deep learning is a very attractive strategy to achieve accurate electric load forecasting. Although it outperforms most machine learning approaches, the LSTM forecasting model still reveals a lack of validity because it neglects several characteristics of the electric load exhibited by time series. In this work, we propose a load-forecasting model based on enhanced-LSTM that explicitly considers the periodicity characteristic of the electric load by using multiple sequences of inputs time lags. An autoregressive model is developed together with an autocorrelation function (ACF) to regress consumption and identify the most relevant time lags to feed the multi-sequence LSTM. Two variations of deep neural networks, LSTM and gated recurrent unit (GRU) are developed for both single and multi-sequence time-lagged features. These models are compared to each other and to a spectrum of data mining benchmark techniques including artificial neural networks (ANN), boosting, and bagging ensemble trees. France Metropolitan’s electricity consumption data is used to train and validate our models. The obtained results show that GRU- and LSTM-based deep learning model with multi-sequence time lags achieve higher performance than other alternatives including the single-sequence LSTM. It is demonstrated that the new models can capture critical characteristics of complex time series (i.e., periodicity) by encompassing past information from multiple timescale sequences. These models subsequently achieve predictions that are more accurate.

Download Full-text

Forecasting adverse surgical events using self-supervised transfer learning for physiological signals

npj Digital Medicine ◽

10.1038/s41746-021-00536-y ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Hugh Chen ◽

Scott M. Lundberg ◽

Gabriel Erion ◽

Jerry H. Kim ◽

Su-In Lee

Keyword(s):

Time Series ◽

Transfer Learning ◽

Short Term Memory ◽

Computational Cost ◽

Physiological Signals ◽

Short Term ◽

Physiological Signal ◽

Data Phase ◽

Unseen Data ◽

Long Short Term Memory

AbstractHundreds of millions of surgical procedures take place annually across the world, which generate a prevalent type of electronic health record (EHR) data comprising time series physiological signals. Here, we present a transferable embedding method (i.e., a method to transform time series signals into input features for predictive machine learning models) named PHASE (PHysiologicAl Signal Embeddings) that enables us to more accurately forecast adverse surgical outcomes based on physiological signals. We evaluate PHASE on minute-by-minute EHR data of more than 50,000 surgeries from two operating room (OR) datasets and patient stays in an intensive care unit (ICU) dataset. PHASE outperforms other state-of-the-art approaches, such as long-short term memory networks trained on raw data and gradient boosted trees trained on handcrafted features, in predicting six distinct outcomes: hypoxemia, hypocapnia, hypotension, hypertension, phenylephrine, and epinephrine. In a transfer learning setting where we train embedding models in one dataset then embed signals and predict adverse events in unseen data, PHASE achieves significantly higher prediction accuracy at lower computational cost compared to conventional approaches. Finally, given the importance of understanding models in clinical applications we demonstrate that PHASE is explainable and validate our predictive models using local feature attribution methods.

Download Full-text

Unsupervised Pre-training of a Deep LSTM-based Stacked Autoencoder for Multivariate Time Series Forecasting Problems

Scientific Reports ◽

10.1038/s41598-019-55320-6 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 14

Author(s):

Alaa Sagheer ◽

Mostafa Kotb

Keyword(s):

Time Series ◽

Case Studies ◽

Real World ◽

Time Series Data ◽

Short Term Memory ◽

Multivariate Time Series ◽

Series Data ◽

Recurrent Networks ◽

Stacked Autoencoder ◽

Real World Datasets

AbstractCurrently, most real-world time series datasets are multivariate and are rich in dynamical information of the underlying system. Such datasets are attracting much attention; therefore, the need for accurate modelling of such high-dimensional datasets is increasing. Recently, the deep architecture of the recurrent neural network (RNN) and its variant long short-term memory (LSTM) have been proven to be more accurate than traditional statistical methods in modelling time series data. Despite the reported advantages of the deep LSTM model, its performance in modelling multivariate time series (MTS) data has not been satisfactory, particularly when attempting to process highly non-linear and long-interval MTS datasets. The reason is that the supervised learning approach initializes the neurons randomly in such recurrent networks, disabling the neurons that ultimately must properly learn the latent features of the correlated variables included in the MTS dataset. In this paper, we propose a pre-trained LSTM-based stacked autoencoder (LSTM-SAE) approach in an unsupervised learning fashion to replace the random weight initialization strategy adopted in deep LSTM recurrent networks. For evaluation purposes, two different case studies that include real-world datasets are investigated, where the performance of the proposed approach compares favourably with the deep LSTM approach. In addition, the proposed approach outperforms several reference models investigating the same case studies. Overall, the experimental results clearly show that the unsupervised pre-training approach improves the performance of deep LSTM and leads to better and faster convergence than other models.

Download Full-text

Reconstructing coupled time series in climate systems by machine learning

10.5194/esd-2019-63 ◽

2019 ◽

Cited By ~ 1

Author(s):

Yu Huang ◽

Lichao Yang ◽

Zuntao Fu

Keyword(s):

Machine Learning ◽

Time Series ◽

Short Term Memory ◽

Explanatory Variable ◽

Pearson Correlation ◽

The Other ◽

Great Success ◽

Learning Approaches ◽

Linear Coupling ◽

Reconstruction Quality

Abstract. Despite the great success of machine learning, its applications in climate dynamics have not been well developed. One concern might be how well the trained neural networks could learn a dynamical system and what can be the potential applications of this kind of learning. Detailed studies show that the coupling relations or dynamics among variables in linear or nonlinear systems can be well learnt by reservoir computer (RC) and long short-term memory (LSTM) machine learning, and these learnt coupling relations can be further applied to reconstruct one series from the other dominated by common coupling dynamics. In order to validate the above conclusions, toy models are applied to address the following three questions: (i) what can be learnt from different dynamical time series by machine learning; (ii) what factors significantly influence machine learning reconstruction; and (iii) how to select suitable explanatory or input variables for the reconstructed variable for machine learning. The results from these toy models show that both of RC and LSTM can indeed learn coupling relations among variables, and the learnt implicit coupling relation can be applied to accurately reconstruct one series from the other. Both of linear and nonlinear coupling relations between variables can influence the quality of the reconstructed series. If there is a strong linear coupling between variables, all of variables can be taken as explanatory variables for the reconstructed variable, and the reconstruction can be bi-directional. However, when the linear coupling among variables is much weaker, but with stronger nonlinear causality among variables, the reconstruction quality is direction-dependent and it may be only uni-directional. We propose using convergent cross mapping causality (CCM) index ρa→b to determine which variable can be taken as the reconstructed one and which can be taken as the explanatory variable. For example, the Pearson correlation between the average Tropical Surface Air Temperature (TSAT) and the average Northern Hemispheric SAT (NHSAT) is as weak as 0.08, but the CCM index of that NHSAT cross maps TSAT is ρN→T = 0.70, it means that NHSAT could be taken as the explanatory variable. Then we find that TSAT can be well reconstructed from NHSAT by means of RC. However, the reconstruction quality in the opposite direction is poor, because the CCM index of that TSAT cross maps NHSAT is only ρT→N = 0.24. These results also provide insights on machine learning approaches for paleoclimate reconstruction, parameterization scheme, and prediction in related climate studies.

Download Full-text

Forecasting of Coalbed Methane Daily Production Based on T-LSTM Neural Networks

Symmetry ◽

10.3390/sym12050861 ◽

2020 ◽

Vol 12 (5) ◽

pp. 861

Author(s):

Xijie Xu ◽

Xiaoping Rui ◽

Yonglei Fan ◽

Tian Yu ◽

Yiwen Ju

Keyword(s):

Time Series ◽

Transfer Learning ◽

Coalbed Methane ◽

Short Term Memory ◽

Early Stage ◽

Production Data ◽

Production Forecast ◽

Daily Production ◽

Cbm Production ◽

Lstm Network

Accurately forecasting the daily production of coalbed methane (CBM) is important forformulating associated drainage parameters and evaluating the economic benefit of CBM mining. Daily production of CBM depends on many factors, making it difficult to predict using conventional mathematical models. Because traditional methods do not reflect the long-term time series characteristics of CBM production, this study first used a long short-term memory neural network (LSTM) and transfer learning (TL) method for time series forecasting of CBM daily production. Based on the LSTM model, we introduced the idea of transfer learning and proposed a Transfer-LSTM (T-LSTM) CBM production forecasting model. This approach first uses a large amount of data similar to the target to pretrain the weights of the LSTM network, then uses transfer learning to fine-tune LSTM network parameters a second time, so as to obtain the final T-LSTM model. Experiments were carried out using daily CBM production data for the Panhe Demonstration Zone at southern Qinshui basin in China. Based on the results, the idea of transfer learning can solve the problem of insufficient samples during LSTM training. Prediction results for wells that entered the stable period earlier were more accurate, whereas results for types with unstable production in the early stage require further exploration. Because CBM wells daily production data have symmetrical similarities, which can provide a reference for the prediction of other wells, so our proposed T-LSTM network can achieve good results for the production forecast and can provide guidance for forecasting production of CBM wells.

Download Full-text

Deep Learning for text in limted data settings

10.36227/techrxiv.12100692 ◽

2020 ◽

Author(s):

Pathikkumar Patel ◽

Bhargav Lad ◽

Jinan Fiaidhi

Keyword(s):

Machine Learning ◽

Time Series ◽

Deep Learning ◽

Sentiment Analysis ◽

Transfer Learning ◽

Text Classification ◽

State Of The Art ◽

Time Series Forecasting ◽

Text Data ◽

Performance Levels

During the last few years, RNN models have been extensively used and they have proven to be better for sequence and text data. RNNs have achieved state-of-the-art performance levels in several applications such as text classification, sequence to sequence modelling and time series forecasting. In this article we will review different Machine Learning and Deep Learning based approaches for text data and look at the results obtained from these methods. This work also explores the use of transfer learning in NLP and how it affects the performance of models on a specific application of sentiment analysis.

Download Full-text

Detecting subtle change from dense Landsat time series: Case studies of mountain pine beetle and spruce beetle disturbance

Remote Sensing of Environment ◽

10.1016/j.rse.2021.112560 ◽

2021 ◽

Vol 263 ◽

pp. 112560

Author(s):

Su Ye ◽

John Rogan ◽

Zhe Zhu ◽

Todd J. Hawbaker ◽

Sarah J. Hart ◽

...

Keyword(s):

Time Series ◽

Case Studies ◽

Mountain Pine Beetle ◽

Subtle Change ◽

Mountain Pine ◽

Spruce Beetle ◽

Pine Beetle

Download Full-text

Leveraging Label Information in a Knowledge-Driven Approach for Rolling-Element Bearings Remaining Useful Life Prediction

Energies ◽

10.3390/en14082163 ◽

2021 ◽

Vol 14 (8) ◽

pp. 2163

Author(s):

Tarek Berghout ◽

Mohamed Benbouzid ◽

Leïla-Hayet Mouss

Keyword(s):

Transfer Learning ◽

Short Term Memory ◽

Remaining Useful Life ◽

Accelerated Life Tests ◽

Learning Path ◽

Accelerated Life ◽

Unseen Data ◽

Label Information ◽

Life Tests ◽

Ill Posed

Since bearing deterioration patterns are difficult to collect from real, long lifetime scenarios, data-driven research has been directed towards recovering them by imposing accelerated life tests. Consequently, insufficiently recovered features due to rapid damage propagation seem more likely to lead to poorly generalized learning machines. Knowledge-driven learning comes as a solution by providing prior assumptions from transfer learning. Likewise, the absence of true labels was able to create inconsistency related problems between samples, and teacher-given label behaviors led to more ill-posed predictors. Therefore, in an attempt to overcome the incomplete, unlabeled data drawbacks, a new autoencoder has been designed as an additional source that could correlate inputs and labels by exploiting label information in a completely unsupervised learning scheme. Additionally, its stacked denoising version seems to more robustly be able to recover them for new unseen data. Due to the non-stationary and sequentially driven nature of samples, recovered representations have been fed into a transfer learning, convolutional, long–short-term memory neural network for further meaningful learning representations. The assessment procedures were benchmarked against recent methods under different training datasets. The obtained results led to more efficiency confirming the strength of the new learning path.

Download Full-text