Efficient and data-driven prediction of water breakthrough in subsurface systems using deep long short-term memory machine learning

Data-driven predictions of a multiscale Lorenz 96 chaotic system using machine-learning methods: reservoir computing, artificial neural network, and long short-term memory network

Nonlinear Processes in Geophysics ◽

10.5194/npg-27-373-2020 ◽

2020 ◽

Vol 27 (3) ◽

pp. 373-389 ◽

Cited By ~ 7

Author(s):

Ashesh Chattopadhyay ◽

Pedram Hassanzadeh ◽

Devika Subramanian

Keyword(s):

Neural Network ◽

Machine Learning ◽

Short Term Memory ◽

Data Driven ◽

Reservoir Computing ◽

Short Term ◽

Term Memory ◽

Machine Learning Methods ◽

Long Short Term Memory ◽

Lorenz 96

Abstract. In this paper, the performance of three machine-learning methods for predicting short-term evolution and for reproducing the long-term statistics of a multiscale spatiotemporal Lorenz 96 system is examined. The methods are an echo state network (ESN, which is a type of reservoir computing; hereafter RC–ESN), a deep feed-forward artificial neural network (ANN), and a recurrent neural network (RNN) with long short-term memory (LSTM; hereafter RNN–LSTM). This Lorenz 96 system has three tiers of nonlinearly interacting variables representing slow/large-scale (X), intermediate (Y), and fast/small-scale (Z) processes. For training or testing, only X is available; Y and Z are never known or used. We show that RC–ESN substantially outperforms ANN and RNN–LSTM for short-term predictions, e.g., accurately forecasting the chaotic trajectories for hundreds of numerical solver's time steps equivalent to several Lyapunov timescales. The RNN–LSTM outperforms ANN, and both methods show some prediction skills too. Furthermore, even after losing the trajectory, data predicted by RC–ESN and RNN–LSTM have probability density functions (pdf's) that closely match the true pdf – even at the tails. The pdf of the data predicted using ANN, however, deviates from the true pdf. Implications, caveats, and applications to data-driven and data-assisted surrogate modeling of complex nonlinear dynamical systems, such as weather and climate, are discussed.

Download Full-text

The performance of LSTM models from basin to continental scales

10.5194/egusphere-egu2020-8855 ◽

2020 ◽

Author(s):

Frederik Kratzert ◽

Daniel Klotz ◽

Günter Klambauer ◽

Grey Nearing ◽

Sepp Hochreiter

Keyword(s):

Machine Learning ◽

Large Scale ◽

Short Term Memory ◽

Regional Scale ◽

Data Driven ◽

Hydrological Models ◽

Short Term ◽

Term Memory ◽

Basin Scale ◽

Long Short Term Memory

Simulation accuracy among traditional hydrological models usually degrades significantly when going from single basin to regional scale. Hydrological models perform best when calibrated for specific basins, and do worse when a regional calibration scheme is used.&#160;One reason for this is that these models do not (have to) learn hydrological processes from data. Rather, they have a predefined model structure and only a handful of parameters adapt to specific basins. This often yields less-than-optimal parameter values when the loss is not determined by a single basin, but by many through regional calibration.The opposite is true for data driven approaches where models tend to get better with more and diverse training data. We examine whether this holds true when modeling rainfall-runoff processes with deep learning, or if, like their process-based counterparts, data-driven hydrological models degrade when going from basin to regional scale.Recently, Kratzert et al. (2018) showed that the Long Short-Term Memory network (LSTM), a special type of recurrent neural network, achieves comparable performance to the SAC-SMA at basin scale. In follow up work Kratzert et al. (2019a) trained a single LSTM for hundreds of basins in the continental US, which outperformed a set of hydrological models significantly, even compared to basin-calibrated hydrological models. On average, a single LSTM is even better in out-of-sample predictions (ungauged) compared to the SAC-SMA in-sample (gauged) or US National Water Model (Kratzert et al. 2019b).LSTM-based approaches usually involve tuning a large number of hyperparameters, such as the number of neurons, number of layers, and learning rate, that are critical for the predictive performance. Therefore, large-scale hyperparameter search has to be performed to obtain a proficient LSTM network.&#160;&#160;However, in the abovementioned studies, hyperparameter optimization was not conducted at large scale and e.g. in Kratzert et al. (2018) the same network hyperparameters were used in all basins, instead of tuning hyperparameters for each basin separately. It is yet unclear whether LSTMs follow the same trend of traditional hydrological models to degrade performance from basin to regional scale.&#160;In the current study, we performed a computational expensive, basin-specific hyperparameter search to explore how site-specific LSTMs differ in performance compared to regionally calibrated LSTMs. We compared our results to the mHM and VIC models, once calibrated per-basin and once using an MPR regionalization scheme. These benchmark models were calibrated individual research groups, to eliminate bias in our study. We analyse whether differences in basin-specific vs regional model performance can be linked to basin attributes or data set characteristics.References:Kratzert, F., Klotz, D., Brenner, C., Schulz, K., and Herrnegger, M.: Rainfall&#8211;runoff modelling using Long Short-Term Memory (LSTM) networks, Hydrol. Earth Syst. Sci., 22, 6005&#8211;6022, https://doi.org/10.5194/hess-22-6005-2018, 2018.&#160;Kratzert, F., Klotz, D., Shalev, G., Klambauer, G., Hochreiter, S., and Nearing, G.: Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets, Hydrol. Earth Syst. Sci., 23, 5089&#8211;5110, https://doi.org/10.5194/hess-23-5089-2019, 2019a.&#160;Kratzert, F., Klotz, D., Herrnegger, M., Sampson, A. K., Hochreiter, S., & Nearing, G. S.: Toward improved predictions in ungauged basins: Exploiting the power of machine learning. Water Resources Research, 55. https://doi.org/10.1029/2019WR026065, 2019b.

Download Full-text

How Google's Flood Forecasting Initiative Leverages Deep Learning Hydrologic Models

10.5194/egusphere-egu2020-4134 ◽

2020 ◽

Author(s):

Asher Metzger ◽

Zach Moshe ◽

Guy Shalev ◽

Ofir Reich ◽

Zvika Ben-Haim ◽

...

Keyword(s):

Machine Learning ◽

Hydrologic Modeling ◽

Short Term Memory ◽

State Of The Art ◽

Flood Forecasting ◽

Data Driven ◽

Hydrologic Models ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory

One of the major natural disasters is flooding, which causes thousands of fatalities, affects the lives of hundreds of millions, and results in huge economic damages annually. Google&#8217;s Flood Forecasting Initiative aims at providing high-resolution flood forecasts and timely warnings around the globe, while focusing first on developing countries where most of the fatalities occur. The high level structure of Google&#8217;s flood forecasting framework follows the natural hydrologic-hydraulic coupling, where the hydrologic modeling predicts discharge (or other proxies for discharge) based on rainfall-runoff relationships, and the hydraulic model produces high resolution inundation maps based on those discharge predictions.&#160; Within this general partition, both the hydraulic and hydrologic modules benefit by the use of advanced machine learning techniques allowing for precision and global scale.Classical conceptual hydrologic models such as the Sacramento Soil Moisture Accounting Model explicitly model the dynamics of water volumes based on explicit measurements and estimates of the variables (parameters) involved. These models are, however, inherently challenged by the lack of accurate estimates of model parameters and by inaccurate/incomplete description of the complex non-linear rules that govern the underlying dynamics. In contrast, machine learning models, driven by data alone, are potentially capable of describing complex functional dynamics without explicit modelling.&#160; Both the hydrologic and hydraulic models employed by Google rely on data-driven machine learning technologies to achieve superior and scalable performance. In this presentation we focus on describing one of the deep neural hydrologic models proposed by Google.&#160;As was already shown in a recent work by Kratzert et al. (2018, 2019)[1], a deep neural model can achieve high performance hydrologic forecasts using deep recurrent models such as long short-term memory networks (LSTMs). Moreover, it was shown by Shalev et al. (2019)[2] that a single globally shared LSTM can achieve state-of-the-art performance by utilizing a data-driven learned embedding without the need for geographical-specific attributes.&#160; While the need for explicit rules in pure conceptual modeling is likely to impede the creation of scalable and accurate hydrologic models, an agnostic approach that ignores reliable and available physical properties of water networks is also likely to be sub-optimal. HydroNet is one of Google&#8217;s hydrologic models that leverages the known water network structure as well as deep neural technology to create a scalable and reliable hydrologic model. HydroNet builds a globally shared model together with regional adaptation sub-models at each site by utilizing the tree structure of river flow network, and is shown to achieve state-of-the-art scalable hydrologic modeling in several large basins in India and the USA.&#160;&#160;[1] Kratzert, Frederik, Daniel Klotz, Guy Shalev, G&#252;nter Klambauer, Sepp Hochreiter, and Grey Nearing. "Benchmarking a catchment-aware Long Short-Term Memory Network (LSTM) for large-scale hydrological modeling." arXiv preprint arXiv:1907.08456 (2019).[2] Shalev, Guy, Ran El-Yaniv, Daniel Klotz, Frederik Kratzert, Asher Metzger, and Sella Nevo. "Accurate Hydrologic Modeling Using Less Information." arXiv preprint arXiv:1911.09427 (2019).

Download Full-text

Prediction of Head Movement in 360-Degree Videos Using Attention Model

Sensors ◽

10.3390/s21113678 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3678

Author(s):

Dongwon Lee ◽

Minji Choi ◽

Joohyun Lee

Keyword(s):

Machine Learning ◽

Short Term Memory ◽

Moving Average ◽

The Other ◽

Learning Models ◽

Short Term ◽

Term Memory ◽

Attention Model ◽

Long Short Term Memory ◽

Machine Learning Models

In this paper, we propose a prediction algorithm, the combination of Long Short-Term Memory (LSTM) and attention model, based on machine learning models to predict the vision coordinates when watching 360-degree videos in a Virtual Reality (VR) or Augmented Reality (AR) system. Predicting the vision coordinates while video streaming is important when the network condition is degraded. However, the traditional prediction models such as Moving Average (MA) and Autoregression Moving Average (ARMA) are linear so they cannot consider the nonlinear relationship. Therefore, machine learning models based on deep learning are recently used for nonlinear predictions. We use the Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) neural network methods, originated in Recurrent Neural Networks (RNN), and predict the head position in the 360-degree videos. Therefore, we adopt the attention model to LSTM to make more accurate results. We also compare the performance of the proposed model with the other machine learning models such as Multi-Layer Perceptron (MLP) and RNN using the root mean squared error (RMSE) of predicted and real coordinates. We demonstrate that our model can predict the vision coordinates more accurately than the other models in various videos.

Download Full-text

Random forest and long short-term memory based machine learning models for classification of ion mobility spectrometry spectra

Chemical, Biological, Radiological, Nuclear, and Explosives (CBRNE) Sensing XXII ◽

10.1117/12.2585829 ◽

2021 ◽

Author(s):

Patrick C. Riley ◽

Samir V. Deshpande ◽

Brian S. Ince ◽

Brian C. Hauck ◽

Kyle P. O'Donnell ◽

...

Keyword(s):

Machine Learning ◽

Random Forest ◽

Ion Mobility ◽

Short Term Memory ◽

Learning Models ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Machine Learning Models

Download Full-text

Data-Driven Phone Selection for Language Identification via Bidirectional Long Short-Term Memory Modeling

Communications in Computer and Information Science - Computational Intelligence and Intelligent Systems ◽

10.1007/978-981-13-1648-7_26 ◽

2018 ◽

pp. 301-312

Author(s):

Xiao Song ◽

Qiang Cheng ◽

Jingping Xing ◽

Yuexian Zou

Keyword(s):

Short Term Memory ◽

Language Identification ◽

Data Driven ◽

Short Term ◽

Term Memory ◽

Long Short Term Memory ◽

Selection For ◽

Memory Modeling

Download Full-text

Development and implementation of a predictive method for the stock market analysis, using the long short-term memory machine learning method

Journal of Physics Conference Series ◽

10.1088/1742-6596/1514/1/012009 ◽

2020 ◽

Vol 1514 ◽

pp. 012009

Author(s):

D Pedrozo ◽

F Barajas ◽

A Estupiñán ◽

K L Cristiano ◽

D A Triana

Keyword(s):

Machine Learning ◽

Stock Market ◽

Short Term Memory ◽

Market Analysis ◽

Machine Learning Method ◽

Learning Method ◽

Short Term ◽

Term Memory ◽

Predictive Method ◽

Long Short Term Memory

Download Full-text

Data-Driven Predictive Modeling of Neuronal Dynamics Using Long Short-Term Memory

Algorithms ◽

10.3390/a12100203 ◽

2019 ◽

Vol 12 (10) ◽

pp. 203

Author(s):

Benjamin Plaster ◽

Gautam Kumar

Keyword(s):

Neural Network ◽

Time Horizon ◽

Short Term Memory ◽

Temporal Dynamics ◽

Data Driven ◽

Brain Dynamics ◽

Short Term ◽

Term Memory ◽

Brain Functions ◽

Long Short Term Memory

Modeling brain dynamics to better understand and control complex behaviors underlying various cognitive brain functions have been of interest to engineers, mathematicians and physicists over the last several decades. With the motivation of developing computationally efficient models of brain dynamics to use in designing control-theoretic neurostimulation strategies, we have developed a novel data-driven approach in a long short-term memory (LSTM) neural network architecture to predict the temporal dynamics of complex systems over an extended long time-horizon in future. In contrast to recent LSTM-based dynamical modeling approaches that make use of multi-layer perceptrons or linear combination layers as output layers, our architecture uses a single fully connected output layer and reversed-order sequence-to-sequence mapping to improve short time-horizon prediction accuracy and to make multi-timestep predictions of dynamical behaviors. We demonstrate the efficacy of our approach in reconstructing the regular spiking to bursting dynamics exhibited by an experimentally-validated 9-dimensional Hodgkin-Huxley model of hippocampal CA1 pyramidal neurons. Through simulations, we show that our LSTM neural network can predict the multi-time scale temporal dynamics underlying various spiking patterns with reasonable accuracy. Moreover, our results show that the predictions improve with increasing predictive time-horizon in the multi-timestep deep LSTM neural network.

Download Full-text

Enhancing Machine Learning Prediction in Cybersecurity Using Dynamic Feature Selector

Journal of Cybersecurity and Privacy ◽

10.3390/jcp1010011 ◽

2021 ◽

Vol 1 (1) ◽

pp. 199-218

Author(s):

Mostofa Ahsan ◽

Rahul Gomes ◽

Md. Minhaz Chowdhury ◽

Kendall E. Nygard

Keyword(s):

Machine Learning ◽

Short Term Memory ◽

Computational Time ◽

Dynamic Feature ◽

Short Term ◽

Feature Size ◽

Learning Stage ◽

Term Memory ◽

Feature Selector ◽

Long Short Term Memory

Machine learning algorithms are becoming very efficient in intrusion detection systems with their real time response and adaptive learning process. A robust machine learning model can be deployed for anomaly detection by using a comprehensive dataset with multiple attack types. Nowadays datasets contain many attributes. Such high dimensionality of datasets poses a significant challenge to information extraction in terms of time and space complexity. Moreover, having so many attributes may be a hindrance towards creation of a decision boundary due to noise in the dataset. Large scale data with redundant or insignificant features increases the computational time and often decreases goodness of fit which is a critical issue in cybersecurity. In this research, we have proposed and implemented an efficient feature selection algorithm to filter insignificant variables. Our proposed Dynamic Feature Selector (DFS) uses statistical analysis and feature importance tests to reduce model complexity and improve prediction accuracy. To evaluate DFS, we conducted experiments on two datasets used for cybersecurity research namely Network Security Laboratory (NSL-KDD) and University of New South Wales (UNSW-NB15). In the meta-learning stage, four algorithms were compared namely Bidirectional Long Short-Term Memory (Bi-LSTM), Gated Recurrent Units, Random Forest and a proposed Convolutional Neural Network and Long Short-Term Memory (CNN-LSTM) for accuracy estimation. For NSL-KDD, experiments revealed an increment in accuracy from 99.54% to 99.64% while reducing feature size of one-hot encoded features from 123 to 50. In UNSW-NB15 we observed an increase in accuracy from 90.98% to 92.46% while reducing feature size from 196 to 47. The proposed approach is thus able to achieve higher accuracy while significantly lowering number of features required for processing.

Download Full-text

COMPARATIVE ANALYSIS AND EVALUATION OF THE APPLICATION OF DEEP LEARNING TECHNIQUES TO CYBERSECURITY DATASETS

DYNA INGENIERIA E INDUSTRIA ◽

10.6036/10007 ◽

2021 ◽

Vol 96 (5) ◽

pp. 528-533

Author(s):

XAVIER LARRIVA NOVO ◽

MARIO VEGA BARBAS ◽

VICTOR VILLAGRA ◽

JULIO BERROCAL

Keyword(s):

Machine Learning ◽

Deep Learning ◽

High Performance ◽

New Technologies ◽

Short Term Memory ◽

Machine Learning Techniques ◽

Short Term ◽

Term Memory ◽

Learning Techniques ◽

Long Short Term Memory

Cybersecurity has stood out in recent years with the aim of protecting information systems. Different methods, techniques and tools have been used to make the most of the existing vulnerabilities in these systems. Therefore, it is essential to develop and improve new technologies, as well as intrusion detection systems that allow detecting possible threats. However, the use of these technologies requires highly qualified cybersecurity personnel to analyze the results and reduce the large number of false positives that these technologies presents in their results. Therefore, this generates the need to research and develop new high-performance cybersecurity systems that allow efficient analysis and resolution of these results. This research presents the application of machine learning techniques to classify real traffic, in order to identify possible attacks. The study has been carried out using machine learning tools applying deep learning algorithms such as multi-layer perceptron and long-short-term-memory. Additionally, this document presents a comparison between the results obtained by applying the aforementioned algorithms and algorithms that are not deep learning, such as: random forest and decision tree. Finally, the results obtained are presented, showing that the long-short-term-memory algorithm is the one that provides the best results in relation to precision and logarithmic loss.

Download Full-text