scholarly journals Reproducible parallel inference and simulation of stochastic state space models using odin, dust, and mcstate

2021 ◽  
Vol 5 ◽  
pp. 288
Author(s):  
Richard G. FitzJohn ◽  
Edward S. Knock ◽  
Lilith K. Whittles ◽  
Pablo N. Perez-Guzman ◽  
Sangeeta Bhatia ◽  
...  

State space models, including compartmental models, are used to model physical, biological and social phenomena in a broad range of scientific fields. A common way of representing the underlying processes in these models is as a system of stochastic processes which can be simulated forwards in time. Inference of model parameters based on observed time-series data can then be performed using sequential Monte Carlo techniques. However, using these methods for routine inference problems can be made difficult due to various engineering considerations: allowing model design to change in response to new data and ideas, writing model code which is highly performant, and incorporating all of this with up-to-date statistical techniques. Here, we describe a suite of packages in the R programming language designed to streamline the design and deployment of state space models, targeted at infectious disease modellers but suitable for other domains. Users describe their model in a familiar domain-specific language, which is converted into parallelised C++ code. A fast, parallel, reproducible random number generator is then used to run large numbers of model simulations in an efficient manner. We also provide standard inference and prediction routines, though the model simulator can be used directly if these do not meet the user’s needs. These packages provide guarantees on reproducibility and performance, allowing the user to focus on the model itself, rather than the underlying computation. The ability to automatically generate high-performance code that would be tedious and time-consuming to write and verify manually, particularly when adding further structure to compartments, is crucial for infectious disease modellers. Our packages have been critical to the development cycle of our ongoing real-time modelling efforts in the COVID-19 pandemic, and have the potential to do the same for models used in a number of different domains.

2020 ◽  
Vol 5 ◽  
pp. 288
Author(s):  
Edward S. Knock ◽  
Lilith K. Whittles ◽  
Pablo N. Perez-Guzman ◽  
Sangeeta Bhatia ◽  
Fernando Guntoro ◽  
...  

State space models, including compartmental models, are used to model physical, biological and social phenomena in a broad range of scientific fields. A common way of representing the underlying processes in these models is as a system of stochastic processes which can be simulated forwards in time. Inference of model parameters based on observed time-series data can then be performed using sequential Monte Carlo techniques. However, using these methods for routine inference problems can be made difficult due to various engineering considerations: allowing model design to change in response to new data and ideas, writing model code which is highly performant, and incorporating all of this with up-to-date statistical techniques. Here, we describe a suite of packages in the R programming language designed to streamline the design and deployment of state space models, targeted at infectious disease modellers but suitable for other domains. Users describe their model in a familiar domain-specific language, which is converted into parallelised C++ code. A fast, parallel, reproducible random number generator is then used to run large numbers of model simulations in an efficient manner. We also provide standard inference and prediction routines, though the model simulator can be used directly if these do not meet the user’s needs. These packages provide guarantees on reproducibility and performance, allowing the user to focus on the model itself, rather than the underlying computation. The ability to automatically generate high-performance code that would be tedious and time-consuming to write and verify manually, particularly when adding further structure to compartments, is crucial for infectious disease modellers. Our packages have been critical to the development cycle of our ongoing real-time modelling efforts in the COVID-19 pandemic, and have the potential to do the same for models used in a number of different domains.


2017 ◽  
Vol 9 (4) ◽  
pp. 2036-2042 ◽  
Author(s):  
Suman Suman ◽  
Urmil Verma

Box and Jenkins’ Autoregressive Integrated Moving Average (ARIMA) models are widely used for analyzing and forecasting the time-series data. In this approach, the underlying parameters are assumed to be constant however the data in agriculture are generally collected over time and thus have the time-dependency in parameters. Such data can be analyzed using state space (SS) procedures by the application of Kalman filtering technique. The purpose of this article is to illustrate the usefulness of state space models in sugarcane yield forecasting and to pro-vide some empirical evidence for its superiority over the classical time-series analysis. ARIMA and state space models individually could provide the suitable relationship(s) to reliably forecast the sugarcane yield in Karnal, Ambala, Kurukshetra, Yamunanagar and Panipat districts of Haryana (India). However, the state space models with lower error metrics showed the superiority over ARIMA models for this empirical study. The sugarcane yield forecasts based on SS models in the districts under consideration showed good agreement with State Department of Agriculture (DOA) yields by showing 3-6 percent average absolute deviations.


The R Journal ◽  
2012 ◽  
Vol 4 (1) ◽  
pp. 11 ◽  
Author(s):  
Elizabeth,E. Holmes ◽  
Eric,J. Ward ◽  
Kellie Wills

Sensors ◽  
2018 ◽  
Vol 18 (12) ◽  
pp. 4112 ◽  
Author(s):  
Se-Min Lim ◽  
Hyeong-Cheol Oh ◽  
Jaein Kim ◽  
Juwon Lee ◽  
Jooyoung Park

Recently, wearable devices have become a prominent health care application domain by incorporating a growing number of sensors and adopting smart machine learning technologies. One closely related topic is the strategy of combining the wearable device technology with skill assessment, which can be used in wearable device apps for coaching and/or personal training. Particularly pertinent to skill assessment based on high-dimensional time series data from wearable sensors is classifying whether a player is an expert or a beginner, which skills the player is exercising, and extracting some low-dimensional representations useful for coaching. In this paper, we present a deep learning-based coaching assistant method, which can provide useful information in supporting table tennis practice. Our method uses a combination of LSTM (Long short-term memory) with a deep state space model and probabilistic inference. More precisely, we use the expressive power of LSTM when handling high-dimensional time series data, and state space model and probabilistic inference to extract low-dimensional latent representations useful for coaching. Experimental results show that our method can yield promising results for characterizing high-dimensional time series patterns and for providing useful information when working with wearable IMU (Inertial measurement unit) sensors for table tennis coaching.


2013 ◽  
Vol 347-350 ◽  
pp. 3331-3335
Author(s):  
Qian Ru Wang ◽  
Xi Wei Chen ◽  
Da Shi Luo ◽  
Yu Feng Wei ◽  
Li Ya Jin ◽  
...  

Grey system theory has been widely used to forecast the economic data that are often highly nonlinear, irregular and non-stationary. Many models based on grey system theory could adapt to various economic time series data. However, some of these models didnt consider the impact of the model parameters, or only considered a simple change of the model parameters for the prediction. In this paper, we proposed the PSO based GM (1, 1) model using the optimized parameters in order to improve the forecasting accuracy. The experiment shows that PSO based GM (1, 1) gets much better forecasting accuracy compared with other widely used grey models on the actual chaotic economic data.


2007 ◽  
Vol 9 (1) ◽  
pp. 30-41 ◽  
Author(s):  
Nikhil S. Padhye ◽  
Sandra K. Hanneman

The application of cosinor models to long time series requires special attention. With increasing length of the time series, the presence of noise and drifts in rhythm parameters from cycle to cycle lead to rapid deterioration of cosinor models. The sensitivity of amplitude and model-fit to the data length is demonstrated for body temperature data from ambulatory menstrual cycling and menopausal women and from ambulatory male swine. It follows that amplitude comparisons between studies cannot be made independent of consideration of the data length. Cosinor analysis may be carried out on serial-sections of the series for improved model-fit and for tracking changes in rhythm parameters. Noise and drift reduction can also be achieved by folding the series onto a single cycle, which leads to substantial gains in the model-fit but lowers the amplitude. Central values of model parameters are negligibly changed by consideration of the autoregressive nature of residuals.


2020 ◽  
Vol 496 (1) ◽  
pp. 629-637
Author(s):  
Ce Yu ◽  
Kun Li ◽  
Shanjiang Tang ◽  
Chao Sun ◽  
Bin Ma ◽  
...  

ABSTRACT Time series data of celestial objects are commonly used to study valuable and unexpected objects such as extrasolar planets and supernova in time domain astronomy. Due to the rapid growth of data volume, traditional manual methods are becoming extremely hard and infeasible for continuously analysing accumulated observation data. To meet such demands, we designed and implemented a special tool named AstroCatR that can efficiently and flexibly reconstruct time series data from large-scale astronomical catalogues. AstroCatR can load original catalogue data from Flexible Image Transport System (FITS) files or data bases, match each item to determine which object it belongs to, and finally produce time series data sets. To support the high-performance parallel processing of large-scale data sets, AstroCatR uses the extract-transform-load (ETL) pre-processing module to create sky zone files and balance the workload. The matching module uses the overlapped indexing method and an in-memory reference table to improve accuracy and performance. The output of AstroCatR can be stored in CSV files or be transformed other into formats as needed. Simultaneously, the module-based software architecture ensures the flexibility and scalability of AstroCatR. We evaluated AstroCatR with actual observation data from The three Antarctic Survey Telescopes (AST3). The experiments demonstrate that AstroCatR can efficiently and flexibly reconstruct all time series data by setting relevant parameters and configuration files. Furthermore, the tool is approximately 3× faster than methods using relational data base management systems at matching massive catalogues.


Sign in / Sign up

Export Citation Format

Share Document