scholarly journals Statistical modeling of COVID-19 deaths with excess zero counts

2021 ◽  
Vol 10 (s1) ◽  
Author(s):  
Sami Khedhiri

Abstract Objectives Modeling and forecasting possible trajectories of COVID-19 infections and deaths using statistical methods is one of the most important topics in present time. However, statistical models use different assumptions and methods and thus yield different results. One issue in monitoring disease progression over time is how to handle excess zeros counts. In this research, we assess the statistical empirical performance of these models in terms of their fit and forecast accuracy of COVID-19 deaths. Methods Two types of models are suggested in the literature to study count time series data. The first type of models is based on Poisson and negative binomial conditional probability distributions to account for data over dispersion and using auto regression to account for dependence of the responses. The second type of models is based on zero-inflated mixed auto regression and also uses exponential family conditional distributions. We study the goodness of fit and forecast accuracy of these count time series models based on autoregressive conditional count distributions with and without zero inflation. Results We illustrate these methods using a recently published online COVID-19 data for Tunisia, which reports daily death counts from March 2020 to February 2021. We perform an empirical analysis and we compare the fit and the forecast performance of these models for death counts in presence of an intervention policy. Our statistical findings show that models that account for zero inflation produce better fit and have more accurate forecast of the pandemic deaths. Conclusions This paper shows that infectious disease data with excess zero counts are better modelled with zero-inflated models. These models yield more accurate predictions of deaths related to the pandemic than the generalized count data models. In addition, our statistical results find that the lift of travel restrictions has a significant impact on the surge of COVID-19 deaths. One plausible explanation of the outperformance of zero-inflated models is that the zero values are related to an intervention policy and therefore they are structural.

Author(s):  
Saleh Ibrahim Musa ◽  
N. O. Nweze

Time series of count with over-dispersion is the reality often encountered in many biomedical and public health applications.  Statistical modelling of this type of series has been a great challenge. Rottenly, the Poisson and negative binomial distributions have been widely used in practice for discrete count time series data, their forms are too simplistic to accommodate features such as over-dispersion. Unable to account for these associated features while analyzing such data may result in incorrect and sometimes misleading inferences as well as detection of spurious associations. Therefore, the need for further investigation of count time series models suitable to fit count time series with over-dispersion of different level. The study therefore proposed a best model that can fit and forecast time series count data with different levels of over-dispersion and sample sizes Simulation studies were conducted using R statistical package, to investigate the performances of Autoregressiove Conditional Poisson (ACP) and Poisson Autoregressive (PAR) models. The predictive ability of the models were observed at different steps ahead. The relative performance of the models were examined using Akaike Information criteria (AIC) and Hannan-Quinn Information Criteria (HQIC). Conclusively, the best model to fit was ACP at different sample sizes. The predictive abilities of the four fitted models increased as sample size and number of steps ahead were increased


2018 ◽  
Vol 7 (3.7) ◽  
pp. 51
Author(s):  
Maria Elena Nor ◽  
Norsoraya Azurin Wahir ◽  
G P. Khuneswari ◽  
Mohd Saifullah Rusiman

The presence of outliers is an example of aberrant data that can have huge negative influence on statistical method under the assumption of normality and it affects the estimation. This paper introduces an alternative method as outlier treatment in time series which is interpolation. It compares two interpolation methods using performance indicator. Assuming outlier as a missing value in the data allows the application of the interpolation method to interpolate the missing value, thus comparing the result using the forecast accuracy. The monthly time series data from January 1998 until December 2015 of Malaysia Tourist Arrivals were used to deal with outliers. The results found that the cubic spline interpolation method gave the best result than the linear interpolation and the improved time series data indicated better performance in forecasting rather than the original time series data of Box-Jenkins model. 


Author(s):  
Yisu Jia ◽  
Robert Lund ◽  
James Livsey

Abstract This paper probabilistically explores a class of stationary count time series models built by superpositioning (or otherwise combining) independent copies of a binary stationary sequence of zeroes and ones. Superpositioning methods have proven useful in devising stationary count time series having prespecified marginal distributions. Here, basic properties of this model class are established and the idea is further developed. Specifically, stationary series with binomial, Poisson, negative binomial, discrete uniform, and multinomial marginal distributions are constructed; other marginal distributions are possible. Our primary goal is to derive the autocovariance function of the resulting series.


Author(s):  
Mahua Bose ◽  
Kalyani Mali

In recent years, several methods for forecasting fuzzy time series have been presented in different areas, such as stock price, student enrollments, climatology, production sector, etc. Choice of data partitioning technique is a central factor and it highly influences the forecast accuracy. In all existing works on fuzzy time series model, cluster with highest membership is used to form fuzzy logical relationships. But the position of the element within the cluster is not considered. The present study incorporates the idea of fuzzy discretization and shadowed set theory in defining intervals and uses the positional information of elements within a cluster in selection of rules for decision making. The objective of this work is to show the effect of the elements, lying outside the core area on forecast. Performance of the presented model is evaluated on standard datasets.


2019 ◽  
Vol 11 (3) ◽  
pp. 793 ◽  
Author(s):  
Rashad Aliyev ◽  
Sara Salehi ◽  
Rafig Aliyev

Receiving appropriate forecast accuracy is important in many countries’ economic activities, and developing effective and precise time series model is critical issue in tourism demand forecasting. In this paper, fuzzy rule-based system model for hotel occupancy forecasting is developed by analyzing 40 months’ time series data and applying fuzzy c-means clustering algorithm. Based on the values of root mean square error and mean absolute percentage error which are metrics for measuring forecast accuracy, it is defined that the model with 7 clusters and 4 inputs is the optimal forecasting model for hotel occupancy.


2018 ◽  
Vol 7 (4) ◽  
pp. 421-431
Author(s):  
Fitrawaty Fitrawaty

Penelitian ini bertujuan untuk menganalisis bagaimana interdependensi instrument kebijakan moneter dengan pengangguran di Indonesia selama periode tahun 2000 – 2011. Data yang digunakan adalah data time series yang diperoleh dari Bank Indonesia, Biro Pusat Statistik dan institusi lainnya. Penelitian ini menggunakan metode Vector Auturegression (VAR) dilanjutkan dengan Struktural Vector Auturegression (SVAR). Berdasarkan hasil interpretasi VAR dan SVAR, secara khusus diperoleh bahwa keterkaitan  antara instrumen moneter dengan pengangguran (UNEMP) memiliki arah yang berbeda. Variabel operasi pasar terbuka (OPT),  tingkat suku bunga diskonto (rDiskonto), dan tingkat bunga domestik (rDom), mempunyai arah yang negatif terhadap variabel pengangguran, sedangkan variabel giro wajib minimum (GWM), nilai tukar (EXC) mempunyai arah yang positif. Keseluruahan instrumen moneter secara parsial berpengaruh tidak signifikan terhadap UNEMP. Begitu juga setelah dilakukan shock dengan menaikkan OPT sebesar 5% pada tahun 2010, diperoleh bahwa variabel OPT, GWM, rDiskonto, rDOM, EXC, juga tidak berpengaruh signifikan terhadap pengangguran. This study is intended to analyze the correlation of monetary policy instruments with unemployment in Indonesia during the period 2000 - 2011. The data used are time series data obtained from Bank of Indonesia, the Central Statistics Bureau and other institutions. This study uses Vector Auto regression (VAR) method followed by Structural Vector Auto regression (SVAR). Based on the results of the interpretation of VAR and SVAR, it was found that the relationship between monetary instruments with unemployment (UNEMP) had different directions. Open market operations variable (OMO), discounted interest rates variable(discount), and domestic interest rates variable (FDOM), have a negative correlation to the unemployment variable, while the statutory reserve requirement (GWM), exchange rate (EXC) variables have a positive correlation. The partiality of monetary instruments has no significant effect on UNEMP. Likewise, after the shock of increasing OPT by 5% in 2010, it was found that the OMO variable, GWM, discount, FDOM, and EXC, also had no significant effects on unemployment.


2014 ◽  
Vol 1 (1) ◽  
pp. 841-876 ◽  
Author(s):  
H. R. Wang ◽  
C. Wang ◽  
X. Lin ◽  
J. Kang

Abstract. Auto Regressive Integrated Moving Average (ARIMA) model is often used to calculate time series data formed by inter-annual variations of monthly data. However, the influence brought about by inter-monthly variations within each year is ignored. Based on the monthly data classified by clustering analysis, the characteristics of time series data are extracted. An improved ARIMA model is developed accounting for both the inter-annual and inter-monthly variation. The correlation between characteristic quantity and monthly data within each year is constructed by regression analysis first. The model can be used for predicting characteristic quantity followed by the stationary treatment for characteristic quantity time series by difference. A case study is conducted to predict the precipitation in Lanzhou precipitation station, China, using the model, and the results show that the accuracy of the improved model is significantly higher than the seasonal model, with the mean residual achieving 9.41 mm and the forecast accuracy increasing by 21%.


Sign in / Sign up

Export Citation Format

Share Document