scholarly journals An ARFIMA-based model for daily precipitation amounts with direct access to fluctuations

2020 ◽  
Vol 34 (10) ◽  
pp. 1487-1505
Author(s):  
Katja Polotzek ◽  
Holger Kantz

Abstract Correlations in models for daily precipitation are often generated by elaborate numerics that employ a high number of hidden parameters. We propose a parsimonious and parametric stochastic model for European mid-latitude daily precipitation amounts with focus on the influence of correlations on the statistics. Our method is meta-Gaussian by applying a truncated-Gaussian-power (tGp) transformation to a Gaussian ARFIMA model. The speciality of this approach is that ARFIMA(1, d, 0) processes provide synthetic time series with long- (LRC), meaning the sum of all autocorrelations is infinite, and short-range (SRC) correlations by only one parameter each. Our model requires the fit of only five parameters overall that have a clear interpretation. For model time series of finite length we deduce an effective sample size for the sample mean, whose variance is increased due to correlations. For example the statistical uncertainty of the mean daily amount of 103 years of daily records at the Fichtelberg mountain in Germany equals the one of about 14 years of independent daily data. Our effective sample size approach also yields theoretical confidence intervals for annual total amounts and allows for proper model validation in terms of the empirical mean and fluctuations of annual totals. We evaluate probability plots for the daily amounts, confidence intervals based on the effective sample size for the daily mean and annual totals, and the Mahalanobis distance for the annual maxima distribution. For reproducing annual maxima the way of fitting the marginal distribution is more crucial than the presence of correlations, which is the other way round for annual totals. Our alternative to rainfall simulation proves capable of modeling daily precipitation amounts as the statistics of a random selection of 20 data sets is well reproduced.

2019 ◽  
Vol 7 (3) ◽  
pp. 334-364 ◽  
Author(s):  
Carolina Franco ◽  
Roderick J A Little ◽  
Thomas A Louis ◽  
Eric V Slud

Abstract The most widespread method of computing confidence intervals (CIs) in complex surveys is to add and subtract the margin of error (MOE) from the point estimate, where the MOE is the estimated standard error multiplied by the suitable Gaussian quantile. This Wald-type interval is used by the American Community Survey (ACS), the largest US household sample survey. For inferences on small proportions with moderate sample sizes, this method often results in marked under-coverage and lower CI endpoint less than 0. We assess via simulation the coverage and width, in complex sample surveys, of seven alternatives to the Wald interval for a binomial proportion with sample size replaced by the ‘effective sample size,’ that is, the sample size divided by the design effect. Building on previous work by the present authors, our simulations address the impact of clustering, stratification, different stratum sampling fractions, and stratum-specific proportions. We show that all intervals undercover when there is clustering and design effects are computed from a simple design-based estimator of sampling variance. Coverage can be better calibrated for the alternatives to Wald by improving estimation of the effective sample size through superpopulation modeling. This approach is more effective in our simulations than previously proposed modifications of effective sample size. We recommend intervals of the Wilson or Bayes uniform prior form, with the Jeffreys prior interval not far behind.


2017 ◽  
Vol 23 (2) ◽  
pp. 33
Author(s):  
José W. Camero Jiménez ◽  
Jahaziel G. Ponce Sánchez

Actualmente los métodos para estimar la media son los basados en el intervalo de confianza del promedio o media muestral. Este trabajo pretende ayudar a escoger el estimador (promedio o mediana) a usar dependiendo del tamaño de muestra. Para esto se han generado, vía simulación en excel, muestras con distribución normal y sus intervalos de confianza para ambos estimadores, y mediante pruebas de hipótesis para la diferencia de proporciones se demostrará que método es mejor dependiendo del tamaño de muestra. Palabras clave.-Tamaño de muestra, Intervalo de confianza, Promedio, Mediana. ABSTRACTCurrently the methods for estimating the mean are those based on the confidence interval of the average or sample mean. This paper aims to help you choose the estimator (average or median) to use depending on the sample size. For this we have generated, via simulation in EXCEL, samples with normal distribution and confidence intervals for both estimators, and by hypothesis tests for the difference of proportions show that method is better depending on the sample size. Keywords.-Sampling size, Confidence interval, Average, Median.


2011 ◽  
Vol 28 (2) ◽  
pp. 471-481 ◽  
Author(s):  
Tucker McElroy ◽  
Dimitris N. Politis

This paper considers the problem of variance estimation for the sample mean in the context of long memory and negative memory time series dynamics, adopting the fixed-bandwidth approach now popular in the econometrics literature. The distribution theory generalizes the short memory results of Kiefer and Vogelsang (2005, Econometric Theory 21, 1130–1164). In particular, our results highlight the dependence on the kernel (we include flat-top kernels), whether or not the kernel is nonzero at the boundary, and, most important, whether or not the process is short memory. Simulation studies support the importance of accounting for memory in the construction of confidence intervals for the mean.


Water ◽  
2021 ◽  
Vol 13 (16) ◽  
pp. 2156
Author(s):  
George Pouliasis ◽  
Gina Alexandra Torres-Alves ◽  
Oswaldo Morales-Napoles

The generation of synthetic time series is important in contemporary water sciences for their wide applicability and ability to model environmental uncertainty. Hydroclimatic variables often exhibit highly skewed distributions, intermittency (that is, alternating dry and wet intervals), and spatial and temporal dependencies that pose a particular challenge to their study. Vine copula models offer an appealing approach to generate synthetic time series because of their ability to preserve any marginal distribution while modeling a variety of probabilistic dependence structures. In this work, we focus on the stochastic modeling of hydroclimatic processes using vine copula models. We provide an approach to model intermittency by coupling Markov chains with vine copula models. Our approach preserves first-order auto- and cross-dependencies (correlation). Moreover, we present a novel framework that is able to model multiple processes simultaneously. This method is based on the coupling of temporal and spatial dependence models through repetitive sampling. The result is a parsimonious and flexible method that can adequately account for temporal and spatial dependencies. Our method is illustrated within the context of a recent reliability assessment of a historical hydraulic structure in central Mexico. Our results show that by ignoring important characteristics of probabilistic dependence that are well captured by our approach, the reliability of the structure could be severely underestimated.


Biometrika ◽  
2020 ◽  
Author(s):  
Oliver Dukes ◽  
Stijn Vansteelandt

Summary Eliminating the effect of confounding in observational studies typically involves fitting a model for an outcome adjusted for covariates. When, as often, these covariates are high-dimensional, this necessitates the use of sparse estimators, such as the lasso, or other regularization approaches. Naïve use of such estimators yields confidence intervals for the conditional treatment effect parameter that are not uniformly valid. Moreover, as the number of covariates grows with the sample size, correctly specifying a model for the outcome is nontrivial. In this article we deal with both of these concerns simultaneously, obtaining confidence intervals for conditional treatment effects that are uniformly valid, regardless of whether the outcome model is correct. This is done by incorporating an additional model for the treatment selection mechanism. When both models are correctly specified, we can weaken the standard conditions on model sparsity. Our procedure extends to multivariate treatment effect parameters and complex longitudinal settings.


2016 ◽  
Vol 407 ◽  
pp. 371-386 ◽  
Author(s):  
Krzysztof Bartoszek

Sign in / Sign up

Export Citation Format

Share Document