Open Statistics | ScienceGate

Asymptotic Inference for Optimal Rerandomization Designs

Open Statistics ◽

10.1515/stat-2020-0102 ◽

2021 ◽

Vol 1 (1) ◽

pp. 49-58

Author(s):

Mårten Schultzberg ◽

Per Johansson

Keyword(s):

Experimental Design ◽

Normal Distribution ◽

Mahalanobis Distance ◽

Design Strategy ◽

Sampling Distribution ◽

Optimal Designs ◽

Asymptotic Inference ◽

Optimal Allocations ◽

Asymptotic Sampling

AbstractRecently a computational-based experimental design strategy called rerandomization has been proposed as an alternative or complement to traditional blocked designs. The idea of rerandomization is to remove, from consideration, those allocations with large imbalances in observed covariates according to a balance criterion, and then randomize within the set of acceptable allocations. Based on the Mahalanobis distance criterion for balancing the covariates, we show that asymptotic inference to the population, from which the units in the sample are randomly drawn, is possible using only the set of best, or ‘optimal’, allocations. Finally, we show that for the optimal and near optimal designs, the quite complex asymptotic sampling distribution derived by Li et al. (2018), is well approximated by a normal distribution.

A note on an extreme left skewed unit distribution: Theory, modelling and data fitting

Open Statistics ◽

10.1515/stat-2020-0103 ◽

2021 ◽

Vol 2 (1) ◽

pp. 1-23

Author(s):

Christophe Chesneau

Keyword(s):

Probability Density Function ◽

Probability Density ◽

Density Function ◽

Power Distribution ◽

Stochastic Order ◽

Cumulative Distribution ◽

Closed Form Expression ◽

Survival Times ◽

Data Set ◽

Wide Range

Abstract In probability and statistics, unit distributions are used to model proportions, rates, and percentages, among other things. This paper is about a new one-parameter unit distribution, whose probability density function is defined by an original ratio of power and logarithmic functions. This function has a wide range of J shapes, some of which are more angular than others. In this sense, the proposed distribution can be thought of as an “extremely left skewed alternative” to the traditional power distribution. We discuss its main characteristics, including other features of the probability density function, some stochastic order results, the closed-form expression of the cumulative distribution function involving special integral functions, the quantile and hazard rate functions, simple expressions for the ordinary moments, skewness, kurtosis, moments generating function, incomplete moments, logarithmic moments and logarithmically weighted moments. Subsequently, a simple example of an application is given by the use of simulated data, with fair comparison to the power model supported by numerical and graphical illustrations. A new modelling strategy beyond the unit domain is also proposed and developed, with an application to a survival times data set.

Prediction Regions for Poisson and Over-Dispersed Poisson Regression Models with Applications in Forecasting the Number of Deaths during the COVID-19 Pandemic

Open Statistics ◽

10.1515/stat-2020-0106 ◽

2021 ◽

Vol 2 (1) ◽

pp. 81-112

Author(s):

Taeho Kim ◽

Benjamin Lieberman ◽

George Luta ◽

Edsel A. Peña

Keyword(s):

Regression Model ◽

Poisson Regression ◽

Negative Binomial ◽

The United States ◽

Additional Parameter ◽

Cumulative Number ◽

Poisson Regression Model ◽

Prediction Region ◽

Over Dispersion ◽

Prediction Regions

Abstract Motivated by the Coronavirus Disease (COVID-19) pandemic, which is due to the SARS-CoV-2 virus, and the important problem of forecasting the number of daily deaths and the number of cumulative deaths, this paper examines the construction of prediction regions or intervals under the no-covariate or intercept-only Poisson model, the Poisson regression model, and a new over-dispersed Poisson regression model. These models are useful for settings with events of interest that are rare. For the no-covariate Poisson and the Poisson regression model, several prediction regions are developed and their performances are compared through simulation studies. The methods are applied to the problem of forecasting the number of daily deaths and the number of cumulative deaths in the United States (US) due to COVID-19. To examine their predictive accuracy in light of what actually happened, daily deaths data until May 15, 2020 were used to forecast cumulative deaths by June 1, 2020. It was observed that there is over-dispersion in the observed data relative to the Poisson regression model. A novel over-dispersed Poisson regression model is therefore proposed. This new model, which is distinct from the negative binomial regression (NBR) model, builds on frailty ideas in Survival Analysis and over-dispersion is quantified through an additional parameter. It has the flavor of a discrete measurement error model and with a viable physical interpretation in contrast to the NBR model. The Poisson regression model is a hidden model in this over-dispersed Poisson regression model, obtained as a limiting case when the over-dispersion parameter increases to infinity. A prediction region for the cumulative number of US deaths due to COVID-19 by October 1, 2020, given the data until September 1, 2020, is presented. Realized daily and cumulative deaths values from September 1st until September 25th are compared to the prediction region limits. Finally, the paper discusses limitations of the proposed procedures and mentions open research problems. It also pinpoints dangers and pitfalls when forecasting on a long horizon, especially during a pandemic where events, both foreseen and unforeseen, could impact point predictions and prediction regions.

GARCH with generalized Pareto tail

Open Statistics ◽

10.1515/stat-2020-0105 ◽

2021 ◽

Vol 2 (1) ◽

pp. 37-80

Author(s):

Hiroyuki Kawakatsu

Keyword(s):

Risk Management ◽

Financial Risk ◽

Heavy Tail ◽

Conditional Heteroskedasticity ◽

Return Distribution ◽

Tail Risk ◽

Asset Return ◽

Generalized Pareto ◽

Estimated Parameters ◽

Pareto Tail

Abstract This paper proposes the use of a spliced distribution with generalized Pareto tail for financial risk management. The proposed distribution is tailored to flexibly capture the heavy tail in asset return distribution. The parameters of the distribution can be estimated jointly with a conditional heteroskedasticity model. The estimated parameters can then be used to produce tail risk forecasts for risk management purposes. The use of the proposed distribution is illustrated by evaluating tail risk forecasts for a number of major stock indices.

Orthonormal Canonical Correlation Analysis

Open Statistics ◽

10.1515/stat-2020-0104 ◽

2021 ◽

Vol 2 (1) ◽

pp. 24-36

Author(s):

Stan Lipovetsky

Keyword(s):

Correlation Analysis ◽

Canonical Correlation Analysis ◽

Canonical Correlation ◽

Data Sets ◽

Multivariate Statistical ◽

Canonical Correlations ◽

Robust Version ◽

Multiple Variables ◽

Value Decomposition ◽

Further Development

Abstract Complex managerial problems are usually described by datasets with multiple variables, and in lack of a theoretical model, the data structures can be found by special multivariate statistical techniques. For two datasets, the canonical correlation analysis and its robust version are known as good working research tools. This paper presents their further development via the orthonormal approximation of data matrices which corresponds to using singular value decomposition in the canonical correlations. The features of the new method are described and applications considered. This type of multivariate analysis is useful for solving various practical problems of applied statistics requiring operating with two data sets, and can be helpful in managerial estimations and decision making.

Dependence and dependence structures: estimation and visualization using the unifying concept of distance multivariance

Open Statistics ◽

10.1515/stat-2020-0001 ◽

2019 ◽

Vol 1 (1) ◽

pp. 1-48 ◽

Cited By ~ 1

Author(s):

Björn Böttcher

Keyword(s):

R Package ◽

Higher Order ◽

Dependence Structure ◽

Moment Conditions ◽

Dependence Measure ◽

Multivariate Dependence ◽

Distance Covariance ◽

Dependence Measures ◽

Pairwise Independence ◽

Rv Coefficient

AbstractDistance multivariance is a multivariate dependence measure, which can detect dependencies between an arbitrary number of random vectors each of which can have a distinct dimension. Here we discuss several new aspects, present a concise overview and use it as the basis for several new results and concepts: in particular, we show that distance multivariance unifies (and extends) distance covariance and the Hilbert-Schmidt independence criterion HSIC, moreover also the classical linear dependence measures: covariance, Pearson’s correlation and the RV coefficient appear as limiting cases. Based on distance multivariance several new measures are defined: a multicorrelation which satisfies a natural set of multivariate dependence measure axioms and m-multivariance which is a dependence measure yielding tests for pairwise independence and independence of higher order. These tests are computationally feasible and under very mild moment conditions they are consistent against all alternatives. Moreover, a general visualization scheme for higher order dependencies is proposed, including consistent estimators (based on distance multivariance) for the dependence structure.Many illustrative examples are provided. All functions for the use of distance multivariance in applications are published in the R-package multivariance.

Open Statistics
Latest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Walter De Gruyter Gmbh

Asymptotic Inference for Optimal Rerandomization Designs

A note on an extreme left skewed unit distribution: Theory, modelling and data fitting

Prediction Regions for Poisson and Over-Dispersed Poisson Regression Models with Applications in Forecasting the Number of Deaths during the COVID-19 Pandemic

GARCH with generalized Pareto tail

Orthonormal Canonical Correlation Analysis

Dependence and dependence structures: estimation and visualization using the unifying concept of distance multivariance

Export Citation Format

Open StatisticsLatest Publications

TOTAL DOCUMENTS

H-INDEX

Published By Walter De Gruyter Gmbh

Asymptotic Inference for Optimal Rerandomization Designs

A note on an extreme left skewed unit distribution: Theory, modelling and data fitting

Prediction Regions for Poisson and Over-Dispersed Poisson Regression Models with Applications in Forecasting the Number of Deaths during the COVID-19 Pandemic

GARCH with generalized Pareto tail

Orthonormal Canonical Correlation Analysis

Dependence and dependence structures: estimation and visualization using the unifying concept of distance multivariance

Open Statistics
Latest Publications