Sequentially Estimating the Approximate Conditional Mean Using Extreme Learning Machines

This study examined the extreme learning machine (ELM) applied to the Wald test statistic for the model specification of the conditional mean, which we call the WELM testing procedure. The omnibus test statistics available in the literature weakly converge to a Gaussian stochastic process under the null that the model is correct, and this makes their application inconvenient. By contrast, the WELM testing procedure is straightforwardly applicable when detecting model misspecification. We applied the WELM testing procedure to the sequential testing procedure formed by a set of polynomial models and estimate an approximate conditional expectation. We then conducted extensive Monte Carlo experiments to evaluate the performance of the sequential WELM testing procedure and verify that it consistently estimates the most parsimonious conditional mean when the set of polynomial models contains a correctly specified model. Otherwise, it consistently rejects all the models in the set.

Download Full-text

A robust score test of homogeneity for zero-inflated count data

Statistical Methods in Medical Research ◽

10.1177/0962280220937324 ◽

2020 ◽

Vol 29 (12) ◽

pp. 3653-3665

Author(s):

Wei-Wen Hsu ◽

David Todem ◽

Nadeesha R Mawella ◽

KyungMann Kim ◽

Richard R Rosenkranz

Keyword(s):

Null Hypothesis ◽

Model Misspecification ◽

Score Test ◽

Testing Procedure ◽

Population Heterogeneity ◽

Model Specification ◽

Score Tests ◽

Random Term ◽

Test Of Homogeneity ◽

Empirical Size

In many applications of zero-inflated models, score tests are often used to evaluate whether the population heterogeneity as implied by these models is consistent with the data. The most frequently cited justification for using score tests is that they only require estimation under the null hypothesis. Because this estimation involves specifying a plausible model consistent with the null hypothesis, the testing procedure could lead to unreliable inferences under model misspecification. In this paper, we propose a score test of homogeneity for zero-inflated models that is robust against certain model misspecifications. Due to the true model being unknown in practical settings, our proposal is developed under a general framework of mixture models for which a layer of randomness is imposed on the model to account for uncertainty in the model specification. We exemplify this approach on the class of zero-inflated Poisson models, where a random term is imposed on the Poisson mean to adjust for relevant covariates missing from the mean model or a misspecified functional form. For this example, we show through simulations that the resulting score test of zero inflation maintains its empirical size at all levels, albeit a loss of power for the well-specified non-random mean model under the null. Frequencies of health promotion activities among young Girl Scouts and dental caries indices among inner-city children are used to illustrate the robustness of the proposed testing procedure.

Download Full-text

Factor-augmented forecasting regressions with threshold effects

Econometrics Journal ◽

10.1093/ectj/utab011 ◽

2021 ◽

Author(s):

Yayi Yan ◽

Tingting Cheng

Keyword(s):

Likelihood Ratio Statistic ◽

Estimation Method ◽

Wald Test ◽

Threshold Effects ◽

Threshold Parameter ◽

Test Statistic ◽

Stock Market Returns ◽

Testing Procedures ◽

Finite Samples ◽

Regression Parameters

Abstract This paper introduces a factor-augmented forecasting regression model in the presence of threshold effects. We consider least squares estimation of the regression parameters, and establish asymptotic theories for estimators of both slope coefficients and the threshold parameter. Prediction intervals are also constructed for factor-augmented forecasts. Moreover, we develop a likelihood ratio statistic for tests on the threshold parameter and a sup-Wald test statistic for tests on the presence of threshold effects, respectively. Simulation results show that the proposed estimation method and testing procedures work very well in finite samples. Finally, we demonstrate the usefulness of the proposed model through an application to forecasting stock market returns.

Download Full-text

A Novel Change Detection Method Based on Statistical Distribution Characteristics Using Multi-Temporal PolSAR Data

Sensors ◽

10.3390/s20051508 ◽

2020 ◽

Vol 20 (5) ◽

pp. 1508 ◽

Cited By ~ 2

Author(s):

Jinqi Zhao ◽

Yonglei Chang ◽

Jie Yang ◽

Yufen Niu ◽

Zhong Lu ◽

...

Keyword(s):

Change Detection ◽

Optimal Threshold ◽

Distribution Model ◽

Test Statistic ◽

Omnibus Test ◽

Polarimetric Synthetic Aperture Radar ◽

Difference Image ◽

East Lake ◽

Multi Temporal ◽

The Difference

Unsupervised change detection approaches, which are relatively straightforward and easy to implement and interpret, and which require no human intervention, are widely used in change detection. Polarimetric synthetic aperture radar (PolSAR), which has an all-weather response capability with increased polarimetric information, is a key tool for change detection. However, for PolSAR data, inadequate evaluation of the difference image (DI) map makes the threshold-based algorithms incompatible with the true distribution model, which causes the change detection results to be ineffective and inaccurate. In this paper, to solve these problems, we focus on the generation of the DI map and the selection of the optimal threshold. An omnibus test statistic is used to generate the DI map from multi-temporal PolSAR images, and an improved Kittler and Illingworth algorithm based on either Weibull or gamma distribution is used to obtain the optimal threshold for generating the change detection map. Multi-temporal PolSAR data obtained by the Radarsat-2 sensor over Wuhan in China are used to verify the efficiency of the proposed method. The experimental results using our approach obtained the best performance in East Lake and Yanxi Lake regions with false alarm rates of 1.59% and 1.80%, total errors of 2.73% and 4.33%, overall accuracy of 97.27% and 95.67%, and Kappa coefficients of 0.6486 and 0.6275, respectively. Our results demonstrated that the proposed method is more suitable than the other compared methods for multi-temporal PolSAR data, and it can obtain both effective and accurate results.

Download Full-text

An Omnibus Test for Systematic Changes in Judges’ Rankings

Journal of Educational Statistics ◽

10.3102/10769986017001001 ◽

1992 ◽

Vol 17 (1) ◽

pp. 1-26

Author(s):

Douglas E. Critchlow ◽

Joseph S. Verducci

Keyword(s):

Literary Criticism ◽

Null Hypothesis ◽

Statistical Test ◽

Post Treatment ◽

Graphical Methods ◽

Test Results ◽

Test Statistic ◽

Omnibus Test ◽

Null Distributions ◽

Power Of The Test

Paired rankings arise when each subject in a study independently ranks a set of items, undergoes a treatment, and afterwards ranks the same set of items. For such data, a statistical test is proposed to detect if the subjects’ posttreatment rankings have moved systematically toward some unknown ranking or set of rankings. The null hypothesis for this test is that each subject’s post-treatment ranking is symmetrically distributed about his pretreatment ranking. The exact and asymptotic null distributions of the test statistic are simulated and compared, and the power of the test is studied. Using paired rankings from an experimental course in literary criticism, we also offer some graphical methods for representing such data that help us to interpret the test results.

Download Full-text

The Economics of Outdoor Recreation Congestion: A Case study of Camping

Journal of the Northeastern Agricultural Economics Council ◽

10.1017/s016354840000457x ◽

1979 ◽

Vol 8 (1) ◽

pp. 13-16 ◽

Cited By ~ 1

Author(s):

P. Geoffrey Allen ◽

Thomas H. Stevens

Keyword(s):

Outdoor Recreation ◽

Model Misspecification ◽

Economic Value ◽

High Density ◽

Model Specification ◽

Demand Model ◽

Empirical Results ◽

Congestion Effects

Bias in estimating recreational values may result if congestion is ignored in the demand model specification. Theoretical and empirical considerations pertaining to recreation congestion are summarized. Empirical results for camping in Western Massachusetts are presented which demonstrate the potential degree of bias from demand model misspecification. The results indicate that recreational values may be strongly influenced by congestion effects and that camping areas with relatively low densities may have a higher economic value than high density areas with similar facilities.

Download Full-text

A NEW CHARACTERIZATION OF THE NORMAL DISTRIBUTION AND TEST FOR NORMALITY

Econometric Theory ◽

10.1017/s026646661500016x ◽

2015 ◽

Vol 32 (5) ◽

pp. 1216-1252 ◽

Cited By ~ 6

Author(s):

Anil K. Bera ◽

Antonio F. Galvao ◽

Liang Wang ◽

Zhijie Xiao

Keyword(s):

Normal Distribution ◽

Testing Procedure ◽

Finite Sample ◽

Omnibus Test ◽

Sample Mean ◽

Underlying Distribution ◽

Asymptotic Covariance ◽

Finite Sample Properties ◽

Test For Normality

We study the asymptotic covariance function of the sample mean and quantile, and derive a new and surprising characterization of the normal distribution: the asymptotic covariance between the sample mean and quantile is constant across all quantiles,if and only ifthe underlying distribution is normal. This is a powerful result and facilitates statistical inference. Utilizing this result, we develop a new omnibus test for normality based on the quantile-mean covariance process. Compared to existing normality tests, the proposed testing procedure has several important attractive features. Monte Carlo evidence shows that the proposed test possesses good finite sample properties. In addition to the formal test, we suggest a graphical procedure that is easy to implement and visualize in practice. Finally, we illustrate the use of the suggested techniques with an application to stock return datasets.

Download Full-text

On the Limit Behavior of a Chi-Square Type Test if the Number of Conditional Moments Tested Approaches Infinity

Econometric Theory ◽

10.1017/s0266466600008239 ◽

1994 ◽

Vol 10 (1) ◽

pp. 70-90 ◽

Cited By ~ 18

Author(s):

R.M. de Jong ◽

H.J. Bierens

Keyword(s):

Infinite Number ◽

Fixed Number ◽

Model Specification ◽

Test Statistic ◽

Conditional Moment ◽

Chi Square ◽

Moment Conditions ◽

Conditional Moments ◽

Finite Samples ◽

Consistent Model

In this paper, a consistent model specification test is proposed. Some consistent model specification tests have been discussed in econometrics literature. Those tests are consistent by randomization, display a discontinuity in sample size, or have an asymptotic distribution that depends on the data-generating process and on the model, whereas our test does not have one of those disadvantages. Our test can be viewed upon as a conditional moment test as proposed by Newey but instead of a fixed number of conditional moments, an asymptotically infinite number of moment conditions is employed. The use of an asymptotically infinite number of conditional moments will make it possible to obtain a consistent test. Computation of the test statistic is particularly simple, since in finite samples our statistic is equivalent to a chi-square conditional moment test of a finite number of conditional moments.

Download Full-text

Sequential Test for a Mixture of Finite Exponential Distribution

Journal of Mathematics ◽

10.1155/2021/6625853 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

A.S. Al-Moisheer

Keyword(s):

Mixture Distribution ◽

Testing Procedure ◽

Optimal Number ◽

Finite Mixture ◽

Sequential Test ◽

Ratio Test ◽

Test Statistic ◽

General Hypothesis ◽

Number Of Components ◽

Exponential Mixture

Testing the number of components in a finite mixture is considered one of the challenging problems. In this paper, exponential finite mixtures are used to determine the number of components in a finite mixture. A sequential testing procedure is adopted based on the likelihood ratio test (LRT) statistic. The distribution of the test statistic under the null hypothesis is obtained using a resampling technique based on B bootstrap samples. The quantiles of the distribution of the test statistic are evaluated from the B bootstrap samples. The performance of the test is examined through the empirical power and application on two real datasets. The proposed procedure is not only used for testing the number of components but also for estimating the optimal number of components in a finite exponential mixture distribution. The innovation of this paper is the sequential test, which tests the more general hypothesis of a finite exponential mixture of k components versus a mixture of k + 1 components. The special case of testing an exponential mixture of one component versus two components is the one commonly used in the literature.

Download Full-text

Permutation Testing in the Presence of Polygenic Variation

10.1101/014571 ◽

2015 ◽

Author(s):

Mark Abney

Keyword(s):

Quantitative Trait ◽

Mixed Model ◽

Linear Mixed Model ◽

Permutation Test ◽

Statistical Significance ◽

Null Distribution ◽

Polygenic Effect ◽

Permutation Testing ◽

Test Statistic ◽

Omnibus Test

This article discusses problems with and solutions to performing valid permutation tests for quantitative trait loci in the presence of polygenic effects. Although permutation testing is a popular approach for determining statistical significance of a test statistic with an unknown distribution--for instance, the maximum of multiple correlated statistics or some omnibus test statistic for a gene, gene-set or pathway--naive application of permutations may result in an invalid test. The risk of performing an invalid permutation test is particularly acute in complex trait mapping where polygenicity may combine with a structured population resulting from the presence of families, cryptic relatedness, admixture or population stratification. I give both analytical derivations and a conceptual understanding of why typical permutation procedures fail and suggest an alternative permutation based algorithm, MVNpermute, that succeeds. In particular, I examine the case where a linear mixed model is used to analyze a quantitative trait and show that both phenotype and genotype permutations may result in an invalid permutation test. I provide a formula that predicts the amount of inflation of the type 1 error rate depending on the degree of misspecification of the covariance structure of the polygenic effect and the heritability of the trait. I validate this formula by doing simulations, showing that the permutation distribution matches the theoretical expectation, and that my suggested permutation based test obtains the correct null distribution. Finally, I discuss situations where naive permutations of the phenotype or genotype are valid and the applicability of the results to other test statistics.

Download Full-text

Doubly robust estimator of risk in the presence of censoring dependent on time-varying covariates: application to a primary prevention trial for coronary events with pravastatin

10.21203/rs.3.rs-11458/v2 ◽

2020 ◽

Author(s):

Takuya Kawahara ◽

Tomohiro Shinozaki ◽

Yutaka Matsuyama

Keyword(s):

Model Misspecification ◽

Clinical Trial Data ◽

Dependent Censoring ◽

Model Specification ◽

Robust Estimator ◽

Time Varying ◽

Simulation Studies ◽

Kaplan Meier ◽

Doubly Robust ◽

Time Varying Covariates

Abstract Background: In the presence of dependent censoring even after stratification of baseline covariates, the Kaplan–Meier estimator provides an inconsistent estimate of risk. To account for dependent censoring, time-varying covariates can be used along with two statistical methods: the inverse probability of censoring weighted (IPCW) Kaplan–Meier estimator and the parametric g-formula estimator. The consistency of the IPCW Kaplan–Meier estimator depends on the correctness of the model specification of censoring hazard, whereas that of the parametric g-formula estimator depends on the correctness of the models for event hazard and time-varying covariates. Methods: We combined the IPCW Kaplan–Meier estimator and the parametric g-formula estimator into a doubly robust estimator that can adjust for dependent censoring. The estimator is theoretically more robust to model misspecification than the IPCW Kaplan–Meier estimator and the parametric g-formula estimator. We conducted simulation studies with a time-varying covariate that affected both time-to-event and censoring under correct and incorrect models for censoring, event, and time-varying covariates. We applied our proposed estimator to a large clinical trial data with censoring before the end of follow-up. Results: Simulation studies demonstrated that our proposed estimator is doubly robust, namely it is consistent if either the model for the IPCW Kaplan–Meier estimator or the models for the parametric g-formula estimator, but not necessarily both, is correctly specified. Simulation studies and data application demonstrated that our estimator can be more efficient than the IPCW Kaplan–Meier estimator. Conclusions: The proposed estimator is useful for estimation of risk if censoring is affected by time-varying risk factors.

Download Full-text