empirical performance
Recently Published Documents


TOTAL DOCUMENTS

409
(FIVE YEARS 123)

H-INDEX

27
(FIVE YEARS 3)

Econometrics ◽  
2022 ◽  
Vol 10 (1) ◽  
pp. 5
Author(s):  
Ron Mittelhammer ◽  
George Judge ◽  
Miguel Henry

In this paper, we introduce a flexible and widely applicable nonparametric entropy-based testing procedure that can be used to assess the validity of simple hypotheses about a specific parametric population distribution. The testing methodology relies on the characteristic function of the population probability distribution being tested and is attractive in that, regardless of the null hypothesis being tested, it provides a unified framework for conducting such tests. The testing procedure is also computationally tractable and relatively straightforward to implement. In contrast to some alternative test statistics, the proposed entropy test is free from user-specified kernel and bandwidth choices, idiosyncratic and complex regularity conditions, and/or choices of evaluation grids. Several simulation exercises were performed to document the empirical performance of our proposed test, including a regression example that is illustrative of how, in some contexts, the approach can be applied to composite hypothesis-testing situations via data transformations. Overall, the testing procedure exhibits notable promise, exhibiting appreciable increasing power as sample size increases for a number of alternative distributions when contrasted with hypothesized null distributions. Possible general extensions of the approach to composite hypothesis-testing contexts, and directions for future work are also discussed.


2022 ◽  
Author(s):  
Changjian Shui ◽  
Boyu Wang ◽  
Christian Gagné

AbstractA crucial aspect of reliable machine learning is to design a deployable system for generalizing new related but unobserved environments. Domain generalization aims to alleviate such a prediction gap between the observed and unseen environments. Previous approaches commonly incorporated learning the invariant representation for achieving good empirical performance. In this paper, we reveal that merely learning the invariant representation is vulnerable to the related unseen environment. To this end, we derive a novel theoretical analysis to control the unseen test environment error in the representation learning, which highlights the importance of controlling the smoothness of representation. In practice, our analysis further inspires an efficient regularization method to improve the robustness in domain generalization. The proposed regularization is orthogonal to and can be straightforwardly adopted in existing domain generalization algorithms that ensure invariant representation learning. Empirical results show that our algorithm outperforms the base versions in various datasets and invariance criteria.


Author(s):  
Umberto Amato ◽  
Anestis Antoniadis ◽  
Italia De Feis ◽  
Irène Gijbels

AbstractNonparametric univariate regression via wavelets is usually implemented under the assumptions of dyadic sample size, equally spaced fixed sample points, and i.i.d. normal errors. In this work, we propose, study and compare some wavelet based nonparametric estimation methods designed to recover a one-dimensional regression function for data that not necessary possess the above requirements. These methods use appropriate regularizations by penalizing the decomposition of the unknown regression function on a wavelet basis of functions evaluated on the sampling design. Exploiting the sparsity of wavelet decompositions for signals belonging to homogeneous Besov spaces, we use some efficient proximal gradient descent algorithms, available in recent literature, for computing the estimates with fast computation times. Our wavelet based procedures, in both the standard and the robust regression case have favorable theoretical properties, thanks in large part to the separability nature of the (non convex) regularization they are based on. We establish asymptotic global optimal rates of convergence under weak conditions. It is known that such rates are, in general, unattainable by smoothing splines or other linear nonparametric smoothers. Lastly, we present several experiments to examine the empirical performance of our procedures and their comparisons with other proposals available in the literature. An interesting regression analysis of some real data applications using these procedures unambiguously demonstrate their effectiveness.


Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 8009
Author(s):  
Abdulmajid Murad ◽  
Frank Alexander Kraemer ◽  
Kerstin Bach ◽  
Gavin Taylor

Data-driven forecasts of air quality have recently achieved more accurate short-term predictions. However, despite their success, most of the current data-driven solutions lack proper quantifications of model uncertainty that communicate how much to trust the forecasts. Recently, several practical tools to estimate uncertainty have been developed in probabilistic deep learning. However, there have not been empirical applications and extensive comparisons of these tools in the domain of air quality forecasts. Therefore, this work applies state-of-the-art techniques of uncertainty quantification in a real-world setting of air quality forecasts. Through extensive experiments, we describe training probabilistic models and evaluate their predictive uncertainties based on empirical performance, reliability of confidence estimate, and practical applicability. We also propose improving these models using “free” adversarial training and exploiting temporal and spatial correlation inherent in air quality data. Our experiments demonstrate that the proposed models perform better than previous works in quantifying uncertainty in data-driven air quality forecasts. Overall, Bayesian neural networks provide a more reliable uncertainty estimate but can be challenging to implement and scale. Other scalable methods, such as deep ensemble, Monte Carlo (MC) dropout, and stochastic weight averaging-Gaussian (SWAG), can perform well if applied correctly but with different tradeoffs and slight variations in performance metrics. Finally, our results show the practical impact of uncertainty estimation and demonstrate that, indeed, probabilistic models are more suitable for making informed decisions.


2021 ◽  
Vol 20 (12) ◽  
Author(s):  
Phillip C. Lotshaw ◽  
Travis S. Humble ◽  
Rebekah Herrman ◽  
James Ostrowski ◽  
George Siopsis

2021 ◽  
Author(s):  
Robert Wang ◽  
Richard Y Zhang ◽  
Alex Khodaverdian ◽  
Nir Yosef

CRISPR-Cas9 lineage tracing technologies have emerged as a powerful tool for investigating develop-ment in single-cell contexts, but exact reconstruction of the underlying clonal relationships in experiment is plagued by data-related complications. These complications are functions of the experimental parameters in these systems, such as the Cas9 cutting rate, the diversity of indel outcomes, and the rate of missing data. In this paper, we develop two theoretically grounded algorithms for reconstruction of the underlying phylogenetic tree, as well as asymptotic bounds for the number of recording sites necessary for exact recapitulation of the ground truth phylogeny at high probability. In doing so, we explore the relationship between the problem difficulty and the experimental parameters, with implications for experimental design. Lastly, we provide simulations validating these bounds and showing the empirical performance of these algorithms. Overall, this work provides a first theoretical analysis of phylogenetic reconstruction in the CRISPR-Cas9 lineage tracing technology.


Author(s):  
Hendrik van der Wurp ◽  
Andreas Groll

AbstractIn this work, we propose an extension of the versatile joint regression framework for bivariate count responses of the package by Marra and Radice (R package version 0.2-3, 2020) by incorporating an (adaptive) LASSO-type penalty. The underlying estimation algorithm is based on a quadratic approximation of the penalty. The method enables variable selection and the corresponding estimates guarantee shrinkage and sparsity. Hence, this approach is particularly useful in high-dimensional count response settings. The proposal’s empirical performance is investigated in a simulation study and an application on FIFA World Cup football data.


Author(s):  
Ayman El Tarabishy ◽  
Won-Sik Hwang ◽  
John Laurence Enriquez ◽  
Ki-Chan Kim

2021 ◽  
Author(s):  
MohammadHossein Bateni ◽  
Yiwei Chen ◽  
Dragos Florin Ciocan ◽  
Vahab Mirrokni

In settings where a platform must allocate finite supplies of goods to buyers, balancing overall platform revenues with the fairness of the individual allocations to platform participants is paramount to the well-functioning of the platform. This is made even more difficult by the fact that the supply of goods is in practice stochastic and difficult to forecast, such as in the case of online ad allocation, where the platform manages a supply of impressions that varies over time. In this paper, we design a fair allocation scheme that works in the presence of supply uncertainty. Algorithmically, the scheme repeatedly solves for Fisher market equilibria in a model predictive control fashion and is proved to admit constant factor guarantees versus the offline optimal. In addition, the scheme is tested on a sequence of real ad datasets, showing strong empirical performance.


Sign in / Sign up

Export Citation Format

Share Document