sample complexity
Recently Published Documents


TOTAL DOCUMENTS

260
(FIVE YEARS 91)

H-INDEX

20
(FIVE YEARS 6)

2022 ◽  
Vol 63 (1) ◽  
pp. 012701
Author(s):  
Kexiang Yang ◽  
Ercai Chen ◽  
Xiaoyao Zhou

Author(s):  
Philipp Trunschke ◽  
Martin Eigel ◽  
Reinhold Schneider

We consider best approximation problems in a nonlinear subset  [[EQUATION]] of a Banach space of functions [[EQUATION]] . The norm is assumed to be a generalization of the [[EQUATION]] -norm for which only a weighted Monte Carlo estimate [[EQUATION]] can be computed. The objective is to obtain an approximation [[EQUATION]] of an unknown function [[EQUATION]] by minimizing the empirical norm [[EQUATION]] . We consider this problem for general nonlinear subsets and establish error bounds for the empirical best approximation error. Our results are based on a restricted isometry property (RIP) which holds in probability and is independent of the nonlinear least squares setting. Several model classes are examined where analytical statements can be made about the RIP and the results are compared to existing sample complexity bounds from the literature. We find that for well-studied model classes our general bound is weaker but exhibits many of the same properties as these specialized bounds. Notably, we demonstrate the advantage of an optimal sampling density (as known for linear spaces) for sets of functions with sparse representations.


Author(s):  
Junyu Zhang ◽  
Lin Xiao ◽  
Shuzhong Zhang

The cubic regularized Newton method of Nesterov and Polyak has become increasingly popular for nonconvex optimization because of its capability of finding an approximate local solution with a second order guarantee and its low iteration complexity. Several recent works extend this method to the setting of minimizing the average of N smooth functions by replacing the exact gradients and Hessians with subsampled approximations. It is shown that the total Hessian sample complexity can be reduced to be sublinear in N per iteration by leveraging stochastic variance reduction techniques. We present an adaptive variance reduction scheme for a subsampled Newton method with cubic regularization and show that the expected Hessian sample complexity is [Formula: see text] for finding an [Formula: see text]-approximate local solution (in terms of first and second order guarantees, respectively). Moreover, we show that the same Hessian sample complexity is retained with fixed sample sizes if exact gradients are used. The techniques of our analysis are different from previous works in that we do not rely on high probability bounds based on matrix concentration inequalities. Instead, we derive and utilize new bounds on the third and fourth order moments of the average of random matrices, which are of independent interest on their own.


2021 ◽  
Vol 12 (5) ◽  
pp. 1-29
Author(s):  
Qiong Wu ◽  
Adam Hare ◽  
Sirui Wang ◽  
Yuwei Tu ◽  
Zhenming Liu ◽  
...  

Existing topic modeling and text segmentation methodologies generally require large datasets for training, limiting their capabilities when only small collections of text are available. In this work, we reexamine the inter-related problems of “topic identification” and “text segmentation” for sparse document learning, when there is a single new text of interest. In developing a methodology to handle single documents, we face two major challenges. First is sparse information : with access to only one document, we cannot train traditional topic models or deep learning algorithms. Second is significant noise : a considerable portion of words in any single document will produce only noise and not help discern topics or segments. To tackle these issues, we design an unsupervised, computationally efficient methodology called Biclustering Approach to Topic modeling and Segmentation (BATS). BATS leverages three key ideas to simultaneously identify topics and segment text: (i) a new mechanism that uses word order information to reduce sample complexity, (ii) a statistically sound graph-based biclustering technique that identifies latent structures of words and sentences, and (iii) a collection of effective heuristics that remove noise words and award important words to further improve performance. Experiments on six datasets show that our approach outperforms several state-of-the-art baselines when considering topic coherence, topic diversity, segmentation, and runtime comparison metrics.


2021 ◽  
Vol 20 (8) ◽  
Author(s):  
Wooyeong Song ◽  
Marcin Wieśniak ◽  
Nana Liu ◽  
Marcin Pawłowski ◽  
Jinhyoung Lee ◽  
...  

Author(s):  
Hongchang Gao ◽  
Hanzi Xu ◽  
Slobodan Vucetic

Continuous DR-submodular maximization is an important machine learning problem, which covers numerous popular applications. With the emergence of large-scale distributed data, developing efficient algorithms for the continuous DR-submodular maximization, such as the decentralized Frank-Wolfe method, became an important challenge. However, existing decentralized Frank-Wolfe methods for this kind of problem have the sample complexity of $\mathcal{O}(1/\epsilon^3)$, incurring a large computational overhead. In this paper, we propose two novel sample efficient decentralized Frank-Wolfe methods to address this challenge. Our theoretical results demonstrate that the sample complexity of the two proposed methods is $\mathcal{O}(1/\epsilon^2)$, which is better than $\mathcal{O}(1/\epsilon^3)$ of the existing methods. As far as we know, this is the first published result achieving such a favorable sample complexity. Extensive experimental results confirm the effectiveness of the proposed methods.


Author(s):  
Yi He ◽  
Fudong Lin ◽  
Xu Yuan ◽  
Nian-Feng Tzeng

This paper proposes a novel oversampling approach that strives to balance the class priors with a considerably imbalanced data distribution of high dimensionality. The crux of our approach lies in learning interpretable latent representations that can model the synthetic mechanism of the minority samples by using a generative adversarial network(GAN). A Bayesian regularizer is imposed to guide the GAN to extract a set of salient features that are either disentangled or intensionally entangled, with their interplay controlled by a prescribed structure, defined with human-in-the-loop. As such, our GAN enjoys an improved sample complexity, being able to synthesize high-quality minority samples even if the sizes of minority classes are extremely small during training. Empirical studies substantiate that our approach can empower simple classifiers to achieve superior imbalanced classification performance over the state-of-the-art competitors and is robust across various imbalance settings. Code is released in github.com/fudonglin/IMSIC.


Author(s):  
Weinan Zhang ◽  
Xihuai Wang ◽  
Jian Shen ◽  
Ming Zhou

This paper investigates the model-based methods in multi-agent reinforcement learning (MARL). We specify the dynamics sample complexity and the opponent sample complexity in MARL, and conduct a theoretic analysis of return discrepancy upper bound. To reduce the upper bound with the intention of low sample complexity during the whole learning process, we propose a novel decentralized model-based MARL method, named Adaptive Opponent-wise Rollout Policy Optimization (AORPO). In AORPO, each agent builds its multi-agent environment model, consisting of a dynamics model and multiple opponent models, and trains its policy with the adaptive opponent-wise rollout. We further prove the theoretic convergence of AORPO under reasonable assumptions. Empirical experiments on competitive and cooperative tasks demonstrate that AORPO can achieve improved sample efficiency with comparable asymptotic performance over the compared MARL methods.


Author(s):  
Mathieu Seurin ◽  
Florian Strub ◽  
Philippe Preux ◽  
Olivier Pietquin

Sparse rewards are double-edged training signals in reinforcement learning: easy to design but hard to optimize. Intrinsic motivation guidances have thus been developed toward alleviating the resulting exploration problem. They usually incentivize agents to look for new states through novelty signals. Yet, such methods encourage exhaustive exploration of the state space rather than focusing on the environment's salient interaction opportunities. We propose a new exploration method, called Don't Do What Doesn't Matter (DoWhaM), shifting the emphasis from state novelty to state with relevant actions. While most actions consistently change the state when used, e.g. moving the agent, some actions are only effective in specific states, e.g., opening a door, grabbing an object. DoWhaM detects and rewards actions that seldom affect the environment. We evaluate DoWhaM on the procedurally-generated environment MiniGrid against state-of-the-art methods. Experiments consistently show that DoWhaM greatly reduces sample complexity, installing the new state-of-the-art in MiniGrid.


Sign in / Sign up

Export Citation Format

Share Document