asymptotic performance
Recently Published Documents


TOTAL DOCUMENTS

386
(FIVE YEARS 47)

H-INDEX

29
(FIVE YEARS 3)

2021 ◽  
pp. 1-36
Author(s):  
Joris Pinkse ◽  
Karl Schurter

We estimate the density and its derivatives using a local polynomial approximation to the logarithm of an unknown density function f. The estimator is guaranteed to be non-negative and achieves the same optimal rate of convergence in the interior as on the boundary of the support of f. The estimator is therefore well-suited to applications in which non-negative density estimates are required, such as in semiparametric maximum likelihood estimation. In addition, we show that our estimator compares favorably with other kernel-based methods, both in terms of asymptotic performance and computational ease. Simulation results confirm that our method can perform similarly or better in finite samples compared to these alternative methods when they are used with optimal inputs, that is, an Epanechnikov kernel and optimally chosen bandwidth sequence. We provide code in several languages.


2021 ◽  
Author(s):  
Xiangshuai Zeng ◽  
Laurenz Wiskott ◽  
Sen Cheng

Episodic memory has been studied extensively in the past few decades, but so far little is understood about how it is used to affect behavior. Here we postulate three learning paradigms: one-shot learning, replay learning, and online learning, where in the first two paradigms episodic memory is retrieved for decision-making or replayed to the neocortex for extracting semantic knowledge, respectively. In the third paradigm, the neocortex directly extracts information from online experiences as they occur, but does not have access to these experiences afterwards. By using visually-driven reinforcement learning in simulations, we found that whether an agent is able to solve a task by relying on the three learning paradigms depends differently on the number of learning trials and the complexity of the task. Episodic memory can, but does not always, have a major benefit for spatial learning, and its effect differs for the two modes of accessing episodic information. One-shot learning is initially faster than replay learning, but the latter reaches a better asymptotic performance. We believe that understanding how episodic memory drives behavior will be an important step towards elucidating the nature of episodic memory.


Mathematics ◽  
2021 ◽  
Vol 9 (23) ◽  
pp. 3057
Author(s):  
Mohammad Arashi ◽  
Mina Norouzirad ◽  
Mahdi Roozbeh ◽  
Naushad Mamode Mamode Khan

The ridge regression estimator is a commonly used procedure to deal with multicollinear data. This paper proposes an estimation procedure for high-dimensional multicollinear data that can be alternatively used. This usage gives a continuous estimate, including the ridge estimator as a particular case. We study its asymptotic performance for the growing dimension, i.e., p→∞ when n is fixed. Under some mild regularity conditions, we prove the proposed estimator’s consistency and derive its asymptotic properties. Some Monte Carlo simulation experiments are executed in their performance, and the implementation is considered to analyze a high-dimensional genetic dataset.


2021 ◽  
Vol vol. 23, no. 3 (Discrete Algorithms) ◽  
Author(s):  
Yoshiharu Kohayakawa ◽  
Flávio Keidi Miyazawa ◽  
Yoshiko Wakabayashi

In the $d$-dimensional hypercube bin packing problem, a given list of $d$-dimensional hypercubes must be packed into the smallest number of hypercube bins. Epstein and van Stee [SIAM J. Comput. 35 (2005)] showed that the asymptotic performance ratio $\rho$ of the online bounded space variant is $\Omega(\log d)$ and $O(d/\log d)$, and conjectured that it is $\Theta(\log d)$. We show that $\rho$ is in fact $\Theta(d/\log d)$, using probabilistic arguments.


Author(s):  
Yi Zhang ◽  
Zhengzheng Xiang ◽  
Lei Lu ◽  
Shuai Han ◽  
Weixiao Meng

Entropy ◽  
2021 ◽  
Vol 23 (8) ◽  
pp. 1043
Author(s):  
Zijian Gao ◽  
Kele Xu ◽  
Bo Ding ◽  
Huaimin Wang

Recently, deep reinforcement learning (RL) algorithms have achieved significant progress in the multi-agent domain. However, training for increasingly complex tasks would be time-consuming and resource intensive. To alleviate this problem, efficient leveraging of historical experience is essential, which is under-explored in previous studies because most existing methods fail to achieve this goal in a continuously dynamic system owing to their complicated design. In this paper, we propose a method for knowledge reuse called “KnowRU”, which can be easily deployed in the majority of multi-agent reinforcement learning (MARL) algorithms without requiring complicated hand-coded design. We employ the knowledge distillation paradigm to transfer knowledge among agents to shorten the training phase for new tasks while improving the asymptotic performance of agents. To empirically demonstrate the robustness and effectiveness of KnowRU, we perform extensive experiments on state-of-the-art MARL algorithms in collaborative and competitive scenarios. The results show that KnowRU outperforms recently reported methods and not only successfully accelerates the training phase, but also improves the training performance, emphasizing the importance of the proposed knowledge reuse for MARL.


Author(s):  
Weinan Zhang ◽  
Xihuai Wang ◽  
Jian Shen ◽  
Ming Zhou

This paper investigates the model-based methods in multi-agent reinforcement learning (MARL). We specify the dynamics sample complexity and the opponent sample complexity in MARL, and conduct a theoretic analysis of return discrepancy upper bound. To reduce the upper bound with the intention of low sample complexity during the whole learning process, we propose a novel decentralized model-based MARL method, named Adaptive Opponent-wise Rollout Policy Optimization (AORPO). In AORPO, each agent builds its multi-agent environment model, consisting of a dynamics model and multiple opponent models, and trains its policy with the adaptive opponent-wise rollout. We further prove the theoretic convergence of AORPO under reasonable assumptions. Empirical experiments on competitive and cooperative tasks demonstrate that AORPO can achieve improved sample efficiency with comparable asymptotic performance over the compared MARL methods.


Sign in / Sign up

Export Citation Format

Share Document