Sampling Distributions

2021 ◽  
pp. 81-104
Author(s):  
Alan Agresti ◽  
Maria Kateri
2012 ◽  
Vol 44 (2) ◽  
pp. 391-407 ◽  
Author(s):  
Anand Bhaskar ◽  
Yun S. Song

Obtaining a closed-form sampling distribution for the coalescent with recombination is a challenging problem. In the case of two loci, a new framework based on an asymptotic series has recently been developed to derive closed-form results when the recombination rate is moderate to large. In this paper, an arbitrary number of loci is considered and combinatorial approaches are employed to find closed-form expressions for the first couple of terms in an asymptotic expansion of the multi-locus sampling distribution. These expressions are universal in the sense that their functional form in terms of the marginal one-locus distributions applies to all finite- and infinite-alleles models of mutation.


Author(s):  
Joan B. Garfield ◽  
Dani Ben-Zvi ◽  
Beth Chance ◽  
Elsa Medina ◽  
Cary Roseth ◽  
...  

Author(s):  
Yangchen Pan ◽  
Hengshuai Yao ◽  
Amir-massoud Farahmand ◽  
Martha White

Dyna is an architecture for model based reinforcement learning (RL), where simulated experience from a model is used to update policies or value functions. A key component of Dyna is search control, the mechanism to generate the state and action from which the agent queries the model, which remains largely unexplored. In this work, we propose to generate such states by using the trajectory obtained from Hill Climbing (HC) the current estimate of the value function. This has the effect of propagating value from high value regions and of preemptively updating value estimates of the regions that the agent is likely to visit next. We derive a noisy projected natural gradient algorithm for hill climbing, and highlight a connection to Langevin dynamics. We provide an empirical demonstration on four classical domains that our algorithm, HC Dyna, can obtain significant sample efficiency improvements. We study the properties of different sampling distributions for search control, and find that there appears to be a benefit specifically from using the samples generated by climbing on current value estimates from low value to high value region.


Sign in / Sign up

Export Citation Format

Share Document