partially observable
Recently Published Documents


TOTAL DOCUMENTS

907
(FIVE YEARS 281)

H-INDEX

39
(FIVE YEARS 4)

2022 ◽  
Vol 418 ◽  
pp. 126838
Author(s):  
Fengqiang Gao ◽  
Yuyue Yan ◽  
Zhihao Chen ◽  
Linxiao Zheng ◽  
Huan Ren

Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 636
Author(s):  
Lingli Yu ◽  
Shuxin Huo ◽  
Keyi Li ◽  
Yadong Wei

An intelligent land vehicle utilizes onboard sensors to acquire observed states at a disorderly intersection. However, partial observation of the environment occurs due to sensor noise. This causes decision failure easily. A collision relationship-based driving behavior decision-making method via deep recurrent Q network (CR-DRQN) is proposed for intelligent land vehicles. First, the collision relationship between the intelligent land vehicle and surrounding vehicles is designed as the input. The collision relationship is extracted from the observed states with the sensor noise. This avoids a CR-DRQN dimension explosion and speeds up the network training. Then, DRQN is utilized to attenuate the impact of the input noise and achieve driving behavior decision-making. Finally, some comparative experiments are conducted to verify the effectiveness of the proposed method. CR-DRQN maintains a high decision success rate at a disorderly intersection with partially observable states. In addition, the proposed method is outstanding in the aspects of safety, the ability of collision risk prediction, and comfort.


2022 ◽  
Vol 73 ◽  
pp. 277-327
Author(s):  
Samer Nashed ◽  
Shlomo Zilberstein

Opponent modeling is the ability to use prior knowledge and observations in order to predict the behavior of an opponent. This survey presents a comprehensive overview of existing opponent modeling techniques for adversarial domains, many of which must address stochastic, continuous, or concurrent actions, and sparse, partially observable payoff structures. We discuss all the components of opponent modeling systems, including feature extraction, learning algorithms, and strategy abstractions. These discussions lead us to propose a new form of analysis for describing and predicting the evolution of game states over time. We then introduce a new framework that facilitates method comparison, analyze a representative selection of techniques using the proposed framework, and highlight common trends among recently proposed methods. Finally, we list several open problems and discuss future research directions inspired by AI research on opponent modeling and related research in other disciplines.


Mathematics ◽  
2022 ◽  
Vol 10 (2) ◽  
pp. 184
Author(s):  
Andrey Borisov ◽  
Alexey Bosov ◽  
Gregory Miller

The paper presents an optimal control problem for the partially observable stochastic differential system driven by an external Markov jump process. The available controlled observations are indirect and corrupted by some Wiener noise. The goal is to optimize a linear function of the state (output) given a general quadratic criterion. The separation principle, verified for the system at hand, allows examination of the control problem apart from the filter optimization. The solution to the latter problem is provided by the Wonham filter. The solution to the former control problem is obtained by formulating an equivalent control problem with a linear drift/nonlinear diffusion stochastic process and with complete information. This problem, in turn, is immediately solved by the application of the dynamic programming method. The applicability of the obtained theoretical results is illustrated by a numerical example, where an optimal amplification/stabilization problem is solved for an unstable externally controlled step-wise mechanical actuator.


Mathematics ◽  
2022 ◽  
Vol 10 (1) ◽  
pp. 157
Author(s):  
Zehra Eksi ◽  
Daniel Schreitl

The Bitcoin market exhibits characteristics of a market with pricing bubbles. The price is very volatile, and it inherits the risk of quickly increasing to a peak and decreasing from the peak even faster. In this context, it is vital for investors to close their long positions optimally. In this study, we investigate the performance of the partially observable digital-drift model of Ekström and Lindberg and the corresponding optimal exit strategy on a Bitcoin trade. In order to estimate the unknown intensity of the random drift change time, we refer to Bitcoin halving events, which are considered as pivotal events that push the price up. The out-of-sample performance analysis of the model yields returns values ranging between 9% and 1153%. We conclude that the return of the initiated Bitcoin momentum trades heavily depends on the entry date: the earlier we entered, the higher the expected return at the optimal exit time suggested by the model. Overall, to the extent of our analysis, the model provides a supporting framework for exit decisions, but is by far not the ultimate tool to succeed in every trade.


Author(s):  
Hanna Kurniawati

Planning under uncertainty is critical to robotics. The partially observable Markov decision process (POMDP) is a mathematical framework for such planning problems. POMDPs are powerful because of their careful quantification of the nondeterministic effects of actions and the partial observability of the states. But for the same reason, they are notorious for their high computational complexity and have been deemed impractical for robotics. However, over the past two decades, the development of sampling-based approximate solvers has led to tremendous advances in POMDP-solving capabilities. Although these solvers do not generate the optimal solution, they can compute good POMDP solutions that significantly improve the robustness of robotics systems within reasonable computational resources, thereby making POMDPs practical for many realistic robotics problems. This article presents a review of POMDPs, emphasizing computational issues that have hindered their practicality in robotics and ideas in sampling-based solvers that have alleviated such difficulties, together with lessons learned from applying POMDPs to physical robots. Expected final online publication date for the Annual Review of Control, Robotics, and Autonomous Systems, Volume 5 is May 2022. Please see http://www.annualreviews.org/page/journal/pubdates for revised estimates.


Entropy ◽  
2021 ◽  
Vol 24 (1) ◽  
pp. 59
Author(s):  
Baihan Lin

Inspired by the adaptation phenomenon of neuronal firing, we propose the regularity normalization (RN) as an unsupervised attention mechanism (UAM) which computes the statistical regularity in the implicit space of neural networks under the Minimum Description Length (MDL) principle. Treating the neural network optimization process as a partially observable model selection problem, the regularity normalization constrains the implicit space by a normalization factor, the universal code length. We compute this universal code incrementally across neural network layers and demonstrate the flexibility to include data priors such as top-down attention and other oracle information. Empirically, our approach outperforms existing normalization methods in tackling limited, imbalanced and non-stationary input distribution in image classification, classic control, procedurally-generated reinforcement learning, generative modeling, handwriting generation and question answering tasks with various neural network architectures. Lastly, the unsupervised attention mechanisms is a useful probing tool for neural networks by tracking the dependency and critical learning stages across layers and recurrent time steps of deep networks.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Tomoaki Kimura ◽  
Kodai Shiba ◽  
Chih-Chieh Chen ◽  
Masaru Sogabe ◽  
Katsuyoshi Sakamoto ◽  
...  

Variational quantum circuit is proposed for applications in supervised learning and reinforcement learning to harness potential quantum advantage. However, many practical applications in robotics and time-series analysis are in partially observable environment. In this work, we propose an algorithm based on variational quantum circuits for reinforcement learning under partially observable environment. Simulations suggest learning advantage over several classical counterparts. The learned parameters are then tested on IBMQ systems to demonstrate the applicability of our approach for real-machine-based predictions.


Sign in / Sign up

Export Citation Format

Share Document