q function
Recently Published Documents


TOTAL DOCUMENTS

241
(FIVE YEARS 64)

H-INDEX

28
(FIVE YEARS 3)

Automatica ◽  
2022 ◽  
Vol 137 ◽  
pp. 110060
Author(s):  
Milad Farjadnasab ◽  
Maryam Babazadeh

2021 ◽  
Vol 5 (2) ◽  
pp. 63-68
Author(s):  
Irvan Pramana ◽  
Asrizal Deri Futra
Keyword(s):  

Penggunaan metode algoritma Q learning pada robot mampu melakukan perbaikan tanpa harus memperbaharui aturan dari luar karena sifatnya off policy (dapat mengikuti aturan apapun untuk menghasilkan solusi optimal). Dalam sistem kerjanya, robot melakukan proses pembelajaran terhadap garis lintasan yang dilaluinya sehingga didapatkan suatu nilai untuk aksi yang telah dilakukan pada setiap state yang terdeteksi. Tujuan penelitian ini adalah membuat robot bergerak berdasarkan nilai Q function tertinggi yang dihasilkan oleh algoritma Q learning. Berdasarkan hasil pada pengujian penerapan algoritma Q learning pada robot line follower, persentase keberhasilan yang didapatkan adalah sebesar 100% untuk percobaan pertama, 66,67% untuk percobaan kedua, 100% untuk percobaan ketiga, 66,67% untuk percobaan keempat, dan 100% untuk percobaan kelima sehingga rata – rata keberhasilan sebesar 86,67%.


2021 ◽  
Vol 11 (23) ◽  
pp. 11162
Author(s):  
Bonwoo Gu ◽  
Yunsick Sung

A Deep-Q-Network (DQN) controls a virtual agent as the level of a player using only screenshots as inputs. Replay memory selects a limited number of experience replays according to an arbitrary batch size and updates them using the associated Q-function. Hence, relatively fewer experience replays of different states are utilized when the number of states is fixed and the state of the randomly selected transitions becomes identical or similar. The DQN may not be applicable in some environments where it is necessary to perform the learning process using more experience replays than is required by the limited batch size. In addition, because it is unknown whether each action can be executed, a problem of an increasing amount of repetitive learning occurs as more non-executable actions are selected. In this study, an enhanced DQN framework is proposed to resolve the batch size problem and reduce the learning time of a DQN in an environment with numerous non-executable actions. In the proposed framework, non-executable actions are filtered to reduce the number of selectable actions to identify the optimal action for the current state. The proposed method was validated in Gomoku, a strategy board game, in which the application of a traditional DQN would be difficult.


2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Yanyan Zhang ◽  
Ghulam Farid ◽  
Zabidin Salleh ◽  
Ayyaz Ahmad

The aim of this paper is to unify the extended Mittag-Leffler function and generalized Q function and define a unified Mittag-Leffler function. Both the extended Mittag-Leffler function and generalized Q function can be obtained from the unified Mittag-Leffler function. The Laplace, Euler beta, and Whittaker transformations are applied for this function, and generalized formulas are obtained. These formulas reproduce integral transformations of various deduced Mittag-Leffler functions and Q function. Also, the convergence of this unified Mittag-Leffler function is proved, and an associated fractional integral operator is constructed.


Author(s):  
Alessandro Soranzo ◽  
Francesca Vatta ◽  
Massimiliano Comisso ◽  
Giulia Buttazzoni ◽  
Fulvio Babich

2021 ◽  
pp. 2150188
Author(s):  
N. H. Abd El-Wahab ◽  
R. A. Zait

We consider a four-level double V-type atom with two closely-separated top levels and two closely-separated lower levels interacting with a single mode field via multi-photon processes and in the presence of Kerr medium. We show that this atomic system possesses supersymmetric structure and construct its supersymmetric generators. We diagonalize the Hamiltonian of this system using supersymmetric unitary transformation and obtain the corresponding eigenstates and eigenvalues. The atom–field wave functions are obtained when the atom and the field mode are initially in two different cases. The evolution of both the quasi-probability distribution Q-function and the Mandel Q-parameter of the field are studied when the input field is in a coherent state. The influence of the Kerr medium and the detuning parameters on the behavior of these quantum effects is analyzed. The results show that they play a prominent role on the Poissonian statistics of the field. Also, the Kerr medium changes the behavior of the quasi-probability distribution Q-function. We end with discussion and conclusions.


Author(s):  
Yue Guan ◽  
Qifan Zhang ◽  
Panagiotis Tsiotras

We explore the use of policy approximations to reduce the computational cost of learning Nash equilibria in zero-sum stochastic games. We propose a new Q-learning type algorithm that uses a sequence of entropy-regularized soft policies to approximate the Nash policy during the Q-function updates. We prove that under certain conditions, by updating the entropy regularization, the algorithm converges to a Nash equilibrium. We also demonstrate the proposed algorithm's ability to transfer previous training experiences, enabling the agents to adapt quickly to new environments. We provide a dynamic hyper-parameter scheduling scheme to further expedite convergence. Empirical results applied to a number of stochastic games verify that the proposed algorithm converges to the Nash equilibrium, while exhibiting a major speed-up over existing algorithms.


Sign in / Sign up

Export Citation Format

Share Document