action space
Recently Published Documents


TOTAL DOCUMENTS

309
(FIVE YEARS 119)

H-INDEX

20
(FIVE YEARS 6)

2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Baolai Wang ◽  
Shengang Li ◽  
Xianzhong Gao ◽  
Tao Xie

With the development of unmanned aerial vehicle (UAV) technology, UAV swarm confrontation has attracted many researchers’ attention. However, the situation faced by the UAV swarm has substantial uncertainty and dynamic variability. The state space and action space increase exponentially with the number of UAVs, so that autonomous decision-making becomes a difficult problem in the confrontation environment. In this paper, a multiagent reinforcement learning method with macro action and human expertise is proposed for autonomous decision-making of UAVs. In the proposed approach, UAV swarm is modeled as a large multiagent system (MAS) with an individual UAV as an agent, and the sequential decision-making problem in swarm confrontation is modeled as a Markov decision process. Agents in the proposed method are trained based on the macro actions, where sparse and delayed rewards, large state space, and action space are effectively overcome. The key to the success of this method is the generation of the macro actions that allow the high-level policy to find a near-optimal solution. In this paper, we further leverage human expertise to design a set of good macro actions. Extensive empirical experiments in our constructed swarm confrontation environment show that our method performs better than the other algorithms.


2021 ◽  
Author(s):  
Laha Ale ◽  
Scott King ◽  
Ning Zhang ◽  
Abdul Sattar ◽  
Janahan Skandaraniyam

<div> Mobile Edge Computing (MEC) has been regarded as a promising paradigm to reduce service latency for data processing in Internet of Things, by provisioning computing resources at network edge. In this work, we jointly optimize the task partitioning and computational power allocation for computation offloading in a dynamic environment with multiple IoT devices and multiple edge servers. We formulate the problem as a Markov decision process with constrained hybrid action space, which cannot be well handled by existing deep reinforcement learning (DRL) algorithms. Therefore, we develop a novel Deep Reinforcement Learning called Dirichlet Deep Deterministic Policy Gradient (D3PG), which </div><div>is built on Deep Deterministic Policy Gradient (DDPG) to solve the problem. The developed model can learn to solve multi-objective optimization, including maximizing the number of tasks processed before expiration and minimizing the energy cost and service latency. More importantly, D3PG can effectively deal with constrained distribution-continuous hybrid action space, where the distribution variables are for the task partitioning and offloading, while the continuous variables are for computational frequency control. Moreover, the D3PG can address many similar issues in MEC and general reinforcement learning problems. Extensive simulation results show that the proposed D3PG outperforms the state-of-art methods.</div><div> Mobile Edge Computing (MEC) has been regarded as a promising paradigm to reduce service latency for data processing in Internet of Things, by provisioning computing resources at network edge. In this work, we jointly optimize the task partitioning and computational power allocation for computation offloading in a dynamic environment with multiple IoT devices and multiple edge servers. We formulate the problem as a Markov decision process with constrained hybrid action space, which cannot be well handled by existing deep reinforcement learning (DRL) algorithms. Therefore, we develop a novel Deep Reinforcement Learning called Dirichlet Deep Deterministic Policy Gradient (D3PG), which is built on Deep Deterministic Policy Gradient (DDPG) to solve the problem. The developed model can learn to solve multi-objective optimization, including maximizing the number of tasks processed before expiration and minimizing the energy cost and service latency. More importantly, D3PG can effectively deal with constrained distribution-continuous hybrid action space, where the distribution variables are for the task partitioning and offloading, while the continuous variables are for computational frequency control. Moreover, the D3PG can address many similar issues in MEC and general reinforcement learning problems. Extensive simulation results show that the proposed D3PG outperforms the state-of-art methods.</div>


2021 ◽  
Author(s):  
Laha Ale ◽  
Scott King ◽  
Ning Zhang ◽  
Abdul Sattar ◽  
Janahan Skandaraniyam

<div> Mobile Edge Computing (MEC) has been regarded as a promising paradigm to reduce service latency for data processing in Internet of Things, by provisioning computing resources at network edge. In this work, we jointly optimize the task partitioning and computational power allocation for computation offloading in a dynamic environment with multiple IoT devices and multiple edge servers. We formulate the problem as a Markov decision process with constrained hybrid action space, which cannot be well handled by existing deep reinforcement learning (DRL) algorithms. Therefore, we develop a novel Deep Reinforcement Learning called Dirichlet Deep Deterministic Policy Gradient (D3PG), which </div><div>is built on Deep Deterministic Policy Gradient (DDPG) to solve the problem. The developed model can learn to solve multi-objective optimization, including maximizing the number of tasks processed before expiration and minimizing the energy cost and service latency. More importantly, D3PG can effectively deal with constrained distribution-continuous hybrid action space, where the distribution variables are for the task partitioning and offloading, while the continuous variables are for computational frequency control. Moreover, the D3PG can address many similar issues in MEC and general reinforcement learning problems. Extensive simulation results show that the proposed D3PG outperforms the state-of-art methods.</div><div> Mobile Edge Computing (MEC) has been regarded as a promising paradigm to reduce service latency for data processing in Internet of Things, by provisioning computing resources at network edge. In this work, we jointly optimize the task partitioning and computational power allocation for computation offloading in a dynamic environment with multiple IoT devices and multiple edge servers. We formulate the problem as a Markov decision process with constrained hybrid action space, which cannot be well handled by existing deep reinforcement learning (DRL) algorithms. Therefore, we develop a novel Deep Reinforcement Learning called Dirichlet Deep Deterministic Policy Gradient (D3PG), which is built on Deep Deterministic Policy Gradient (DDPG) to solve the problem. The developed model can learn to solve multi-objective optimization, including maximizing the number of tasks processed before expiration and minimizing the energy cost and service latency. More importantly, D3PG can effectively deal with constrained distribution-continuous hybrid action space, where the distribution variables are for the task partitioning and offloading, while the continuous variables are for computational frequency control. Moreover, the D3PG can address many similar issues in MEC and general reinforcement learning problems. Extensive simulation results show that the proposed D3PG outperforms the state-of-art methods.</div>


2021 ◽  
pp. 65-90
Author(s):  
Johannes M. Lehner ◽  
Eva Born ◽  
Peter Kelemen ◽  
Rainer Born

AbstractThis chapter develops a model of resilient action in situations where established rules or behavioural routines are either not available or are misleading, thus exposing actors to high means-end ambiguity. The model suggests that an ‘action space’ must be created by stabilizing the action system and expanding options for action. It is based on our qualitative research in the Austrian Military (high degree of publicness) on cases of resilient field action, especially as regards ‘bouncing back’ incidents. We contend that different types of drill combined with the acquisition of background knowledge are essential for organizational resilience, the management of unexpected situations and the explanation of success, leading to controlled reproducibility solutions of typical problems. As such, the model intends to explain exploitation types of learning. However, as an antecedent for installing the action space, we explore so-called the ‘exaptation’ of drilled procedures, pertaining to the transfer of procedures to serve novel requirements, thus located in the exploration domain. This phenomenon leads to properties that contribute to recovery from shock in critical situations, through innovation. In short, the chapter provides novel empirical evidence that applying rules does not lead to resilient action in the case of unknown or unexpected situations. Instead, we show robust evidence that a corrective understanding and reflective use of rules and routines is causally related to the ability to deal with surprise and fostering resilience.


SOROT ◽  
2021 ◽  
Vol 16 (2) ◽  
pp. 99
Author(s):  
Eki Karsani Apriliyadi ◽  
Tommy Hendrix

Tujuan kajian ini untuk melihat pandemi Covid-19 di Indonesia sebagai fenomena sosial yang memberikan ruang interpretasi dari masyarakat dalam ruang aksi. Artikel ini merupakan kajian dalam melihat fenomena pandemi Covid-19 yang dianalisis secara kualitatif menggunakan perspektif wacana, pengetahuan dan kekuasaan dari Foucault. Hasilnya menunjukkan bahwa berbagai interpretasi dari masyarakat melibatkan tindakan nyata yang berkaitan dengan pemahaman tentang kekuasaan yang subjektif, horisontal, dan hadir dalam ruang interaksi publik yang melibatkan berbagai pihak. Melalui perspektif ini kita bisa lihat bagaimana kekuasaan bekerja melalui beragam mekanisme dalam ruang interaksi yang dilihat secara horizontal. Ruang tafsir relasi kekuasaan terkait fenomena pandemi Covid-19 di Indonesia memperlihatkan adanya tiga perhatian yang berbeda (relasi kekuasaan sebagai strategi, relasi kekuasaan govermentality, dan relasi kekuasaan dominasi).This study aims to see the Covid-19 pandemic in Indonesia as a social phenomenon that provides space for interpretation from the community in action space. This article is a study in looking at the Covid-19 pandemic phenomenon, which is qualitatively analyzed using the perspective of discourse, knowledge and power from Foucault. The results show that various interpretations of society involve concrete actions related to an understanding of power that is subjective, horizontal, and present in the public interaction space involving multiple parties. From this perspective, we can see how power works through various mechanisms in the interaction space seen horizontally. The interpretation room for power relations related to the Covid-19 pandemic phenomenon in Indonesia shows that there are three different concerns (power relations as a strategy, governmental power relations, and dominance power relations).


2021 ◽  
Vol 4 ◽  
Author(s):  
Yifu Qiu ◽  
Yitao Qiu ◽  
Yicong Yuan ◽  
Zheng Chen ◽  
Raymond Lee

Reinforcement Learning (RL) based machine trading attracts a rich profusion of interest. However, in the existing research, RL in the day-trade task suffers from the noisy financial movement in the short time scale, difficulty in order settlement, and expensive action search in a continuous-value space. This paper introduced an end-to-end RL intraday trading agent, namely QF-TraderNet, based on the quantum finance theory (QFT) and deep reinforcement learning. We proposed a novel design for the intraday RL trader’s action space, inspired by the Quantum Price Levels (QPLs). Our action space design also brings the model a learnable profit-and-loss control strategy. QF-TraderNet composes two neural networks: 1) A long short term memory networks for the feature learning of financial time series; 2) a policy generator network (PGN) for generating the distribution of actions. The profitability and robustness of QF-TraderNet have been verified in multi-type financial datasets, including FOREX, metals, crude oil, and financial indices. The experimental results demonstrate that QF-TraderNet outperforms other baselines in terms of cumulative price returns and Sharpe Ratio, and the robustness in the acceidential market shift.


2021 ◽  
Author(s):  
Jun Ma ◽  
Shunyi Yao ◽  
Guangda Chen ◽  
Jiakai Song ◽  
Jianmin Ji

2021 ◽  
Vol 21 (4) ◽  
Author(s):  
Malin Gütschow ◽  
Bartosz Bartkowski ◽  
María R. Felipe-Lucia

AbstractThe urgency to address climate change, biodiversity loss, and natural resource degradation requires major changes in agricultural practices. Agricultural policy in Germany has so far failed to generate such changes; meanwhile, public demands for new regulations are met by widespread farmers’ protests. Against this background, an improved understanding of the factors influencing farmers’ uptake of sustainable agricultural practices is necessary. This study introduces the concept of action space to analyze the role of barriers to change which lie beyond farmers’ perceived immediate control. We apply this conceptual framework to the case of diversified crop rotations in Saxony (Germany) and combine semi-structured interviews and a survey to identify key barriers to change and their relative weights. We find that farmers feel rather strongly restricted in their action space to implement diversified crop rotations for sustainable agriculture. The most important barriers pertain to the market environment, which severely limits the feasibility of many crops. In addition, limited regulatory predictability as well as regulatory incoherence and limited flexibility restrict farmers in their action space. The role of resource availability within the farm businesses as well as availability and accessibility of knowledge is ambiguous between interview and survey results. The analysis of interactions indicates that multiple barriers form a self-reinforcing system in which farmers perceive to have little leeway to implement sustainable practices. These results emphasize the need to create an enabling market and regulatory environment in which sustainable practices pay off.


Sign in / Sign up

Export Citation Format

Share Document