continuous actions
Recently Published Documents


TOTAL DOCUMENTS

78
(FIVE YEARS 31)

H-INDEX

9
(FIVE YEARS 1)

Symmetry ◽  
2021 ◽  
Vol 13 (12) ◽  
pp. 2411
Author(s):  
Chayoung Kim

Artificial intelligence (AI) techniques in power grid control and energy management in building automation require both deep Q-networks (DQNs) and deep deterministic policy gradients (DDPGs) in deep reinforcement learning (DRL) as off-policy algorithms. Most studies on improving the stability of DRL have addressed these with replay buffers and a target network using a delayed temporal difference (TD) backup, which is known for minimizing a loss function at every iteration. The loss functions were developed for DQN and DDPG, and it is well-known that there have been few studies on improving the techniques of the loss functions used in both DQN and DDPG. Therefore, we modified the loss function based on a temporal consistency (TC) loss and adapted the proposed TC loss function for the target network update in both DQN and DDPG. The proposed TC loss function showed effective results, particularly in a critic network in DDPG. In this work, we demonstrate that, in OpenAI Gym, both “cart-pole” and “pendulum”, the proposed TC loss function shows enormously improved convergence speed and performance, particularly in the critic network in DDPG.


2021 ◽  
pp. 1-15
Author(s):  
Mario Hervault ◽  
Pier-Giorgio Zanone ◽  
Jean-Christophe Buisson ◽  
Raoul Huys

Abstract Most studies contributing to identify the brain network for inhibitory control have investigated the cancelation of prepared–discrete actions, thus focusing on an isolated and short-lived chunk of human behavior. Aborting ongoing–continuous actions is an equally crucial ability but remains little explored. Although discrete and ongoing–continuous rhythmic actions are associated with partially overlapping yet largely distinct brain activations, it is unknown whether the inhibitory network operates similarly in both situations. Thus, distinguishing between action types constitutes a powerful means to investigate whether inhibition is a generic function. We, therefore, used independent component analysis (ICA) of EEG data and show that canceling a discrete action and aborting a rhythmic action rely on independent brain components. The ICA showed that a delta/theta power increase generically indexed inhibitory activity, whereas N2 and P3 ERP waves did so in an action-specific fashion. The action-specific components were generated by partially distinct brain sources, which indicates that the inhibitory network is engaged differently when canceling a prepared–discrete action versus aborting an ongoing–continuous action. In particular, increased activity was estimated in precentral gyri and posterior parts of the cingulate cortex for action canceling, whereas an enhanced activity was found in more frontal gyri and anterior parts of the cingulate cortex for action aborting. Overall, the present findings support the idea that inhibitory control is differentially implemented according to the type of action to revise.


2021 ◽  
Vol 11 (21) ◽  
pp. 9789
Author(s):  
Jiaqi Dong ◽  
Zeyang Xia ◽  
Qunfei Zhao

Augmented reality assisted assembly training (ARAAT) is an effective and affordable technique for labor training in the automobile and electronic industry. In general, most tasks of ARAAT are conducted by real-time hand operations. In this paper, we propose an algorithm of dynamic gesture recognition and prediction that aims to evaluate the standard and achievement of the hand operations for a given task in ARAAT. We consider that the given task can be decomposed into a series of hand operations and furthermore each hand operation into several continuous actions. Then, each action is related with a standard gesture based on the practical assembly task such that the standard and achievement of the actions included in the operations can be identified and predicted by the sequences of gestures instead of the performance throughout the whole task. Based on the practical industrial assembly, we specified five typical tasks, three typical operations, and six standard actions. We used Zernike moments combined histogram of oriented gradient and linear interpolation motion trajectories to represent 2D static and 3D dynamic features of standard gestures, respectively, and chose the directional pulse-coupled neural network as the classifier to recognize the gestures. In addition, we defined an action unit to reduce the dimensions of features and computational cost. During gesture recognition, we optimized the gesture boundaries iteratively by calculating the score probability density distribution to reduce interferences of invalid gestures and improve precision. The proposed algorithm was evaluated on four datasets and proved to increase recognition accuracy and reduce the computational cost from the experimental results.


2021 ◽  
pp. 108201
Author(s):  
Fumie Sugimoto ◽  
Motohiro Kimura ◽  
Yuji Takeda
Keyword(s):  

ACTA IMEKO ◽  
2021 ◽  
Vol 10 (3) ◽  
pp. 7
Author(s):  
András Kalapos ◽  
Csaba Gór ◽  
Róbert Moni ◽  
István Harmati

<p class="Abstract">The present study focused on vision-based end-to-end reinforcement learning in relation to<strong> </strong>vehicle control problems such as lane following and collision avoidance. The controller policy presented in this paper is able to control a small-scale robot to follow the right-hand lane of a real two-lane road, although its training has only been carried out in a simulation. This model, realised by a simple, convolutional network, relies on images of a forward-facing monocular camera and generates continuous actions that directly control the vehicle. To train this policy, proximal policy optimization was used, and to achieve the generalisation capability required for real performance, domain randomisation was used. A thorough analysis of the trained policy was conducted by measuring multiple performance metrics and comparing these to baselines that rely on other methods. To assess the quality of the simulation-to-reality transfer learning process and the performance of the controller in the real world, simple metrics were measured on a real track and compared with results from a matching simulation. Further analysis was carried out by visualising salient object maps.</p>


2021 ◽  
Vol 9 (3) ◽  
pp. 250-254
Author(s):  
Dr. Oman Sefaj

Change management is a key issue for the survival and realization of organizational objectives in today's business environment, which is changing in different ways. Change has become a necessary constant for companies that need to change in order to remain competitive in the market. The ability to manage this change is seen as a basic skill of successful enterprises in contemporary entrepreneurship. These changes are undoubtedly manifested both locally and internationally. Economic globalization as an integrator has caused these two levels to merge into one, causing the difference between them to fade. This integration has been very intense lately. Leadership as the process by which an individual influences a group to achieve a common goal. Process means systematic and continuous actions and ways by which the leader influences subordinates. It should be noted that leadership is not a linear process but an interactive process and requires adaptation and innovation. Entrepreneurship and innovation in developing countries but also in developed countries in the contemporary era is influencing in all aspects of business development. This research for Kosovo condition confirms the following results: that the change process is an integral and necessary part of day-to-day management in these enterprises, identifying leadership challenges in implementing the change process and adapting enterprises to the contemporary environment continues to be vital to performance profitability and competitiveness. Answering research questions requires testing the relationships between variables (type of change, process factors, and success of the change). Testing of these connections is enabled using quantitative methods. As a result, the use of questionnaires as a method for data collection in this paper enables: research and analysis of possible relationships between the variables taken in the study, and tends to 'open' issues that will be of interest to be explored in the future. These leadership activities, adapting the efforts to maximize the existing opportunities in the environment of strong competition and following the contemporary development trends in the enterprise have been researched during the processes of change in the enterprise are: creating a strong leadership team to lead the processes, develop a vision to assist and guide efforts to achieve strategic objectives, designing and communicating the strategy to achieve the planned results, providing training and career development for employees to understand, reduce resistance and motivate to achieve the mission and vision of the enterprise.


Author(s):  
Simge Nur Aslan ◽  
Burak Taşçı ◽  
Ayşegül Uçar ◽  
Cüneyt Güzeli˙ş

This paper proposes an algorithm for learning to move the desired object by humanoid robots. In this algorithm, the semantic segmentation algorithm and Deep Reinforcement Learning (DRL) algorithms are combined. The semantic segmentation algorithm is used to detect and recognize the object be moved. DRL algorithms are used at the walking and grasping steps. Deep Q Network (DQN) is used to walk towards the target object by means of the previously defined actions at the gate manager and the different head positions of the robot. Deep Deterministic Policy Gradient (DDPG) network is used for grasping by means of the continuous actions. The previously defined commands are finally assigned for the robot to stand up, turn left side and move forward together with the object. In the experimental setup, the Robotis-Op3 humanoid robot is used. The obtained results show that the proposed algorithm has successfully worked.


Author(s):  
Camille Horbez ◽  
Yulan Qing ◽  
Kasra Rafi

Abstract We address the question of determining which mapping class groups of infinite-type surfaces admit nonelementary continuous actions on hyperbolic spaces. More precisely, let $\Sigma $ be a connected, orientable surface of infinite type with tame endspace whose mapping class group is generated by a coarsely bounded subset. We prove that ${\mathrm {Map}}(\Sigma )$ admits a continuous nonelementary action on a hyperbolic space if and only if $\Sigma $ contains a finite-type subsurface which intersects all its homeomorphic translates. When $\Sigma $ contains such a nondisplaceable subsurface K of finite type, the hyperbolic space we build is constructed from the curve graphs of K and its homeomorphic translates via a construction of Bestvina, Bromberg and Fujiwara. Our construction has several applications: first, the second bounded cohomology of ${\mathrm {Map}}(\Sigma )$ contains an embedded $\ell ^1$ ; second, using work of Dahmani, Guirardel and Osin, we deduce that ${\mathrm {Map}} (\Sigma )$ contains nontrivial normal free subgroups (while it does not if $\Sigma $ has no nondisplaceable subsurface of finite type), has uncountably many quotients and is SQ-universal.


2021 ◽  
Vol 7 (2) ◽  
pp. e001091
Author(s):  
Alli Gokeler ◽  
Anne Benjaminse ◽  
Francesco Della Villa ◽  
Fillippo Tosarelli ◽  
Evert Verhagen ◽  
...  

Athletes in team sports have to quickly visually perceive actions of opponents and teammates while executing their own movements. These continuous actions are performed under time pressure and may contribute to a non-contact ACL injury. However, ACL injury screening and prevention programmes are primarily based on standardised movements in a predictable environment. The sports environment provides much greater cognitive demand because athletes must attend their attention to numerous external stimuli and inhibit impulsive actions. Any deficit or delay in attentional processing may contribute to an inability to correct potential errors in complex coordination, resulting in knee positions that increase the ACL injury risk. In this viewpoint, we advocate that ACL injury screening should include the sports specific neurocognitive demands.


2021 ◽  
Vol 11 (9) ◽  
pp. 4135
Author(s):  
Chi-Kai Hsieh ◽  
Kun-Lin Chan ◽  
Feng-Tsun Chien

This paper studies the problem of joint power allocation and user association in wireless heterogeneous networks (HetNets) with a deep reinforcement learning (DRL)-based approach. This is a challenging problem since the action space is hybrid, consisting of continuous actions (power allocation) and discrete actions (device association). Instead of quantizing the continuous space (i.e., possible values of powers) into a set of discrete alternatives and applying traditional deep reinforcement approaches such as deep Q learning, we propose working on the hybrid space directly by using the novel parameterized deep Q-network (P-DQN) to update the learning policy and maximize the average cumulative reward. Furthermore, we incorporate the constraints of limited wireless backhaul capacity and the quality-of-service (QoS) of each user equipment (UE) into the learning process. Simulation results show that the proposed P-DQN outperforms the traditional approaches, such as the DQN and distance-based association, in terms of energy efficiency while satisfying the QoS and backhaul capacity constraints. The improvement in the energy efficiency of the proposed P-DQN on average may reach 77.6% and 140.6% over the traditional DQN and distance-based association approaches, respectively, in a HetNet with three SBS and five UEs.


Sign in / Sign up

Export Citation Format

Share Document