Combination of reinforcement learning and bee algorithm for controlling two-link arm with six muscle: simplified human arm model in the horizontal plane

2019 ◽  
Vol 43 (1) ◽  
pp. 135-142
Author(s):  
Fereidoun Nowshiravan Rahatabad ◽  
Parisa Rangraz
2021 ◽  
Vol 33 (1) ◽  
pp. 129-156
Author(s):  
Masami Iwamoto ◽  
Daichi Kato

This letter proposes a new idea to improve learning efficiency in reinforcement learning (RL) with the actor-critic method used as a muscle controller for posture stabilization of the human arm. Actor-critic RL (ACRL) is used for simulations to realize posture controls in humans or robots using muscle tension control. However, it requires very high computational costs to acquire a better muscle control policy for desirable postures. For efficient ACRL, we focused on embodiment that is supposed to potentially achieve efficient controls in research fields of artificial intelligence or robotics. According to the neurophysiology of motion control obtained from experimental studies using animals or humans, the pedunculopontine tegmental nucleus (PPTn) induces muscle tone suppression, and the midbrain locomotor region (MLR) induces muscle tone promotion. PPTn and MLR modulate the activation levels of mutually antagonizing muscles such as flexors and extensors in a process through which control signals are translated from the substantia nigra reticulata to the brain stem. Therefore, we hypothesized that the PPTn and MLR could control muscle tone, that is, the maximum values of activation levels of mutually antagonizing muscles using different sigmoidal functions for each muscle; then we introduced antagonism function models (AFMs) of PPTn and MLR for individual muscles, incorporating the hypothesis into the process to determine the activation level of each muscle based on the output of the actor in ACRL. ACRL with AFMs representing the embodiment of muscle tone successfully achieved posture stabilization in five joint motions of the right arm of a human adult male under gravity in predetermined target angles at an earlier period of learning than the learning methods without AFMs. The results obtained from this study suggest that the introduction of embodiment of muscle tone can enhance learning efficiency in posture stabilization disorders of humans or humanoid robots.


2008 ◽  
Vol 2008.21 (0) ◽  
pp. 524-525
Author(s):  
Kyuengbo MIN ◽  
Hideyuki Kimpara ◽  
Takahiko Sugiyama ◽  
Chikara Nagai ◽  
Masami IWAMOTO

Mechatronics ◽  
2021 ◽  
Vol 78 ◽  
pp. 102630
Author(s):  
Aolei Yang ◽  
Yanling Chen ◽  
Wasif Naeem ◽  
Minrui Fei ◽  
Ling Chen

2011 ◽  
Vol 21 (11) ◽  
pp. 3293-3303 ◽  
Author(s):  
FEREYDOON NOWSHIRAVAN RAHATABAD ◽  
ALI FALLAH ◽  
AMIR HOMAYOUN JAFARI

In this paper, the feasibility of observing chaotic behavior in the model of a human arm is discussed. Two-Link Arm driven by Six Muscles (TLASM) which is a well-known model of planar human arm reaching movements in the horizontal plane is investigated. Reinforcement learning (RL) that is considered as a model for Dopamine-based learning in the brain is used to control the TLSAM. Finally, the existence of chaos phenomena in the TLASM model controlled with RL is researched using tools like bifurcation maps, Lyapunov exponents, phase-plane trajectories, and spectral analysis using FFT. Results yield that chaos phenomena may occur in the overall system by changing some internal parameters of muscles that have a physiological explanation.


2007 ◽  
Vol 362 (1479) ◽  
pp. 383-401 ◽  
Author(s):  
Francesco Mannella ◽  
Gianluca Baldassarre

Previous experiments have shown that when domestic chicks ( Gallus gallus ) are first trained to locate food elements hidden at the centre of a closed square arena and then are tested in a square arena of double the size, they search for food both at its centre and at a distance from walls similar to the distance of the centre from the walls experienced during training. This paper presents a computational model that successfully reproduces these behaviours. The model is based on a neural-network implementation of the reinforcement-learning actor–critic architecture (in this architecture the ‘critic’ learns to evaluate perceived states in terms of predicted future rewards, while the ‘actor’ learns to increase the probability of selecting the actions that lead to higher evaluations). The analysis of the model suggests which type of information and cognitive mechanisms might underlie chicks' behaviours: (i) the tendency to explore the area at a specific distance from walls might be based on the processing of the height of walls' horizontal edges, (ii) the capacity to generalize the search at the centre of square arenas independently of their size might be based on the processing of the relative position of walls' vertical edges on the horizontal plane (equalization of walls' width), and (iii) the whole behaviour exhibited in the large square arena can be reproduced by assuming the existence of an attention process that, at each time, focuses chicks' internal processing on either one of the two previously discussed information sources. The model also produces testable predictions regarding the generalization capabilities that real chicks should exhibit if trained in circular arenas of varying size. The paper also highlights the potentialities of the model to address other experiments on animals' navigation and analyses its strengths and weaknesses in comparison to other models.


2020 ◽  
Vol 7 (3) ◽  
pp. 138-146
Author(s):  
Fereidoun Nowshiravan Rahatabad

Background: The central nervous system (CNS) is optimizing arm movements to reduce some kind of cost function. Simulating parts of the nervous system is one way of obtaining accurate information about the neurological and treatment of neuromuscular diseases. The primary purpose of this paper is to model and control the human arm in a reaching movement based on reinforcement learning (RL) theory. Methods: First, Zajac’s muscle model has improved by a fuzzy system. Second, the proposed muscle model applied to the 6 muscles, which are responsible for a two-link arm that moves in the horizontal plane. Third, the model parameters are approximated based on the genetic algorithm (GA). Experimental data recorded from healthy subjects for assessing the approach. At last, the RL algorithm has utilized to guide the arm for reaching tasks. Results: The results show that: (1) The proposed system is temporally similar to a real arm movement. (2) The RL algorithm can generate the motor commands obtained from electromyographies (EMGs). (3) The similarity of obtained activation function from the system has compared with the real data activation function, which may prove the possibility of RL in the CNS (basal ganglia). Finally, in order to have a graphical and effective representation of the arm model, the virtual reality environment of MATLAB has been used. Conclusion: Since the RL method is a representative of the brain’s control function, it has some features, such as better settling time, not having any peek overshoot, and robustness.


Author(s):  
Nives Klopw ◽  
◽  
Jadran Lenarcic

In this paper, we studied the maximum inclination angles in the human shoulder girdle and compared them with the motion abilities of the humanoid robotic shoulder proposed in 1,2) We found that the extreme inclination angles of the human shoulder girdle are different when the arm is constantly stretched downward or when the arm executes an inclination co-planar to the girdle's motion. The humanoid robotic shoulder fulfills the range of the maximum inclination angles in the second type of motion but its workspace is obviously too small for the first type of motion. We also measured the working cone of the human arm and found that central axis of the human arm working cone is about 60° inclined in the horizontal plane.


Sign in / Sign up

Export Citation Format

Share Document