Model-free control method based on reinforcement learning for building cooling water systems: Validation by measured data-based simulation

2020 ◽  
Vol 218 ◽  
pp. 110055
Author(s):  
Shunian Qiu ◽  
Zhenhai Li ◽  
Zhengwei Li ◽  
Jiajie Li ◽  
Shengping Long ◽  
...  
2021 ◽  
Vol 11 (18) ◽  
pp. 8419
Author(s):  
Jiang Zhao ◽  
Jiaming Sun ◽  
Zhihao Cai ◽  
Longhong Wang ◽  
Yingxun Wang

To achieve the perception-based autonomous control of UAVs, schemes with onboard sensing and computing are popular in state-of-the-art work, which often consist of several separated modules with respective complicated algorithms. Most methods depend on handcrafted designs and prior models with little capacity for adaptation and generalization. Inspired by the research on deep reinforcement learning, this paper proposes a new end-to-end autonomous control method to simplify the separate modules in the traditional control pipeline into a single neural network. An image-based reinforcement learning framework is established, depending on the design of the network architecture and the reward function. Training is performed with model-free algorithms developed according to the specific mission, and the control policy network can map the input image directly to the continuous actuator control command. A simulation environment for the scenario of UAV landing was built. In addition, the results under different typical cases, including both the small and large initial lateral or heading angle offsets, show that the proposed end-to-end method is feasible for perception-based autonomous control.


2018 ◽  
Vol 10 (10) ◽  
pp. 168781401880776 ◽  
Author(s):  
Yan Zhang ◽  
Jianzhou Wang ◽  
Wei Li ◽  
Jie Wang ◽  
Peng Yang

This article describes a model-free adaptive control method for knee joint exoskeleton, which avoids the complexity of human–exoskeleton modeling. An important feature of the proposed controller is that it uses the input and output data of the knee joint angle to control the exoskeleton. Furthermore, discrete sliding mode control law and prior torque are introduced to improve the accuracy and robustness of the system. Prior torque of knee joint is obtained through the walking simulation of human–exoskeleton modeling. Specially, the experiment is carried out by using the co-simulation automatic dynamic analysis of mechanical systems and MATLAB. Data from these assessments indicate that the proposed strategy enables the knee exoskeleton to track the trajectory of angle well and has a good performance on walking assistance.


2013 ◽  
Vol 397-400 ◽  
pp. 1373-1377
Author(s):  
Bao Hua Cheng ◽  
Ai Guo Wu ◽  
Yu Wen You

This paper applied model free control method to the refrigeration system evaporator minimum stable superheat control. Compared with conventional PID control, model free control method has fast convergence, anti-interference ability. Meanwhile, model free control can reduce the coupling effect between evaporator superheat and evaporating temperature and be well adapted to the variable load control requirements of the refrigeration system, which has good steady state performance and dynamic performance.


2016 ◽  
Author(s):  
Nils B. Kroemer ◽  
Ying Lee ◽  
Shakoor Pooseh ◽  
Ben Eppinger ◽  
Thomas Goschke ◽  
...  

AbstractDopamine is a key neurotransmitter in reinforcement learning and action control. Recent findings suggest that these components are inherently entangled. Here, we tested if increases in dopamine tone by administration of L-DOPA upregulate deliberative “model-based” control of behavior or reflexive “model-free” control as predicted by dual-control reinforcement-learning models. Alternatively, L-DOPA may impair learning as suggested by “value” or “thrift” theories of dopamine. To this end, we employed a two-stage Markov decision-task to investigate the effect of L-DOPA (randomized cross-over) on behavioral control while brain activation was measured using fMRI. L-DOPA led to attenuated model-free control of behavior as indicated by the reduced impact of reward on choice and increased stochasticity of model-free choices. Correspondingly, in the brain, L-DOPA decreased the effect of reward while prediction-error signals were unaffected. Taken together, our results suggest that L-DOPA reduces model-free control of behavior by attenuating the transfer of value to action.


Sign in / Sign up

Export Citation Format

Share Document