A Multi-Step Neural Control for Motor Brain-Machine Interface by Reinforcement Learning

Brain-machine interfaces (BMIs) decode cortical neural spikes of paralyzed patients to control external devices for the purpose of movement restoration. Neuroplasticity induced by conducting a relatively complex task within multistep, is helpful to performance improvements of BMI system. Reinforcement learning (RL) allows the BMI system to interact with the environment to learn the task adaptively without a teacher signal, which is more appropriate to the case for paralyzed patients. In this work, we proposed to apply Q(λ)-learning to multistep goal-directed tasks using users neural activity. Neural data were recorded from M1 of a monkey manipulating a joystick in a center-out task. Compared with a supervised learning approach, significant BMI control was achieved with correct directional decoding in 84.2% and 81% of the trials from naïve states. The results demonstrate that the BMI system was able to complete a task by interacting with the environment, indicating that RL-based methods have the potential to develop more natural BMI systems.

Download Full-text

Reinforcement Learning Based Fast Self-Recalibrating Decoder for Intracortical Brain–Machine Interface

Sensors ◽

10.3390/s20195528 ◽

2020 ◽

Vol 20 (19) ◽

pp. 5528

Author(s):

Peng Zhang ◽

Lianying Chao ◽

Yuting Chen ◽

Xuan Ma ◽

Weihua Wang ◽

...

Keyword(s):

Clinical Practice ◽

Reinforcement Learning ◽

Transfer Learning ◽

Classification Accuracy ◽

Brain Machine Interface ◽

Brain Machine Interfaces ◽

Neural Data ◽

Neural Recordings ◽

New Knowledge ◽

Machine Interface

Background: For the nonstationarity of neural recordings in intracortical brain–machine interfaces, daily retraining in a supervised manner is always required to maintain the performance of the decoder. This problem can be improved by using a reinforcement learning (RL) based self-recalibrating decoder. However, quickly exploring new knowledge while maintaining a good performance remains a challenge in RL-based decoders. Methods: To solve this problem, we proposed an attention-gated RL-based algorithm combining transfer learning, mini-batch, and weight updating schemes to accelerate the weight updating and avoid over-fitting. The proposed algorithm was tested on intracortical neural data recorded from two monkeys to decode their reaching positions and grasping gestures. Results: The decoding results showed that our proposed algorithm achieved an approximate 20% increase in classification accuracy compared to that obtained by the non-retrained classifier and even achieved better classification accuracy than the daily retraining classifier. Moreover, compared with a conventional RL method, our algorithm improved the accuracy by approximately 10% and the online weight updating speed by approximately 70 times. Conclusions: This paper proposed a self-recalibrating decoder which achieved a good and robust decoding performance with fast weight updating and might facilitate its application in wearable device and clinical practice.

Download Full-text

Reinforcement Learning in Reproducing Kernel Hilbert Spaces: Enabling Continuous Brain?Machine Interface Adaptation

IEEE Signal Processing Magazine ◽

10.1109/msp.2021.3076309 ◽

2021 ◽

Vol 38 (4) ◽

pp. 34-45

Author(s):

Yiwen Wang ◽

Jose C. Principe

Keyword(s):

Reinforcement Learning ◽

Hilbert Spaces ◽

Reproducing Kernel ◽

Reproducing Kernel Hilbert Spaces ◽

Brain Machine Interface ◽

Kernel Hilbert Spaces ◽

Machine Interface ◽

Interface Adaptation

Download Full-text

Adaptive Classification for Brain-Machine Interface with Reinforcement Learning

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-642-24955-6_44 ◽

2011 ◽

pp. 360-369

Author(s):

Shuichi Matsuzaki ◽

Yusuke Shiina ◽

Yasuhiro Wada

Keyword(s):

Reinforcement Learning ◽

Brain Machine Interface ◽

Adaptive Classification ◽

Machine Interface

Download Full-text

Toward an Autonomous Brain Machine Interface: Integrating Sensorimotor Reward Modulation and Reinforcement Learning

Journal of Neuroscience ◽

10.1523/jneurosci.1802-14.2015 ◽

2015 ◽

Vol 35 (19) ◽

pp. 7374-7387 ◽

Cited By ~ 38

Author(s):

B. T. Marsh ◽

V. S. A. Tarigoppula ◽

C. Chen ◽

J. T. Francis

Keyword(s):

Reinforcement Learning ◽

Brain Machine Interface ◽

Machine Interface

Download Full-text

Classifier Performance in Primary Somatosensory Cortex Towards Implementation of a Reinforcement Learning Based Brain Machine Interface

2016 32nd Southern Biomedical Engineering Conference (SBEC) ◽

10.1109/sbec.2016.19 ◽

2016 ◽

Cited By ~ 7

Author(s):

David McNiel ◽

Mohammad Bataineh ◽

John Choi ◽

John Hessburg ◽

Joseph Francis

Keyword(s):

Reinforcement Learning ◽

Somatosensory Cortex ◽

Primary Somatosensory Cortex ◽

Brain Machine Interface ◽

Classifier Performance ◽

Machine Interface

Download Full-text

Neural control of finger movement via intracortical brain–machine interface

Journal of Neural Engineering ◽

10.1088/1741-2552/aa80bd ◽

2017 ◽

Vol 14 (6) ◽

pp. 066004 ◽

Cited By ~ 13

Author(s):

Z T Irwin ◽

K E Schroeder ◽

P P Vu ◽

A J Bullard ◽

D M Tat ◽

...

Keyword(s):

Neural Control ◽

Brain Machine Interface ◽

Finger Movement ◽

Machine Interface

Download Full-text

Design of Transfer Reinforcement Learning Under Low Task Similarity

Volume 2A: 44th Design Automation Conference ◽

10.1115/detc2018-86013 ◽

2018 ◽

Cited By ~ 2

Author(s):

Xiongqing Liu ◽

Yan Jin

Keyword(s):

Reinforcement Learning ◽

Case Studies ◽

Collision Avoidance ◽

Negative Transfer ◽

Learning Approach ◽

Complex Task ◽

Simple Task ◽

Task Similarity

In this paper, a deep reinforcement learning approach was implemented to achieve autonomous collision avoidance. A transfer reinforcement learning approach (TRL) was proposed by introducing two concepts: transfer belief — how much confidence the agent puts in the expert’s experience, and transfer period — how long the agent’s decision is influenced by the expert’s experience. Various case studies have been conducted on transfer from a simple task — single static obstacle, to a complex task — multiple dynamic obstacles. It is found that if two tasks have low similarity, it is better to decrease initial transfer belief and keep a relatively longer transfer period, in order to reduce negative transfer and boost learning. Student agent’s learning variance grows significantly if using too short transfer period.

Download Full-text