binary feedback
Recently Published Documents


TOTAL DOCUMENTS

50
(FIVE YEARS 7)

H-INDEX

13
(FIVE YEARS 1)

2020 ◽  
Author(s):  
Shoeb Shaikh ◽  
Rosa So ◽  
Tafadzwa Sibindi ◽  
Camilo Libedinsky ◽  
Arindam Basu

AbstractIntra-cortical Brain Machine Interfaces (iBMIs) with wireless capability could scale the number of recording channels by integrating an intention decoder to reduce data rates. However, the need for frequent retraining due to neural signal non-stationarity is a big impediment. This paper presents an alternate paradigm of online reinforcement learning (RL) with a binary evaluative feedback in iBMIs to tackle this issue. This paradigm eliminates time-consuming calibration procedures. Instead, it relies on updating the model on a sequential sample-by-sample basis based on an instantaneous evaluative binary feedback signal. However, batch updates of weight in popular deep networks is very resource consuming and incompatible with constraints of an implant. In this work, using offline open-loop analysis on pre-recorded data, we show application of a simple RL algorithm - Banditron -in discrete-state iBMIs and compare it against previously reported state of the art RL algorithms – Hebbian RL, Attention gated RL, deep Q-learning. Owing to its simplistic single-layer architecture, Banditron is found to yield at least two orders of magnitude of reduction in power dissipation compared to state of the art RL algorithms. At the same time, post-hoc analysis performed on four pre-recorded experimental datasets procured from the motor cortex of two non-human primates performing joystick-based movement-related tasks indicate Banditron performing significantly better than state of the art RL algorithms by at least 5%, 10%, 7% and 7% in experiments 1, 2, 3 and 4 respectively. Furthermore, we propose a non-linear variant of Banditron, Banditron-RP, which gives an average improvement of 6%, 2% in decoding accuracy in experiments 2,4 respectively with only a moderate increase in power consumption.


The network topology of association was always active but the association between them may not be always connected and properties are restricted. On the time there is a chance of node failures and detecting the node failure is important. Two node failure detection schemes are implemented which are binary and non-binary feedback schemes. These schemes unite locality estimation, localized monitoring and node association. These results are applicable to both attached and detached networks. The schemes accomplish high disappointment discovery rates, low forged positive rates, and low correspondence overhead


2019 ◽  
Vol 122 (2) ◽  
pp. 797-808 ◽  
Author(s):  
Shintaro Uehara ◽  
Firas Mawase ◽  
Amanda S. Therrien ◽  
Kendra M. Cherry-Allen ◽  
Pablo Celnik

Motor exploration, a trial-and-error process in search for better motor outcomes, is known to serve a critical role in motor learning. This is particularly relevant during reinforcement learning, where actions leading to a successful outcome are reinforced while unsuccessful actions are avoided. Although early on motor exploration is beneficial to finding the correct solution, maintaining high levels of exploration later in the learning process might be deleterious. Whether and how the level of exploration changes over the course of reinforcement learning, however, remains poorly understood. Here we evaluated temporal changes in motor exploration while healthy participants learned a reinforcement-based motor task. We defined exploration as the magnitude of trial-to-trial change in movements as a function of whether the preceding trial resulted in success or failure. Participants were required to find the optimal finger-pointing direction using binary feedback of success or failure. We found that the magnitude of exploration gradually increased over time when participants were learning the task. Conversely, exploration remained low in participants who were unable to correctly adjust their pointing direction. Interestingly, exploration remained elevated when participants underwent a second training session, which was associated with faster relearning. These results indicate that the motor system may flexibly upregulate the extent of exploration during reinforcement learning as if acquiring a specific strategy to facilitate subsequent learning. Also, our findings showed that exploration affects reinforcement learning and vice versa, indicating an interactive relationship between them. Reinforcement-based tasks could be used as primers to increase exploratory behavior leading to more efficient subsequent learning. NEW & NOTEWORTHY Motor exploration, the ability to search for the correct actions, is critical to learning motor skills. Despite this, whether and how the level of exploration changes over the course of training remains poorly understood. We showed that exploration increased and remained high throughout training of a reinforcement-based motor task. Interestingly, elevated exploration persisted and facilitated subsequent learning. These results suggest that the motor system upregulates exploration as if learning a strategy to facilitate subsequent learning.


Technologies ◽  
2019 ◽  
Vol 7 (1) ◽  
pp. 22
Author(s):  
Ramiro Sámano-Robles

This paper investigates backlog retransmission strategies for a class of random access protocols with retransmission diversity (i.e., network diversity multiple access or NDMA) combined with multiple-antenna-based multi-packet reception (MPR). This paper proposes NDMA-MPR as a candidate for 5G contention-based and ultra-low latency multiple access. This proposal is based on the following known features of NDMA-MPR: (1) near collision-free performance, (2) very low latency values, and (3) reduced feedback complexity (binary feedback). These features match the machine-type traffic, real-time, and dense object connectivity requirements in 5G. This work is an extension of previous works using a multiple antenna receiver with correlated Rice channels and co-channel interference modelled as a Rayleigh fading variable. Two backlog retransmission strategies are implemented: persistent and randomized. Boundaries and extended analysis of the system are here obtained for different network and channel conditions. Average delay is evaluated using the M/G/1 queue model with statistically independent vacations. The results suggest that NDMA-MPR can achieve very low values of latency that can guarantee real- or near-real-time performance for multiple access in 5G, even in scenarios with high correlation and moderate co-channel interference.


2019 ◽  
Author(s):  
Peter Holland ◽  
Olivier Codol ◽  
Elizabeth Oxley ◽  
Madison Taylor ◽  
Elizabeth Hamshere ◽  
...  

AbstractThe addition of rewarding feedback to motor learning tasks has been shown to increase the retention of learning, spurring interest in the possible utility for rehabilitation. However, laboratory-based motor tasks employing rewarding feedback have repeatedly been shown to lead to great inter-individual variability in performance. Understanding the causes of such variability is vital for maximising the potential benefits of reward-based motor learning. Thus, using a large cohort (n=241) we examined whether spatial (SWM), verbal (VWM) and mental rotation (RWM) working memory capacity and dopamine-related genetic profiles were associated with performance in two reward-based motor tasks. The first task assessed participant’s ability to follow a hidden and slowly shifting reward region based on hit/miss (binary) feedback. The second task investigated participant’s capacity to preserve performance with binary feedback after adapting to the rotation with full visual feedback. Our results demonstrate that higher SWM is associated with greater success and a greater capacity to reproduce a successful motor action, measured as change in reach angle following reward. Whereas higher RWM was predictive of an increased propensity to express an explicit strategy when required to make large adjustments in reach angle. Therefore, both SWM and RWM were reliable predictors of success during reward-based motor learning. Change in reach direction following failure was also a strong predictor of success rate, although we observed no consistent relationship with any type of working memory. Surprisingly, no dopamine-related genotypes predicted performance. Therefore, working memory capacity plays a pivotal role in determining individual ability in reward-based motor learning.Significance statementReward-based motor learning tasks have repeatedly been shown to lead to idiosyncratic behaviours that cause varying degrees of task success. Yet, the factors determining an individual’s capacity to use reward-based feedback are unclear. Here, we assessed a wide range of possible candidate predictors, and demonstrate that domain-specific working memory plays an essential role in determining individual capacity to use reward-based feedback. Surprisingly, genetic variations in dopamine availability were not found to play a role. This is in stark contrast with seminal work in the reinforcement and decision-making literature, which show strong and replicated effects of the same dopaminergic genes in decision-making. Therefore, our results provide novel insights into reward-based motor learning, highlighting a key role for domain-specific working memory capacity.


Author(s):  
Yaroslav Toroshanko

The scheme of control of congestion with the use of feedback on the sign of the sensitivity function of productivity of the telecommunication network is considered. The performance sensitivity sign (target function) provides the optimal direction for adjusting the speed of the data source. The most common scheme for controlling binary feedback feedback messages about possible network congestion is to detect and set the congestion bit based on the length of the incoming packet queue. The main advantage of a queue-based scheme is its low complexity: the length of the queue can be controlled by a single counter. However, this method can create large queues in network nodes. Detecting the created mash should be delayed for some time to create the queue. Similarly, the detection of a congestion solution is also delayed by the time required to process the queue. To determine the sensitivity function of the network performance, it is proposed to use a simple neural network model of the dynamic system. The developed optimal control algorithm provides the formation of a control signal in such a way that the output of the system is as close as possible to the pre-established characteristics of the network. The model for the backtroque prediction of the queue state is proposed. The results of the simulation of the performance of the proposed scheme show that the scheme based on sensitivity has better performance compared with the usual scheme of choosing the threshold. A sensitivity-based circuit gives smaller variations in queue size and source speeds. Moreover, fluctuations are not significantly different, even if there are large changes in the service speeds of bottlenecks in the telecommunication network or its fragments.


Author(s):  
Yanan Sui ◽  
Masrour Zoghi ◽  
Katja Hofmann ◽  
Yisong Yue

The dueling bandits problem is an online learning framework where learning happens ``on-the-fly'' through preference feedback, i.e., from comparisons between a pair of actions. Unlike conventional online learning settings that require absolute feedback for each action, the dueling bandits framework assumes only the presence of (noisy) binary feedback about the relative quality of each pair of actions. The dueling bandits problem is well-suited for modeling settings that elicit subjective or implicit human feedback, which is typically more reliable in preference form. In this survey, we review recent results in the theories, algorithms, and applications of the dueling bandits problem. As an emerging domain, the theories and algorithms of dueling bandits have been intensively studied during the past few years. We provide an overview of recent advancements, including algorithmic advances and applications. We discuss extensions to standard problem formulation and novel application areas, highlighting key open research questions in our discussion.


2017 ◽  
Vol 28 (10) ◽  
pp. 3478-3490 ◽  
Author(s):  
Shintaro Uehara ◽  
Firas Mawase ◽  
Pablo Celnik

Abstract Humans can acquire knowledge of new motor behavior via different forms of learning. The two forms most commonly studied have been the development of internal models based on sensory-prediction errors (error-based learning) and success-based feedback (reinforcement learning). Human behavioral studies suggest these are distinct learning processes, though the neurophysiological mechanisms that are involved have not been characterized. Here, we evaluated physiological markers from the cerebellum and the primary motor cortex (M1) using noninvasive brain stimulations while healthy participants trained finger-reaching tasks. We manipulated the extent to which subjects rely on error-based or reinforcement by providing either vector or binary feedback about task performance. Our results demonstrated a double dissociation where learning the task mainly via error-based mechanisms leads to cerebellar plasticity modifications but not long-term potentiation (LTP)-like plasticity changes in M1; while learning a similar action via reinforcement mechanisms elicited M1 LTP-like plasticity but not cerebellar plasticity changes. Our findings indicate that learning complex motor behavior is mediated by the interplay of different forms of learning, weighing distinct neural mechanisms in M1 and the cerebellum. Our study provides insights for designing effective interventions to enhance human motor learning.


2015 ◽  
Vol 809 (1) ◽  
pp. L16 ◽  
Author(s):  
Stephen Justham ◽  
Eric W. Peng ◽  
Kevin Schawinski

Sign in / Sign up

Export Citation Format

Share Document