matching law
Recently Published Documents


TOTAL DOCUMENTS

156
(FIVE YEARS 22)

H-INDEX

28
(FIVE YEARS 2)

2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Ethan Trepka ◽  
Mehran Spitmaan ◽  
Bilal A. Bari ◽  
Vincent D. Costa ◽  
Jeremiah Y. Cohen ◽  
...  

AbstractFor decades, behavioral scientists have used the matching law to quantify how animals distribute their choices between multiple options in response to reinforcement they receive. More recently, many reinforcement learning (RL) models have been developed to explain choice by integrating reward feedback over time. Despite reasonable success of RL models in capturing choice on a trial-by-trial basis, these models cannot capture variability in matching behavior. To address this, we developed metrics based on information theory and applied them to choice data from dynamic learning tasks in mice and monkeys. We found that a single entropy-based metric can explain 50% and 41% of variance in matching in mice and monkeys, respectively. We then used limitations of existing RL models in capturing entropy-based metrics to construct more accurate models of choice. Together, our entropy-based metrics provide a model-free tool to predict adaptive choice behavior and reveal underlying neural mechanisms.


2021 ◽  
Author(s):  
◽  
Heather L Peters

<p>Self-control has been extensively studied using procedures in which subjects chose between two reinforcer alternatives. Traditionally, one of those alternatives delivers a small reinforcer after a short delay (SI), the other, a larger reinforcer after a long delay (LD). Choosing the SI is defined as impulsivity as it requires forfeit of the larger reinforcer; and choosing the LD is termed self-control. Four experiments were conducted to examine behaviour using non-human animal analogues of self-control situations. The subjects used for all four experiments were Norway-hooded rats. Experiment 1 used an SI - LD self-control paradigm to examine the effect of manipulating reinforcer quality on response distribution. Findings were that behaviour became more impulsive as the delay ratio became more extreme and this tendency was more systematic when different quality reinforcers were used for the SI and LD alternatives. Experiments 2 and 3 introduced a novel self-control paradigm designed as an analogue of choice situations in which individuals choose between two competing immediately available reinforcers each associated with a different delayed reinforcer. The procedure used was a concurrent-chains schedule that delivered primary reinforcement in the initial and the terminal links. The initial reinforcers were of equal amount and unequal quality; the terminal reinforcers were of unequal amount and equal quality. An impulsive choice was defined as choosing the alternative that delivered the most-valuable reinforcer in the initial link and the least-valued reinforcer in the terminal link. A self-controlled choice was defined as choosing the alternative that delivered the least-valuable reinforcer in the initial link and the most-valuable reinforcer in the terminal link. The results indicated that behaviour was more self-controlled when the terminal reinforcer quality was ethanol solution and increasing the delay between the initial and terminal links increased subjects' responding on the impulsive choice. Behaviour allocation in Experiment 3 was well described by the Contextual Choice Model (Grace, 1994) when the temporal context scaling parameter (k) was allowed to vary. Subjects that were relatively more impulsive had lower derived k values. The final experiment presented the subjects from Experiment 3 with concurrent variable interval (VI) VI schedules in which one alternative delivered plain-sucrose solution and the other ethanol-sucrose solution. Preference measures obtained from Experiment 4 were negatively correlated with the values obtained for the scaling parameter in Experiment 3, indicating that subjects which were more impulsive in the MN - ML paradigm had a stronger preference for ethanol. In summary, findings indicate that reinforcer quality may change the discriminability of reinforcer alternatives; and the influence of reinforcer quality on response allocation is well described by quantitative models based on the Matching Law.</p>


2021 ◽  
Author(s):  
◽  
Heather L Peters

<p>Self-control has been extensively studied using procedures in which subjects chose between two reinforcer alternatives. Traditionally, one of those alternatives delivers a small reinforcer after a short delay (SI), the other, a larger reinforcer after a long delay (LD). Choosing the SI is defined as impulsivity as it requires forfeit of the larger reinforcer; and choosing the LD is termed self-control. Four experiments were conducted to examine behaviour using non-human animal analogues of self-control situations. The subjects used for all four experiments were Norway-hooded rats. Experiment 1 used an SI - LD self-control paradigm to examine the effect of manipulating reinforcer quality on response distribution. Findings were that behaviour became more impulsive as the delay ratio became more extreme and this tendency was more systematic when different quality reinforcers were used for the SI and LD alternatives. Experiments 2 and 3 introduced a novel self-control paradigm designed as an analogue of choice situations in which individuals choose between two competing immediately available reinforcers each associated with a different delayed reinforcer. The procedure used was a concurrent-chains schedule that delivered primary reinforcement in the initial and the terminal links. The initial reinforcers were of equal amount and unequal quality; the terminal reinforcers were of unequal amount and equal quality. An impulsive choice was defined as choosing the alternative that delivered the most-valuable reinforcer in the initial link and the least-valued reinforcer in the terminal link. A self-controlled choice was defined as choosing the alternative that delivered the least-valuable reinforcer in the initial link and the most-valuable reinforcer in the terminal link. The results indicated that behaviour was more self-controlled when the terminal reinforcer quality was ethanol solution and increasing the delay between the initial and terminal links increased subjects' responding on the impulsive choice. Behaviour allocation in Experiment 3 was well described by the Contextual Choice Model (Grace, 1994) when the temporal context scaling parameter (k) was allowed to vary. Subjects that were relatively more impulsive had lower derived k values. The final experiment presented the subjects from Experiment 3 with concurrent variable interval (VI) VI schedules in which one alternative delivered plain-sucrose solution and the other ethanol-sucrose solution. Preference measures obtained from Experiment 4 were negatively correlated with the values obtained for the scaling parameter in Experiment 3, indicating that subjects which were more impulsive in the MN - ML paradigm had a stronger preference for ethanol. In summary, findings indicate that reinforcer quality may change the discriminability of reinforcer alternatives; and the influence of reinforcer quality on response allocation is well described by quantitative models based on the Matching Law.</p>


2021 ◽  
Vol 25 (12) ◽  
pp. 1665-1665
Author(s):  
Emi Furukawa ◽  
Brent Alsop ◽  
Shizuka Shimabukuro ◽  
Paula Sowerby ◽  
Stephanie Jensen ◽  
...  

Background: Research on altered motivational processes in ADHD has focused on reward. The sensitivity of children with ADHD to punishment has received limited attention. We evaluated the effects of punishment on the behavioral allocation of children with and without ADHD from the United States, New Zealand, and Japan, applying the generalized matching law. Methods: Participants in two studies (Furukawa et al., 2017, 2019) were 210 English-speaking (145 ADHD) and 93 Japanese-speaking (34 ADHD) children. They completed an operant task in which they chose between playing two simultaneously available games. Rewards became available every 10 seconds on average, arranged equally across the two games. Responses on one game were punished four times as often as responses on the other. The asymmetrical punishment schedules should bias responding to the less punished alternative. Results: Compared with controls, children with ADHD from both samples allocated significantly more responses to the less frequently punished game, suggesting greater behavioral sensitivity to punishment. For these children, the bias toward the less punished alternative increased with time on task. Avoiding the more punished game resulted in missed reward opportunities and reduced earnings. English-speaking controls showed some preference for the less punished game. The behavior of Japanese controls was not significantly influenced by the frequency of punishment, despite slowed response times after punished trials and immediate shifts away from the punished game, indicating awareness of punishment. Conclusion: Punishment exerted greater control over the behavior of children with ADHD, regardless of their cultural background. This may be a common characteristic of the disorder. Avoidance of punishment led to poorer task performance. Caution is required in the use of punishment, especially with children with ADHD. The group difference in punishment sensitivity was more pronounced in the Japanese sample; this may create a negative halo effect for children with ADHD in this culture.


PLoS ONE ◽  
2021 ◽  
Vol 16 (8) ◽  
pp. e0252540
Author(s):  
Andrew W. Lo ◽  
Katherine P. Marlowe ◽  
Ruixun Zhang

Probability matching, also known as the “matching law” or Herrnstein’s Law, has long puzzled economists and psychologists because of its apparent inconsistency with basic self-interest. We conduct an experiment with real monetary payoffs in which each participant plays a computer game to guess the outcome of a binary lottery. In addition to finding strong evidence for probability matching, we document different tendencies towards randomization in different payoff environments—as predicted by models of the evolutionary origin of probability matching—after controlling for a wide range of demographic and socioeconomic variables. We also find several individual differences in the tendency to maximize or randomize, correlated with wealth and other socioeconomic factors. In particular, subjects who have taken probability and statistics classes and those who self-reported finding a pattern in the game are found to have randomized more, contrary to the common wisdom that those with better understanding of probabilistic reasoning are more likely to be rational economic maximizers. Our results provide experimental evidence that individuals—even those with experience in probability and investing—engage in randomized behavior and probability matching, underscoring the role of the environment as a driver of behavioral anomalies.


2021 ◽  
Author(s):  
Ethan Trepka ◽  
Mehran Spitmaan ◽  
Bilal A Bari ◽  
Vincent D Costa ◽  
Jeremiah Y Cohen ◽  
...  

For decades, behavioral scientists have used the matching law to quantify how animals distribute their choices between multiple options in response to reinforcement they receive. More recently, many reinforcement learning (RL) models have been developed to explain choice by integrating reward feedback over time. Despite reasonable success of RL models in capturing choice on a trial-by-trial basis, these models cannot capture variability in matching. To address this, we developed novel metrics based on information theory and applied them to choice data from dynamic learning tasks in mice and monkeys. We found that a single entropy-based metric can explain 50% and 41% of variance in matching in mice and monkeys, respectively. We then used limitations of existing RL models in capturing entropy-based metrics to construct a more accurate model of choice. Together, our novel entropy-based metrics provide a powerful, model-free tool to predict adaptive choice behavior and reveal underlying neural mechanisms.


Sign in / Sign up

Export Citation Format

Share Document