Value Learning through Reinforcement

2014 ◽  
pp. 283-298 ◽  
Author(s):  
Nathaniel D. Daw ◽  
Philippe N. Tobler
Keyword(s):  
Author(s):  
Joan E. Grusec

This chapter surveys how behavior, affect, and cognition with respect to parenting and moral development have been conceptualized over time. It moves to a discussion of domains of socialization; that is, different contexts in which socialization occurs and where different mechanisms operate. Domains include protection where the child is experiencing negative affect, reciprocity where there is an exchange of favors, group participation or learning through observing others and engaging with them in positive action, guided learning where values are taught in the child’s zone of proximal development, and control where values are learned through discipline and reward. Research using narratives of young adults about value-learning events suggests that inhibition of antisocial behavior is more likely learned in the control domain, and prosocial behavior more likely in the group participation domain. Internalization of values, measured by narrative meaningfulness, is most likely in the group participation domain.


2020 ◽  
Vol 209 ◽  
pp. 103134 ◽  
Author(s):  
Jonathan Rittmo ◽  
Rickard Carlsson ◽  
Pierre Gander ◽  
Robert Lowe

2018 ◽  
Author(s):  
Hilary Don ◽  
A Ross Otto ◽  
Astin Cornwall ◽  
Tyler Davis ◽  
Darrell A. Worthy

Learning about reward and expected values of choice alternatives is critical for adaptive behavior. Although human choice is affected by the presentation frequency of reward-related alternatives, this is overlooked by some dominant models of value learning. For instance, the delta rule learns average rewards, whereas the decay rule learns cumulative rewards for each option. In a binary-outcome choice task, participants selected between pairs of options that had reward probabilities of .65 (A) versus .35 (B) or .75 (C) versus .25 (D). Crucially, during training there were twice as many AB trials as CD trials, therefore option A was associated with higher cumulative reward, while option C gave higher average reward. Participants then decided between novel combinations of options (e.g., AC). Participants preferred option A, a result predicted by the Decay model, but not the Delta model. This suggests that expected values are based more on total reward than average reward.


Sign in / Sign up

Export Citation Format

Share Document