In uncertain environments we must balance our need to gather information with our desire to reaprewards by exploiting current knowledge. Achieving this balance is further complicated in reactiveenvironments where actions produce long-lasting change to the system. In four experiments, weinvestigate how people learn to make effective decisions from experience in a dynamic multi-armedbandit task. In contrast to the typical exploitation-dependent diminishing rewards found in previousstudies, options were framed as skills that developed greater rewards the more they were chosen. InExperiment 1, we provide a proof of concept, and in Experiments 2-4 we explore the boundaries ofparticipants’ sensitivity to reactive dynamics. Our results suggest that most individuals can learneffective strategies for coping with these reactive environments. A two-part comparison of severalcompeting psychological models supports several conclusions: 1) a sizeable minority of individualslearned that their environment was reactive, 2) several distinct groups of individuals employed uniquedecision strategies, and 3) testing models with the simulation method reveals qualitative misfits thatmotivate future theory development.