Visual and Affective Grounding in Language and Mind
One of the main limitations in natural language-based approaches to meaning is that they are not grounded. In this study, we evaluate how well different kinds of models account for people’s representations of both concrete and abstract concepts. The models are both unimodal (language-based only) models and multimodal distributional semantic models (which additionallyincorporate perceptual and/or affective information). The language-based models include both external (based on text corpora) and internal (derived from word associations) language. We present two new studies and a re-analysis of a series of previous studies demonstrating that the unimodal performance is substantially higher for internal models, especially when comparisons at the basiclevel are considered. For multimodal models, our findings suggest that additional visual and affective features lead to only slightly more accurate mental representations of word meaning than what is already encoded in internal language models; however, for abstract concepts, visual andaffective features improve the predictions of external text-based models. Our work presents new evidence that the grounding problem includes abstract words as well and is therefore more widespread than previously suggested. Implications for both embodied and distributional views arediscussed.