scholarly journals Animacy and object size are reflected in perceptual similarity computations by the preschool years

2019 ◽  
Author(s):  
Bria Long ◽  
Mariko Moher ◽  
Susan Carey ◽  
Talia Konkle

By adulthood, animacy and object size jointly structure neural responses in visual cortex and influence perceptual similarity computations. Here, we take a first step in asking about the development of these aspects of cognitive architecture by probing whether animacy and object size are reflected in perceptual similarity computations by the preschool years. We used visual search performance as an index of perceptual similarity, as research with adults suggests search is slower when distractors are perceptually similar to the target. Preschoolers found target pictures more quickly when targets differed from distractor pictures in either animacy (Experiment 1) or in real-world size (Experiment 2; the pictures themselves were all the same size), versus when they do not. Taken together, these results suggest that the visual system has abstracted perceptual features for animates vs. inanimates and big vs. small objects as classes by the preschool years and call for further research exploring the development of these perceptual representations and their consequences for neural organization in childhood.

Author(s):  
Gwendolyn Rehrig ◽  
Reese A. Cullimore ◽  
John M. Henderson ◽  
Fernanda Ferreira

Abstract According to the Gricean Maxim of Quantity, speakers provide the amount of information listeners require to correctly interpret an utterance, and no more (Grice in Logic and conversation, 1975). However, speakers do tend to violate the Maxim of Quantity often, especially when the redundant information improves reference precision (Degen et al. in Psychol Rev 127(4):591–621, 2020). Redundant (non-contrastive) information may facilitate real-world search if it narrows the spatial scope under consideration, or improves target template specificity. The current study investigated whether non-contrastive modifiers that improve reference precision facilitate visual search in real-world scenes. In two visual search experiments, we compared search performance when perceptually relevant, but non-contrastive modifiers were included in the search instruction. Participants (NExp. 1 = 48, NExp. 2 = 48) searched for a unique target object following a search instruction that contained either no modifier, a location modifier (Experiment 1: on the top left, Experiment 2: on the shelf), or a color modifier (the black lamp). In Experiment 1 only, the target was located faster when the verbal instruction included either modifier, and there was an overall benefit of color modifiers in a combined analysis for scenes and conditions common to both experiments. The results suggest that violations of the Maxim of Quantity can facilitate search when the violations include task-relevant information that either augments the target template or constrains the search space, and when at least one modifier provides a highly reliable cue. Consistent with Degen et al. (2020), we conclude that listeners benefit from non-contrastive information that improves reference precision, and engage in rational reference comprehension. Significance statement This study investigated whether providing more information than someone needs to find an object in a photograph helps them to find that object more easily, even though it means they need to interpret a more complicated sentence. Before searching a scene, participants were either given information about where the object would be located in the scene, what color the object was, or were only told what object to search for. The results showed that providing additional information helped participants locate an object in an image more easily only when at least one piece of information communicated what part of the scene the object was in, which suggests that more information can be beneficial as long as that information is specific and helps the recipient achieve a goal. We conclude that people will pay attention to redundant information when it supports their task. In practice, our results suggest that instructions in other contexts (e.g., real-world navigation, using a smartphone app, prescription instructions, etc.) can benefit from the inclusion of what appears to be redundant information.


2017 ◽  
Vol 117 (1) ◽  
pp. 388-402 ◽  
Author(s):  
Michael A. Cohen ◽  
George A. Alvarez ◽  
Ken Nakayama ◽  
Talia Konkle

Visual search is a ubiquitous visual behavior, and efficient search is essential for survival. Different cognitive models have explained the speed and accuracy of search based either on the dynamics of attention or on similarity of item representations. Here, we examined the extent to which performance on a visual search task can be predicted from the stable representational architecture of the visual system, independent of attentional dynamics. Participants performed a visual search task with 28 conditions reflecting different pairs of categories (e.g., searching for a face among cars, body among hammers, etc.). The time it took participants to find the target item varied as a function of category combination. In a separate group of participants, we measured the neural responses to these object categories when items were presented in isolation. Using representational similarity analysis, we then examined whether the similarity of neural responses across different subdivisions of the visual system had the requisite structure needed to predict visual search performance. Overall, we found strong brain/behavior correlations across most of the higher-level visual system, including both the ventral and dorsal pathways when considering both macroscale sectors as well as smaller mesoscale regions. These results suggest that visual search for real-world object categories is well predicted by the stable, task-independent architecture of the visual system. NEW & NOTEWORTHY Here, we ask which neural regions have neural response patterns that correlate with behavioral performance in a visual processing task. We found that the representational structure across all of high-level visual cortex has the requisite structure to predict behavior. Furthermore, when directly comparing different neural regions, we found that they all had highly similar category-level representational structures. These results point to a ubiquitous and uniform representational structure in high-level visual cortex underlying visual object processing.


2019 ◽  
Vol 5 (1) ◽  
Author(s):  
Alejandro Lleras ◽  
Zhiyuan Wang ◽  
Anna Madison ◽  
Simona Buetti

Recently, Wang, Buetti and Lleras (2017) developed an equation to predict search performance in heterogeneous visual search scenes (i.e., multiple types of non-target objects simultaneously present) based on parameters observed when participants perform search in homogeneous scenes (i.e., when all non-target objects are identical to one another). The equation was based on a computational model where every item in the display is processed with unlimited capacity and independently of one another, with the goal of determining whether the item is likely to be a target or not. The model was tested in two experiments using real-world objects. Here, we extend those findings by testing the predictive power of the equation to simpler objects. Further, we compare the model’s performance under two stimulus arrangements: spatially-intermixed (items randomly placed around the scene) and spatially-segregated displays (identical items presented near each other). This comparison allowed us to isolate and quantify the facilitatory effect of processing displays that contain identical items (homogeneity facilitation), a factor that improves performance in visual search above-and-beyond target-distractor dissimilarity. The results suggest that homogeneity facilitation effects in search arise from local item-to-item interaction (rather than by rejecting items as “groups”) and that the strength of those interactions might be determined by stimulus complexity (with simpler stimuli producing stronger interactions and thus, stronger homogeneity facilitation effects).


2017 ◽  
Vol 3 (1) ◽  
Author(s):  
Zhiyuan Wang ◽  
Simona Buetti ◽  
Alejandro Lleras

Previous work in our lab has demonstrated that efficient visual search with a fixed target has a reaction time by set size function that is best characterized by logarithmic curves. Further, the steepness of these logarithmic curves is determined by the similarity between target and distractor items (Buetti et al., 2016). A theoretical account of these findings was proposed, namely that a parallel, unlimited capacity, exhaustive processing architecture is underlying such data. Here, we conducted two experiments to expand these findings to a set of real-world stimuli, in both homogeneous and heterogeneous search displays. We used computational simulations of this architecture to identify a way to predict RT performance in heterogeneous search using parameters estimated from homogeneous search data. Further, by examining the systematic deviation from our predictions in the observed data, we found evidence that early visual processing for individual items is not independent. Instead, items in homogeneous displays seemed to facilitate each other’s processing by a multiplicative factor. These results challenge previous accounts of heterogeneity effects in visual search, and demonstrate the explanatory and predictive power of an approach that combines computational simulations and behavioral data to better understand performance in visual search.


2021 ◽  
Author(s):  
Thomas L. Botch ◽  
Brenda D. Garcia ◽  
Yeo Bi Choi ◽  
Caroline E. Robertson

Visual search is a universal human activity in naturalistic environments. Traditionally, visual search is investigated under tightly controlled conditions, where head-restricted participants locate a minimalistic target in a cluttered array presented on a computer screen. Do classic findings of visual search extend to naturalistic settings, where participants actively explore complex, real-world scenes? Here, we leverage advances in virtual reality (VR) technology to relate individual differences in classic visual search paradigms to naturalistic search behavior. In a naturalistic visual search task, participants looked for an object within their environment via a combination of head-turns and eye-movements using a head-mounted display. Then, in a classic visual search task, participants searched for a target within a simple array of colored letters using only eye-movements. We tested how set size, a property known to limit visual search within computer displays, predicts the efficiency of search behavior inside immersive, real-world scenes that vary in levels of visual clutter. We found that participants' search performance was impacted by the level of visual clutter within real-world scenes. Critically, we also observed that individual differences in visual search efficiency in classic search predicted efficiency in real-world search, but only when the comparison was limited to the forward-facing field of view for real-world search. These results demonstrate that set size is a reliable predictor of individual performance across computer-based and active, real-world visual search behavior.


2020 ◽  
Author(s):  
Caterina Magri ◽  
Talia Konkle ◽  
Alfonso Caramazza

AbstractIn human occipitotemporal cortex, brain responses to depicted inanimate objects have a large-scale organization by real-world object size. Critically, the size of objects in the world is systematically related to behaviorally-relevant properties: small objects are often grasped and manipulated (e.g., forks), while large objects tend to be less motor-relevant (e.g., tables), though this relationship does not always have to be true (e.g., picture frames and wheelbarrows). To determine how these two dimensions interact, we measured brain activity with functional magnetic resonance imaging while participants viewed a stimulus set of small and large objects with either low or high motor-relevance. The results revealed that the size organization was evident for objects with both low and high motor-relevance; further, a motor-relevance map was also evident across both large and small objects. Targeted contrasts revealed that typical combinations (small motor-relevant vs. large non-motor-relevant) yielded more robust topographies than the atypical covariance contrast (small non-motor-relevant vs. large motor-relevant). In subsequent exploratory analyses, a factor analysis revealed that the construct of motor-relevance was better explained by two underlying factors: one more related to manipulability, and the other to whether an object moves or is stable. The factor related to manipulability better explained responses in lateral small-object preferring regions, while the factor related to object stability (lack of movement) better explained responses in ventromedial large-object preferring regions. Taken together, these results reveal that the structure of neural responses to objects of different sizes further reflect behavior-relevant properties of manipulability and stability, and contribute to a deeper understanding of some of the factors that help the large-scale organization of object representation in high-level visual cortex.Highlights-Examined the relationship between real-world size and motor-relevant properties in the structure of responses to inanimate objects.-Large scale topography was more robust for contrast that followed natural covariance of small motor-relevant vs. large non-motor-relevant, over contrast that went against natural covariance.-Factor analysis revealed that manipulability and stability were, respectively, better explanatory predictors of responses in small- and large-object regions.


2019 ◽  
Author(s):  
Daria Kvasova ◽  
Laia Garcia-Vernet ◽  
Salvador Soto-Faraco

AbstractReal-world multisensory events do not only provide temporally and spatially correlated information, but also semantic correspondences about object identity. Semantically consistent sounds can enhance visual detection, identification and search performance, but these effects are always demonstrated in simple and stereotyped displays that lack ecological validity. In order to address identity-based crossmodal relationships in real world scenarios, we designed a visual search task using complex, dynamic scenes. Participants searched objects in video clips from real life scenes with background sounds. Auditory cues embedded in the background sounds could be target-consistent, distracter-consistent, neutral and no sound (just background noise). We found that characteristic sounds enhance visual search of relevant objects in natural scenes but fail to increase the salience of irrelevant distracters. Our findings generalize previous results on object-based crossmodal interactions with simple stimuli and shed light upon how audio-visual semantically congruent relationships play out in real life contexts.


Author(s):  
Viktoria R. Mileva ◽  
Peter J. B. Hancock ◽  
Stephen R. H. Langton

AbstractFinding an unfamiliar person in a crowd of others is an integral task for police officers, CCTV-operators, and security staff who may be looking for a suspect or missing person; however, research suggests that it is difficult and accuracy in such tasks is low. In two real-world visual-search experiments, we examined whether being provided with four images versus one image of an unfamiliar target person would help improve accuracy when searching for that person through video footage. In Experiment 1, videos were taken from above and at a distance to simulate CCTV, and images of the target showed their face and torso. In Experiment 2, videos were taken from approximately shoulder height, such as one would expect from body-camera or mobile phone recordings, and target images included only the face. Our findings suggest that having four images as exemplars leads to higher accuracy in the visual search tasks, but this only reached significance in Experiment 2. There also appears to be a conservative bias whereby participants are more likely to respond that the target is not in the video when presented with only one image as opposed to 4. These results point to there being an advantage for providing multiple images of targets for use in video visual-search.


2020 ◽  
Author(s):  
Gwendolyn L Rehrig ◽  
Reese A. Cullimore ◽  
John M. Henderson ◽  
Fernanda Ferreira

According to the Gricean Maxim of Quantity, speakers provide the amount of information listeners require to correctly interpret an utterance, and no more (Grice, 1975). However, speakers do tend to violate the Maxim of Quantity often, especially when the redundant information improves reference precision (Degen et al., 2020). Redundant (non-contrastive) information may facilitate real-world search if it narrows the spatial scope under consideration, or improves target template specificity. The current study investigated whether non-contrastive modifiers that improve reference precision facilitate visual search in real-world scenes. In two visual search experiments, we compared search performance when perceptually-relevant, but non-contrastive modifiers were included in the search instruction. Participants (NExp. 1=48, NExp. 2=48) searched for a unique target object following a search instruction that contained either no modifier, a location modifier (Experiment 1: on the top left, Experiment 2: on the shelf), or a color modifier (the black lamp). In Experiment 1 only, the target was located faster when the verbal instruction included either modifier, and there was an overall benefit of color modifiers in a combined analysis for scenes and conditions common to both experiments. The results suggest that violations of the maxim of quantity can facilitate search when the violations include task-relevant information that either augments the target template or constrains the search space, and when at least one modifier provides a highly reliable cue. Consistent with Degen et al., (2020), we conclude that listeners benefit from non-contrastive information that improves reference precision, and engage in rational reference comprehension.


Author(s):  
Amit Barde ◽  
Matt Ward ◽  
William S. Helton ◽  
Mark Billinghurst ◽  
Gun Lee

Visual search performance was studied using auditory cues delivered over a bone conduction headset. Two types of auditory cues were employed to evaluate the effectiveness of such cues in an attention redirection task. Participants were required to locate and shoot targets at one of four locations on a screen when one of the two audio cues was delivered. Reaction and target acquisition times were significantly reduced when the binaurally spatialised cues were used compared to unlocalisable, monophonic cues. This appears to suggest that an auditory cue with directional information is far superior at aiding search tasks or alerting the user to redirect attention in the real-world space in comparison to a centered ‘monophonic’ cue. The results demonstrate the effectiveness of a binaurally spatialised, dynamic cue and point to its potential use in an information rich environment to provide useful and actionable information.


Sign in / Sign up

Export Citation Format

Share Document