Temporal Limitations in Object Processing Across the Human Ventral Visual Pathway

2007 ◽  
Vol 98 (1) ◽  
pp. 382-393 ◽  
Author(s):  
Thomas J. McKeeff ◽  
David A. Remus ◽  
Frank Tong

Behavioral studies have shown that object recognition becomes severely impaired at fast presentation rates, indicating a limitation in temporal processing capacity. Here, we studied whether this behavioral limit in object recognition reflects limitations in the temporal processing capacity of early visual areas tuned to basic features or high-level areas tuned to complex objects. We used functional MRI (fMRI) to measure the temporal processing capacity of multiple areas along the ventral visual pathway progressing from the primary visual cortex (V1) to high-level object-selective regions, specifically the fusiform face area (FFA) and parahippocampal place area (PPA). Subjects viewed successive images of faces or houses at presentation rates varying from 2.3 to 37.5 items/s while performing an object discrimination task. Measures of the temporal frequency response profile of each visual area revealed a systematic decline in peak tuning across the visual hierarchy. Areas V1–V3 showed peak activity at rapid presentation rates of 18–25 items/s, area V4v peaked at intermediate rates (9 items/s), and the FFA and PPA peaked at the slowest temporal rates (4–5 items/s). Our results reveal a progressive loss in the temporal processing capacity of the human visual system as information is transferred from early visual areas to higher areas. These data suggest that temporal limitations in object recognition likely result from the limited processing capacity of high-level object-selective areas rather than that of earlier stages of visual processing.

2020 ◽  
Author(s):  
Yaoda Xu ◽  
Maryam Vaziri-Pashkam

ABSTRACTAny given visual object input is characterized by multiple visual features, such as identity, position and size. Despite the usefulness of identity and nonidentity features in vision and their joint coding throughout the primate ventral visual processing pathway, they have so far been studied relatively independently. Here we document the relative coding strength of object identity and nonidentity features in a brain region and how this may change across the human ventral visual pathway. We examined a total of four nonidentity features, including two Euclidean features (position and size) and two non-Euclidean features (image statistics and spatial frequency content of an image). Overall, identity representation increased and nonidentity feature representation decreased along the ventral visual pathway, with identity outweighed the non-Euclidean features, but not the Euclidean ones, in higher levels of visual processing. A similar analysis was performed in 14 convolutional neural networks (CNNs) pretrained to perform object categorization with varying architecture, depth, and with/without recurrent processing. While the relative coding strength of object identity and nonidentity features in lower CNN layers matched well with that in early human visual areas, the match between higher CNN layers and higher human visual regions were limited. Similar results were obtained regardless of whether a CNN was trained with real-world or stylized object images that emphasized shape representation. Together, by measuring the relative coding strength of object identity and nonidentity features, our approach provided a new tool to characterize feature coding in the human brain and the correspondence between the brain and CNNs.SIGNIFICANCE STATEMENTThis study documented the relative coding strength of object identity compared to four types of nonidentity features along the human ventral visual processing pathway and compared brain responses with those of 14 CNNs pretrained to perform object categorization. Overall, identity representation increased and nonidentity feature representation decreased along the ventral visual pathway, with the coding strength of the different nonidentity features differed at higher levels of visual processing. While feature coding in lower CNN layers matched well with that of early human visual areas, the match between higher CNN layers and higher human visual regions were limited. Our approach provided a new tool to characterize feature coding in the human brain and the correspondence between the brain and CNNs.


2016 ◽  
Vol 28 (9) ◽  
pp. 1295-1302 ◽  
Author(s):  
Stephanie Kristensen ◽  
Frank E. Garcea ◽  
Bradford Z. Mahon ◽  
Jorge Almeida

Visual processing of complex objects is supported by the ventral visual pathway in the service of object identification and by the dorsal visual pathway in the service of object-directed reaching and grasping. Here, we address how these two streams interact during tool processing, by exploiting the known asymmetry in projections of subcortical magnocellular and parvocellular inputs to the dorsal and ventral streams. The ventral visual pathway receives both parvocellular and magnocellular input, whereas the dorsal visual pathway receives largely magnocellular input. We used fMRI to measure tool preferences in parietal cortex when the images were presented at either high or low temporal frequencies, exploiting the fact that parvocellular channels project principally to the ventral but not dorsal visual pathway. We reason that regions of parietal cortex that exhibit tool preferences for stimuli presented at frequencies characteristic of the parvocellular pathway receive their inputs from the ventral stream. We found that the left inferior parietal lobule, in the vicinity of the supramarginal gyrus, exhibited tool preferences for images presented at low temporal frequencies, whereas superior and posterior parietal regions exhibited tool preferences for images present at high temporal frequencies. These data indicate that object identity, processed within the ventral stream, is communicated to the left inferior parietal lobule and may there combine with inputs from the dorsal visual pathway to allow for functionally appropriate object manipulation.


2017 ◽  
Author(s):  
Ilya Kuzovkin ◽  
Raul Vicente ◽  
Mathilde Petton ◽  
Jean-Philippe Lachaux ◽  
Monica Baciu ◽  
...  

Previous work demonstrated a direct correspondence between the hierarchy of the human visual areas and layers of deep convolutional neural networks (DCNN) trained on visual object recognition. We used DCNNs to investigate which frequency bands correlate with feature transformations of increasing complexity along the ventral visual pathway. By capitalizing on intracranial depth recordings from 100 patients and 11293 electrodes we assessed the alignment between the DCNN and signals at different frequency bands in different time windows. We found that gamma activity, especially in the low gamma-band (30 – 70 Hz), matched the increasing complexity of visual feature representations in the DCNN. These findings show that the activity of the DCNN captures the essential characteristics of biological object recognition not only in space and time, but also in the frequency domain. These results also demonstrate the potential that modern artificial intelligence algorithms have in advancing our understanding of the brain.Significance StatementRecent advances in the field of artificial intelligence have revealed principles about neural processing, in particular about vision. Previous works have demonstrated a direct correspondence between the hierarchy of human visual areas and layers of deep convolutional neural networks (DCNNs), suggesting that DCNN is a good model of visual object recognition in primate brain. Studying intracranial depth recordings allowed us to extend previous works by assessing when and at which frequency bands the activity of the visual system corresponds to the DCNN. Our key finding is that signals in gamma frequencies along the ventral visual pathway are aligned with the layers of DCNN. Gamma frequencies play a major role in transforming visual input to coherent object representations.


2020 ◽  
Author(s):  
Florence Campana ◽  
Jacob G. Martin ◽  
Levan Bokeria ◽  
Simon Thorpe ◽  
Xiong Jiang ◽  
...  

AbstractThe commonly accepted “simple-to-complex” model of visual processing in the brain posits that visual tasks on complex objects such as faces are based on representations in high-level visual areas. Yet, recent experimental data showing the visual system’s ability to localize faces in natural images within 100ms (Crouzet et al., 2010) challenge the prevalent hierarchical description of the visual system, and instead suggest the hypothesis of face-selectivity in early visual areas. In the present study, we tested this hypothesis with human participants in two eye tracking experiments, an fMRI experiment and an EEG experiment. We found converging evidence for neural representations selective for upright faces in V1/V2, with latencies starting around 40 ms post-stimulus onset. Our findings suggest a revision of the standard “simple-to-complex” model of hierarchical visual processing.Significance statementVisual processing in the brain is classically described as a series of stages with increasingly complex object representations: early visual areas encode simple visual features (such as oriented bars), and high-level visual areas encode representations for complex objects (such as faces). In the present study, we provide behavioral, fMRI, and EEG evidence for representations of complex objects – namely faces – in early visual areas. Our results challenge the standard “simple-to-complex” model of visual processing, suggesting that it needs to be revised to include neural representations for faces at the lowest levels of the visual hierarchy. Such early object representations would permit the rapid and precise localization of complex objects, as has previously been reported for the object class of faces.


2015 ◽  
Vol 35 (36) ◽  
pp. 12412-12424 ◽  
Author(s):  
A. Stigliani ◽  
K. S. Weiner ◽  
K. Grill-Spector

2020 ◽  
Author(s):  
Haider Al-Tahan ◽  
Yalda Mohsenzadeh

AbstractWhile vision evokes a dense network of feedforward and feedback neural processes in the brain, visual processes are primarily modeled with feedforward hierarchical neural networks, leaving the computational role of feedback processes poorly understood. Here, we developed a generative autoencoder neural network model and adversarially trained it on a categorically diverse data set of images. We hypothesized that the feedback processes in the ventral visual pathway can be represented by reconstruction of the visual information performed by the generative model. We compared representational similarity of the activity patterns in the proposed model with temporal (magnetoencephalography) and spatial (functional magnetic resonance imaging) visual brain responses. The proposed generative model identified two segregated neural dynamics in the visual brain. A temporal hierarchy of processes transforming low level visual information into high level semantics in the feedforward sweep, and a temporally later dynamics of inverse processes reconstructing low level visual information from a high level latent representation in the feedback sweep. Our results append to previous studies on neural feedback processes by presenting a new insight into the algorithmic function and the information carried by the feedback processes in the ventral visual pathway.Author summaryIt has been shown that the ventral visual cortex consists of a dense network of regions with feedforward and feedback connections. The feedforward path processes visual inputs along a hierarchy of cortical areas that starts in early visual cortex (an area tuned to low level features e.g. edges/corners) and ends in inferior temporal cortex (an area that responds to higher level categorical contents e.g. faces/objects). Alternatively, the feedback connections modulate neuronal responses in this hierarchy by broadcasting information from higher to lower areas. In recent years, deep neural network models which are trained on object recognition tasks achieved human-level performance and showed similar activation patterns to the visual brain. In this work, we developed a generative neural network model that consists of encoding and decoding sub-networks. By comparing this computational model with the human brain temporal (magnetoencephalography) and spatial (functional magnetic resonance imaging) response patterns, we found that the encoder processes resemble the brain feedforward processing dynamics and the decoder shares similarity with the brain feedback processing dynamics. These results provide an algorithmic insight into the spatiotemporal dynamics of feedforward and feedback processes in biological vision.


2020 ◽  
Vol 117 (23) ◽  
pp. 13145-13150 ◽  
Author(s):  
Insub Kim ◽  
Sang Wook Hong ◽  
Steven K. Shevell ◽  
Won Mok Shim

Color is a perceptual construct that arises from neural processing in hierarchically organized cortical visual areas. Previous research, however, often failed to distinguish between neural responses driven by stimulus chromaticity versus perceptual color experience. An unsolved question is whether the neural responses at each stage of cortical processing represent a physical stimulus or a color we see. The present study dissociated the perceptual domain of color experience from the physical domain of chromatic stimulation at each stage of cortical processing by using a switch rivalry paradigm that caused the color percept to vary over time without changing the retinal stimulation. Using functional MRI (fMRI) and a model-based encoding approach, we found that neural representations in higher visual areas, such as V4 and VO1, corresponded to the perceived color, whereas responses in early visual areas V1 and V2 were modulated by the chromatic light stimulus rather than color perception. Our findings support a transition in the ascending human ventral visual pathway, from a representation of the chromatic stimulus at the retina in early visual areas to responses that correspond to perceptually experienced colors in higher visual areas.


2017 ◽  
Vol 118 (1) ◽  
pp. 564-573 ◽  
Author(s):  
Sonia Poltoratski ◽  
Sam Ling ◽  
Devin McCormack ◽  
Frank Tong

The visual system employs a sophisticated balance of attentional mechanisms: salient stimuli are prioritized for visual processing, yet observers can also ignore such stimuli when their goals require directing attention elsewhere. A powerful determinant of visual salience is local feature contrast: if a local region differs from its immediate surround along one or more feature dimensions, it will appear more salient. We used high-resolution functional MRI (fMRI) at 7T to characterize the modulatory effects of bottom-up salience and top-down voluntary attention within multiple sites along the early visual pathway, including visual areas V1–V4 and the lateral geniculate nucleus (LGN). Observers viewed arrays of spatially distributed gratings, where one of the gratings immediately to the left or right of fixation differed from all other items in orientation or motion direction, making it salient. To investigate the effects of directed attention, observers were cued to attend to the grating to the left or right of fixation, which was either salient or nonsalient. Results revealed reliable additive effects of top-down attention and stimulus-driven salience throughout visual areas V1–hV4. In comparison, the LGN exhibited significant attentional enhancement but was not reliably modulated by orientation- or motion-defined salience. Our findings indicate that top-down effects of spatial attention can influence visual processing at the earliest possible site along the visual pathway, including the LGN, whereas the processing of orientation- and motion-driven salience primarily involves feature-selective interactions that take place in early cortical visual areas. NEW & NOTEWORTHY While spatial attention allows for specific, goal-driven enhancement of stimuli, salient items outside of the current focus of attention must also be prioritized. We used 7T fMRI to compare salience and spatial attentional enhancement along the early visual hierarchy. We report additive effects of attention and bottom-up salience in early visual areas, suggesting that salience enhancement is not contingent on the observer’s attentional state.


2005 ◽  
Vol 93 (6) ◽  
pp. 3453-3462 ◽  
Author(s):  
Mary-Ellen Large ◽  
Adrian Aldcroft ◽  
Tutis Vilis

Perceptual continuity is an important aspect of our experience of the visual world. In this study, we focus on an example of perceptual continuity involving the maintenance of figure—ground segregation despite the removal of binding cues that initiated the segregation. Fragmented line drawings of objects were superimposed on a background of randomly oriented lines. Global forms could be discriminated from the background based on differences in motion or differences in color/brightness. Furthermore, perception of a global form persisted after the binding cue had been removed. A comparison between the persistence of forms constructed from motion or color demonstrated that both forms produced persistence after the object defining cues were removed. Functional imaging showed a gradual increase in the persistence of brain activity in the lower visual areas (V1, V2, VP), which reached significance in V4v and peaked in the lateral occipital area. There was no difference in the location of persistence for color- or motion-defined forms. These results suggest that the retention of a global percept is an emerging property of the ventral visual processing stream and the maintenance of grouped visual elements is independent of cue type. We postulated that perceptual persistence depends on a system of perceptual memory reflecting the state of perceptual organization.


Sign in / Sign up

Export Citation Format

Share Document