scholarly journals Efficient coding of natural scenes improves neural system identification

2022 ◽  
Author(s):  
Yongrong Qiu ◽  
David A Klindt ◽  
Klaudia P Szatko ◽  
Dominic Gonschorek ◽  
Larissa Hoefling ◽  
...  

Neural system identification aims at learning the response function of neurons to arbitrary stimuli using experimentally recorded data, but typically does not leverage coding principles such as efficient coding of natural environments. Visual systems, however, have evolved to efficiently process input from the natural environment. Here, we present a normative network regularization for system identification models by incorporating, as a regularizer, the efficient coding hypothesis, which states that neural response properties of sensory representations are strongly shaped by the need to preserve most of the stimulus information with limited resources. Using this approach, we explored if a system identification model can be improved by sharing its convolutional filters with those of an autoencoder which aims to efficiently encode natural stimuli. To this end, we built a hybrid model to predict the responses of retinal neurons to noise stimuli. This approach did not only yield a higher performance than the stand-alone system identification model, it also produced more biologically-plausible filters. We found these results to be consistent for retinal responses to different stimuli and across model architectures. Moreover, our normatively regularized model performed particularly well in predicting responses of direction-of-motion sensitive retinal neurons. In summary, our results support the hypothesis that efficiently encoding environmental inputs can improve system identification models of early visual processing.

2008 ◽  
Vol 275 (1649) ◽  
pp. 2299-2308 ◽  
Author(s):  
M To ◽  
P.G Lovell ◽  
T Troscianko ◽  
D.J Tolhurst

Natural visual scenes are rich in information, and any neural system analysing them must piece together the many messages from large arrays of diverse feature detectors. It is known how threshold detection of compound visual stimuli (sinusoidal gratings) is determined by their components' thresholds. We investigate whether similar combination rules apply to the perception of the complex and suprathreshold visual elements in naturalistic visual images. Observers gave magnitude estimations (ratings) of the perceived differences between pairs of images made from photographs of natural scenes. Images in some pairs differed along one stimulus dimension such as object colour, location, size or blur. But, for other image pairs, there were composite differences along two dimensions (e.g. both colour and object-location might change). We examined whether the ratings for such composite pairs could be predicted from the two ratings for the respective pairs in which only one stimulus dimension had changed. We found a pooling relationship similar to that proposed for simple stimuli: Minkowski summation with exponent 2.84 yielded the best predictive power ( r =0.96), an exponent similar to that generally reported for compound grating detection. This suggests that theories based on detecting simple stimuli can encompass visual processing of complex, suprathreshold stimuli.


2016 ◽  
Author(s):  
Alexander Heitman ◽  
Nora Brackbill ◽  
Martin Greschner ◽  
Alexander Sher ◽  
Alan M. Litke ◽  
...  

A central goal of systems neuroscience is to develop accurate quantitative models of how neural circuits process information. Prevalent models of light response in retinal ganglion cells (RGCs) usually begin with linear filtering over space and time, which reduces the high-dimensional visual stimulus to a simpler and more tractable scalar function of time that in turn determines the model output. Although these pseudo-linear models can accurately replicate RGC responses to stochastic stimuli, it is unclear whether the strong linearity assumption captures the function of the retina in the natural environment. This paper tests how accurately one pseudo-linear model, the generalized linear model (GLM), explains the responses of primate RGCs to naturalistic visual stimuli. Light responses from macaque RGCs were obtained using large-scale multi-electrode recordings, and two major cell types, ON and OFF parasol, were examined. Visual stimuli consisted of images of natural environments with simulated saccadic and fixational eye movements. The GLM accurately reproduced RGC responses to white noise stimuli, as observed previously, but did not generalize to predict RGC responses to naturalistic stimuli. It also failed to capture RGC responses when fitted and tested with naturalistic stimuli alone. Fitted scalar nonlinearities before and after the linear filtering stage were insufficient to correct the failures. These findings suggest that retinal signaling under natural conditions cannot be captured by models that begin with linear filtering, and emphasize the importance of additional spatial nonlinearities, gain control, and/or peripheral effects in the first stage of visual processing.


2017 ◽  
Author(s):  
Warrick Roseboom ◽  
Zafeirios Fountas ◽  
Kyriacos Nikiforou ◽  
David Bhowmik ◽  
Murray Shanahan ◽  
...  

Despite being a fundamental dimension of experience, how the human brain generates the perception of time remains unknown. Here, we provide a novel explanation for how human time perception might be accomplished, based on non-temporal perceptual clas-sification processes. To demonstrate this proposal, we built an artificial neural system centred on a feed-forward image classification network, functionally similar to human visual processing. In this system, input videos of natural scenes drive changes in network activation, and accumulation of salient changes in activation are used to estimate duration. Estimates produced by this system match human reports made about the same videos, replicating key qualitative biases, including differentiating between scenes of walking around a busy city or sitting in a cafe or office. Our approach provides a working model of duration perception from stimulus to estimation and presents a new direction for examining the foundations of this central aspect of human experience.


2018 ◽  
Author(s):  
Samuel A. Ocko ◽  
Jack Lindsey ◽  
Surya Ganguli ◽  
Stephane Deny

AbstractOne of the most striking aspects of early visual processing in the retina is the immediate parcellation of visual information into multiple parallel pathways, formed by different retinal ganglion cell types each tiling the entire visual field. Existing theories of efficient coding have been unable to account for the functional advantages of such cell-type diversity in encoding natural scenes. Here we go beyond previous theories to analyze how a simple linear retinal encoding model with different convolutional cell types efficiently encodes naturalistic spatiotemporal movies given a fixed firing rate budget. We find that optimizing the receptive fields and cell densities of two cell types makes them match the properties of the two main cell types in the primate retina, midget and parasol cells, in terms of spatial and temporal sensitivity, cell spacing, and their relative ratio. Moreover, our theory gives a precise account of how the ratio of midget to parasol cells decreases with retinal eccentricity. Also, we train a nonlinear encoding model with a rectifying nonlinearity to efficiently encode naturalistic movies, and again find emergent receptive fields resembling those of midget and parasol cells that are now further subdivided into ON and OFF types. Thus our work provides a theoretical justification, based on the efficient coding of natural movies, for the existence of the four most dominant cell types in the primate retina that together comprise 70% of all ganglion cells.


2017 ◽  
Author(s):  
David W. Hunter ◽  
Paul B. Hibbard

AbstractVisual acuity is greatest in the centre of the visual field, peaking in the fovea and degrading significantly towards the periphery. The rate of decay of visual performance with eccentricity depends strongly on the stimuli and task used in measurement. While detailed measures of this decay have been made across a broad range of tasks, a comprehensive theoretical account of this phenomenon is lacking. We demonstrate that the decay in visual performance can be attributed to the efficient encoding of binocular information in natural scenes. The efficient coding hypothesis holds that the early stages of visual processing attempt to form an efficient coding of ecologically valid stimuli. Using Independent Component Analysis to learn an efficient coding of stereoscopic images, we show that the ratio of binocular to monocular components varied with eccentricity at the same rate as human stereo acuity and Vernier acuity. Our results demonstrate that the organisation of the visual cortex is dependent on the underlying statistics of binocular scenes and, strikingly, that monocular acuity depends on the mechanisms by which the visual cortex processes binocular information. This result has important theoretical implications for understanding the encoding of visual information in the brain.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Emmanuelle Sophie Briolat ◽  
Lina María Arenas ◽  
Anna E. Hughes ◽  
Eric Liggins ◽  
Martin Stevens

Abstract Background Crypsis by background-matching is a critical form of anti-predator defence for animals exposed to visual predators, but achieving effective camouflage in patchy and variable natural environments is not straightforward. To cope with heterogeneous backgrounds, animals could either specialise on particular microhabitat patches, appearing cryptic in some areas but mismatching others, or adopt a compromise strategy, providing partial matching across different patch types. Existing studies have tested the effectiveness of compromise strategies in only a limited set of circumstances, primarily with small targets varying in pattern, and usually in screen-based tasks. Here, we measured the detection risk associated with different background-matching strategies for relatively large targets, with human observers searching for them in natural scenes, and focusing on colour. Model prey were designed to either ‘specialise’ on the colour of common microhabitat patches, or ‘generalise’ by matching the average colour of the whole visual scenes. Results In both the field and an equivalent online computer-based search task, targets adopting the generalist strategy were more successful in evading detection than those matching microhabitat patches. This advantage occurred because, across all possible locations in these experiments, targets were typically viewed against a patchwork of different microhabitat areas; the putatively generalist targets were thus more similar on average to their various immediate surroundings than were the specialists. Conclusions Demonstrating close agreement between the results of field and online search experiments provides useful validation of online citizen science methods commonly used to test principles of camouflage, at least for human observers. In finding a survival benefit to matching the average colour of the visual scenes in our chosen environment, our results highlight the importance of relative scales in determining optimal camouflage strategies, and suggest how compromise coloration can succeed in nature.


2009 ◽  
Vol 26 (1) ◽  
pp. 35-49 ◽  
Author(s):  
THORSTEN HANSEN ◽  
KARL R. GEGENFURTNER

AbstractForm vision is traditionally regarded as processing primarily achromatic information. Previous investigations into the statistics of color and luminance in natural scenes have claimed that luminance and chromatic edges are not independent of each other and that any chromatic edge most likely occurs together with a luminance edge of similar strength. Here we computed the joint statistics of luminance and chromatic edges in over 700 calibrated color images from natural scenes. We found that isoluminant edges exist in natural scenes and were not rarer than pure luminance edges. Most edges combined luminance and chromatic information but to varying degrees such that luminance and chromatic edges were statistically independent of each other. Independence increased along successive stages of visual processing from cones via postreceptoral color-opponent channels to edges. The results show that chromatic edge contrast is an independent source of information that can be linearly combined with other cues for the proper segmentation of objects in natural and artificial vision systems. Color vision may have evolved in response to the natural scene statistics to gain access to this independent information.


2018 ◽  
Author(s):  
Niru Maheswaranathan ◽  
Lane T. McIntosh ◽  
Hidenori Tanaka ◽  
Satchel Grant ◽  
David B. Kastner ◽  
...  

AbstractUnderstanding how the visual system encodes natural scenes is a fundamental goal of sensory neuroscience. We show here that a three-layer network model predicts the retinal response to natural scenes with an accuracy nearing the fundamental limits of predictability. The model’s internal structure is interpretable, in that model units are highly correlated with interneurons recorded separately and not used to fit the model. We further show the ethological relevance to natural visual processing of a diverse set of phenomena of complex motion encoding, adaptation and predictive coding. Our analysis uncovers a fast timescale of visual processing that is inaccessible directly from experimental data, showing unexpectedly that ganglion cells signal in distinct modes by rapidly (< 0.1 s) switching their selectivity for direction of motion, orientation, location and the sign of intensity. A new approach that decomposes ganglion cell responses into the contribution of interneurons reveals how the latent effects of parallel retinal circuits generate the response to any possible stimulus. These results reveal extremely flexible and rapid dynamics of the retinal code for natural visual stimuli, explaining the need for a large set of interneuron pathways to generate the dynamic neural code for natural scenes.


Author(s):  
N Seijdel ◽  
N Tsakmakidis ◽  
EHF De Haan ◽  
SM Bohte ◽  
HS Scholte

AbstractFeedforward deep convolutional neural networks (DCNNs) are, under specific conditions, matching and even surpassing human performance in object recognition in natural scenes. This performance suggests that the analysis of a loose collection of image features could support the recognition of natural object categories, without dedicated systems to solve specific visual subtasks. Research in humans however suggests that while feedforward activity may suffice for sparse scenes with isolated objects, additional visual operations (‘routines’) that aid the recognition process (e.g. segmentation or grouping) are needed for more complex scenes. Linking human visual processing to performance of DCNNs with increasing depth, we here explored if, how, and when object information is differentiated from the backgrounds they appear on. To this end, we controlled the information in both objects and backgrounds, as well as the relationship between them by adding noise, manipulating background congruence and systematically occluding parts of the image. Results indicate that with an increase in network depth, there is an increase in the distinction between object- and background information. For more shallow networks, results indicated a benefit of training on segmented objects. Overall, these results indicate that, de facto, scene segmentation can be performed by a network of sufficient depth. We conclude that the human brain could perform scene segmentation in the context of object identification without an explicit mechanism, by selecting or “binding” features that belong to the object and ignoring other features, in a manner similar to a very deep convolutional neural network.


Sign in / Sign up

Export Citation Format

Share Document