Tactile object recognition performance on graspable objects, but not texture-like objects, relates to visual object recognition ability

Abstract The cognitive connection between the senses of touch and vision is probably the best-known case of cross-modality. Recent discoveries suggest that the mapping between both senses is learned rather than innate. These evidences open the door to a dynamic cross-modality that allows individuals to adaptively develop within their environment. Mimicking this aspect of human learning, we propose a new cross-modal mechanism that allows artificial cognitive systems (ACS) to adapt quickly to unforeseen perceptual anomalies generated by the environment or by the system itself. In this context, visual recognition systems have advanced remarkably in recent years thanks to the creation of large-scale datasets together with the advent of deep learning algorithms. However, such advances have not occurred on the haptic mode, mainly due to the lack of two-handed dexterous datasets that allow learning systems to process the tactile information of human object exploration. This data imbalance limits the creation of synchronized multimodal datasets that would enable the development of cross-modality in ACS during object exploration. In this work, we use a multimodal dataset recently generated from tactile sensors placed on a collection of objects that capture haptic data from human manipulation, together with the corresponding visual counterpart. Using this data, we create a cross-modal learning transfer mechanism capable of detecting both sudden and permanent anomalies in the visual channel and still maintain visual object recognition performance by retraining the visual mode for a few minutes using haptic information. Here we show the importance of cross-modality in perceptual awareness and its ecological capabilities to self-adapt to different environments.

Download Full-text

Minimal Recognizable Configurations Elicit Category-selective Responses in Higher Order Visual Cortex

Journal of Cognitive Neuroscience ◽

10.1162/jocn_a_01420 ◽

2019 ◽

Vol 31 (9) ◽

pp. 1354-1367

Author(s):

Yael Holzinger ◽

Shimon Ullman ◽

Daniel Harari ◽

Marlene Behrmann ◽

Galia Avidan

Keyword(s):

Visual Cortex ◽

Object Recognition ◽

Visual Recognition ◽

Recognition Performance ◽

Occipital Cortex ◽

Building Blocks ◽

Visual Object ◽

Visual Object Recognition ◽

Face Area ◽

Early Visual Cortex

Visual object recognition is performed effortlessly by humans notwithstanding the fact that it requires a series of complex computations, which are, as yet, not well understood. Here, we tested a novel account of the representations used for visual recognition and their neural correlates using fMRI. The rationale is based on previous research showing that a set of representations, termed “minimal recognizable configurations” (MIRCs), which are computationally derived and have unique psychophysical characteristics, serve as the building blocks of object recognition. We contrasted the BOLD responses elicited by MIRC images, derived from different categories (faces, objects, and places), sub-MIRCs, which are visually similar to MIRCs, but, instead, result in poor recognition and scrambled, unrecognizable images. Stimuli were presented in blocks, and participants indicated yes/no recognition for each image. We confirmed that MIRCs elicited higher recognition performance compared to sub-MIRCs for all three categories. Whereas fMRI activation in early visual cortex for both MIRCs and sub-MIRCs of each category did not differ from that elicited by scrambled images, high-level visual regions exhibited overall greater activation for MIRCs compared to sub-MIRCs or scrambled images. Moreover, MIRCs and sub-MIRCs from each category elicited enhanced activation in corresponding category-selective regions including fusiform face area and occipital face area (faces), lateral occipital cortex (objects), and parahippocampal place area and transverse occipital sulcus (places). These findings reveal the psychological and neural relevance of MIRCs and enable us to make progress in developing a more complete account of object recognition.

Download Full-text

Both fluid intelligence and visual object recognition ability relate to nodule detection in chest radiographs

Applied Cognitive Psychology ◽

10.1002/acp.3460 ◽

2018 ◽

Vol 32 (6) ◽

pp. 755-762 ◽

Cited By ~ 5

Author(s):

Mackenzie A. Sunday ◽

Edwin Donnelly ◽

Isabel Gauthier

Keyword(s):

Object Recognition ◽

Fluid Intelligence ◽

Chest Radiographs ◽

Visual Object ◽

Visual Object Recognition ◽

Recognition Ability ◽

Nodule Detection

Download Full-text

Canonical Views in Haptic Object Perception

Perception ◽

10.1068/p6038 ◽

2008 ◽

Vol 37 (12) ◽

pp. 1867-1878 ◽

Cited By ~ 18

Author(s):

Andrew T Woods ◽

Allison Moore ◽

Fiona N Newell

Keyword(s):

Object Recognition ◽

Recognition Performance ◽

Object Perception ◽

Large Degree ◽

Visual Object ◽

Visual Object Recognition ◽

Object Exploration ◽

Haptic Object ◽

Efficient Recognition ◽

Canonical Views

Previous investigations of visual object recognition have found that some views of both familiar and unfamiliar objects promote more efficient recognition performance than other views. These views are considered as canonical and are often the views that present the most information about an object's 3-D structure and features in the image. Although objects can also be efficiently recognised with touch alone, little is known whether some views promote more efficient recognition than others. This may seem unlikely, given that the object structure and features are readily available to the hand during object exploration. We conducted two experiments to investigate whether canonical views existed in haptic object recognition. In the first, participants were required to position each object in a way that would present the best view for learning the object with touch alone. We found a large degree of consistency of viewpoint position across participants for both familiar and unfamiliar objects. In a second experiment, we found that these consistent, or canonical, views promoted better haptic recognition performance than other random views of the objects. Interestingly, these haptic canonical views were not necessarily the same as the canonical views normally found in visual perception. Nevertheless, our findings provide support for the idea that both the visual and the tactile systems are functionally equivalent in terms of how objects are represented in memory and subsequently recognised.

Download Full-text

Recurrent convolutional neural networks: a better model of biological object recognition

10.1101/133330 ◽

2017 ◽

Cited By ~ 3

Author(s):

Courtney J. Spoerer ◽

Patrick McClure ◽

Nikolaus Kriegeskorte

Keyword(s):

Neural Networks ◽

Object Recognition ◽

Convolutional Neural Networks ◽

Recurrent Neural Networks ◽

Feedforward Control ◽

Recognition Performance ◽

Feedforward Neural Networks ◽

Visual Object ◽

Visual Object Recognition ◽

Feedback Connections

Feedforward neural networks provide the dominant model of how the brain performs visual object recognition. However, these networks lack the lateral and feedback connections, and the resulting recurrent neuronal dynamics, of the ventral visual pathway in the human and nonhuman primate brain. Here we investigate recurrent convolutional neural networks with bottom-up (B), lateral (L), and top-down (T) connections. Combining these types of connections yields four architectures (B, BT, BL, and BLT), which we systematically test and compare. We hypothesized that recurrent dynamics might improve recognition performance in the challenging scenario of partial occlusion. We introduce two novel occluded object recognition tasks to test the efficacy of the models, digit clutter (where multiple target digits occlude one another) and digit debris (where target digits are occluded by digit fragments). We find that recurrent neural networks outperform feedforward control models (approximately matched in parametric complexity) at recognising objects, both in the absence of occlusion and in all occlusion conditions. Recurrent networks were also found to be more robust to the inclusion of additive Gaussian noise. Recurrent neural networks are better in two respects: (1) they are more neurobiologically realistic than their feedforward counterparts; (2) they are better in terms of their ability to recognise objects, especially under challenging conditions. This work shows that computer vision can benefit from using recurrent convolutional architectures and suggests that the ubiquitous recurrent connections in biological brains are essential for task performance.

Download Full-text

Task, Timing, and Representation in Visual Object Recognition

Developing and Applying Biologically-Inspired Vision Systems ◽

10.4018/978-1-4666-2539-6.ch003 ◽

2012 ◽

pp. 44-64

Author(s):

Albert L. Rothenstein

Keyword(s):

Object Recognition ◽

Visual Processing ◽

Spatial Information ◽

Recognition Performance ◽

Explanatory Power ◽

Visual Object ◽

Visual Object Recognition ◽

Biologically Inspired ◽

Feed Forward ◽

Primate Vision

Most biologically-inspired models of object recognition rely on a feed-forward architecture in which abstract representations are gradually built from simple representations, but recognition performance in such systems drops when multiple objects are present in the input. This chapter puts forward the proposal that by using multiple passes of the visual processing hierarchy, both bottom-up and top-down, it is possible to address the limitations of feed-forward architectures and explain the different recognition behaviors that primate vision exhibits. The model relies on the reentrant connections that are ubiquitous in the primate brain to recover spatial information, and thus allow for the selective processing of stimuli. The chapter ends with a discussion of the implications of this work, its explanatory power, and a number of predictions for future experimental work.

Download Full-text