Exploring Perceptual Illusions in Deep Neural Networks

Mapping Intimacies ◽

10.1101/687905 ◽

2019 ◽

Author(s):

Emily J. Ward

Keyword(s):

Neural Networks ◽

Object Recognition ◽

Visual Processing ◽

Visual Information ◽

Deep Neural Networks ◽

Perceptual System ◽

Deep Convolutional Neural Networks ◽

Perceptual Illusions ◽

Trained Neural Network ◽

Level Performance

AbstractPerceptual illusions—discrepancies between what exists externally and what we actually see—reveal a great deal about how the perceptual system functions. Rather than failures of perception, illusions expose automatic computations and biases in visual processing that help make better decisions from visual information to achieve our perceptual goals. Recognizing objects is one such perceptual goal that is shared between humans and certain Deep Convolutional Neural Networks, which can reach human-level performance. Do neural networks trained exclusively for object recognition “perceive” visual illusions, simply as a result of solving this one perceptual problem? Here, I showed four classic illusions to humans and a pre-trained neural network to see if the network exhibits similar perceptual biases. I found that deep neural networks trained exclusively for object recognition exhibit the Müller-Lyer illusion, but not other illusions. This result shows that some perceptual computations that are similar to humans’ may come “for free” in a system with perceptual goals similar to humans’.

Download Full-text

Noise-robust recognition of objects by humans and deep neural networks

10.1101/2020.08.03.234625 ◽

2020 ◽

Author(s):

Hojin Jang ◽

Devin McCormack ◽

Frank Tong

Keyword(s):

Neural Networks ◽

Visual Processing ◽

Deep Neural Networks ◽

Signal To Noise Ratio ◽

Human Vision ◽

Training Procedure ◽

Robust Recognition ◽

Recognition Of Objects ◽

Noise Robust ◽

Level Performance

ABSTRACTDeep neural networks (DNNs) can accurately recognize objects in clear viewing conditions, leading to claims that they have attained or surpassed human-level performance. However, standard DNNs are severely impaired at recognizing objects in visual noise, whereas human vision remains robust. We developed a noise-training procedure, generating noisy images of objects with low signal-to-noise ratio, to investigate whether DNNs can acquire robustness that better matches human vision. After noise training, DNNs outperformed human observers while exhibiting more similar patterns of performance, and provided a better model for predicting human recognition thresholds on an image-by-image basis. Noise training also improved DNN recognition of vehicles in noisy weather. Layer-specific analyses revealed that the contaminating effects of noise were dampened, rather than amplified, across successive stages of the noise-trained network, with greater benefit at higher levels of the network. Our findings indicate that DNNs can learn noise-robust representations that better approximate human visual processing.

Download Full-text

Beyond Core Object Recognition: Recurrent processes account for object recognition under occlusion

10.1101/302034 ◽

2018 ◽

Cited By ~ 8

Author(s):

Karim Rajaei ◽

Yalda Mohsenzadeh ◽

Reza Ebrahimpour ◽

Seyed-Mahdi Khaligh-Razavi

Keyword(s):

Neural Networks ◽

Deep Learning ◽

Object Recognition ◽

Human Brain ◽

Deep Neural Networks ◽

Temporal Dynamics ◽

Computational Modelling ◽

Mechanistic Explanation ◽

The Core ◽

Level Performance

AbstractCore object recognition, the ability to rapidly recognize objects despite variations in their appearance, is largely solved through the feedforward processing of visual information. Deep neural networks are shown to achieve human-level performance in these tasks, and explain the primate brain representation. On the other hand, object recognition under more challenging conditions (i.e. beyond the core recognition problem) is less characterized. One such example is object recognition under occlusion. It is unclear to what extent feedforward and recurrent processes contribute in object recognition under occlusion. Furthermore, we do not know whether the conventional deep neural networks, such as AlexNet, which were shown to be successful in solving core object recognition, can perform similarly well in problems that go beyond the core recognition. Here, we characterize neural dynamics of object recognition under occlusion, using magnetoencephalography (MEG), while participants were presented with images of objects with various levels of occlusion. We provide evidence from multivariate analysis of MEG data, behavioral data, and computational modelling, demonstrating an essential role for recurrent processes in object recognition under occlusion. Furthermore, the computational model with local recurrent connections, used here, suggests a mechanistic explanation of how the human brain might be solving this problem.Author SummaryIn recent years, deep-learning-based computer vision algorithms have been able to achieve human-level performance in several object recognition tasks. This has also contributed in our understanding of how our brain may be solving these recognition tasks. However, object recognition under more challenging conditions, such as occlusion, is less characterized. Temporal dynamics of object recognition under occlusion is largely unknown in the human brain. Furthermore, we do not know if the previously successful deep-learning algorithms can similarly achieve human-level performance in these more challenging object recognition tasks. By linking brain data with behavior, and computational modeling, we characterized temporal dynamics of object recognition under occlusion, and proposed a computational mechanism that explains both behavioral and the neural data in humans. This provides a plausible mechanistic explanation for how our brain might be solving object recognition under more challenging conditions.

Download Full-text

Faculty Opinions recommendation of Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence.

Faculty Opinions – Post-Publication Peer Review of the Biomedical Literature ◽

10.3410/f.726413891.793534418 ◽

2017 ◽

Author(s):

Odelia Schwartz

Keyword(s):

Neural Networks ◽

Object Recognition ◽

Deep Neural Networks ◽

Visual Object ◽

Visual Object Recognition ◽

Cortical Dynamics ◽

Spatio Temporal

Download Full-text

Human and bird detection and classification based on Doppler radar spectrograms and vision images using convolutional neural networks

International Journal of Advanced Robotic Systems ◽

10.1177/17298814211010569 ◽

2021 ◽

Vol 18 (3) ◽

pp. 172988142110105

Author(s):

Jnana Sai Abhishek Varma Gokaraju ◽

Weon Keun Song ◽

Min-Ho Ka ◽

Somyot Kaitwanidvilai

Keyword(s):

Neural Networks ◽

Object Recognition ◽

Convolutional Neural Networks ◽

Doppler Radar ◽

Autonomous Vehicle ◽

Sampling Point ◽

Body Segment ◽

Kinematic Modeling ◽

Deep Convolutional Neural Networks ◽

Kinematic Models

The study investigated object detection and classification based on both Doppler radar spectrograms and vision images using two deep convolutional neural networks. The kinematic models for a walking human and a bird flapping its wings were incorporated into MATLAB simulations to create data sets. The dynamic simulator identified the final position of each ellipsoidal body segment taking its rotational motion into consideration in addition to its bulk motion at each sampling point to describe its specific motion naturally. The total motion induced a micro-Doppler effect and created a micro-Doppler signature that varied in response to changes in the input parameters, such as varying body segment size, velocity, and radar location. Micro-Doppler signature identification of the radar signals returned from the target objects that were animated by the simulator required kinematic modeling based on a short-time Fourier transform analysis of the signals. Both You Only Look Once V3 and Inception V3 were used for the detection and classification of the objects with different red, green, blue colors on black or white backgrounds. The results suggested that clear micro-Doppler signature image-based object recognition could be achieved in low-visibility conditions. This feasibility study demonstrated the application possibility of Doppler radar to autonomous vehicle driving as a backup sensor for cameras in darkness. In this study, the first successful attempt of animated kinematic models and their synchronized radar spectrograms to object recognition was made.

Download Full-text

RGB-D object recognition with multimodal deep convolutional neural networks

2017 IEEE International Conference on Multimedia and Expo (ICME) ◽

10.1109/icme.2017.8019538 ◽

2017 ◽

Cited By ~ 12

Author(s):

Mohammad Muntasir Rahman ◽

Yanhao Tan ◽

Jian Xue ◽

Ke Lu

Keyword(s):

Neural Networks ◽

Object Recognition ◽

Convolutional Neural Networks ◽

Deep Convolutional Neural Networks

Download Full-text

Depth in convolutional neural networks solves scene segmentation

10.1101/2019.12.16.877753 ◽

2019 ◽

Cited By ~ 1

Author(s):

N Seijdel ◽

N Tsakmakidis ◽

EHF De Haan ◽

SM Bohte ◽

HS Scholte

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Visual Processing ◽

Human Performance ◽

Object Identification ◽

Image Features ◽

Background Information ◽

Natural Scenes ◽

Scene Segmentation ◽

Deep Convolutional Neural Networks

AbstractFeedforward deep convolutional neural networks (DCNNs) are, under specific conditions, matching and even surpassing human performance in object recognition in natural scenes. This performance suggests that the analysis of a loose collection of image features could support the recognition of natural object categories, without dedicated systems to solve specific visual subtasks. Research in humans however suggests that while feedforward activity may suffice for sparse scenes with isolated objects, additional visual operations (‘routines’) that aid the recognition process (e.g. segmentation or grouping) are needed for more complex scenes. Linking human visual processing to performance of DCNNs with increasing depth, we here explored if, how, and when object information is differentiated from the backgrounds they appear on. To this end, we controlled the information in both objects and backgrounds, as well as the relationship between them by adding noise, manipulating background congruence and systematically occluding parts of the image. Results indicate that with an increase in network depth, there is an increase in the distinction between object- and background information. For more shallow networks, results indicated a benefit of training on segmented objects. Overall, these results indicate that, de facto, scene segmentation can be performed by a network of sufficient depth. We conclude that the human brain could perform scene segmentation in the context of object identification without an explicit mechanism, by selecting or “binding” features that belong to the object and ignoring other features, in a manner similar to a very deep convolutional neural network.

Download Full-text

CircConv: A Structured Convolution with Low Complexity

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33014287 ◽

2019 ◽

Vol 33 ◽

pp. 4287-4294

Author(s):

Siyu Liao ◽

Bo Yuan

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Neural Networks ◽

Computational Cost ◽

Low Complexity ◽

Deep Convolutional Neural Networks ◽

Significant Saving ◽

Machine Learning Applications ◽

Fast Multiplication ◽

Large Model

Deep neural networks (DNNs), especially deep convolutional neural networks (CNNs), have emerged as the powerful technique in various machine learning applications. However, the large model sizes of DNNs yield high demands on computation resource and weight storage, thereby limiting the practical deployment of DNNs. To overcome these limitations, this paper proposes to impose the circulant structure to the construction of convolutional layers, and hence leads to circulant convolutional layers (CircConvs) and circulant CNNs. The circulant structure and models can be either trained from scratch or re-trained from a pre-trained non-circulant model, thereby making it very flexible for different training environments. Through extensive experiments, such strong structureimposing approach is proved to be able to substantially reduce the number of parameters of convolutional layers and enable significant saving of computational cost by using fast multiplication of the circulant tensor.

Download Full-text