Deep Convolutional Neural Networks Based on Image Data Augmentation for Visual Object Recognition

Author(s):  
Khaoula Jayech
2019 ◽  
Author(s):  
Astrid A. Zeman ◽  
J. Brendan Ritchie ◽  
Stefania Bracci ◽  
Hans Op de Beeck

AbstractDeep Convolutional Neural Networks (CNNs) are gaining traction as the benchmark model of visual object recognition, with performance now surpassing humans. While CNNs can accurately assign one image to potentially thousands of categories, network performance could be the result of layers that are tuned to represent the visual shape of objects, rather than object category, since both are often confounded in natural images. Using two stimulus sets that explicitly dissociate shape from category, we correlate these two types of information with each layer of multiple CNNs. We also compare CNN output with fMRI activation along the human visual ventral stream by correlating artificial with biological representations. We find that CNNs encode category information independently from shape, peaking at the final fully connected layer in all tested CNN architectures. Comparing CNNs with fMRI brain data, early visual cortex (V1) and early layers of CNNs encode shape information. Anterior ventral temporal cortex encodes category information, which correlates best with the final layer of CNNs. The interaction between shape and category that is found along the human visual ventral pathway is echoed in multiple deep networks. Our results suggest CNNs represent category information independently from shape, much like the human visual system.


2017 ◽  
Author(s):  
Courtney J. Spoerer ◽  
Patrick McClure ◽  
Nikolaus Kriegeskorte

Feedforward neural networks provide the dominant model of how the brain performs visual object recognition. However, these networks lack the lateral and feedback connections, and the resulting recurrent neuronal dynamics, of the ventral visual pathway in the human and nonhuman primate brain. Here we investigate recurrent convolutional neural networks with bottom-up (B), lateral (L), and top-down (T) connections. Combining these types of connections yields four architectures (B, BT, BL, and BLT), which we systematically test and compare. We hypothesized that recurrent dynamics might improve recognition performance in the challenging scenario of partial occlusion. We introduce two novel occluded object recognition tasks to test the efficacy of the models, digit clutter (where multiple target digits occlude one another) and digit debris (where target digits are occluded by digit fragments). We find that recurrent neural networks outperform feedforward control models (approximately matched in parametric complexity) at recognising objects, both in the absence of occlusion and in all occlusion conditions. Recurrent networks were also found to be more robust to the inclusion of additive Gaussian noise. Recurrent neural networks are better in two respects: (1) they are more neurobiologically realistic than their feedforward counterparts; (2) they are better in terms of their ability to recognise objects, especially under challenging conditions. This work shows that computer vision can benefit from using recurrent convolutional architectures and suggests that the ubiquitous recurrent connections in biological brains are essential for task performance.


2021 ◽  
Vol 11 (15) ◽  
pp. 6721
Author(s):  
Jinyeong Wang ◽  
Sanghwan Lee

In increasing manufacturing productivity with automated surface inspection in smart factories, the demand for machine vision is rising. Recently, convolutional neural networks (CNNs) have demonstrated outstanding performance and solved many problems in the field of computer vision. With that, many machine vision systems adopt CNNs to surface defect inspection. In this study, we developed an effective data augmentation method for grayscale images in CNN-based machine vision with mono cameras. Our method can apply to grayscale industrial images, and we demonstrated outstanding performance in the image classification and the object detection tasks. The main contributions of this study are as follows: (1) We propose a data augmentation method that can be performed when training CNNs with industrial images taken with mono cameras. (2) We demonstrate that image classification or object detection performance is better when training with the industrial image data augmented by the proposed method. Through the proposed method, many machine-vision-related problems using mono cameras can be effectively solved by using CNNs.


2021 ◽  
Vol 18 (3) ◽  
pp. 172988142110105
Author(s):  
Jnana Sai Abhishek Varma Gokaraju ◽  
Weon Keun Song ◽  
Min-Ho Ka ◽  
Somyot Kaitwanidvilai

The study investigated object detection and classification based on both Doppler radar spectrograms and vision images using two deep convolutional neural networks. The kinematic models for a walking human and a bird flapping its wings were incorporated into MATLAB simulations to create data sets. The dynamic simulator identified the final position of each ellipsoidal body segment taking its rotational motion into consideration in addition to its bulk motion at each sampling point to describe its specific motion naturally. The total motion induced a micro-Doppler effect and created a micro-Doppler signature that varied in response to changes in the input parameters, such as varying body segment size, velocity, and radar location. Micro-Doppler signature identification of the radar signals returned from the target objects that were animated by the simulator required kinematic modeling based on a short-time Fourier transform analysis of the signals. Both You Only Look Once V3 and Inception V3 were used for the detection and classification of the objects with different red, green, blue colors on black or white backgrounds. The results suggested that clear micro-Doppler signature image-based object recognition could be achieved in low-visibility conditions. This feasibility study demonstrated the application possibility of Doppler radar to autonomous vehicle driving as a backup sensor for cameras in darkness. In this study, the first successful attempt of animated kinematic models and their synchronized radar spectrograms to object recognition was made.


Animals ◽  
2021 ◽  
Vol 11 (5) ◽  
pp. 1263
Author(s):  
Zhaojun Wang ◽  
Jiangning Wang ◽  
Congtian Lin ◽  
Yan Han ◽  
Zhaosheng Wang ◽  
...  

With the rapid development of digital technology, bird images have become an important part of ornithology research data. However, due to the rapid growth of bird image data, it has become a major challenge to effectively process such a large amount of data. In recent years, deep convolutional neural networks (DCNNs) have shown great potential and effectiveness in a variety of tasks regarding the automatic processing of bird images. However, no research has been conducted on the recognition of habitat elements in bird images, which is of great help when extracting habitat information from bird images. Here, we demonstrate the recognition of habitat elements using four DCNN models trained end-to-end directly based on images. To carry out this research, an image database called Habitat Elements of Bird Images (HEOBs-10) and composed of 10 categories of habitat elements was built, making future benchmarks and evaluations possible. Experiments showed that good results can be obtained by all the tested models. ResNet-152-based models yielded the best test accuracy rate (95.52%); the AlexNet-based model yielded the lowest test accuracy rate (89.48%). We conclude that DCNNs could be efficient and useful for automatically identifying habitat elements from bird images, and we believe that the practical application of this technology will be helpful for studying the relationships between birds and habitat elements.


Mathematics ◽  
2021 ◽  
Vol 9 (6) ◽  
pp. 624
Author(s):  
Stefan Rohrmanstorfer ◽  
Mikhail Komarov ◽  
Felix Mödritscher

With the always increasing amount of image data, it has become a necessity to automatically look for and process information in these images. As fashion is captured in images, the fashion sector provides the perfect foundation to be supported by the integration of a service or application that is built on an image classification model. In this article, the state of the art for image classification is analyzed and discussed. Based on the elaborated knowledge, four different approaches will be implemented to successfully extract features out of fashion data. For this purpose, a human-worn fashion dataset with 2567 images was created, but it was significantly enlarged by the performed image operations. The results show that convolutional neural networks are the undisputed standard for classifying images, and that TensorFlow is the best library to build them. Moreover, through the introduction of dropout layers, data augmentation and transfer learning, model overfitting was successfully prevented, and it was possible to incrementally improve the validation accuracy of the created dataset from an initial 69% to a final validation accuracy of 84%. More distinct apparel like trousers, shoes and hats were better classified than other upper body clothes.


Sign in / Sign up

Export Citation Format

Share Document