scholarly journals Optimized convolutional neural network architectures for efficient on-device vision-based object detection

Author(s):  
Ivan Rodriguez-Conde ◽  
Celso Campos ◽  
Florentino Fdez-Riverola

AbstractConvolutional neural networks have pushed forward image analysis research and computer vision over the last decade, constituting a state-of-the-art approach in object detection today. The design of increasingly deeper and wider architectures has made it possible to achieve unprecedented levels of detection accuracy, albeit at the cost of both a dramatic computational burden and a large memory footprint. In such a context, cloud systems have become a mainstream technological solution due to their tremendous scalability, providing researchers and practitioners with virtually unlimited resources. However, these resources are typically made available as remote services, requiring communication over the network to be accessed, thus compromising the speed of response, availability, and security of the implemented solution. In view of these limitations, the on-device paradigm has emerged as a recent yet widely explored alternative, pursuing more compact and efficient networks to ultimately enable the execution of the derived models directly on resource-constrained client devices. This study provides an up-to-date review of the more relevant scientific research carried out in this vein, circumscribed to the object detection problem. In particular, the paper contributes to the field with a comprehensive architectural overview of both the existing lightweight object detection frameworks targeted to mobile and embedded devices, and the underlying convolutional neural networks that make up their internal structure. More specifically, it addresses the main structural-level strategies used for conceiving the various components of a detection pipeline (i.e., backbone, neck, and head), as well as the most salient techniques proposed for adapting such structures and the resulting architectures to more austere deployment environments. Finally, the study concludes with a discussion of the specific challenges and next steps to be taken to move toward a more convenient accuracy–speed trade-off.

2019 ◽  
Vol 11 (18) ◽  
pp. 2176 ◽  
Author(s):  
Chen ◽  
Zhong ◽  
Tan

Detecting objects in aerial images is a challenging task due to multiple orientations and relatively small size of the objects. Although many traditional detection models have demonstrated an acceptable performance by using the imagery pyramid and multiple templates in a sliding-window manner, such techniques are inefficient and costly. Recently, convolutional neural networks (CNNs) have successfully been used for object detection, and they have demonstrated considerably superior performance than that of traditional detection methods; however, this success has not been expanded to aerial images. To overcome such problems, we propose a detection model based on two CNNs. One of the CNNs is designed to propose many object-like regions that are generated from the feature maps of multi scales and hierarchies with the orientation information. Based on such a design, the positioning of small size objects becomes more accurate, and the generated regions with orientation information are more suitable for the objects arranged with arbitrary orientations. Furthermore, another CNN is designed for object recognition; it first extracts the features of each generated region and subsequently makes the final decisions. The results of the extensive experiments performed on the vehicle detection in aerial imagery (VEDAI) and overhead imagery research data set (OIRDS) datasets indicate that the proposed model performs well in terms of not only the detection accuracy but also the detection speed.


2021 ◽  
Vol 10 (6) ◽  
pp. 377
Author(s):  
Chiao-Ling Kuo ◽  
Ming-Hua Tsai

The importance of road characteristics has been highlighted, as road characteristics are fundamental structures established to support many transportation-relevant services. However, there is still huge room for improvement in terms of types and performance of road characteristics detection. With the advantage of geographically tiled maps with high update rates, remarkable accessibility, and increasing availability, this paper proposes a novel simple deep-learning-based approach, namely joint convolutional neural networks (CNNs) adopting adaptive squares with combination rules to detect road characteristics from roadmap tiles. The proposed joint CNNs are responsible for the foreground and background image classification and various types of road characteristics classification from previous foreground images, raising detection accuracy. The adaptive squares with combination rules help efficiently focus road characteristics, augmenting the ability to detect them and provide optimal detection results. Five types of road characteristics—crossroads, T-junctions, Y-junctions, corners, and curves—are exploited, and experimental results demonstrate successful outcomes with outstanding performance in reality. The information of exploited road characteristics with location and type is, thus, converted from human-readable to machine-readable, the results will benefit many applications like feature point reminders, road condition reports, or alert detection for users, drivers, and even autonomous vehicles. We believe this approach will also enable a new path for object detection and geospatial information extraction from valuable map tiles.


2021 ◽  
Vol 11 (15) ◽  
pp. 6721
Author(s):  
Jinyeong Wang ◽  
Sanghwan Lee

In increasing manufacturing productivity with automated surface inspection in smart factories, the demand for machine vision is rising. Recently, convolutional neural networks (CNNs) have demonstrated outstanding performance and solved many problems in the field of computer vision. With that, many machine vision systems adopt CNNs to surface defect inspection. In this study, we developed an effective data augmentation method for grayscale images in CNN-based machine vision with mono cameras. Our method can apply to grayscale industrial images, and we demonstrated outstanding performance in the image classification and the object detection tasks. The main contributions of this study are as follows: (1) We propose a data augmentation method that can be performed when training CNNs with industrial images taken with mono cameras. (2) We demonstrate that image classification or object detection performance is better when training with the industrial image data augmented by the proposed method. Through the proposed method, many machine-vision-related problems using mono cameras can be effectively solved by using CNNs.


2019 ◽  
Vol 345 ◽  
pp. 3-14 ◽  
Author(s):  
Guanzhong Tian ◽  
Liang Liu ◽  
JongHyok Ri ◽  
Yong Liu ◽  
Yiran Sun

Sign in / Sign up

Export Citation Format

Share Document