scholarly journals Domain-Specific On-Device Object Detection Method

Entropy ◽  
2022 ◽  
Vol 24 (1) ◽  
pp. 77
Author(s):  
Seongju Kang ◽  
Jaegi Hwang ◽  
Kwangsue Chung

Object detection is a significant activity in computer vision, and various approaches have been proposed to detect varied objects using deep neural networks (DNNs). However, because DNNs are computation-intensive, it is difficult to apply them to resource-constrained devices. Here, we propose an on-device object detection method using domain-specific models. In the proposed method, we define object of interest (OOI) groups that contain objects with a high frequency of appearance in specific domains. Compared with the existing DNN model, the layers of the domain-specific models are shallower and narrower, reducing the number of trainable parameters; thus, speeding up the object detection. To ensure a lightweight network design, we combine various network structures to obtain the best-performing lightweight detection model. The experimental results reveal that the size of the proposed lightweight model is 21.7 MB, which is 91.35% and 36.98% smaller than those of YOLOv3-SPP and Tiny-YOLO, respectively. The f-measure achieved on the MS COCO 2017 dataset were 18.3%, 11.9% and 20.3% higher than those of YOLOv3-SPP, Tiny-YOLO and YOLO-Nano, respectively. The results demonstrated that the lightweight model achieved higher efficiency and better performance on non-GPU devices, such as mobile devices and embedded boards, than conventional models.

2021 ◽  
Vol 66 (3) ◽  
pp. 2493-2507
Author(s):  
Hyun Kyu Shin ◽  
Si Woon Lee ◽  
Goo Pyo Hong ◽  
Sael Lee ◽  
Sang Hyo Lee ◽  
...  

Sensors ◽  
2021 ◽  
Vol 21 (16) ◽  
pp. 5323
Author(s):  
Yongsu Kim ◽  
Hyoeun Kang ◽  
Naufal Suryanto ◽  
Harashta Tatimma Tatimma Larasati ◽  
Afifatul Mukaroh ◽  
...  

Deep neural networks (DNNs), especially those used in computer vision, are highly vulnerable to adversarial attacks, such as adversarial perturbations and adversarial patches. Adversarial patches, often considered more appropriate for a real-world attack, are attached to the target object or its surroundings to deceive the target system. However, most previous research employed adversarial patches that are conspicuous to human vision, making them easy to identify and counter. Previously, the spatially localized perturbation GAN (SLP-GAN) was proposed, in which the perturbation was only added to the most representative area of the input images, creating a spatially localized adversarial camouflage patch that excels in terms of visual fidelity and is, therefore, difficult to detect by human vision. In this study, the use of the method called eSLP-GAN was extended to deceive classifiers and object detection systems. Specifically, the loss function was modified for greater compatibility with an object-detection model attack and to increase robustness in the real world. Furthermore, the applicability of the proposed method was tested on the CARLA simulator for a more authentic real-world attack scenario.


2019 ◽  
Vol 11 (16) ◽  
pp. 1921 ◽  
Author(s):  
Zijun Duo ◽  
Wenke Wang ◽  
Huizan Wang

Oceanic mesoscale eddies greatly influence energy and matter transport and acoustic propagation. However, the traditional detection method for oceanic mesoscale eddies relies too much on the threshold value and has significant subjectivity. The existing machine learning methods are not mature or purposeful enough, as their train set lacks authority. In view of the above problems, this paper constructs a mesoscale eddy automatic identification and positioning network—OEDNet—based on an object detection network. Firstly, 2D image processing technology is used to enhance the data of a small number of accurate eddy samples annotated by marine experts to generate the train set. Then, the object detection model with a deep residual network, and a feature pyramid network as the main structure, is designed and optimized for small samples and complex regions in the mesoscale eddies of the ocean. Experimental results show that the model achieves better recognition compared to the traditional detection method and exhibits a good generalization ability in different sea areas.


2018 ◽  
Vol 8 (9) ◽  
pp. 1488 ◽  
Author(s):  
Alexander Pacha ◽  
Jan Hajič ◽  
Jorge Calvo-Zaragoza

Deep learning is bringing breakthroughs to many computer vision subfields including Optical Music Recognition (OMR), which has seen a series of improvements to musical symbol detection achieved by using generic deep learning models. However, so far, each such proposal has been based on a specific dataset and different evaluation criteria, which made it difficult to quantify the new deep learning-based state-of-the-art and assess the relative merits of these detection models on music scores. In this paper, a baseline for general detection of musical symbols with deep learning is presented. We consider three datasets of heterogeneous typology but with the same annotation format, three neural models of different nature, and establish their performance in terms of a common evaluation standard. The experimental results confirm that the direct music object detection with deep learning is indeed promising, but at the same time illustrates some of the domain-specific shortcomings of the general detectors. A qualitative comparison then suggests avenues for OMR improvement, based both on properties of the detection model and how the datasets are defined. To the best of our knowledge, this is the first time that competing music object detection systems from the machine learning paradigm are directly compared to each other. We hope that this work will serve as a reference to measure the progress of future developments of OMR in music object detection.


2019 ◽  
Vol 8 (4) ◽  
pp. 9420-9429

One of the major causes for achieving poor text detection results in video frames are complex background, illumination and de-blurring of the frames and it is also important challenges for the researchers to overcome such problems. Therefore, motivated from this kind of observation from recent survey, we propose a text detection method based on Deep Neural Networks known as TextBoxes which is capable of detecting text in video frames with improved performance when compared to state-of-the-art techniques. In parallel this we also propose a Text candidate detection for video frames and scene images by extracting words based on Automatic Window Detection by making use of Discrete Wavelet Transform (DWT) with the sliding window for extracting high frequency sub bands for each sliding window. K-means clustering technique has been used to obtain the text components and to decrease the background complexity and noise. Six-layer convolutional neural network model has been designed to recognize the text in multilingual images. Experiments for text detection are done on our own multilingual South Indian database, ICDAR-2015 Videos, YVT videos, SVT, and MSRA Scene datasets and demonstrated in terms of Recall, Precision and F-measure and for recognition ICDAR-2015 Videos, ICDAR 2011 and SVT scene images


Author(s):  
L. Lou ◽  
S. Zhang ◽  
S. Zhang

Abstract. With the development of deep learning, object detection has a significantly improvement. But most of algorithms only focus on the detection accuracy and speed, they do not consider the difficulty of making training datasets and the time consumption of training detection models, which will have a bad influence on the performance of detection model when the class of objects change in high frequency. This paper proposes a method named double network detection (DN detection), it can improve the efficiency of making training datasets and shorten the time of training model. At the same time, the experiment shows that the DN detection have a good performance in accuracy and speed.


2018 ◽  
Author(s):  
◽  
Zhi Zhang

Despite being a core topic for more than several decades, object detection is still receiving increasing attentions due to its irreplaceable importance in a wide variety of applications. Abundant object detectors based on deep neural networks have shown significantly revamped accuracies in recent years. However, it's still the day one for these models to be effectively deployed to real world. In this dissertation, we focus on object detection models which tackle real world problems that are unavailable few years ago. We also aim at making object detectors on the go, which means detectors are not longer required to be run on workstations and cloud services which is latency unfriendly. To achieve these goals, we addressed the problem in two phases: application and deployment. We have done thoughtful research on both areas. Our contribution involves inter-frame information fusing, model knowledge distillation, advanced model flow control for progressive inference, and hardware oriented model design and optimization. More specifically, we proposed a novel cross-frame verification scheme for spatial temporal fused object detection model for sequential images and videos in a proposal and reject favor. To compress model from a learning basis and resolve domain specific training data shortage, we improved the learning algorithm to handle insufficient labeled data by searching for optimal guidance paths from pre-trained models. To further reduce model inference cost, we designed a progressive neural network which run in flexible cost enabled by RNN style decision controller during runtime. We recognize the awkward model deployment problem, especially for object detection models that require excessive customized layers. In response, we propose to use end-to-end neural network which use pure neural network components to substitute traditional post-processing operations. We also applied operator decomposition and graph level and on-device optimization towards real-time object detection on low power edge devices. All these works have achieved state-of-the-art performances and converted to successful applications.


Sign in / Sign up

Export Citation Format

Share Document