scholarly journals A New Method for Pedestrian Detection with Lightweight Backbone based on Yolov3 Network

2019 ◽  
Vol 3 (5) ◽  
Author(s):  
Qirui Dong

The main purpose of YOLOv3, aiming to improve the detection speed and accuracy from current detection models, is to predict the center coordinates of (x, y) from the Bounding Box and its length, width through multiple layers of VGG Convolutional Neural Network (VGG-CNN) and uses the Darknet lightweight framework to process images at a faster speed. More specifically, our model has been reduced part of YOLOv3’s complex and computationally intensive procedures and improved its algorithms to maintain the efficiency and accuracy of object detection. By this method, it performs a higher quality on mass object detection tasks with fewer detection errors.

Sensors ◽  
2020 ◽  
Vol 20 (23) ◽  
pp. 6779
Author(s):  
Byung-Gil Han ◽  
Joon-Goo Lee ◽  
Kil-Taek Lim ◽  
Doo-Hyun Choi

With the increase in research cases of the application of a convolutional neural network (CNN)-based object detection technology, studies on the light-weight CNN models that can be performed in real time on the edge-computing devices are also increasing. This paper proposed scalable convolutional blocks that can be easily designed CNN networks of You Only Look Once (YOLO) detector which have the balanced processing speed and accuracy of the target edge-computing devices considering different performances by exchanging the proposed blocks simply. The maximum number of kernels of the convolutional layer was determined through simple but intuitive speed comparison tests for three edge-computing devices to be considered. The scalable convolutional blocks were designed in consideration of the limited maximum number of kernels to detect objects in real time on these edge-computing devices. Three scalable and fast YOLO detectors (SF-YOLO) which designed using the proposed scalable convolutional blocks compared the processing speed and accuracy with several conventional light-weight YOLO detectors on the edge-computing devices. When compared with YOLOv3-tiny, SF-YOLO was seen to be 2 times faster than the previous processing speed but with the same accuracy as YOLOv3-tiny, and also, a 48% improved processing speed than the YOLOv3-tiny-PRN which is the processing speed improvement model. Also, even in the large SF-YOLO model that focuses on the accuracy performance, it achieved a 10% faster processing speed with better accuracy of 40.4% [email protected] in the MS COCO dataset than YOLOv4-tiny model.


Author(s):  
Ziyu Shi ◽  
Haichang Gao ◽  
Yiwen Tang ◽  
Han Zheng ◽  
Shuai Kang ◽  
...  

With the development of deep learning technologies, object detection algorithms have made significant progress in terms of detection speed and detection performance. However, the detection speed of current detection networks still does not meet the requirements of real-world applications in some scenarios. In this paper, we propose a faster non-maximum suppression (FNMS) algorithm that reduces the processing time by a large margin while achieving the same detection precision compared with the traditional non-maximum suppression (NMS) algorithm. Moreover, an attempt is made to adopt additional lightweight network structures to improve the speed of the detection network. By combining our FNMS algorithm with other network optimization strategies, we are able to improve the detection speed of YOLO v3 on the DOTA dataset by 165%.


2019 ◽  
Vol 17 (1) ◽  
pp. 69-76
Author(s):  
Mohammad Shiddiq Ghozali

Perkembangan Teknologi Informasi dan Komunikasi begitu pesat di zaman sekarang ini. Diikuti pula dengan perkembangan di bidang Artificial Intelligence (AI) atau Kecerdasan Buatan. Di Indonesia sendiri masih belum begitu populer dikalangan masyarakat akan tetapi perusahaan-perusahaan IT berlomba-lomba menciptakan inovasi dibidang Kecerdasan Buatan dan penerapan Kecerdasan Buatan disegala aspek kehidupan. Contoh kasus di Automated Teller Machine (ATM), seringkali terjadi kejahatan di ATM seperti pengintaian nomor pin, skimming, lebanese loop dan kejahatan lainnya. Walaupun di ATM sudah terdapat CCTV akan tetapi penjahat menggunakan alat bantu untuk menutupi wajahnya seperti helm, topi, masker dan kacamata hitam. Biasanya didepan pintu masuk ATM terpampang larangan untuk tidak menggunakan helm, topi, masker dan kacamata hitam serta tidak membawa rokok. Akan tetapi larangan itu masih tetap ada yang melanggar, dikarenakan tidak ada tindak lanjut ketika seseorang menggunakan benda-benda yang dilarang dibawa kedalam ATM. Oleh karena itu penulis membuat sistem pendeteksi obyek di bidang Kecerdasan Buatan untuk mendeteksi benda-benda yang dilarang digunakan ketika berada di ATM. Salah satu metode yang digunakan untuk menciptakan Object Detection yaitu You Only Look Once (YOLO). Implementasi ide ini tersedia pada DARKNET (open source neural network). Cara kerja YOLO yaitu dengan melihat seluruh gambar sekali, kemudian melewati jaringan saraf sekali langsung mendeteksi object yang ada. Oleh karena itu disebut You Only Look Once (YOLO). Pada penelitian ini, penulis membuat sistem yang masih dalam bentuk pengembangan, sehingga menjalankannya masih menggunakan command prompt. Keywords : Automated Teller Machine (ATM), Kecerdasan Buatan, Pendeteksi Obyek, You Only Look Once (YOLO)  


2021 ◽  
Vol 18 (1) ◽  
pp. 172988142199332
Author(s):  
Xintao Ding ◽  
Boquan Li ◽  
Jinbao Wang

Indoor object detection is a very demanding and important task for robot applications. Object knowledge, such as two-dimensional (2D) shape and depth information, may be helpful for detection. In this article, we focus on region-based convolutional neural network (CNN) detector and propose a geometric property-based Faster R-CNN method (GP-Faster) for indoor object detection. GP-Faster incorporates geometric property in Faster R-CNN to improve the detection performance. In detail, we first use mesh grids that are the intersections of direct and inverse proportion functions to generate appropriate anchors for indoor objects. After the anchors are regressed to the regions of interest produced by a region proposal network (RPN-RoIs), we then use 2D geometric constraints to refine the RPN-RoIs, in which the 2D constraint of every classification is a convex hull region enclosing the width and height coordinates of the ground-truth boxes on the training set. Comparison experiments are implemented on two indoor datasets SUN2012 and NYUv2. Since the depth information is available in NYUv2, we involve depth constraints in GP-Faster and propose 3D geometric property-based Faster R-CNN (DGP-Faster) on NYUv2. The experimental results show that both GP-Faster and DGP-Faster increase the performance of the mean average precision.


2021 ◽  
Vol 443 ◽  
pp. 292-301
Author(s):  
Gangyi Tian ◽  
Jianran Liu ◽  
Wenyuan Yang

2021 ◽  
Vol 104 (2) ◽  
pp. 003685042110113
Author(s):  
Xianghua Ma ◽  
Zhenkun Yang

Real-time object detection on mobile platforms is a crucial but challenging computer vision task. However, it is widely recognized that although the lightweight object detectors have a high detection speed, the detection accuracy is relatively low. In order to improve detecting accuracy, it is beneficial to extract complete multi-scale image features in visual cognitive tasks. Asymmetric convolutions have a useful quality, that is, they have different aspect ratios, which can be used to exact image features of objects, especially objects with multi-scale characteristics. In this paper, we exploit three different asymmetric convolutions in parallel and propose a new multi-scale asymmetric convolution unit, namely MAC block to enhance multi-scale representation ability of CNNs. In addition, MAC block can adaptively merge the features with different scales by allocating learnable weighted parameters to three different asymmetric convolution branches. The proposed MAC blocks can be inserted into the state-of-the-art backbone such as ResNet-50 to form a new multi-scale backbone network of object detectors. To evaluate the performance of MAC block, we conduct experiments on CIFAR-100, PASCAL VOC 2007, PASCAL VOC 2012 and MS COCO 2014 datasets. Experimental results show that the detection precision can be greatly improved while a fast detection speed is guaranteed as well.


Electronics ◽  
2021 ◽  
Vol 10 (14) ◽  
pp. 1737
Author(s):  
Wooseop Lee ◽  
Min-Hee Kang ◽  
Jaein Song ◽  
Keeyeon Hwang

As automated vehicles have been considered one of the important trends in intelligent transportation systems, various research is being conducted to enhance their safety. In particular, the importance of technologies for the design of preventive automated driving systems, such as detection of surrounding objects and estimation of distance between vehicles. Object detection is mainly performed through cameras and LiDAR, but due to the cost and limits of LiDAR’s recognition distance, the need to improve Camera recognition technique, which is relatively convenient for commercialization, is increasing. This study learned convolutional neural network (CNN)-based faster regions with CNN (Faster R-CNN) and You Only Look Once (YOLO) V2 to improve the recognition techniques of vehicle-mounted monocular cameras for the design of preventive automated driving systems, recognizing surrounding vehicles in black box highway driving videos and estimating distances from surrounding vehicles through more suitable models for automated driving systems. Moreover, we learned the PASCAL visual object classes (VOC) dataset for model comparison. Faster R-CNN showed similar accuracy, with a mean average precision (mAP) of 76.4 to YOLO with a mAP of 78.6, but with a Frame Per Second (FPS) of 5, showing slower processing speed than YOLO V2 with an FPS of 40, and a Faster R-CNN, which we had difficulty detecting. As a result, YOLO V2, which shows better performance in accuracy and processing speed, was determined to be a more suitable model for automated driving systems, further progressing in estimating the distance between vehicles. For distance estimation, we conducted coordinate value conversion through camera calibration and perspective transform, set the threshold to 0.7, and performed object detection and distance estimation, showing more than 80% accuracy for near-distance vehicles. Through this study, it is believed that it will be able to help prevent accidents in automated vehicles, and it is expected that additional research will provide various accident prevention alternatives such as calculating and securing appropriate safety distances, depending on the vehicle types.


Sign in / Sign up

Export Citation Format

Share Document