scholarly journals Exploring a Multimodal Mixture-Of-YOLOs Framework for Advanced Real-Time Object Detection

2020 ◽  
Vol 10 (2) ◽  
pp. 612
Author(s):  
Jinsoo Kim ◽  
Jeongho Cho

To construct a safe and sound autonomous driving system, object detection is essential, and research on fusion of sensors is being actively conducted to increase the detection rate of objects in a dynamic environment in which safety must be secured. Recently, considerable performance improvements in object detection have been achieved with the advent of the convolutional neural network (CNN) structure. In particular, the YOLO (You Only Look Once) architecture, which is suitable for real-time object detection by simultaneously predicting and classifying bounding boxes of objects, is receiving great attention. However, securing the robustness of object detection systems in various environments still remains a challenge. In this paper, we propose a weighted mean-based adaptive object detection strategy that enhances detection performance through convergence of individual object detection results based on an RGB camera and a LiDAR (Light Detection and Ranging) for autonomous driving. The proposed system utilizes the YOLO framework to perform object detection independently based on image data and point cloud data (PCD). Each detection result is united to reduce the number of objects not detected at the decision level by the weighted mean scheme. To evaluate the performance of the proposed object detection system, tests on vehicles and pedestrians were carried out using the KITTI Benchmark Suite. Test results demonstrated that the proposed strategy can achieve detection performance with a higher mean average precision (mAP) for targeted objects than an RGB camera and is also robust against external environmental changes.

Sensors ◽  
2021 ◽  
Vol 21 (4) ◽  
pp. 1066
Author(s):  
Peng Jia ◽  
Fuxiang Liu

At present, the one-stage detector based on the lightweight model can achieve real-time speed, but the detection performance is challenging. To enhance the discriminability and robustness of the model extraction features and improve the detector’s detection performance for small objects, we propose two modules in this work. First, we propose a receptive field enhancement method, referred to as adaptive receptive field fusion (ARFF). It enhances the model’s feature representation ability by adaptively learning the fusion weights of different receptive field branches in the receptive field module. Then, we propose an enhanced up-sampling (EU) module to reduce the information loss caused by up-sampling on the feature map. Finally, we assemble ARFF and EU modules on top of YOLO v3 to build a real-time, high-precision and lightweight object detection system referred to as the ARFF-EU network. We achieve a state-of-the-art speed and accuracy trade-off on both the Pascal VOC and MS COCO data sets, reporting 83.6% AP at 37.5 FPS and 42.5% AP at 33.7 FPS, respectively. The experimental results show that our proposed ARFF and EU modules improve the detection performance of the ARFF-EU network and achieve the development of advanced, very deep detectors while maintaining real-time speed.


Sensors ◽  
2021 ◽  
Vol 21 (24) ◽  
pp. 8381
Author(s):  
Duarte Fernandes ◽  
Tiago Afonso ◽  
Pedro Girão ◽  
Dibet Gonzalez ◽  
António Silva ◽  
...  

Recently released research about deep learning applications related to perception for autonomous driving focuses heavily on the usage of LiDAR point cloud data as input for the neural networks, highlighting the importance of LiDAR technology in the field of Autonomous Driving (AD). In this sense, a great percentage of the vehicle platforms used to create the datasets released for the development of these neural networks, as well as some AD commercial solutions available on the market, heavily invest in an array of sensors, including a large number of sensors as well as several sensor modalities. However, these costs create a barrier to entry for low-cost solutions for the performance of critical perception tasks such as Object Detection and SLAM. This paper explores current vehicle platforms and proposes a low-cost, LiDAR-based test vehicle platform capable of running critical perception tasks (Object Detection and SLAM) in real time. Additionally, we propose the creation of a deep learning-based inference model for Object Detection deployed in a resource-constrained device, as well as a graph-based SLAM implementation, providing important considerations, explored while taking into account the real-time processing requirement and presenting relevant results demonstrating the usability of the developed work in the context of the proposed low-cost platform.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1677
Author(s):  
Jongwon Kim ◽  
Jeongho Cho

An essential component for the autonomous flight or air-to-ground surveillance of a UAV is an object detection device. It must possess a high detection accuracy and requires real-time data processing to be employed for various tasks such as search and rescue, object tracking and disaster analysis. With the recent advancements in multimodal data-based object detection architectures, autonomous driving technology has significantly improved, and the latest algorithm has achieved an average precision of up to 96%. However, these remarkable advances may be unsuitable for the image processing of UAV aerial data directly onboard for object detection because of the following major problems: (1) Objects in aerial views generally have a smaller size than in an image and they are uneven and sparsely distributed throughout an image; (2) Objects are exposed to various environmental changes, such as occlusion and background interference; and (3) The payload weight of a UAV is limited. Thus, we propose employing a new real-time onboard object detection architecture, an RGB aerial image and a point cloud data (PCD) depth map image network (RGDiNet). A faster region-based convolutional neural network was used as the baseline detection network and an RGD, an integration of the RGB aerial image and the depth map reconstructed by the light detection and ranging PCD, was utilized as an input for computational efficiency. Performance tests and evaluation of the proposed RGDiNet were conducted under various operating conditions using hand-labeled aerial datasets. Consequently, it was shown that the proposed method has a superior performance for the detection of vehicles and pedestrians than conventional vision-based methods.


Sensors ◽  
2021 ◽  
Vol 21 (16) ◽  
pp. 5279
Author(s):  
Dong-Hoon Kwak ◽  
Guk-Jin Son ◽  
Mi-Kyung Park ◽  
Young-Duk Kim

The consumption of seaweed is increasing year by year worldwide. Therefore, the foreign object inspection of seaweed is becoming increasingly important. Seaweed is mixed with various materials such as laver and sargassum fusiforme. So it has various colors even in the same seaweed. In addition, the surface is uneven and greasy, causing diffuse reflections frequently. For these reasons, it is difficult to detect foreign objects in seaweed, so the accuracy of conventional foreign object detectors used in real manufacturing sites is less than 80%. Supporting real-time inspection should also be considered when inspecting foreign objects. Since seaweed requires mass production, rapid inspection is essential. However, hyperspectral imaging techniques are generally not suitable for high-speed inspection. In this study, we overcome this limitation by using dimensionality reduction and using simplified operations. For accuracy improvement, the proposed algorithm is carried out in 2 stages. Firstly, the subtraction method is used to clearly distinguish seaweed and conveyor belts, and also detect some relatively easy to detect foreign objects. Secondly, a standardization inspection is performed based on the result of the subtraction method. During this process, the proposed scheme adopts simplified and burdenless calculations such as subtraction, division, and one-by-one matching, which achieves both accuracy and low latency performance. In the experiment to evaluate the performance, 60 normal seaweeds and 60 seaweeds containing foreign objects were used, and the accuracy of the proposed algorithm is 95%. Finally, by implementing the proposed algorithm as a foreign object detection platform, it was confirmed that real-time operation in rapid inspection was possible, and the possibility of deployment in real manufacturing sites was confirmed.


2021 ◽  
Vol 11 (13) ◽  
pp. 6016
Author(s):  
Jinsoo Kim ◽  
Jeongho Cho

For autonomous vehicles, it is critical to be aware of the driving environment to avoid collisions and drive safely. The recent evolution of convolutional neural networks has contributed significantly to accelerating the development of object detection techniques that enable autonomous vehicles to handle rapid changes in various driving environments. However, collisions in an autonomous driving environment can still occur due to undetected obstacles and various perception problems, particularly occlusion. Thus, we propose a robust object detection algorithm for environments in which objects are truncated or occluded by employing RGB image and light detection and ranging (LiDAR) bird’s eye view (BEV) representations. This structure combines independent detection results obtained in parallel through “you only look once” networks using an RGB image and a height map converted from the BEV representations of LiDAR’s point cloud data (PCD). The region proposal of an object is determined via non-maximum suppression, which suppresses the bounding boxes of adjacent regions. A performance evaluation of the proposed scheme was performed using the KITTI vision benchmark suite dataset. The results demonstrate the detection accuracy in the case of integration of PCD BEV representations is superior to when only an RGB camera is used. In addition, robustness is improved by significantly enhancing detection accuracy even when the target objects are partially occluded when viewed from the front, which demonstrates that the proposed algorithm outperforms the conventional RGB-based model.


2019 ◽  
Author(s):  
Jimut Bahan Pal

It has been a real challenge for computers with low computing power and memory to detect objects in real time. After the invention of Convolution Neural Networks (CNN) it is easy for computers to detect images and recognize them. There are several technologies and models which can detect objects in real time, but most of them require high end technologies in terms of GPUs and TPUs. Though, recently many new algorithms and models have been proposed, which runs on low resources. In this paper we studied MobileNets to detect objects using webcam to successfully build a real time objectdetection system. We observed the pre trained model of the famous MS COCO dataset to achieve our purpose. Moreover, we applied Google’s open source TensorFlow as our back end. This real time object detection system may help in future to solve various complex vision problems.


2020 ◽  
Vol 20 (20) ◽  
pp. 11959-11966
Author(s):  
Jiachen Yang ◽  
Chenguang Wang ◽  
Huihui Wang ◽  
Qiang Li

Sign in / Sign up

Export Citation Format

Share Document