Align-Yolact: a one-stage semantic segmentation network for real-time object detection

In recent years, increasing image data comes from various sensors, and object detection plays a vital role in image understanding. For object detection in complex scenes, more detailed information in the image should be obtained to improve the accuracy of detection task. In this paper, we propose an object detection algorithm by jointing semantic segmentation (SSOD) for images. First, we construct a feature extraction network that integrates the hourglass structure network with the attention mechanism layer to extract and fuse multi-scale features to generate high-level features with rich semantic information. Second, the semantic segmentation task is used as an auxiliary task to allow the algorithm to perform multi-task learning. Finally, multi-scale features are used to predict the location and category of the object. The experimental results show that our algorithm substantially enhances object detection performance and consistently outperforms other three comparison algorithms, and the detection speed can reach real-time, which can be used for real-time detection.

Download Full-text

DNS: A multi-scale deconvolution semantic segmentation network for joint detection and segmentation

MATEC Web of Conferences ◽

10.1051/matecconf/201927702005 ◽

2019 ◽

Vol 277 ◽

pp. 02005

Author(s):

Ning Feng ◽

Le Dong ◽

Qianni Zhang ◽

Ning Zhang ◽

Xi Wu ◽

...

Keyword(s):

Image Analysis ◽

Object Detection ◽

Real Time ◽

Medical Image ◽

Medical Image Analysis ◽

Semantic Segmentation ◽

Autonomous Driving ◽

Joint Detection ◽

Multi Scale ◽

Segmentation Task

Real-time semantic segmentation has become crucial in many applications such as medical image analysis and autonomous driving. In this paper, we introduce a single semantic segmentation network, called DNS, for joint object detection and segmentation task. We take advantage of multi-scale deconvolution mechanism to perform real time computations. To this goal, down-scale and up-scale streams are utilized to combine the multi-scale features for the final detection and segmentation task. By using the proposed DNS, not only the tradeoff between accuracy and cost but also the balance of detection and segmentation performance are settled. Experimental results for PASCAL VOC datasets show competitive performance for joint object detection and segmentation task.

Download Full-text

An Efficient Deep Convolutional Neural Network Approach for Object Detection and Recognition Using a Multi-Scale Anchor Box in Real-Time

Future Internet ◽

10.3390/fi13120307 ◽

2021 ◽

Vol 13 (12) ◽

pp. 307

Author(s):

Vijayakumar Varadarajan ◽

Dweepna Garg ◽

Ketan Kotecha

Keyword(s):

Neural Network ◽

Object Detection ◽

Real Time ◽

Semantic Segmentation ◽

Object Identification ◽

Detection Accuracy ◽

Neural Network Approach ◽

Multi Scale ◽

Scale Anchor ◽

Detection And Recognition

Deep learning is a relatively new branch of machine learning in which computers are taught to recognize patterns in massive volumes of data. It primarily describes learning at various levels of representation, which aids in understanding data that includes text, voice, and visuals. Convolutional neural networks have been used to solve challenges in computer vision, including object identification, image classification, semantic segmentation and a lot more. Object detection in videos involves confirming the presence of the object in the image or video and then locating it accurately for recognition. In the video, modelling techniques suffer from high computation and memory costs, which may decrease performance measures such as accuracy and efficiency to identify the object accurately in real-time. The current object detection technique based on a deep convolution neural network requires executing multilevel convolution and pooling operations on the entire image to extract deep semantic properties from it. For large objects, detection models can provide superior results; however, those models fail to detect the varying size of the objects that have low resolution and are greatly influenced by noise because the features after the repeated convolution operations of existing models do not fully represent the essential characteristics of the objects in real-time. With the help of a multi-scale anchor box, the proposed approach reported in this paper enhances the detection accuracy by extracting features at multiple convolution levels of the object. The major contribution of this paper is to design a model to understand better the parameters and the hyper-parameters which affect the detection and the recognition of objects of varying sizes and shapes, and to achieve real-time object detection and recognition speeds by improving accuracy. The proposed model has achieved 84.49 mAP on the test set of the Pascal VOC-2007 dataset at 11 FPS, which is comparatively better than other real-time object detection models.

Download Full-text

SSA3D: Semantic Segmentation Assisted One-Stage Three-Dimensional Vehicle Object Detection

IEEE Transactions on Intelligent Transportation Systems ◽

10.1109/tits.2021.3133476 ◽

2021 ◽

pp. 1-15

Author(s):

Shangfeng Huang ◽

Guorong Cai ◽

Zongyue Wang ◽

Qiming Xia ◽

Ruisheng Wang

Keyword(s):

Object Detection ◽

Three Dimensional ◽

Semantic Segmentation ◽

One Stage

Download Full-text

Development of environment design support mixed reality system capable of environment estimation using deep learning

Impact ◽

10.21820/23987073.2020.2.9 ◽

2020 ◽

Vol 2020 (2) ◽

pp. 9-11

Author(s):

Tomohiro Fukuda

Keyword(s):

Deep Learning ◽

Real Time ◽

Computer Games ◽

Construction Projects ◽

Mixed Reality ◽

Semantic Segmentation ◽

Environment Design ◽

Aviation Training ◽

Architecture And Design ◽

World Environment

Mixed reality (MR) is rapidly becoming a vital tool, not just in gaming, but also in education, medicine, construction and environmental management. The term refers to systems in which computer-generated content is superimposed over objects in a real-world environment across one or more sensory modalities. Although most of us have heard of the use of MR in computer games, it also has applications in military and aviation training, as well as tourism, healthcare and more. In addition, it has the potential for use in architecture and design, where buildings can be superimposed in existing locations to render 3D generations of plans. However, one major challenge that remains in MR development is the issue of real-time occlusion. This refers to hiding 3D virtual objects behind real articles. Dr Tomohiro Fukuda, who is based at the Division of Sustainable Energy and Environmental Engineering, Graduate School of Engineering at Osaka University in Japan, is an expert in this field. Researchers, led by Dr Tomohiro Fukuda, are tackling the issue of occlusion in MR. They are currently developing a MR system that realises real-time occlusion by harnessing deep learning to achieve an outdoor landscape design simulation using a semantic segmentation technique. This methodology can be used to automatically estimate the visual environment prior to and after construction projects.

Download Full-text

Vision-Based Navigation of Autonomous Vehicles in Roadway Environments with Unexpected Hazards

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198119855606 ◽

2019 ◽

Vol 2673 (12) ◽

pp. 494-507 ◽

Cited By ~ 1

Author(s):

Mhafuzul Islam ◽

Mashrur Chowdhury ◽

Hongda Li ◽

Hongxin Hu

Keyword(s):

Object Detection ◽

Autonomous Vehicles ◽

Autonomous Vehicle ◽

Semantic Segmentation ◽

Steering Wheel ◽

Potential Hazard ◽

Driving System ◽

Hazardous Object ◽

Vision Based Navigation ◽

Navigational System

Vision-based navigation of autonomous vehicles primarily depends on the deep neural network (DNN) based systems in which the controller obtains input from sensors/detectors, such as cameras, and produces a vehicle control output, such as a steering wheel angle to navigate the vehicle safely in a roadway traffic environment. Typically, these DNN-based systems in the autonomous vehicle are trained through supervised learning; however, recent studies show that a trained DNN-based system can be compromised by perturbation or adverse inputs. Similarly, this perturbation can be introduced into the DNN-based systems of autonomous vehicles by unexpected roadway hazards, such as debris or roadblocks. In this study, we first introduce a hazardous roadway environment that can compromise the DNN-based navigational system of an autonomous vehicle, and produce an incorrect steering wheel angle, which could cause crashes resulting in fatality or injury. Then, we develop a DNN-based autonomous vehicle driving system using object detection and semantic segmentation to mitigate the adverse effect of this type of hazard, which helps the autonomous vehicle to navigate safely around such hazards. We find that our developed DNN-based autonomous vehicle driving system, including hazardous object detection and semantic segmentation, improves the navigational ability of an autonomous vehicle to avoid a potential hazard by 21% compared with the traditional DNN-based autonomous vehicle driving system.

Download Full-text

Real-Time Object Detection and Localization for Vision-Based Robot Manipulator

SN Computer Science ◽

10.1007/s42979-021-00561-4 ◽

2021 ◽

Vol 2 (3) ◽

Author(s):

Varun Batra ◽

Vijay Kumar

Keyword(s):

Object Detection ◽

Real Time ◽

Robot Manipulator ◽

Detection And Localization

Download Full-text

Align-Yolact: a one-stage semantic segmentation network for real-time object detection

Tiny-RetinaNet: a one-stage detector for real-time object detection

Real-Time Object Detection and Semantic Segmentation Hardware System with Deep Learning Networks

Real-time object detection and semantic segmentation for autonomous driving

Convolutional Neural Networks-Based Object Detection Algorithm by Jointing Semantic Segmentation for Images

DNS: A multi-scale deconvolution semantic segmentation network for joint detection and segmentation

An Efficient Deep Convolutional Neural Network Approach for Object Detection and Recognition Using a Multi-Scale Anchor Box in Real-Time

SSA3D: Semantic Segmentation Assisted One-Stage Three-Dimensional Vehicle Object Detection

Development of environment design support mixed reality system capable of environment estimation using deep learning

Vision-Based Navigation of Autonomous Vehicles in Roadway Environments with Unexpected Hazards

Real-Time Object Detection and Localization for Vision-Based Robot Manipulator

Export Citation Format