scholarly journals FOREGROUND DETECTION ON DEPTH MAPS USING SKELETAL REPRESENTATION OF OBJECT SILHOUETTES

Author(s):  
D. Beloborodov ◽  
L. Mestetskiy

This article considers the problem of foreground detection on depth maps. The problem of finding objects of interest on images appears in many object detection, recognition and tracking applications as one of the first steps. However, this problem becomes too complicated for RGB images with multicolored or constantly changing background and in presence of occlusions. Depth maps provide valuable information about distance to the camera for each point of the scene, making it possible to explore object detection methods, based on depth features. We define foreground as a set of objects silhouettes, nearest to the camera relative to the local background. We propose a method of foreground detection on depth maps based on medial representation of objects silhouettes which does not require any machine learning procedures and is able to detect foreground in near real-time in complex scenes with occlusions, using a single depth map. Proposed method is implemented to depth maps, obtained from Kinect sensor.

2021 ◽  
Author(s):  
Hung-Hao Chen ◽  
Chia-Hung Wang ◽  
Hsueh-Wei Chen ◽  
Pei-Yung Hsiao ◽  
Li-Chen Fu ◽  
...  

The current fusion-based methods transform LiDAR data into bird’s eye view (BEV) representations or 3D voxel, leading to information loss and heavy computation cost of 3D convolution. In contrast, we directly consume raw point clouds and perform fusion between two modalities. We employ the concept of region proposal network to generate proposals from two streams, respectively. In order to make two sensors compensate the weakness of each other, we utilize the calibration parameters to project proposals from one stream onto the other. With the proposed multi-scale feature aggregation module, we are able to combine the extracted regionof-interest-level (RoI-level) features of RGB stream from different receptive fields, resulting in fertilizing feature richness. Experiments on KITTI dataset show that our proposed network outperforms other fusion-based methods with meaningful improvements as compared to 3D object detection methods under challenging setting.


2020 ◽  
Vol 2020 (2) ◽  
pp. 140-1-140-6
Author(s):  
Yuzhong Jiao ◽  
Kayton Wai Keung Cheung ◽  
Mark Ping Chan Mok ◽  
Yiu Kei Li

Computer generated 2D plus Depth (2D+Z) images are common input data for 3D display with depth image-based rendering (DIBR) technique. Due to their simplicity, linear interpolation methods are usually used to convert low-resolution images into high-resolution images for not only depth maps but also 2D RGB images. However linear methods suffer from zigzag artifacts in both depth map and RGB images, which severely affects the 3D visual experience. In this paper, spatial distance-based interpolation algorithm for computer generated 2D+Z images is proposed. The method interpolates RGB images with the help of depth and edge information from depth maps. Spatial distance from interpolated pixel to surrounding available pixels is utilized to obtain the weight factors of surrounding pixels. Experiment results show that such spatial distance-based interpolation can achieve sharp edges and less artifacts for 2D RGB images. Naturally, it can improve the performance of 3D display. Since bilinear interpolation is used in homogenous areas, the proposed algorithm keeps low computational complexity.


2020 ◽  
Vol 3 (2) ◽  
pp. 99
Author(s):  
Albar Albar ◽  
Hendrick Hendrick ◽  
Rahmad Hidayat

Face detection is mostly applied in RGB images. The object detection usually applied the Deep Learning method for model creation. One method face spoofing is by using a thermal camera. The famous object detection methods are Yolo, Fast RCNN, Faster RCNN, SSD, and Mask RCNN. We proposed a segmentation Mask RCNN method to create a face model from thermal images. This model was able to locate the face area in images. The dataset was established using 1600 images. The images were created from direct capturing and collecting from the online dataset. The Mask RCNN was configured to train with 5 epochs and 131 iterations. The final model predicted and located the face correctly using the test image.


2018 ◽  
Author(s):  
Pallabi Ghosh ◽  
Domenic Forte ◽  
Damon L. Woodard ◽  
Rajat Subhra Chakraborty

Abstract Counterfeit electronics constitute a fast-growing threat to global supply chains as well as national security. With rapid globalization, the supply chain is growing more and more complex with components coming from a diverse set of suppliers. Counterfeiters are taking advantage of this complexity and replacing original parts with fake ones. Moreover, counterfeit integrated circuits (ICs) may contain circuit modifications that cause security breaches. Out of all types of counterfeit ICs, recycled and remarked ICs are the most common. Over the past few years, a plethora of counterfeit IC detection methods have been created; however, most of these methods are manual and require highly-skilled subject matter experts (SME). In this paper, an automated bent and corroded pin detection methodology using image processing is proposed to identify recycled ICs. Here, depth map of images acquired using an optical microscope are used to detect bent pins, and segmented side view pin images are used to detect corroded pins.


Author(s):  
M. N. Favorskaya ◽  
L. C. Jain

Introduction:Saliency detection is a fundamental task of computer vision. Its ultimate aim is to localize the objects of interest that grab human visual attention with respect to the rest of the image. A great variety of saliency models based on different approaches was developed since 1990s. In recent years, the saliency detection has become one of actively studied topic in the theory of Convolutional Neural Network (CNN). Many original decisions using CNNs were proposed for salient object detection and, even, event detection.Purpose:A detailed survey of saliency detection methods in deep learning era allows to understand the current possibilities of CNN approach for visual analysis conducted by the human eyes’ tracking and digital image processing.Results:A survey reflects the recent advances in saliency detection using CNNs. Different models available in literature, such as static and dynamic 2D CNNs for salient object detection and 3D CNNs for salient event detection are discussed in the chronological order. It is worth noting that automatic salient event detection in durable videos became possible using the recently appeared 3D CNN combining with 2D CNN for salient audio detection. Also in this article, we have presented a short description of public image and video datasets with annotated salient objects or events, as well as the often used metrics for the results’ evaluation.Practical relevance:This survey is considered as a contribution in the study of rapidly developed deep learning methods with respect to the saliency detection in the images and videos.


Entropy ◽  
2021 ◽  
Vol 23 (5) ◽  
pp. 546
Author(s):  
Zhenni Li ◽  
Haoyi Sun ◽  
Yuliang Gao ◽  
Jiao Wang

Depth maps obtained through sensors are often unsatisfactory because of their low-resolution and noise interference. In this paper, we propose a real-time depth map enhancement system based on a residual network which uses dual channels to process depth maps and intensity maps respectively and cancels the preprocessing process, and the algorithm proposed can achieve real-time processing speed at more than 30 fps. Furthermore, the FPGA design and implementation for depth sensing is also introduced. In this FPGA design, intensity image and depth image are captured by the dual-camera synchronous acquisition system as the input of neural network. Experiments on various depth map restoration shows our algorithms has better performance than existing LRMC, DE-CNN and DDTF algorithms on standard datasets and has a better depth map super-resolution, and our FPGA completed the test of the system to ensure that the data throughput of the USB 3.0 interface of the acquisition system is stable at 226 Mbps, and support dual-camera to work at full speed, that is, 54 fps@ (1280 × 960 + 328 × 248 × 3).


Electronics ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 517
Author(s):  
Seong-heum Kim ◽  
Youngbae Hwang

Owing to recent advancements in deep learning methods and relevant databases, it is becoming increasingly easier to recognize 3D objects using only RGB images from single viewpoints. This study investigates the major breakthroughs and current progress in deep learning-based monocular 3D object detection. For relatively low-cost data acquisition systems without depth sensors or cameras at multiple viewpoints, we first consider existing databases with 2D RGB photos and their relevant attributes. Based on this simple sensor modality for practical applications, deep learning-based monocular 3D object detection methods that overcome significant research challenges are categorized and summarized. We present the key concepts and detailed descriptions of representative single-stage and multiple-stage detection solutions. In addition, we discuss the effectiveness of the detection models on their baseline benchmarks. Finally, we explore several directions for future research on monocular 3D object detection.


2021 ◽  
Vol 11 (13) ◽  
pp. 6006
Author(s):  
Huy Le ◽  
Minh Nguyen ◽  
Wei Qi Yan ◽  
Hoa Nguyen

Augmented reality is one of the fastest growing fields, receiving increased funding for the last few years as people realise the potential benefits of rendering virtual information in the real world. Most of today’s augmented reality marker-based applications use local feature detection and tracking techniques. The disadvantage of applying these techniques is that the markers must be modified to match the unique classified algorithms or they suffer from low detection accuracy. Machine learning is an ideal solution to overcome the current drawbacks of image processing in augmented reality applications. However, traditional data annotation requires extensive time and labour, as it is usually done manually. This study incorporates machine learning to detect and track augmented reality marker targets in an application using deep neural networks. We firstly implement the auto-generated dataset tool, which is used for the machine learning dataset preparation. The final iOS prototype application incorporates object detection, object tracking and augmented reality. The machine learning model is trained to recognise the differences between targets using one of YOLO’s most well-known object detection methods. The final product makes use of a valuable toolkit for developing augmented reality applications called ARKit.


Sign in / Sign up

Export Citation Format

Share Document