FOREGROUND DETECTION ON DEPTH MAPS USING SKELETAL REPRESENTATION OF OBJECT SILHOUETTES

This article considers the problem of foreground detection on depth maps. The problem of finding objects of interest on images appears in many object detection, recognition and tracking applications as one of the first steps. However, this problem becomes too complicated for RGB images with multicolored or constantly changing background and in presence of occlusions. Depth maps provide valuable information about distance to the camera for each point of the scene, making it possible to explore object detection methods, based on depth features. We define foreground as a set of objects silhouettes, nearest to the camera relative to the local background. We propose a method of foreground detection on depth maps based on medial representation of objects silhouettes which does not require any machine learning procedures and is able to detect foreground in near real-time in complex scenes with occlusions, using a single depth map. Proposed method is implemented to depth maps, obtained from Kinect sensor.

Download Full-text

A Novel Regional Fusion Network for 3D Object Detection based on RGB Images and Point Clouds

10.5121/csit.2021.111812 ◽

2021 ◽

Author(s):

Hung-Hao Chen ◽

Chia-Hung Wang ◽

Hsueh-Wei Chen ◽

Pei-Yung Hsiao ◽

Li-Chen Fu ◽

...

Keyword(s):

Object Detection ◽

Receptive Fields ◽

Point Clouds ◽

Detection Methods ◽

Lidar Data ◽

3D Object ◽

Multi Scale ◽

Interest Level ◽

Rgb Images ◽

3D Object Detection

The current fusion-based methods transform LiDAR data into bird’s eye view (BEV) representations or 3D voxel, leading to information loss and heavy computation cost of 3D convolution. In contrast, we directly consume raw point clouds and perform fusion between two modalities. We employ the concept of region proposal network to generate proposals from two streams, respectively. In order to make two sensors compensate the weakness of each other, we utilize the calibration parameters to project proposals from one stream onto the other. With the proposed multi-scale feature aggregation module, we are able to combine the extracted regionof-interest-level (RoI-level) features of RGB stream from different receptive fields, resulting in fertilizing feature richness. Experiments on KITTI dataset show that our proposed network outperforms other fusion-based methods with meaningful improvements as compared to 3D object detection methods under challenging setting.

Download Full-text

Spatial Distance-based Interpolation Algorithm for Computer Generated 2D+Z Images

Electronic Imaging ◽

10.2352/issn.2470-1173.2020.2.sda-140 ◽

2020 ◽

Vol 2020 (2) ◽

pp. 140-1-140-6

Author(s):

Yuzhong Jiao ◽

Kayton Wai Keung Cheung ◽

Mark Ping Chan Mok ◽

Yiu Kei Li

Keyword(s):

Depth Map ◽

Linear Interpolation ◽

Spatial Distance ◽

Depth Image ◽

Interpolation Algorithm ◽

3D Display ◽

Common Input ◽

Depth Maps ◽

High Resolution Images ◽

Rgb Images

Computer generated 2D plus Depth (2D+Z) images are common input data for 3D display with depth image-based rendering (DIBR) technique. Due to their simplicity, linear interpolation methods are usually used to convert low-resolution images into high-resolution images for not only depth maps but also 2D RGB images. However linear methods suffer from zigzag artifacts in both depth map and RGB images, which severely affects the 3D visual experience. In this paper, spatial distance-based interpolation algorithm for computer generated 2D+Z images is proposed. The method interpolates RGB images with the help of depth and edge information from depth maps. Spatial distance from interpolated pixel to surrounding available pixels is utilized to obtain the weight factors of surrounding pixels. Experiment results show that such spatial distance-based interpolation can achieve sharp edges and less artifacts for 2D RGB images. Naturally, it can improve the performance of 3D display. Since bilinear interpolation is used in homogenous areas, the proposed algorithm keeps low computational complexity.

Download Full-text

Segmentation Method for Face Modelling in Thermal Images

Knowledge Engineering and Data Science ◽

10.17977/um018v3i22020p99-105 ◽

2020 ◽

Vol 3 (2) ◽

pp. 99

Author(s):

Albar Albar ◽

Hendrick Hendrick ◽

Rahmad Hidayat

Keyword(s):

Object Detection ◽

Detection Methods ◽

Segmentation Method ◽

Thermal Camera ◽

Final Model ◽

Face Model ◽

Thermal Images ◽

Face Area ◽

The Face ◽

Rgb Images

Face detection is mostly applied in RGB images. The object detection usually applied the Deep Learning method for model creation. One method face spoofing is by using a thermal camera. The famous object detection methods are Yolo, Fast RCNN, Faster RCNN, SSD, and Mask RCNN. We proposed a segmentation Mask RCNN method to create a face model from thermal images. This model was able to locate the face area in images. The dataset was established using 1600 images. The images were created from direct capturing and collecting from the online dataset. The Mask RCNN was configured to train with 5 epochs and 131 iterations. The final model predicted and located the face correctly using the test image.

Download Full-text

Automated Detection of Pin Defects on Counterfeit Microelectronics

10.31399/asm.cp.istfa2018p0057 ◽

2018 ◽

Author(s):

Pallabi Ghosh ◽

Domenic Forte ◽

Damon L. Woodard ◽

Rajat Subhra Chakraborty

Keyword(s):

Integrated Circuits ◽

Depth Map ◽

Optical Microscope ◽

Automated Detection ◽

Detection Methods ◽

Global Supply Chains ◽

Subject Matter Experts ◽

Security Breaches ◽

The Past ◽

Skilled Subject

Abstract Counterfeit electronics constitute a fast-growing threat to global supply chains as well as national security. With rapid globalization, the supply chain is growing more and more complex with components coming from a diverse set of suppliers. Counterfeiters are taking advantage of this complexity and replacing original parts with fake ones. Moreover, counterfeit integrated circuits (ICs) may contain circuit modifications that cause security breaches. Out of all types of counterfeit ICs, recycled and remarked ICs are the most common. Over the past few years, a plethora of counterfeit IC detection methods have been created; however, most of these methods are manual and require highly-skilled subject matter experts (SME). In this paper, an automated bent and corroded pin detection methodology using image processing is proposed to identify recycled ICs. Here, depth map of images acquired using an optical microscope are used to detect bent pins, and segmented side view pin images are used to detect corroded pins.

Download Full-text

Saliency detection in deep learning era: trends of development

Information and Control Systems ◽

10.31799/1684-8853-2019-3-10-36 ◽

2019 ◽

pp. 10-36 ◽

Cited By ~ 2

Author(s):

M. N. Favorskaya ◽

L. C. Jain

Keyword(s):

Deep Learning ◽

Object Detection ◽

Event Detection ◽

Visual Analysis ◽

Saliency Detection ◽

Salient Object Detection ◽

Public Image ◽

Detection Methods ◽

Salient Object ◽

Salient Event

Introduction:Saliency detection is a fundamental task of computer vision. Its ultimate aim is to localize the objects of interest that grab human visual attention with respect to the rest of the image. A great variety of saliency models based on different approaches was developed since 1990s. In recent years, the saliency detection has become one of actively studied topic in the theory of Convolutional Neural Network (CNN). Many original decisions using CNNs were proposed for salient object detection and, even, event detection.Purpose:A detailed survey of saliency detection methods in deep learning era allows to understand the current possibilities of CNN approach for visual analysis conducted by the human eyes’ tracking and digital image processing.Results:A survey reflects the recent advances in saliency detection using CNNs. Different models available in literature, such as static and dynamic 2D CNNs for salient object detection and 3D CNNs for salient event detection are discussed in the chronological order. It is worth noting that automatic salient event detection in durable videos became possible using the recently appeared 3D CNN combining with 2D CNN for salient audio detection. Also in this article, we have presented a short description of public image and video datasets with annotated salient objects or events, as well as the often used metrics for the results’ evaluation.Practical relevance:This survey is considered as a contribution in the study of rapidly developed deep learning methods with respect to the saliency detection in the images and videos.

Download Full-text

A Survey on Object Detection, Annotation and Anomaly Detection Methods for Endoscopic Videos

2020 5th International Conference on Computing, Communication and Security (ICCCS) ◽

10.1109/icccs49678.2020.9277436 ◽

2020 ◽

Author(s):

Tejas Chheda ◽

Soumya Koppaka ◽

Rithvika Iyer ◽

Dhananjay Kalbande

Keyword(s):

Anomaly Detection ◽

Object Detection ◽

Detection Methods ◽

Endoscopic Videos

Download Full-text

A Residual Network and FPGA Based Real-Time Depth Map Enhancement System

Entropy ◽

10.3390/e23050546 ◽

2021 ◽

Vol 23 (5) ◽

pp. 546

Author(s):

Zhenni Li ◽

Haoyi Sun ◽

Yuliang Gao ◽

Jiao Wang

Keyword(s):

Real Time ◽

Super Resolution ◽

Depth Map ◽

Acquisition System ◽

Depth Image ◽

Fpga Design ◽

Depth Sensing ◽

Residual Network ◽

Real Time Processing ◽

Depth Maps

Depth maps obtained through sensors are often unsatisfactory because of their low-resolution and noise interference. In this paper, we propose a real-time depth map enhancement system based on a residual network which uses dual channels to process depth maps and intensity maps respectively and cancels the preprocessing process, and the algorithm proposed can achieve real-time processing speed at more than 30 fps. Furthermore, the FPGA design and implementation for depth sensing is also introduced. In this FPGA design, intensity image and depth image are captured by the dual-camera synchronous acquisition system as the input of neural network. Experiments on various depth map restoration shows our algorithms has better performance than existing LRMC, DE-CNN and DDTF algorithms on standard datasets and has a better depth map super-resolution, and our FPGA completed the test of the system to ensure that the data throughput of the USB 3.0 interface of the acquisition system is stable at 226 Mbps, and support dual-camera to work at full speed, that is, 54 fps@ (1280 × 960 + 328 × 248 × 3).

Download Full-text

Research on object detection method based on FF-YOLO for complex scenes

IEEE Access ◽

10.1109/access.2021.3108398 ◽

2021 ◽

pp. 1-1

Author(s):

Chen Baoyuan ◽

Liu Yitong ◽

Sun Kun

Keyword(s):

Object Detection ◽

Detection Method ◽

Complex Scenes

Download Full-text

A Survey on Deep Learning Based Methods and Datasets for Monocular 3D Object Detection

Electronics ◽

10.3390/electronics10040517 ◽

2021 ◽

Vol 10 (4) ◽

pp. 517

Author(s):

Seong-heum Kim ◽

Youngbae Hwang

Keyword(s):

Deep Learning ◽

Object Detection ◽

Low Cost ◽

Detection Methods ◽

Future Research ◽

3D Object ◽

Practical Applications ◽

Depth Sensors ◽

Significant Research ◽

3D Object Detection

Owing to recent advancements in deep learning methods and relevant databases, it is becoming increasingly easier to recognize 3D objects using only RGB images from single viewpoints. This study investigates the major breakthroughs and current progress in deep learning-based monocular 3D object detection. For relatively low-cost data acquisition systems without depth sensors or cameras at multiple viewpoints, we first consider existing databases with 2D RGB photos and their relevant attributes. Based on this simple sensor modality for practical applications, deep learning-based monocular 3D object detection methods that overcome significant research challenges are categorized and summarized. We present the key concepts and detailed descriptions of representative single-stage and multiple-stage detection solutions. In addition, we discuss the effectiveness of the detection models on their baseline benchmarks. Finally, we explore several directions for future research on monocular 3D object detection.

Download Full-text

Augmented Reality and Machine Learning Incorporation Using YOLOv3 and ARKit

Applied Sciences ◽

10.3390/app11136006 ◽

2021 ◽

Vol 11 (13) ◽

pp. 6006

Author(s):

Huy Le ◽

Minh Nguyen ◽

Wei Qi Yan ◽

Hoa Nguyen

Keyword(s):

Machine Learning ◽

Augmented Reality ◽

Object Detection ◽

Feature Detection ◽

Detection Methods ◽

Detection Accuracy ◽

Data Annotation ◽

Machine Learning Model ◽

Potential Benefits ◽

Feature Detection And Tracking

Augmented reality is one of the fastest growing fields, receiving increased funding for the last few years as people realise the potential benefits of rendering virtual information in the real world. Most of today’s augmented reality marker-based applications use local feature detection and tracking techniques. The disadvantage of applying these techniques is that the markers must be modified to match the unique classified algorithms or they suffer from low detection accuracy. Machine learning is an ideal solution to overcome the current drawbacks of image processing in augmented reality applications. However, traditional data annotation requires extensive time and labour, as it is usually done manually. This study incorporates machine learning to detect and track augmented reality marker targets in an application using deep neural networks. We firstly implement the auto-generated dataset tool, which is used for the machine learning dataset preparation. The final iOS prototype application incorporates object detection, object tracking and augmented reality. The machine learning model is trained to recognise the differences between targets using one of YOLO’s most well-known object detection methods. The final product makes use of a valuable toolkit for developing augmented reality applications called ARKit.

Download Full-text