scholarly journals Method of Using RealSense Camera to Estimate the Depth Map of Any Monocular Camera

2021 ◽  
Vol 2021 ◽  
pp. 1-9
Author(s):  
Li-fen Tu ◽  
Qi Peng

Robot detection, recognition, positioning, and other applications require not only real-time video image information but also the distance from the target to the camera, that is, depth information. This paper proposes a method to automatically generate any monocular camera depth map based on RealSense camera data. By using this method, any current single-camera detection system can be upgraded online. Without changing the original system, the depth information of the original monocular camera can be obtained simply, and the transition from 2D detection to 3D detection can be realized. In order to verify the effectiveness of the proposed method, a hardware system was constructed using the Micro-vision RS-A14K-GC8 industrial camera and the Intel RealSense D415 depth camera, and the depth map fitting algorithm proposed in this paper was used to test the system. The results show that, except for a few depth-missing areas, the results of other areas with depth are still good, which can basically describe the distance difference between the target and the camera. In addition, in order to verify the scalability of the method, a new hardware system was constructed with different cameras, and images were collected in a complex farmland environment. The generated depth map was good, which could basically describe the distance difference between the target and the camera.

2021 ◽  
Vol 13 ◽  
pp. 175682932110048
Author(s):  
Huajun Song ◽  
Yanqi Wu ◽  
Guangbing Zhou

With the rapid development of drones, many problems have arisen, such as invasion of privacy and endangering security. Inspired by biology, in order to achieve effective detection and robust tracking of small targets such as unmanned aerial vehicles, a binocular vision detection system is designed. The system is composed of long focus and wide-angle dual cameras, servo pan tilt, and dual processors for detecting and identifying targets. In view of the shortcomings of spatio-temporal context target tracking algorithm that cannot adapt to scale transformation and easy to track failure in complex scenes, the scale filter and loss criterion are introduced to make an improvement. Qualitative and quantitative experiments show that the designed system can adapt to the scale changes and partial occlusion conditions in the detection, and meets the real-time requirements. The hardware system and algorithm both have reference value for the application of anti-unmanned aerial vehicle systems.


2018 ◽  
Vol 2018 ◽  
pp. 1-13 ◽  
Author(s):  
Hasan Mahmud ◽  
Md. Kamrul Hasan ◽  
Abdullah-Al-Tariq ◽  
Md. Hasanul Kabir ◽  
M. A. Mottalib

Symbolic gestures are the hand postures with some conventionalized meanings. They are static gestures that one can perform in a very complex environment containing variations in rotation and scale without using voice. The gestures may be produced in different illumination conditions or occluding background scenarios. Any hand gesture recognition system should find enough discriminative features, such as hand-finger contextual information. However, in existing approaches, depth information of hand fingers that represents finger shapes is utilized in limited capacity to extract discriminative features of fingers. Nevertheless, if we consider finger bending information (i.e., a finger that overlaps palm), extracted from depth map, and use them as local features, static gestures varying ever so slightly can become distinguishable. Our work here corroborated this idea and we have generated depth silhouettes with variation in contrast to achieve more discriminative keypoints. This approach, in turn, improved the recognition accuracy up to 96.84%. We have applied Scale-Invariant Feature Transform (SIFT) algorithm which takes the generated depth silhouettes as input and produces robust feature descriptors as output. These features (after converting into unified dimensional feature vectors) are fed into a multiclass Support Vector Machine (SVM) classifier to measure the accuracy. We have tested our results with a standard dataset containing 10 symbolic gesture representing 10 numeric symbols (0-9). After that we have verified and compared our results among depth images, binary images, and images consisting of the hand-finger edge information generated from the same dataset. Our results show higher accuracy while applying SIFT features on depth images. Recognizing numeric symbols accurately performed through hand gestures has a huge impact on different Human-Computer Interaction (HCI) applications including augmented reality, virtual reality, and other fields.


Author(s):  
Hyun Jun Park ◽  
Kwang Baek Kim

<p><span>Intel RealSense depth camera provides depth image using infrared projector and infrared camera. Using infrared radiation makes it possible to measure the depth with high accuracy, but the shadow of infrared radiation makes depth unmeasured regions. Intel RealSense SDK provides a postprocessing algorithm to correct it. However, this algorithm is not enough to be used and needs to be improved. Therefore, we propose a method to correct the depth image using image processing techniques. The proposed method corrects the depth using the adjacent depth information. Experimental results showed that the proposed method corrects the depth image more accurately than the Intel RealSense SDK.</span></p>


Sensors ◽  
2019 ◽  
Vol 19 (13) ◽  
pp. 3008 ◽  
Author(s):  
Zhe Liu ◽  
Zhaozong Meng ◽  
Nan Gao ◽  
Zonghua Zhang

Depth cameras play a vital role in three-dimensional (3D) shape reconstruction, machine vision, augmented/virtual reality and other visual information-related fields. However, a single depth camera cannot obtain complete information about an object by itself due to the limitation of the camera’s field of view. Multiple depth cameras can solve this problem by acquiring depth information from different viewpoints. In order to do so, they need to be calibrated to be able to accurately obtain the complete 3D information. However, traditional chessboard-based planar targets are not well suited for calibrating the relative orientations between multiple depth cameras, because the coordinates of different depth cameras need to be unified into a single coordinate system, and the multiple camera systems with a specific angle have a very small overlapping field of view. In this paper, we propose a 3D target-based multiple depth camera calibration method. Each plane of the 3D target is used to calibrate an independent depth camera. All planes of the 3D target are unified into a single coordinate system, which means the feature points on the calibration plane are also in one unified coordinate system. Using this 3D target, multiple depth cameras can be calibrated simultaneously. In this paper, a method of precise calibration using lidar is proposed. This method is not only applicable to the 3D target designed for the purposes of this paper, but it can also be applied to all 3D calibration objects consisting of planar chessboards. This method can significantly reduce the calibration error compared with traditional camera calibration methods. In addition, in order to reduce the influence of the infrared transmitter of the depth camera and improve its calibration accuracy, the calibration process of the depth camera is optimized. A series of calibration experiments were carried out, and the experimental results demonstrated the reliability and effectiveness of the proposed method.


Author(s):  
Nadia Baha ◽  
Eden Beloudah ◽  
Mehdi Ousmer

Falls are the major health problem among older people who live alone in their home. In the past few years, several studies have been proposed to solve the dilemma especially those which exploit video surveillance. In this paper, in order to allow older adult to safely continue living in home environments, the authors propose a method which combines two different configurations of the Microsoft Kinect: The first one is based on the person's depth information and his velocity (Ceiling mounted Kinect). The second one is based on the variation of bounding box parameters and its velocity (Frontal Kinect). Experimental results on real datasets are conducted and a comparative evaluation of the obtained results relative to the state-of-art methods is presented. The results show that the authors' method is able to accurately detect several types of falls in real-time as well as achieving a significant reduction in false alarms and improves detection rates.


Sign in / Sign up

Export Citation Format

Share Document