scholarly journals Recognition of Fingerspelling Sequences in Polish Sign Language Using Point Clouds Obtained from Depth Images

Sensors ◽  
2019 ◽  
Vol 19 (5) ◽  
pp. 1078 ◽  
Author(s):  
Dawid Warchoł ◽  
Tomasz Kapuściński ◽  
Marian Wysocki

The paper presents a method for recognizing sequences of static letters of the Polish finger alphabet using the point cloud descriptors: viewpoint feature histogram, eigenvalues-based descriptors, ensemble of shape functions, and global radius-based surface descriptor. Each sequence is understood as quick highly coarticulated motions, and the classification is performed by networks of hidden Markov models trained by transitions between postures corresponding to particular letters. Three kinds of the left-to-right Markov models of the transitions, two networks of the transition models—independent and dependent on a dictionary—as well as various combinations of point cloud descriptors are examined on a publicly available dataset of 4200 executions (registered as depth map sequences) prepared by the authors. The hand shape representation proposed in our method can also be applied for recognition of hand postures in single frames. We confirmed this using a known, challenging American finger alphabet dataset with about 60,000 depth images.

2015 ◽  
Vol 764-765 ◽  
pp. 1375-1379 ◽  
Author(s):  
Cheng Tiao Hsieh

This paper aims at presenting a simple approach utilizing a Kinect-based scanner to create models available for 3D printing or other digital manufacturing machines. The outputs of Kinect-based scanners are a depth map and they usually need complicated computational processes to prepare them ready for a digital fabrication. The necessary processes include noise filtering, point cloud alignment and surface reconstruction. Each process may require several functions and algorithms to accomplish these specific tasks. For instance, the Iterative Closest Point (ICP) is frequently used in a 3D registration and the bilateral filter is often used in a noise point filtering process. This paper attempts to develop a simple Kinect-based scanner and its specific modeling approach without involving the above complicated processes.The developed scanner consists of an ASUS’s Xtion Pro and rotation table. A set of organized point cloud can be generated by the scanner. Those organized point clouds can be aligned precisely by a simple transformation matrix instead of the ICP. The surface quality of raw point clouds captured by Kinect are usually rough. For this drawback, this paper introduces a solution to obtain a smooth surface model. Inaddition, those processes have been efficiently developed by free open libraries, VTK, Point Cloud Library and OpenNI.


2014 ◽  
Vol 513-517 ◽  
pp. 4193-4196
Author(s):  
Wen Bao Qiao ◽  
Ming Guo ◽  
Jun Jie Liu

In this paper, we propose an efficient way to produce an initial transposed matrix for two point clouds, which can effectively avoid the drawback of local optimism caused by using standard Iterative Closest Points (ICP)[ algorithm when matching two point clouds. In our approach, the correspondences used to calculate the transposed matrix are confirmed before the point cloud forms. We use the depth images which have been carefully target-segmented to find the boundaries of the shapes that reflect different views of the same target object. Then each contour is affected by curvature scale space (CSS)[ method to find a sequence of characteristic points. After that, our method is applied on these characteristic points to find the most matching pairs of points. Finally, we convert the matched characteristic points to 3D points, and the correspondences are there being confirmed. We can use them to compute an initial transposed matrix to tell the computer which part of the first point cloud should be matched to the second. In this way, we put the two point clouds in a correct initial location, so that the local optimism of ICP and its variations can be excluded.


Sensors ◽  
2020 ◽  
Vol 20 (18) ◽  
pp. 5331
Author(s):  
Ouk Choi ◽  
Min-Gyu Park ◽  
Youngbae Hwang

We present two algorithms for aligning two colored point clouds. The two algorithms are designed to minimize a probabilistic cost based on the color-supported soft matching of points in a point cloud to their K-closest points in the other point cloud. The first algorithm, like prior iterative closest point algorithms, refines the pose parameters to minimize the cost. Assuming that the point clouds are obtained from RGB-depth images, our second algorithm regards the measured depth values as variables and minimizes the cost to obtain refined depth values. Experiments with our synthetic dataset show that our pose refinement algorithm gives better results compared to the existing algorithms. Our depth refinement algorithm is shown to achieve more accurate alignments from the outputs of the pose refinement step. Our algorithms are applied to a real-world dataset, providing accurate and visually improved results.


2020 ◽  
Vol 12 (7) ◽  
pp. 1142
Author(s):  
Jeonghoon Kwak ◽  
Yunsick Sung

To provide a realistic environment for remote sensing applications, point clouds are used to realize a three-dimensional (3D) digital world for the user. Motion recognition of objects, e.g., humans, is required to provide realistic experiences in the 3D digital world. To recognize a user’s motions, 3D landmarks are provided by analyzing a 3D point cloud collected through a light detection and ranging (LiDAR) system or a red green blue (RGB) image collected visually. However, manual supervision is required to extract 3D landmarks as to whether they originate from the RGB image or the 3D point cloud. Thus, there is a need for a method for extracting 3D landmarks without manual supervision. Herein, an RGB image and a 3D point cloud are used to extract 3D landmarks. The 3D point cloud is utilized as the relative distance between a LiDAR and a user. Because it cannot contain all information the user’s entire body due to disparities, it cannot generate a dense depth image that provides the boundary of user’s body. Therefore, up-sampling is performed to increase the density of the depth image generated based on the 3D point cloud; the density depends on the 3D point cloud. This paper proposes a system for extracting 3D landmarks using 3D point clouds and RGB images without manual supervision. A depth image provides the boundary of a user’s motion and is generated by using 3D point cloud and RGB image collected by a LiDAR and an RGB camera, respectively. To extract 3D landmarks automatically, an encoder–decoder model is trained with the generated depth images, and the RGB images and 3D landmarks are extracted from these images with the trained encoder model. The method of extracting 3D landmarks using RGB depth (RGBD) images was verified experimentally, and 3D landmarks were extracted to evaluate the user’s motions with RGBD images. In this manner, landmarks could be extracted according to the user’s motions, rather than by extracting them using the RGB images. The depth images generated by the proposed method were 1.832 times denser than the up-sampling-based depth images generated with bilateral filtering.


2020 ◽  
Vol 12 (7) ◽  
pp. 1125 ◽  
Author(s):  
Helia Farhood ◽  
Stuart Perry ◽  
Eva Cheng ◽  
Juno Kim

The importance of three-dimensional (3D) point cloud technologies in the field of agriculture environmental research has increased in recent years. Obtaining dense and accurate 3D reconstructions of plants and urban areas provide useful information for remote sensing. In this paper, we propose a novel strategy for the enhancement of 3D point clouds from a single 4D light field (LF) image. Using a light field camera in this way creates an easy way for obtaining 3D point clouds from one snapshot and enabling diversity in monitoring and modelling applications for remote sensing. Considering an LF image and associated depth map as an input, we first apply histogram equalization and histogram stretching to enhance the separation between depth planes. We then apply multi-modal edge detection by using feature matching and fuzzy logic from the central sub-aperture LF image and the depth map. These two steps of depth map enhancement are significant parts of our novelty for this work. After combing the two previous steps and transforming the point–plane correspondence, we can obtain the 3D point cloud. We tested our method with synthetic and real world image databases. To verify the accuracy of our method, we compared our results with two different state-of-the-art algorithms. The results showed that our method can reliably mitigate noise and had the highest level of detail compared to other existing methods.


Author(s):  
Xiaowen Teng ◽  
Guangsheng Zhou ◽  
Yuxuan Wu ◽  
Chenglong Huang ◽  
Wanjing Dong ◽  
...  

The 3D reconstruction method using RGB-D camera has a good balance in hardware cost, point cloud quality and automation. However, due to the limitation of inherent structure and imaging principle, the acquired point cloud has problems such as a lot of noise and difficult registration. This paper proposes a three-dimensional reconstruction method using Azure Kinect to solve these inherent problems. Shoot color map, depth map and near-infrared image of the target from six perspectives by Azure Kinect sensor. Multiply the 8-bit infrared image binarization with the general RGB-D image alignment result provided by Microsoft to remove ghost images and most of the background noise. In order to filter the floating point and outlier noise of the point cloud, a neighborhood maximum filtering method is proposed to filter out the abrupt points in the depth map. The floating points in the point cloud are removed before generating the point cloud, and then using the through filter filters out outlier noise. Aiming at the shortcomings of the classic ICP algorithm, an improved method is proposed. By continuously reducing the size of the down-sampling grid and the distance threshold between the corresponding points, the point clouds of each view are continuously registered three times, until get the complete color point cloud. A large number of experimental results on rape plants show that the point cloud accuracy obtained by this method is 0.739mm, a complete scan time is 338.4 seconds, and the color reduction is high. Compared with a laser scanner, the proposed method has considerable reconstruction accuracy and a significantly ahead of the reconstruction speed, but the hardware cost is much lower and it is easy to automate the scanning system. This research shows a low-cost, high-precision 3D reconstruction technology, which has the potential to be widely used for non-destructive measurement of crop phenotype.


2016 ◽  
Vol 23 (2) ◽  
pp. 91-98 ◽  
Author(s):  
Tomasz Dziubich ◽  
Julian Szymański ◽  
Adam Brzeski ◽  
Jan Cychnerski ◽  
Waldemar Korłub

Abstract In this paper, we propose a distributed system for point cloud processing and transferring them via computer network regarding to effectiveness-related requirements. We discuss the comparison of point cloud filters focusing on their usage for streaming optimization. For the filtering step of the stream pipeline processing we evaluate four filters: Voxel Grid, Radial Outliner Remover, Statistical Outlier Removal and Pass Through. For each of the filters we perform a series of tests for evaluating the impact on the point cloud size and transmitting frequency (analysed for various fps ratio). We present results of the optimization process used for point cloud consolidation in a distributed environment. We describe the processing of the point clouds before and after the transmission. Pre- and post-processing allow the user to send the cloud via network without any delays. The proposed pre-processing compression of the cloud and the post-processing reconstruction of it are focused on assuring that the end-user application obtains the cloud with a given precision.


Author(s):  
Yi-Chen Chen ◽  
Chao-Hung Lin

With the development of Web 2.0 and cyber city modeling, an increasing number of 3D models have been available on web-based model-sharing platforms with many applications such as navigation, urban planning, and virtual reality. Based on the concept of data reuse, a 3D model retrieval system is proposed to retrieve building models similar to a user-specified query. The basic idea behind this system is to reuse these existing 3D building models instead of reconstruction from point clouds. To efficiently retrieve models, the models in databases are compactly encoded by using a shape descriptor generally. However, most of the geometric descriptors in related works are applied to polygonal models. In this study, the input query of the model retrieval system is a point cloud acquired by Light Detection and Ranging (LiDAR) systems because of the efficient scene scanning and spatial information collection. Using Point clouds with sparse, noisy, and incomplete sampling as input queries is more difficult than that by using 3D models. Because that the building roof is more informative than other parts in the airborne LiDAR point cloud, an image-based approach is proposed to encode both point clouds from input queries and 3D models in databases. The main goal of data encoding is that the models in the database and input point clouds can be consistently encoded. Firstly, top-view depth images of buildings are generated to represent the geometry surface of a building roof. Secondly, geometric features are extracted from depth images based on height, edge and plane of building. Finally, descriptors can be extracted by spatial histograms and used in 3D model retrieval system. For data retrieval, the models are retrieved by matching the encoding coefficients of point clouds and building models. In experiments, a database including about 900,000 3D models collected from the Internet is used for evaluation of data retrieval. The results of the proposed method show a clear superiority over related methods.


Author(s):  
Yi-Chen Chen ◽  
Chao-Hung Lin

With the development of Web 2.0 and cyber city modeling, an increasing number of 3D models have been available on web-based model-sharing platforms with many applications such as navigation, urban planning, and virtual reality. Based on the concept of data reuse, a 3D model retrieval system is proposed to retrieve building models similar to a user-specified query. The basic idea behind this system is to reuse these existing 3D building models instead of reconstruction from point clouds. To efficiently retrieve models, the models in databases are compactly encoded by using a shape descriptor generally. However, most of the geometric descriptors in related works are applied to polygonal models. In this study, the input query of the model retrieval system is a point cloud acquired by Light Detection and Ranging (LiDAR) systems because of the efficient scene scanning and spatial information collection. Using Point clouds with sparse, noisy, and incomplete sampling as input queries is more difficult than that by using 3D models. Because that the building roof is more informative than other parts in the airborne LiDAR point cloud, an image-based approach is proposed to encode both point clouds from input queries and 3D models in databases. The main goal of data encoding is that the models in the database and input point clouds can be consistently encoded. Firstly, top-view depth images of buildings are generated to represent the geometry surface of a building roof. Secondly, geometric features are extracted from depth images based on height, edge and plane of building. Finally, descriptors can be extracted by spatial histograms and used in 3D model retrieval system. For data retrieval, the models are retrieved by matching the encoding coefficients of point clouds and building models. In experiments, a database including about 900,000 3D models collected from the Internet is used for evaluation of data retrieval. The results of the proposed method show a clear superiority over related methods.


Sign in / Sign up

Export Citation Format

Share Document