scholarly journals An Improved Estimation Algorithm of Space Targets Pose Based on Multi-Modal Feature Fusion

Mathematics ◽  
2021 ◽  
Vol 9 (17) ◽  
pp. 2085
Author(s):  
Jiang Hua ◽  
Tonglin Hao ◽  
Liangcai Zeng ◽  
Gui Yu

The traditional estimation methods of space targets pose are based on artificial features to match the transformation relationship between the image and the object model. With the explosion of deep learning technology, the approach based on deep neural networks (DNN) has significantly improved the performance of pose estimation. However, the current methods still have problems such as complex calculation, low accuracy, and poor real-time performance. Therefore, a new pose estimation algorithm is proposed in this paper. Firstly, the mask image of the target is obtained by the instance segmentation algorithm, and its point cloud image is obtained based on a depth map combined with camera parameters. Finally, the correlation among points is established to realize the prediction of pose based on multi-modal feature fusion. Experimental results in the YCB-Video dataset show that the proposed algorithm can recognize complex images at a speed of about 24 images per second with an accuracy of more than 80%. In conclusion, the proposed algorithm can realize fast pose estimation for complex stacked objects and has strong stability for different objects.


Sensors ◽  
2020 ◽  
Vol 20 (10) ◽  
pp. 2828
Author(s):  
Mhd Rashed Al Koutayni ◽  
Vladimir Rybalkin ◽  
Jameel Malik ◽  
Ahmed Elhayek ◽  
Christian Weis ◽  
...  

The estimation of human hand pose has become the basis for many vital applications where the user depends mainly on the hand pose as a system input. Virtual reality (VR) headset, shadow dexterous hand and in-air signature verification are a few examples of applications that require to track the hand movements in real-time. The state-of-the-art 3D hand pose estimation methods are based on the Convolutional Neural Network (CNN). These methods are implemented on Graphics Processing Units (GPUs) mainly due to their extensive computational requirements. However, GPUs are not suitable for the practical application scenarios, where the low power consumption is crucial. Furthermore, the difficulty of embedding a bulky GPU into a small device prevents the portability of such applications on mobile devices. The goal of this work is to provide an energy efficient solution for an existing depth camera based hand pose estimation algorithm. First, we compress the deep neural network model by applying the dynamic quantization techniques on different layers to achieve maximum compression without compromising accuracy. Afterwards, we design a custom hardware architecture. For our device we selected the FPGA as a target platform because FPGAs provide high energy efficiency and can be integrated in portable devices. Our solution implemented on Xilinx UltraScale+ MPSoC FPGA is 4.2× faster and 577.3× more energy efficient than the original implementation of the hand pose estimation algorithm on NVIDIA GeForce GTX 1070.



2019 ◽  
Vol 2 (1) ◽  
pp. 1
Author(s):  
Jamal Firmat Banzi1,2 ◽  
Isack Bulugu3 ◽  
Zhongfu Ye1

Recent hand pose estimation methods require large numbers of annotated training data to extract the dynamic information from a hand representation. Nevertheless, precise and dense annotation on the real data is difficult to come by and the amount of information passed to the training algorithm is significantly higher. This paper presents an approach to developing a hand pose estimation system which can accurately regress a 3D pose in an unsupervised manner. The whole process is performed in three stages. Firstly, the hand is modelled by a novel latent tree dependency model (LTDM) which transforms internal joints location to an explicit representation. Secondly, we perform predictive coding of image sequences of hand poses in order to capture latent features underlying a given image without supervision. A mapping is then performed between an image depth and a generated representation. Thirdly, the hand joints are regressed using convolutional neural networks to finally estimate the latent pose given some depth map. Finally, an unsupervised error term which is a part of the recurrent architecture ensures smooth estimations of the final pose. To demonstrate the performance of the proposed system, a complete experiment is conducted on three challenging public datasets, ICVL, MSRA, and NYU. The empirical results show the significant performance of our method which is comparable or better than state-of-the-art approaches.





2020 ◽  
Vol 12 (23) ◽  
pp. 3857
Author(s):  
Junjie Xu ◽  
Bin Song ◽  
Xi Yang ◽  
Xiaoting Nan

The on-board pose estimation of uncooperative target is an essential ability for close-proximity formation flying missions, on-orbit servicing, active debris removal and space exploration. However, the main issues of this research are: first, traditional pose determination algorithms result in a semantic gap and poor generalization abilities. Second, specific pose information cannot be accurately known in a complicated space target imaging environment. Deep learning methods can effectively solve these problems; thus, we propose a pose estimation algorithm that is based on deep learning. We use keypoints detection method to estimate the pose of space targets. For complicated space target imaging environment, we combined the high-resolution network with dilated convolution and online hard keypoint mining strategy. The improved network pays more attention to the obscured keypoints, has a larger receptive field, and improves the detection accuracy. Extensive experiments have been conducted and the results demonstrate that the proposed algorithms can effectively reduce the error rate of pose estimation and, compared with the related pose estimation methods, our proposed model has a higher detection accuracy and a lower pose determination error rate in the speed dataset.



Symmetry ◽  
2020 ◽  
Vol 12 (10) ◽  
pp. 1636
Author(s):  
Yiqi Wu ◽  
Shichao Ma ◽  
Dejun Zhang ◽  
Jun Sun

Hand pose estimation from 3D data is a key challenge in computer vision as well as an essential step for human–computer interaction. A lot of deep learning-based hand pose estimation methods have made significant progress but give less consideration to the inner interactions of input data, especially when consuming hand point clouds. Therefore, this paper proposes an end-to-end capsule-based hand pose estimation network (Capsule-HandNet), which processes hand point clouds directly with the consideration of structural relationships among local parts, including symmetry, junction, relative location, etc. Firstly, an encoder is adopted in Capsule-HandNet to extract multi-level features into the latent capsule by dynamic routing. The latent capsule represents the structural relationship information of the hand point cloud explicitly. Then, a decoder recovers a point cloud to fit the input hand point cloud via a latent capsule. This auto-encoder procedure is designed to ensure the effectiveness of the latent capsule. Finally, the hand pose is regressed from the combined feature, which consists of the global feature and the latent capsule. The Capsule-HandNet is evaluated on public hand pose datasets under the metrics of the mean error and the fraction of frames. The mean joint errors of Capsule-HandNet on MSRA and ICVL datasets reach 8.85 mm and 7.49 mm, respectively, and Capsule-HandNet outperforms the state-of-the-art methods on most thresholds under the fraction of frames metric. The experimental results demonstrate the effectiveness of Capsule-HandNet for 3D hand pose estimation.



Author(s):  
Annapoorani Gopal ◽  
Lathaselvi Gandhimaruthian ◽  
Javid Ali

The Deep Neural Networks have gained prominence in the biomedical domain, becoming the most commonly used networks after machine learning technology. Mammograms can be used to detect breast cancers with high precision with the help of Convolutional Neural Network (CNN) which is deep learning technology. An exhaustive labeled data is required to train the CNN from scratch. This can be overcome by deploying Generative Adversarial Network (GAN) which comparatively needs lesser training data during a mammogram screening. In the proposed study, the application of GANs in estimating breast density, high-resolution mammogram synthesis for clustered microcalcification analysis, effective segmentation of breast tumor, analysis of the shape of breast tumor, extraction of features and augmentation of the image during mammogram classification have been extensively reviewed.



Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1579
Author(s):  
Dongqi Wang ◽  
Qinghua Meng ◽  
Dongming Chen ◽  
Hupo Zhang ◽  
Lisheng Xu

Automatic detection of arrhythmia is of great significance for early prevention and diagnosis of cardiovascular disease. Traditional feature engineering methods based on expert knowledge lack multidimensional and multi-view information abstraction and data representation ability, so the traditional research on pattern recognition of arrhythmia detection cannot achieve satisfactory results. Recently, with the increase of deep learning technology, automatic feature extraction of ECG data based on deep neural networks has been widely discussed. In order to utilize the complementary strength between different schemes, in this paper, we propose an arrhythmia detection method based on the multi-resolution representation (MRR) of ECG signals. This method utilizes four different up to date deep neural networks as four channel models for ECG vector representations learning. The deep learning based representations, together with hand-crafted features of ECG, forms the MRR, which is the input of the downstream classification strategy. The experimental results of big ECG dataset multi-label classification confirm that the F1 score of the proposed method is 0.9238, which is 1.31%, 0.62%, 1.18% and 0.6% higher than that of each channel model. From the perspective of architecture, this proposed method is highly scalable and can be employed as an example for arrhythmia recognition.



2021 ◽  
Vol 10 (3) ◽  
pp. 157
Author(s):  
Paul-Mark DiFrancesco ◽  
David A. Bonneau ◽  
D. Jean Hutchinson

Key to the quantification of rockfall hazard is an understanding of its magnitude-frequency behaviour. Remote sensing has allowed for the accurate observation of rockfall activity, with methods being developed for digitally assembling the monitored occurrences into a rockfall database. A prevalent challenge is the quantification of rockfall volume, whilst fully considering the 3D information stored in each of the extracted rockfall point clouds. Surface reconstruction is utilized to construct a 3D digital surface representation, allowing for an estimation of the volume of space that a point cloud occupies. Given various point cloud imperfections, it is difficult for methods to generate digital surface representations of rockfall with detailed geometry and correct topology. In this study, we tested four different computational geometry-based surface reconstruction methods on a database comprised of 3668 rockfalls. The database was derived from a 5-year LiDAR monitoring campaign of an active rock slope in interior British Columbia, Canada. Each method resulted in a different magnitude-frequency distribution of rockfall. The implications of 3D volume estimation were demonstrated utilizing surface mesh visualization, cumulative magnitude-frequency plots, power-law fitting, and projected annual frequencies of rockfall occurrence. The 3D volume estimation methods caused a notable shift in the magnitude-frequency relations, while the power-law scaling parameters remained relatively similar. We determined that the optimal 3D volume calculation approach is a hybrid methodology comprised of the Power Crust reconstruction and the Alpha Solid reconstruction. The Alpha Solid approach is to be used on small-scale point clouds, characterized with high curvatures relative to their sampling density, which challenge the Power Crust sampling assumptions.



Sensors ◽  
2021 ◽  
Vol 21 (12) ◽  
pp. 4068
Author(s):  
Zheshuo Zhang ◽  
Jie Zhang ◽  
Jiawen Dai ◽  
Bangji Zhang ◽  
Hengmin Qi

Vehicle parameters are essential for dynamic analysis and control systems. One problem of the current estimation algorithm for vehicles’ parameters is that: real-time estimation methods only identify parts of vehicle parameters, whereas other parameters such as suspension damping coefficients and suspension and tire stiffnesses are assumed to be known in advance by means of an inertial parameter measurement device (IPMD). In this study, a fusion algorithm is proposed for identifying comprehensive vehicle parameters without the help of an IPMD, and vehicle parameters are divided into time-independent parameters (TIPs) and time-dependent parameters (TDPs) based on whether they change over time. TIPs are identified by a hybrid-mass state-variable (HMSV). A dual unscented Kalman filter (DUKF) is applied to update both TDPs and online states. The experiment is conducted on a real two-axle vehicle and the test data are used to estimate both TIPs and TDPs to validate the accuracy of the proposed algorithm. Numerical simulations are performed to further investigate the algorithm’s performance in terms of sprung mass variation, model error because of linearization and various road conditions. The results from both the experiment and simulation show that the proposed algorithm can estimate TIPs as well as update TDPs and online states with high accuracy and quick convergence, and no requirement of road information.



Sign in / Sign up

Export Citation Format

Share Document