An Approach to Tracking Deformable Hand Gesture for Real-Time Interaction

2007 ◽  
Vol 18 (10) ◽  
pp. 2423 ◽  
Author(s):  
Xi-Ying WANG
2021 ◽  
Vol 12 ◽  
Author(s):  
Chengming Ma ◽  
Qian Liu ◽  
Yaqi Dang

This paper provides an in-depth study and analysis of human artistic poses through intelligently enhanced multimodal artistic pose recognition. A complementary network model architecture of multimodal information based on motion energy proposed. The network exploits both the rich information of appearance features provided by RGB data and the depth information provided by depth data as well as the characteristics of robustness to luminance and observation angle. The multimodal fusion is accomplished by the complementary information characteristics of the two modalities. Moreover, to better model the long-range temporal structure while considering action classes with sub-action sharing phenomena, an energy-guided video segmentation method is employed. And in the feature fusion stage, a cross-modal cross-fusion approach is proposed, which enables the convolutional network to share local features of two modalities not only in the shallow layer but also to obtain the fusion of global features in the deep convolutional layer by connecting the feature maps of multiple convolutional layers. Firstly, the Kinect camera is used to acquire the color image data of the human body, the depth image data, and the 3D coordinate data of the skeletal points using the Open pose open-source framework. Then, the action automatically extracted from keyframes based on the distance between the hand and the head, and the relative distance features are extracted from the keyframes to describe the action, the local occupancy pattern features and HSV color space features are extracted to describe the object, and finally, the feature fusion is performed and the complex action recognition task is completed. To solve the consistency problem of virtual-reality fusion, the mapping relationship between hand joint point coordinates and the virtual scene is determined in the augmented reality scene, and the coordinate consistency model of natural hand and virtual model is established; finally, the real-time interaction between hand gesture and virtual model is realized, and the average correct rate of its hand gesture reaches 99.04%, which improves the robustness and real-time interaction of hand gesture recognition.


2021 ◽  
Vol 11 (11) ◽  
pp. 5067
Author(s):  
Paulo Veloso Gomes ◽  
António Marques ◽  
João Donga ◽  
Catarina Sá ◽  
António Correia ◽  
...  

The interactivity of an immersive environment comes up from the relationship that is established between the user and the system. This relationship results in a set of data exchanges between human and technological actors. The real-time biofeedback devices allow to collect in real time the biodata generated by the user during the exhibition. The analysis, processing and conversion of these biodata into multimodal data allows to relate the stimuli with the emotions they trigger. This work describes an adaptive model for biofeedback data flows management used in the design of interactive immersive systems. The use of an affective algorithm allows to identify the types of emotions felt by the user and the respective intensities. The mapping between stimuli and emotions creates a set of biodata that can be used as elements of interaction that will readjust the stimuli generated by the system. The real-time interaction generated by the evolution of the user’s emotional state and the stimuli generated by the system allows him to adapt attitudes and behaviors to the situations he faces.


2021 ◽  
Vol 2021 (1) ◽  
Author(s):  
Samy Bakheet ◽  
Ayoub Al-Hamadi

AbstractRobust vision-based hand pose estimation is highly sought but still remains a challenging task, due to its inherent difficulty partially caused by self-occlusion among hand fingers. In this paper, an innovative framework for real-time static hand gesture recognition is introduced, based on an optimized shape representation build from multiple shape cues. The framework incorporates a specific module for hand pose estimation based on depth map data, where the hand silhouette is first extracted from the extremely detailed and accurate depth map captured by a time-of-flight (ToF) depth sensor. A hybrid multi-modal descriptor that integrates multiple affine-invariant boundary-based and region-based features is created from the hand silhouette to obtain a reliable and representative description of individual gestures. Finally, an ensemble of one-vs.-all support vector machines (SVMs) is independently trained on each of these learned feature representations to perform gesture classification. When evaluated on a publicly available dataset incorporating a relatively large and diverse collection of egocentric hand gestures, the approach yields encouraging results that agree very favorably with those reported in the literature, while maintaining real-time operation.


Author(s):  
Koichi Ishibuchi ◽  
Keisuke Iwasaki ◽  
Haruo Takemura ◽  
Fumio Kishino

Sign in / Sign up

Export Citation Format

Share Document