human action recognition
Recently Published Documents


TOTAL DOCUMENTS

1846
(FIVE YEARS 614)

H-INDEX

57
(FIVE YEARS 10)

ETRI Journal ◽  
2022 ◽  
Author(s):  
Nudrat Nida ◽  
Muhammad Haroon Yousaf ◽  
Aun Irtaza ◽  
Sergio A. Velastin

2022 ◽  
Vol 2022 ◽  
pp. 1-18
Author(s):  
Chao Tang ◽  
Anyang Tong ◽  
Aihua Zheng ◽  
Hua Peng ◽  
Wei Li

The traditional human action recognition (HAR) method is based on RGB video. Recently, with the introduction of Microsoft Kinect and other consumer class depth cameras, HAR based on RGB-D (RGB-Depth) has drawn increasing attention from scholars and industry. Compared with the traditional method, the HAR based on RGB-D has high accuracy and strong robustness. In this paper, using a selective ensemble support vector machine to fuse multimodal features for human action recognition is proposed. The algorithm combines the improved HOG feature-based RGB modal data, the depth motion map-based local binary pattern features (DMM-LBP), and the hybrid joint features (HJF)-based joints modal data. Concomitantly, a frame-based selective ensemble support vector machine classification model (SESVM) is proposed, which effectively integrates the selective ensemble strategy with the selection of SVM base classifiers, thus increasing the differences between the base classifiers. The experimental results have demonstrated that the proposed method is simple, fast, and efficient on public datasets in comparison with other action recognition algorithms.


2022 ◽  
pp. 116424
Author(s):  
Arya Sarkar ◽  
Avinandan Banerjee ◽  
Pawan Kumar Singh ◽  
Ram Sarkar

Micromachines ◽  
2021 ◽  
Vol 13 (1) ◽  
pp. 72
Author(s):  
Dengshan Li ◽  
Rujing Wang ◽  
Peng Chen ◽  
Chengjun Xie ◽  
Qiong Zhou ◽  
...  

Video object and human action detection are applied in many fields, such as video surveillance, face recognition, etc. Video object detection includes object classification and object location within the frame. Human action recognition is the detection of human actions. Usually, video detection is more challenging than image detection, since video frames are often more blurry than images. Moreover, video detection often has other difficulties, such as video defocus, motion blur, part occlusion, etc. Nowadays, the video detection technology is able to implement real-time detection, or high-accurate detection of blurry video frames. In this paper, various video object and human action detection approaches are reviewed and discussed, many of them have performed state-of-the-art results. We mainly review and discuss the classic video detection methods with supervised learning. In addition, the frequently-used video object detection and human action recognition datasets are reviewed. Finally, a summarization of the video detection is represented, e.g., the video object and human action detection methods could be classified into frame-by-frame (frame-based) detection, extracting-key-frame detection and using-temporal-information detection; the methods of utilizing temporal information of adjacent video frames are mainly the optical flow method, Long Short-Term Memory and convolution among adjacent frames.


2021 ◽  
Author(s):  
Shibin Xuan ◽  
Kuan Wang ◽  
Lixia Liu ◽  
Chang Liu ◽  
Jiaxiang Li

Skeleton-based human action recognition is a research hotspot in recent years, but most of the research focuses on the spatio-temporal feature extraction by convolutional neural network. In order to improve the correct recognition rate of these models, this paper proposes three strategies: using algebraic method to reduce redundant video frames, adding auxiliary edges into the joint adjacency graph to improve the skeleton graph structure, and adding some virtual classes to disperse the error recognition rate. Experimental results on NTU-RGB-D60, NTU-RGB-D120 and Kinetics Skeleton 400 databases show that the proposed strategy can effectively improve the accuracy of the original algorithm.


Sensors ◽  
2021 ◽  
Vol 21 (24) ◽  
pp. 8309
Author(s):  
Inwoong Lee ◽  
Doyoung Kim ◽  
Dongyoon Wee ◽  
Sanghoon Lee

In recent years, human action recognition has been studied by many computer vision researchers. Recent studies have attempted to use two-stream networks using appearance and motion features, but most of these approaches focused on clip-level video action recognition. In contrast to traditional methods which generally used entire images, we propose a new human instance-level video action recognition framework. In this framework, we represent the instance-level features using human boxes and keypoints, and our action region features are used as the inputs of the temporal action head network, which makes our framework more discriminative. We also propose novel temporal action head networks consisting of various modules, which reflect various temporal dynamics well. In the experiment, the proposed models achieve comparable performance with the state-of-the-art approaches on two challenging datasets. Furthermore, we evaluate the proposed features and networks to verify the effectiveness of them. Finally, we analyze the confusion matrix and visualize the recognized actions at human instance level when there are several people.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Zaosheng Ma

Smart cultural tourism is the development trend of the future tourism industry. Virtual reality is an important tool to realize smart tourism. The reality of virtual reality mainly comes from human-computer interaction, which is closely related to human action recognition technology. Therefore, the research takes human action recognition as the research direction, uses a self-organizing mapping network (SOM) neural network to extract the key frame of action video, combines it with multi-feature vector method to recognize human action, and compares the recognition rate and user satisfaction of different recognition methods. The results show that the recognition rate of multi-feature voting human action recognition algorithm based on SOM neural network is 93.68% on UT-Kinect action, 59.06% on MSRDailyActivity3D, and the overall action recognition time is only 3.59 s. Within six months, the total profit of human-computer interactive virtual reality tourism project with SOM neural network multi-eigenvector as the core algorithm reached 422,000 yuan, and 88% of users expressed satisfaction after use. It shows that the proposed method has a good recognition rate and can give users effective feedback in time. It is hoped that this research has a certain reference value in promoting the development of human motion recognition technology.


2021 ◽  
Vol 11 (23) ◽  
pp. 11481
Author(s):  
Junjie Chen ◽  
Wei Yang ◽  
Chenqi Liu ◽  
Leiyue Yao

In recent years, skeleton-based human action recognition (HAR) approaches using convolutional neural network (CNN) models have made tremendous progress in computer vision applications. However, using relative features to depict human actions, in addition to preventing overfitting when the CNN model is trained on a few samples, is still a challenge. In this paper, a new motion image is introduced to transform spatial-temporal motion information into image-based representations. For each skeleton sequence, three relative features are extracted to describe human actions. The three relative features are consisted of relative coordinates, immediate displacement, and immediate motion orientation. In particular, the relative coordinates introduced in our paper not only depict the spatial relations of human skeleton joints but also provide long-term temporal information. To address the problem of small sample sizes, a data augmentation strategy consisting of three simple but effective data augmentation methods is proposed to expand the training samples. Because the generated color images are small in size, a shallow CNN model is suitable to extract the deep features of the generated motion images. Two small-scale but challenging skeleton datasets were used to evaluate the method, scoring 96.59% and 97.48% on the Florence 3D Actions dataset and UTkinect-Action 3D dataset, respectively. The results show that the proposed method achieved a competitive performance compared with the state-of-the-art methods. Furthermore, the augmentation strategy proposed in this paper effectively solves the overfitting problem and can be widely adopted in skeleton-based action recognition.


Sign in / Sign up

Export Citation Format

Share Document