scholarly journals Semi-Supervised Image-to-Video Adaptation for Video Action Recognition

Author(s):  
Rohan Munshi

Given a sequence of images i.e. video, the task given a sequence of images i.e. video, the task of action recognition is to identify the most same action among the action sequences learned by the system. Such human action recognition is based on evidence gathered from videos. It has a lot of applications including surveillance, video indexing, biometrics, telehealth, and human-computer interaction. Vision-based human activity recognition is plagued by numerous challenges thanks to reading changes, occlusion, variation in execution rate, camera motion, and background clutter. In this survey, we provide an overview and report of the existing methods based on their ability to handle these challenges as well as how these methods can be generalized and their ability to detect abnormal actions. Such systematic classification can facilitate researchers to spot the acceptable ways on the market to deal with every one of the challenges visaged and their limitations. In addition to this, we also identify the public datasets and the challenges posed by them. From this survey, we have a tendency to draw conclusions relating to however well a challenge has been resolved, and that we determine potential analysis areas that need more work.

2020 ◽  
pp. 1202-1214
Author(s):  
Riyadh Sahib Abdul Ameer ◽  
Mohammed Al-Taei

Human action recognition has gained popularity because of its wide applicability, such as in patient monitoring systems, surveillance systems, and a wide diversity of systems that contain interactions between people and electrical devices, including human computer interfaces. The proposed method includes sequential stages of object segmentation, feature extraction, action detection and then action recognition. Effective results of human actions using different features of unconstrained videos was a challenging task due to camera motion, cluttered background, occlusions, complexity of human movements, and variety of same actions performed by distinct subjects. Thus, the proposed method overcomes such problems by using the fusion of features concept for the development of a powerful human action descriptor. This descriptor is modified to create a visual word vocabulary (or codebook) which yields a Bag-of-Words representation. The True Positive Rate (TPR) and False Positive Rate (FPR) measures gave a true indication about the proposed HAR system. The computed Accuracy (Ar) and the Error (misclassification) Rate (Er) reveal the effectiveness of the system with the used dataset.


2021 ◽  
Vol 104 (2) ◽  
pp. 003685042110054
Author(s):  
Cherry A. Aly ◽  
Fazly S. Abas ◽  
Goh H. Ann

Introduction: Action recognition is a challenging time series classification task that has received much attention in the recent past due to its importance in critical applications, such as surveillance, visual behavior study, topic discovery, security, and content retrieval. Objectives: The main objective of the research is to develop a robust and high-performance human action recognition techniques. A combination of local and holistic feature extraction methods used through analyzing the most effective features to extract to reach the objective, followed by using simple and high-performance machine learning algorithms. Methods: This paper presents three robust action recognition techniques based on a series of image analysis methods to detect activities in different scenes. The general scheme architecture consists of shot boundary detection, shot frame rate re-sampling, and compact feature vector extraction. This process is achieved by emphasizing variations and extracting strong patterns in feature vectors before classification. Results: The proposed schemes are tested on datasets with cluttered backgrounds, low- or high-resolution videos, different viewpoints, and different camera motion conditions, namely, the Hollywood-2, KTH, UCF11 (YouTube actions), and Weizmann datasets. The proposed schemes resulted in highly accurate video analysis results compared to those of other works based on four widely used datasets. The First, Second, and Third Schemes provides recognition accuracies of 57.8%, 73.6%, and 52.0% on Hollywood2, 94.5%, 97.0%, and 59.3% on KTH, 94.5%, 95.6%, and 94.2% on UCF11, and 98.9%, 97.8% and 100% on Weizmann. Conclusion: Each of the proposed schemes provides high recognition accuracy compared to other state-of-art methods. Especially, the Second Scheme as it gives excellent comparable results to other benchmarked approaches.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Meng Li ◽  
Qiumei Sun

Smart homes have become central in the sustainability of buildings. Recognizing human activity in smart homes is the key tool to achieve home automation. Recently, two-stream Convolutional Neural Networks (CNNs) have shown promising performance for video-based human action recognition. However, such models cannot act directly on the 3D skeletal sequences due to its limitation to the 2D image video inputs. Considering the powerful effect of 3D skeletal data for describing human activity, in this study, we present a novel method to recognize the skeletal human activity in sustainable smart homes using a CNN fusion model. Our proposed method can represent the spatiotemporal information of each 3D skeletal sequence into three images and three image sequences through gray value encoding, referred to as skeletal trajectory shape images (STSIs) and skeletal pose image (SPI) sequences, and build a CNNs’ fusion model with three STSIs and three SPI sequences as input for skeletal activity recognition. Such three STSIs and three SPI sequences are, respectively, generated in three orthogonal planes as complementary to each other. The proposed CNN fusion model allows the hierarchical learning of spatiotemporal features, offering better action recognition performance. Experimental results on three public datasets show that our method outperforms the state-of-the-art methods.


Author(s):  
Souhila Kahlouche ◽  
Mahmoud Belhocine ◽  
Abdallah Menouar

In this work, efficient human activity recognition (HAR) algorithm based on deep learning architecture is proposed to classify activities into seven different classes. In order to learn spatial and temporal features from only 3D skeleton data captured from a “Microsoft Kinect” camera, the proposed algorithm combines both convolution neural network (CNN) and long short-term memory (LSTM) architectures. This combination allows taking advantage of LSTM in modeling temporal data and of CNN in modeling spatial data. The captured skeleton sequences are used to create a specific dataset of interactive activities; these data are then transformed according to a view invariant and a symmetry criterion. To demonstrate the effectiveness of the developed algorithm, it has been tested on several public datasets and it has achieved and sometimes has overcome state-of-the-art performance. In order to verify the uncertainty of the proposed algorithm, some tools are provided and discussed to ensure its efficiency for continuous human action recognition in real time.


2021 ◽  
Vol 14 (2) ◽  
pp. 106-124
Author(s):  
A. F. M. Saifuddin Saif ◽  
Md. Akib Shahriar Khan ◽  
Abir Mohammad Hadi ◽  
Rahul Proshad Karmoker ◽  
Joy Julian Gomes

Recent years have seen a rise in the use of various machine learning techniques in computer vision, particularly in posing feature-based human action recognition which includes convolutional neural networks (CNN) and recurrent neural network (RNN). CNN-based methods are useful in recognizing human actions for combined motions (i.e., standing up, hand shaking, walking). However, in case of uncertainty of camera motion, occlusion, and multiple people, CNN suppresses important feature information and is not efficient enough to recognize variations for human action. Besides, RNN with long short-term memory (LSTM) requires more computational power to retain memories to classify human actions. This research proposes an extended framework based on capsule network using silhouette pose features to recognize human actions. Proposed extended framework achieved high accuracy of 95.64% which is higher than previous research methodology. Extensive experimental validation of the proposed extended framework reveals efficiency which is expected to contribute significantly in action recognition research.


2013 ◽  
Vol 18 (2-3) ◽  
pp. 49-60 ◽  
Author(s):  
Damian Dudzńiski ◽  
Tomasz Kryjak ◽  
Zbigniew Mikrut

Abstract In this paper a human action recognition algorithm, which uses background generation with shadow elimination, silhouette description based on simple geometrical features and a finite state machine for recognizing particular actions is described. The performed tests indicate that this approach obtains a 81 % correct recognition rate allowing real-time image processing of a 360 X 288 video stream.


2018 ◽  
Vol 6 (10) ◽  
pp. 323-328
Author(s):  
K.Kiruba . ◽  
D. Shiloah Elizabeth ◽  
C Sunil Retmin Raj

Sign in / Sign up

Export Citation Format

Share Document