Instructor Activity Recognition through Deep Spatiotemporal Features and Feedforward Extreme Learning Machines

Mathematical Problems in Engineering ◽

10.1155/2019/2474865 ◽

2019 ◽

Vol 2019 ◽

pp. 1-13 ◽

Cited By ~ 3

Author(s):

Nudrat Nida ◽

Muhammad Haroon Yousaf ◽

Aun Irtaza ◽

Sergio A. Velastin

Keyword(s):

Activity Recognition ◽

Teaching Style ◽

Human Action Recognition ◽

Human Action ◽

Connected Components ◽

Feature Maps ◽

Feature Representations ◽

Lecture Room ◽

Learning Machine ◽

Motion Profile

Human action recognition has the potential to predict the activities of an instructor within the lecture room. Evaluation of lecture delivery can help teachers analyze shortcomings and plan lectures more effectively. However, manual or peer evaluation is time-consuming, tedious and sometimes it is difficult to remember all the details of the lecture. Therefore, automation of lecture delivery evaluation significantly improves teaching style. In this paper, we propose a feedforward learning model for instructor’s activity recognition in the lecture room. The proposed scheme represents a video sequence in the form of a single frame to capture the motion profile of the instructor by observing the spatiotemporal relation within the video frames. First, we segment the instructor silhouettes from input videos using graph-cut segmentation and generate a motion profile. These motion profiles are centered by obtaining the largest connected components and normalized. Then, these motion profiles are represented in the form of feature maps by a deep convolutional neural network. Then, an extreme learning machine (ELM) classifier is trained over the obtained feature representations to recognize eight different activities of the instructor within the classroom. For the evaluation of the proposed method, we created an instructor activity video (IAVID-1) dataset and compared our method against different state-of-the-art activity recognition methods. Furthermore, two standard datasets, MuHAVI and IXMAS, were also considered for the evaluation of the proposed scheme.

Download Full-text

Low-Cost Embedded System Using Convolutional Neural Networks-Based Spatiotemporal Feature Map for Real-Time Human Action Recognition

Applied Sciences ◽

10.3390/app11114940 ◽

2021 ◽

Vol 11 (11) ◽

pp. 4940

Author(s):

Jinsoo Kim ◽

Jeongho Cho

Keyword(s):

Embedded System ◽

Real Time ◽

Action Recognition ◽

Processing Speed ◽

Recognition Accuracy ◽

Low Cost ◽

Human Action Recognition ◽

Human Action ◽

Video Data ◽

Feature Maps

The field of research related to video data has difficulty in extracting not only spatial but also temporal features and human action recognition (HAR) is a representative field of research that applies convolutional neural network (CNN) to video data. The performance for action recognition has improved, but owing to the complexity of the model, some still limitations to operation in real-time persist. Therefore, a lightweight CNN-based single-stream HAR model that can operate in real-time is proposed. The proposed model extracts spatial feature maps by applying CNN to the images that develop the video and uses the frame change rate of sequential images as time information. Spatial feature maps are weighted-averaged by frame change, transformed into spatiotemporal features, and input into multilayer perceptrons, which have a relatively lower complexity than other HAR models; thus, our method has high utility in a single embedded system connected to CCTV. The results of evaluating action recognition accuracy and data processing speed through challenging action recognition benchmark UCF-101 showed higher action recognition accuracy than the HAR model using long short-term memory with a small amount of video frames and confirmed the real-time operational possibility through fast data processing speed. In addition, the performance of the proposed weighted mean-based HAR model was verified by testing it in Jetson NANO to confirm the possibility of using it in low-cost GPU-based embedded systems.

Download Full-text

A Set of New Hermite Kernel Functions in Kernel Extreme Learning Machine and Application in Human Action Recognition

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001419550140 ◽

2019 ◽

Vol 33 (12) ◽

pp. 1955014 ◽

Cited By ~ 1

Author(s):

Xueping Liu ◽

Xingzuo Yue

Keyword(s):

Extreme Learning Machine ◽

Action Recognition ◽

Structural Information ◽

Image Data ◽

Human Action Recognition ◽

Human Action ◽

Kernel Functions ◽

Support Vector ◽

Learning Speed ◽

Learning Machine

The kernel function has been successfully utilized in the extreme learning machine (ELM) that provides a stabilized and generalized performance and greatly reduces the computational complexity. However, the selection and optimization of the parameters constituting the most common kernel functions are tedious and time-consuming. In this study, a set of new Hermit kernel functions derived from the generalized Hermit polynomials has been proposed. The significant contributions of the proposed kernel include only one parameter selected from a small set of natural numbers; thus, the parameter optimization is greatly facilitated and excessive structural information of the sample data is retained. Consequently, the new kernel functions can be used as optimal alternatives to other common kernel functions for ELM at a rapid learning speed. The experimental results showed that the proposed kernel ELM method tends to have similar or better robustness and generalized performance at a faster learning speed than the other common kernel ELM and support vector machine methods. Consequently, when applied to human action recognition by depth video sequence, the method also achieves excellent performance, demonstrating its time-based advantage on the video image data.

Download Full-text

Minimum Class Variance Extreme Learning Machine for Human Action Recognition

IEEE Transactions on Circuits and Systems for Video Technology ◽

10.1109/tcsvt.2013.2269774 ◽

2013 ◽

Vol 23 (11) ◽

pp. 1968-1979 ◽

Cited By ~ 81

Author(s):

Alexandros Iosifidis ◽

Anastasios Tefas ◽

Ioannis Pitas

Keyword(s):

Extreme Learning Machine ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Learning Machine

Download Full-text

ACTION RECOGNITION USING UNDECIMATED DUAL TREE COMPLEX WAVELET TRANSFORM FROM DEPTH MOTION MAPS / DEPTH SEQUENCES

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-2-w12-203-2019 ◽

2019 ◽

Vol XLII-2/W12 ◽

pp. 203-209

Author(s):

B. H. Shekar ◽

P. Rathnakara Shetty ◽

M. Sharmila Kumari ◽

L. Mestetsky

Keyword(s):

Wavelet Transform ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Feature Descriptor ◽

Motion Information ◽

Complex Wavelet Transform ◽

Benchmark Datasets ◽

Complex Wavelet ◽

Learning Machine

<p><strong>Abstract.</strong> Accumulating the motion information from a video sequence is one of the highly challenging and significant phase in Human Action Recognition. To achieve this, several classical and compact representations are proposed by the research community with proven applicability. In this paper, we propose a compact Depth Motion Map based representation methodology with hastey striding, consisely accumulating the motion information. We extract Undecimated Dual Tree Complex Wavelet Transform features from the proposed DMM, to form an efficient feature descriptor. We designate a Sequential Extreme Learning Machine for classifying the human action secquences on benchmark datasets, MSR Action 3D dataset and DHA Dataset. We empirically prove the feasability of our method under standard protocols, achieving proven results.</p>

Download Full-text

Patient Monitoring by Abnormal Human Activity Recognition Based on CNN Architecture

Electronics ◽

10.3390/electronics9121993 ◽

2020 ◽

Vol 9 (12) ◽

pp. 1993

Author(s):

Malik Ali Gul ◽

Muhammad Haroon Yousaf ◽

Shah Nawaz ◽

Zaka Ur Rehman ◽

HyungWon Kim

Keyword(s):

Activity Recognition ◽

Action Recognition ◽

Human Activity ◽

Patient Monitoring ◽

Human Action Recognition ◽

Confidence Score ◽

Human Action ◽

Human Activity Recognition ◽

Video Sequences ◽

Human Actions

Human action recognition has emerged as a challenging research domain for video understanding and analysis. Subsequently, extensive research has been conducted to achieve the improved performance for recognition of human actions. Human activity recognition has various real time applications, such as patient monitoring in which patients are being monitored among a group of normal people and then identified based on their abnormal activities. Our goal is to render a multi class abnormal action detection in individuals as well as in groups from video sequences to differentiate multiple abnormal human actions. In this paper, You Look only Once (YOLO) network is utilized as a backbone CNN model. For training the CNN model, we constructed a large dataset of patient videos by labeling each frame with a set of patient actions and the patient’s positions. We retrained the back-bone CNN model with 23,040 labeled images of patient’s actions for 32 epochs. Across each frame, the proposed model allocated a unique confidence score and action label for video sequences by finding the recurrent action label. The present study shows that the accuracy of abnormal action recognition is 96.8%. Our proposed approach differentiated abnormal actions with improved F1-Score of 89.2% which is higher than state-of-the-art techniques. The results indicate that the proposed framework can be beneficial to hospitals and elder care homes for patient monitoring.

Download Full-text

Minimum Variance Extreme Learning Machine for human action recognition

2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ◽

10.1109/icassp.2014.6854640 ◽

2014 ◽

Cited By ~ 13

Author(s):

Alexandros Iosifidis ◽

Anastasios Tefas ◽

Ioannis Pitas

Keyword(s):

Extreme Learning Machine ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Minimum Variance ◽

Learning Machine

Download Full-text

Human action recognition using extreme learning machine based on visual vocabularies

Neurocomputing ◽

10.1016/j.neucom.2010.01.020 ◽

2010 ◽

Vol 73 (10-12) ◽

pp. 1906-1917 ◽

Cited By ~ 106

Author(s):

Rashid Minhas ◽

Aryaz Baradarani ◽

Sepideh Seifzadeh ◽

Q.M. Jonathan Wu

Keyword(s):

Extreme Learning Machine ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Learning Machine

Download Full-text

Human action recognition using extreme learning machine via multiple types of features

10.1117/12.853031 ◽

2010 ◽

Cited By ~ 1

Author(s):

Rashid Minhas ◽

Aryaz Baradarani ◽

Sepideh Seifzadeh ◽

Q. M. J. Wu

Keyword(s):

Extreme Learning Machine ◽

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Learning Machine

Download Full-text

Human Action Recognition Using Median Background and Max Pool Convolution with Nearest Neighbor

International Journal of Ambient Computing and Intelligence ◽

10.4018/ijaci.2019040103 ◽

2019 ◽

Vol 10 (2) ◽

pp. 34-47 ◽

Cited By ~ 1

Author(s):

Bagavathi Lakshmi ◽

S.Parthasarathy

Keyword(s):

Machine Learning ◽

Activity Recognition ◽

Action Recognition ◽

Human Activity ◽

Nearest Neighbor ◽

Human Action Recognition ◽

Human Action ◽

Human Activity Recognition ◽

Machine Learning Algorithms ◽

Support Vector

Discovering human activities on mobile devices is a challenging task for human action recognition. The ability of a device to recognize its user's activity is important because it enables context-aware applications and behavior. Recently, machine learning algorithms have been increasingly used for human action recognition. During the past few years, principal component analysis and support vector machines is widely used for robust human activity recognition. However, with global dynamic tendency and complex tasks involved, this robust human activity recognition (HAR) results in error and complexity. To deal with this problem, a machine learning algorithm is proposed and explores its application on HAR. In this article, a Max Pool Convolution Neural Network based on Nearest Neighbor (MPCNN-NN) is proposed to perform efficient and effective HAR using smartphone sensors by exploiting the inherent characteristics. The MPCNN-NN framework for HAR consists of three steps. In the first step, for each activity, the features of interest or foreground frame are detected using Median Background Subtraction. The second step consists of organizing the features (i.e. postures) that represent the strongest generic discriminating features (i.e. postures) based on Max Pool. The third and the final step is the HAR based on Nearest Neighbor that postures which maximizes the probability. Experiments have been conducted to demonstrate the superiority of the proposed MPCNN-NN framework on human action dataset, KARD (Kinect Activity Recognition Dataset).

Download Full-text

Human action recognition based on multi-scale feature maps from depth video sequences

Multimedia Tools and Applications ◽

10.1007/s11042-021-11193-4 ◽

2021 ◽

Author(s):

Chang Li ◽

Qian Huang ◽

Xing Li ◽

Qianhan Wu

Keyword(s):

Action Recognition ◽

Human Action Recognition ◽

Human Action ◽

Video Sequences ◽

Feature Maps ◽

Scale Feature ◽

Multi Scale ◽

Depth Video

Download Full-text