scholarly journals A deep neural network model for multi-view human activity recognition

PLoS ONE ◽  
2022 ◽  
Vol 17 (1) ◽  
pp. e0262181
Author(s):  
Prasetia Utama Putra ◽  
Keisuke Shima ◽  
Koji Shimatani

Multiple cameras are used to resolve occlusion problem that often occur in single-view human activity recognition. Based on the success of learning representation with deep neural networks (DNNs), recent works have proposed DNNs models to estimate human activity from multi-view inputs. However, currently available datasets are inadequate in training DNNs model to obtain high accuracy rate. Against such an issue, this study presents a DNNs model, trained by employing transfer learning and shared-weight techniques, to classify human activity from multiple cameras. The model comprised pre-trained convolutional neural networks (CNNs), attention layers, long short-term memory networks with residual learning (LSTMRes), and Softmax layers. The experimental results suggested that the proposed model could achieve a promising performance on challenging MVHAR datasets: IXMAS (97.27%) and i3DPost (96.87%). A competitive recognition rate was also observed in online classification.

2020 ◽  
Vol 2020 ◽  
pp. 1-12 ◽  
Author(s):  
Huaijun Wang ◽  
Jing Zhao ◽  
Junhuai Li ◽  
Ling Tian ◽  
Pengjia Tu ◽  
...  

Human activity recognition (HAR) can be exploited to great benefits in many applications, including elder care, health care, rehabilitation, entertainment, and monitoring. Many existing techniques, such as deep learning, have been developed for specific activity recognition, but little for the recognition of the transitions between activities. This work proposes a deep learning based scheme that can recognize both specific activities and the transitions between two different activities of short duration and low frequency for health care applications. In this work, we first build a deep convolutional neural network (CNN) for extracting features from the data collected by sensors. Then, the long short-term memory (LTSM) network is used to capture long-term dependencies between two actions to further improve the HAR identification rate. By combing CNN and LSTM, a wearable sensor based model is proposed that can accurately recognize activities and their transitions. The experimental results show that the proposed approach can help improve the recognition rate up to 95.87% and the recognition rate for transitions higher than 80%, which are better than those of most existing similar models over the open HAPT dataset.


2018 ◽  
Vol 2018 ◽  
pp. 1-13 ◽  
Author(s):  
Yu Zhao ◽  
Rennong Yang ◽  
Guillaume Chevalier ◽  
Ximeng Xu ◽  
Zhenxing Zhang

Human activity recognition (HAR) has become a popular topic in research because of its wide application. With the development of deep learning, new ideas have appeared to address HAR problems. Here, a deep network architecture using residual bidirectional long short-term memory (LSTM) is proposed. The advantages of the new network include that a bidirectional connection can concatenate the positive time direction (forward state) and the negative time direction (backward state). Second, residual connections between stacked cells act as shortcut for gradients, effectively avoiding the gradient vanishing problem. Generally, the proposed network shows improvements on both the temporal (using bidirectional cells) and the spatial (residual connections stacked) dimensions, aiming to enhance the recognition rate. When testing with the Opportunity dataset and the public domain UCI dataset, the accuracy is significantly improved compared with previous results.


2021 ◽  
Vol 15 (6) ◽  
pp. 1-17
Author(s):  
Chenglin Li ◽  
Carrie Lu Tong ◽  
Di Niu ◽  
Bei Jiang ◽  
Xiao Zuo ◽  
...  

Deep learning models for human activity recognition (HAR) based on sensor data have been heavily studied recently. However, the generalization ability of deep models on complex real-world HAR data is limited by the availability of high-quality labeled activity data, which are hard to obtain. In this article, we design a similarity embedding neural network that maps input sensor signals onto real vectors through carefully designed convolutional and Long Short-Term Memory (LSTM) layers. The embedding network is trained with a pairwise similarity loss, encouraging the clustering of samples from the same class in the embedded real space, and can be effectively trained on a small dataset and even on a noisy dataset with mislabeled samples. Based on the learned embeddings, we further propose both nonparametric and parametric approaches for activity recognition. Extensive evaluation based on two public datasets has shown that the proposed similarity embedding network significantly outperforms state-of-the-art deep models on HAR classification tasks, is robust to mislabeled samples in the training set, and can also be used to effectively denoise a noisy dataset.


Sensors ◽  
2020 ◽  
Vol 20 (20) ◽  
pp. 5770 ◽  
Author(s):  
Keshav Thapa ◽  
Zubaer Md. Abdullah Al ◽  
Barsha Lamichhane ◽  
Sung-Hyun Yang

Human activity recognition has become an important research topic within the field of pervasive computing, ambient assistive living (AAL), robotics, health-care monitoring, and many more. Techniques for recognizing simple and single activities are typical for now, but recognizing complex activities such as concurrent and interleaving activity is still a major challenging issue. In this paper, we propose a two-phase hybrid deep machine learning approach using bi-directional Long-Short Term Memory (BiLSTM) and Skip-Chain Conditional random field (SCCRF) to recognize the complex activity. BiLSTM is a sequential generative deep learning inherited from Recurrent Neural Network (RNN). SCCRFs is a distinctive feature of conditional random field (CRF) that can represent long term dependencies. In the first phase of the proposed approach, we recognized the concurrent activities using the BiLSTM technique, and in the second phase, SCCRF identifies the interleaved activity. Accuracy of the proposed framework against the counterpart state-of-art methods using the publicly available datasets in a smart home environment is analyzed. Our experiment’s result surpasses the previously proposed approaches with an average accuracy of more than 93%.


Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 7853
Author(s):  
Aleksej Logacjov ◽  
Kerstin Bach ◽  
Atle Kongsvold ◽  
Hilde Bremseth Bårdstu ◽  
Paul Jarle Mork

Existing accelerometer-based human activity recognition (HAR) benchmark datasets that were recorded during free living suffer from non-fixed sensor placement, the usage of only one sensor, and unreliable annotations. We make two contributions in this work. First, we present the publicly available Human Activity Recognition Trondheim dataset (HARTH). Twenty-two participants were recorded for 90 to 120 min during their regular working hours using two three-axial accelerometers, attached to the thigh and lower back, and a chest-mounted camera. Experts annotated the data independently using the camera’s video signal and achieved high inter-rater agreement (Fleiss’ Kappa =0.96). They labeled twelve activities. The second contribution of this paper is the training of seven different baseline machine learning models for HAR on our dataset. We used a support vector machine, k-nearest neighbor, random forest, extreme gradient boost, convolutional neural network, bidirectional long short-term memory, and convolutional neural network with multi-resolution blocks. The support vector machine achieved the best results with an F1-score of 0.81 (standard deviation: ±0.18), recall of 0.85±0.13, and precision of 0.79±0.22 in a leave-one-subject-out cross-validation. Our highly professional recordings and annotations provide a promising benchmark dataset for researchers to develop innovative machine learning approaches for precise HAR in free living.


Sign in / Sign up

Export Citation Format

Share Document