scholarly journals Similarity Embedding Networks for Robust Human Activity Recognition

2021 ◽  
Vol 15 (6) ◽  
pp. 1-17
Author(s):  
Chenglin Li ◽  
Carrie Lu Tong ◽  
Di Niu ◽  
Bei Jiang ◽  
Xiao Zuo ◽  
...  

Deep learning models for human activity recognition (HAR) based on sensor data have been heavily studied recently. However, the generalization ability of deep models on complex real-world HAR data is limited by the availability of high-quality labeled activity data, which are hard to obtain. In this article, we design a similarity embedding neural network that maps input sensor signals onto real vectors through carefully designed convolutional and Long Short-Term Memory (LSTM) layers. The embedding network is trained with a pairwise similarity loss, encouraging the clustering of samples from the same class in the embedded real space, and can be effectively trained on a small dataset and even on a noisy dataset with mislabeled samples. Based on the learned embeddings, we further propose both nonparametric and parametric approaches for activity recognition. Extensive evaluation based on two public datasets has shown that the proposed similarity embedding network significantly outperforms state-of-the-art deep models on HAR classification tasks, is robust to mislabeled samples in the training set, and can also be used to effectively denoise a noisy dataset.

Sensors ◽  
2019 ◽  
Vol 19 (7) ◽  
pp. 1716 ◽  
Author(s):  
Seungeun Chung ◽  
Jiyoun Lim ◽  
Kyoung Ju Noh ◽  
Gague Kim ◽  
Hyuntae Jeong

In this paper, we perform a systematic study about the on-body sensor positioning and data acquisition details for Human Activity Recognition (HAR) systems. We build a testbed that consists of eight body-worn Inertial Measurement Units (IMU) sensors and an Android mobile device for activity data collection. We develop a Long Short-Term Memory (LSTM) network framework to support training of a deep learning model on human activity data, which is acquired in both real-world and controlled environments. From the experiment results, we identify that activity data with sampling rate as low as 10 Hz from four sensors at both sides of wrists, right ankle, and waist is sufficient in recognizing Activities of Daily Living (ADLs) including eating and driving activity. We adopt a two-level ensemble model to combine class-probabilities of multiple sensor modalities, and demonstrate that a classifier-level sensor fusion technique can improve the classification performance. By analyzing the accuracy of each sensor on different types of activity, we elaborate custom weights for multimodal sensor fusion that reflect the characteristic of individual activities.


Sensors ◽  
2019 ◽  
Vol 19 (15) ◽  
pp. 3434 ◽  
Author(s):  
Nattaya Mairittha ◽  
Tittaya Mairittha ◽  
Sozo Inoue

Labeling activity data is a central part of the design and evaluation of human activity recognition systems. The performance of the systems greatly depends on the quantity and “quality” of annotations; therefore, it is inevitable to rely on users and to keep them motivated to provide activity labels. While mobile and embedded devices are increasingly using deep learning models to infer user context, we propose to exploit on-device deep learning inference using a long short-term memory (LSTM)-based method to alleviate the labeling effort and ground truth data collection in activity recognition systems using smartphone sensors. The novel idea behind this is that estimated activities are used as feedback for motivating users to collect accurate activity labels. To enable us to perform evaluations, we conduct the experiments with two conditional methods. We compare the proposed method showing estimated activities using on-device deep learning inference with the traditional method showing sentences without estimated activities through smartphone notifications. By evaluating with the dataset gathered, the results show our proposed method has improvements in both data quality (i.e., the performance of a classification model) and data quantity (i.e., the number of data collected) that reflect our method could improve activity data collection, which can enhance human activity recognition systems. We discuss the results, limitations, challenges, and implications for on-device deep learning inference that support activity data collection. Also, we publish the preliminary dataset collected to the research community for activity recognition.


Sensors ◽  
2021 ◽  
Vol 21 (18) ◽  
pp. 6316
Author(s):  
Dinis Moreira ◽  
Marília Barandas ◽  
Tiago Rocha ◽  
Pedro Alves ◽  
Ricardo Santos ◽  
...  

With the fast increase in the demand for location-based services and the proliferation of smartphones, the topic of indoor localization is attracting great interest. In indoor environments, users’ performed activities carry useful semantic information. These activities can then be used by indoor localization systems to confirm users’ current relative locations in a building. In this paper, we propose a deep-learning model based on a Convolutional Long Short-Term Memory (ConvLSTM) network to classify human activities within the indoor localization scenario using smartphone inertial sensor data. Results show that the proposed human activity recognition (HAR) model accurately identifies nine types of activities: not moving, walking, running, going up in an elevator, going down in an elevator, walking upstairs, walking downstairs, or going up and down a ramp. Moreover, predicted human activities were integrated within an existing indoor positioning system and evaluated in a multi-story building across several testing routes, with an average positioning error of 2.4 m. The results show that the inclusion of human activity information can reduce the overall localization error of the system and actively contribute to the better identification of floor transitions within a building. The conducted experiments demonstrated promising results and verified the effectiveness of using human activity-related information for indoor localization.


Sensors ◽  
2022 ◽  
Vol 22 (2) ◽  
pp. 635
Author(s):  
Yong Li ◽  
Luping Wang

Due to the wide application of human activity recognition (HAR) in sports and health, a large number of HAR models based on deep learning have been proposed. However, many existing models ignore the effective extraction of spatial and temporal features of human activity data. This paper proposes a deep learning model based on residual block and bi-directional LSTM (BiLSTM). The model first extracts spatial features of multidimensional signals of MEMS inertial sensors automatically using the residual block, and then obtains the forward and backward dependencies of feature sequence using BiLSTM. Finally, the obtained features are fed into the Softmax layer to complete the human activity recognition. The optimal parameters of the model are obtained by experiments. A homemade dataset containing six common human activities of sitting, standing, walking, running, going upstairs and going downstairs is developed. The proposed model is evaluated on our dataset and two public datasets, WISDM and PAMAP2. The experimental results show that the proposed model achieves the accuracy of 96.95%, 97.32% and 97.15% on our dataset, WISDM and PAMAP2, respectively. Compared with some existing models, the proposed model has better performance and fewer parameters.


Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 2141
Author(s):  
Ohoud Nafea ◽  
Wadood Abdul ◽  
Ghulam Muhammad ◽  
Mansour Alsulaiman

Human activity recognition (HAR) remains a challenging yet crucial problem to address in computer vision. HAR is primarily intended to be used with other technologies, such as the Internet of Things, to assist in healthcare and eldercare. With the development of deep learning, automatic high-level feature extraction has become a possibility and has been used to optimize HAR performance. Furthermore, deep-learning techniques have been applied in various fields for sensor-based HAR. This study introduces a new methodology using convolution neural networks (CNN) with varying kernel dimensions along with bi-directional long short-term memory (BiLSTM) to capture features at various resolutions. The novelty of this research lies in the effective selection of the optimal video representation and in the effective extraction of spatial and temporal features from sensor data using traditional CNN and BiLSTM. Wireless sensor data mining (WISDM) and UCI datasets are used for this proposed methodology in which data are collected through diverse methods, including accelerometers, sensors, and gyroscopes. The results indicate that the proposed scheme is efficient in improving HAR. It was thus found that unlike other available methods, the proposed method improved accuracy, attaining a higher score in the WISDM dataset compared to the UCI dataset (98.53% vs. 97.05%).


2020 ◽  
Vol 10 (15) ◽  
pp. 5293 ◽  
Author(s):  
Rebeen Ali Hamad ◽  
Longzhi Yang ◽  
Wai Lok Woo ◽  
Bo Wei

Human activity recognition has become essential to a wide range of applications, such as smart home monitoring, health-care, surveillance. However, it is challenging to deliver a sufficiently robust human activity recognition system from raw sensor data with noise in a smart environment setting. Moreover, imbalanced human activity datasets with less frequent activities create extra challenges for accurate activity recognition. Deep learning algorithms have achieved promising results on balanced datasets, but their performance on imbalanced datasets without explicit algorithm design cannot be promised. Therefore, we aim to realise an activity recognition system using multi-modal sensors to address the issue of class imbalance in deep learning and improve recognition accuracy. This paper proposes a joint diverse temporal learning framework using Long Short Term Memory and one-dimensional Convolutional Neural Network models to improve human activity recognition, especially for less represented activities. We extensively evaluate the proposed method for Activities of Daily Living recognition using binary sensors dataset. A comparative study on five smart home datasets demonstrate that our proposed approach outperforms the existing individual temporal models and their hybridization. Furthermore, this is particularly the case for minority classes in addition to reasonable improvement on the majority classes of human activities.


Author(s):  
Rebeen Ali Hamad ◽  
Masashi Kimura ◽  
Longzhi Yang ◽  
Wai Lok Woo ◽  
Bo Wei

AbstractSystems of sensor human activity recognition are becoming increasingly popular in diverse fields such as healthcare and security. Yet, developing such systems poses inherent challenges due to the variations and complexity of human behaviors during the performance of physical activities. Recurrent neural networks, particularly long short-term memory have achieved promising results on numerous sequential learning problems, including sensor human activity recognition. However, parallelization is inhibited in recurrent networks due to sequential operation and computation that lead to slow training, occupying more memory and hard convergence. One-dimensional convolutional neural network processes input temporal sequential batches independently that lead to effectively executed operations in parallel. Despite that, a one-dimensional Convolutional Neural Network is not sensitive to the order of the time steps which is crucial for accurate and robust systems of sensor human activity recognition. To address this problem, we propose a network architecture based on dilated causal convolution and multi-head self-attention mechanisms that entirely dispense recurrent architectures to make efficient computation and maintain the ordering of the time steps. The proposed method is evaluated for human activities using smart home binary sensors data and wearable sensor data. Results of conducted extensive experiments on eight public and benchmark HAR data sets show that the proposed network outperforms the state-of-the-art models based on recurrent settings and temporal models.


Sensors ◽  
2020 ◽  
Vol 20 (24) ◽  
pp. 7179
Author(s):  
Orhan Konak ◽  
Pit Wegner ◽  
Bert Arnrich

Recent trends in ubiquitous computing have led to a proliferation of studies that focus on human activity recognition (HAR) utilizing inertial sensor data that consist of acceleration, orientation and angular velocity. However, the performances of such approaches are limited by the amount of annotated training data, especially in fields where annotating data is highly time-consuming and requires specialized professionals, such as in healthcare. In image classification, this limitation has been mitigated by powerful oversampling techniques such as data augmentation. Using this technique, this work evaluates to what extent transforming inertial sensor data into movement trajectories and into 2D heatmap images can be advantageous for HAR when data are scarce. A convolutional long short-term memory (ConvLSTM) network that incorporates spatiotemporal correlations was used to classify the heatmap images. Evaluation was carried out on Deep Inertial Poser (DIP), a known dataset composed of inertial sensor data. The results obtained suggest that for datasets with large numbers of subjects, using state-of-the-art methods remains the best alternative. However, a performance advantage was achieved for small datasets, which is usually the case in healthcare. Moreover, movement trajectories provide a visual representation of human activities, which can help researchers to better interpret and analyze motion patterns.


Electronics ◽  
2021 ◽  
Vol 10 (2) ◽  
pp. 111
Author(s):  
Pengjia Tu ◽  
Junhuai Li ◽  
Huaijun Wang ◽  
Ting Cao ◽  
Kan Wang

Human activity recognition (HAR) has vital applications in human–computer interaction, somatosensory games, and motion monitoring, etc. On the basis of the human motion accelerate sensor data, through a nonlinear analysis of the human motion time series, a novel method for HAR that is based on non-linear chaotic features is proposed in this paper. First, the C-C method and G-P algorithm are used to, respectively, compute the optimal delay time and embedding dimension. Additionally, a Reconstructed Phase Space (RPS) is formed while using time-delay embedding for the human accelerometer motion sensor data. Subsequently, a two-dimensional chaotic feature matrix is constructed, where the chaotic feature is composed of the correlation dimension and largest Lyapunov exponent (LLE) of attractor trajectory in the RPS. Next, the classification algorithms are used in order to classify and recognize the two different activity classes, i.e., basic and transitional activities. The experimental results show that the chaotic feature has a higher accuracy than traditional time and frequency domain features.


Sign in / Sign up

Export Citation Format

Share Document