scholarly journals Combining CNN and LSTM for activity of daily living recognition with a 3D matrix skeleton representation

Author(s):  
Giovanni Ercolano ◽  
Silvia Rossi

AbstractIn socially assistive robotics, human activity recognition plays a central role when the adaptation of the robot behavior to the human one is required. In this paper, we present an activity recognition approach for activities of daily living based on deep learning and skeleton data. In the literature, ad hoc features extraction/selection algorithms with supervised classification methods have been deployed, reaching an excellent classification performance. Here, we propose a deep learning approach, combining CNN and LSTM, that exploits both the learning of spatial dependencies correlating the limbs in a skeleton 3D grid representation and the learning of temporal dependencies from instances with a periodic pattern that works on raw data and so without requiring an explicit feature extraction process. These models are proposed for real-time activity recognition, and they are tested on the CAD-60 dataset. Results show that the proposed model behaves better than an LSTM model thanks to the automatic features extraction of the limbs’ correlation. “New Person” results show that the CNN-LSTM model achieves $$95.4\%$$ 95.4 % of precision and $$94.4\%$$ 94.4 % of recall, while the “Have Seen” results are $$96.1\%$$ 96.1 % of precision and $$94.7\%$$ 94.7 % of recall.

Author(s):  
Daniela Micucci ◽  
Marco Mobilio ◽  
Paolo Napoletano

Smartphones, smartwatches, fitness trackers, and ad-hoc wearable devices are being increasingly used to monitor human activities. Data acquired by the hosted sensors are usually processed by machine-learning-based algorithms to classify human activities. The success of those algorithms mostly depends on the availability of training (labeled) data that, if made publicly available, would allow researchers to make objective comparisons between techniques. Nowadays, publicly available data sets are few, often contain samples from subjects with too similar characteristics, and very often lack of specific information so that is not possible to select subsets of samples according to specific criteria. In this article, we present a new smartphone accelerometer dataset designed for activity recognition. The dataset includes 11,771 activities performed by 30 subjects of ages ranging from 18 to 60 years. Activities are divided in 17 fine grained classes grouped in two coarse grained classes: 9 types of activities of daily living (ADL) and 8 types of falls. The dataset has been stored to include all the information useful to select samples according to different criteria, such as the type of ADL performed, the age, the gender, and so on. Finally, the dataset has been benchmarked with two different classifiers and with different configurations. The best results are achieved with k-NN classifying ADLs only, considering personalization, and with both windows of 51 and 151 samples.


Sensors ◽  
2020 ◽  
Vol 20 (17) ◽  
pp. 4756
Author(s):  
Irvin Hussein Lopez-Nava ◽  
Luis M. Valentín-Coronado ◽  
Matias Garcia-Constantino ◽  
Jesus Favela

Activity recognition is one of the most active areas of research in ubiquitous computing. In particular, gait activity recognition is useful to identify various risk factors in people’s health that are directly related to their physical activity. One of the issues in activity recognition, and gait in particular, is that often datasets are unbalanced (i.e., the distribution of classes is not uniform), and due to this disparity, the models tend to categorize into the class with more instances. In the present study, two methods for classifying gait activities using accelerometer and gyroscope data from a large-scale public dataset were evaluated and compared. The gait activities in this dataset are: (i) going down an incline, (ii) going up an incline, (iii) walking on level ground, (iv) going down stairs, and (v) going up stairs. The proposed methods are based on conventional (shallow) and deep learning techniques. In addition, data were evaluated from three data treatments: original unbalanced data, sampled data, and augmented data. The latter was based on the generation of synthetic data according to segmented gait data. The best results were obtained with classifiers built with augmented data, with F-measure results of 0.812 (σ = 0.078) for the shallow learning approach, and of 0.927 (σ = 0.033) for the deep learning approach. In addition, the data augmentation strategy proposed to deal with the unbalanced problem resulted in increased classification performance using both techniques.


Author(s):  
Daniela Micucci ◽  
Marco Mobilio ◽  
Paolo Napoletano

Smartphones, smartwatches, fitness trackers, and ad-hoc wearable devices are being increasingly used to monitor human activities. Data acquired by the hosted sensors are usually processed by machine-learning-based algorithms to classify human activities. The success of those algorithms mostly depends on the availability of training (labeled) data that, if made publicly available, would allow researchers to make objective comparisons between techniques. Nowadays, publicly available data sets are few, often contain samples from subjects with too similar characteristics, and very often lack of specific information so that is not possible to select subsets of samples according to specific criteria. In this article, we present a new smartphone accelerometer dataset designed for activity recognition. The dataset includes 11,771 activities performed by 30 subjects of ages ranging from 18 to 60 years. Activities are divided in 17 fine grained classes grouped in two coarse grained classes: 9 types of activities of daily living (ADL) and 8 types of falls. The dataset has been stored to include all the information useful to select samples according to different criteria, such as the type of ADL performed, the age, the gender, and so on. Finally, the dataset has been benchmarked with two different classifiers and with different configurations. The best results are achieved with k-NN classifying ADLs only, considering personalization, and with both windows of 51 and 151 samples.


2021 ◽  
Vol 13 (4) ◽  
pp. 729
Author(s):  
Pedro J. Navarro ◽  
Leanne Miller ◽  
Alberto Gila-Navarro ◽  
María Victoria Díaz-Galián ◽  
Diego J. Aguila ◽  
...  

Current predefined architectures for deep learning are computationally very heavy and use tens of millions of parameters. Thus, computational costs may be prohibitive for many experimental or technological setups. We developed an ad hoc architecture for the classification of multispectral images using deep learning techniques. The architecture, called 3DeepM, is composed of 3D filter banks especially designed for the extraction of spatial-spectral features in multichannel images. The new architecture has been tested on a sample of 12210 multispectral images of seedless table grape varieties: Autumn Royal, Crimson Seedless, Itum4, Itum5 and Itum9. 3DeepM was able to classify 100% of the images and obtained the best overall results in terms of accuracy, number of classes, number of parameters and training time compared to similar work. In addition, this paper presents a flexible and reconfigurable computer vision system designed for the acquisition of multispectral images in the range of 400 nm to 1000 nm. The vision system enabled the creation of the first dataset consisting of 12210 37-channel multispectral images (12 VIS + 25 IR) of five seedless table grape varieties that have been used to validate the 3DeepM architecture. Compared to predefined classification architectures such as AlexNet, ResNet or ad hoc architectures with a very high number of parameters, 3DeepM shows the best classification performance despite using 130-fold fewer parameters than the architecture to which it was compared. 3DeepM can be used in a multitude of applications that use multispectral images, such as remote sensing or medical diagnosis. In addition, the small number of parameters of 3DeepM make it ideal for application in online classification systems aboard autonomous robots or unmanned vehicles.


Due to advancement in technology, availability of resources and by increased utilization of on node sensors enormous amount of data is obtained. There is a necessity of analyzing and classifying this physiological information by efficient and effective approaches such as deep learning and artificial intelligence. Human Activity Recognition (HAR) is assuming a dominant role in sports, security, anti-crime, healthcare and also in environmental applications like wildlife observation etc. Most techniques work well for processing offline instead of real- time processing. There are few approaches which provide maximum accuracy for real time processing of large-scale data, one of the compromising approaches is deep learning. Limitation of resources is one of the causes to restrict the usage of deep learning for low power devices which can be worn on our body. Deep learning implementations are known to produce precise results for different computing systems.We suggest a deep learning approach in this paper which integrates features and data learned from inertial sensors with complementary knowledge obtained from a collection of shallow features which generates the possibility of performing real time activity classification accurately. Eliminating the obstructions caused by using deep learning methods for real-time analysis is the aim of this integrated design. Before passing the data into the deep learning framework, we perform spectral analysis to optimize the planned methodology for on-node computation. The accuracy obtained by combined approach is tested by utilizing datasets obtained from laboratory and real world controlled and uncontrolled environment. Our outcomes demonstrate the legitimacy of the methodology on various human action datasets, beating different techniques, including the two strategies utilized inside our consolidated pipeline. We additionally exhibit that our integrated design's classification times are reliable with on node real-time analysis criteria on smart phones and wearable technology.


Author(s):  
Yuejun Liu ◽  
Yifei Xu ◽  
Xiangzheng Meng ◽  
Xuguang Wang ◽  
Tianxu Bai

Background: Medical imaging plays an important role in the diagnosis of thyroid diseases. In the field of machine learning, multiple dimensional deep learning algorithms are widely used in image classification and recognition, and have achieved great success. Objective: The method based on multiple dimensional deep learning is employed for the auxiliary diagnosis of thyroid diseases based on SPECT images. The performances of different deep learning models are evaluated and compared. Methods: Thyroid SPECT images are collected with three types, they are hyperthyroidism, normal and hypothyroidism. In the pre-processing, the region of interest of thyroid is segmented and the amount of data sample is expanded. Four CNN models, including CNN, Inception, VGG16 and RNN, are used to evaluate deep learning methods. Results: Deep learning based methods have good classification performance, the accuracy is 92.9%-96.2%, AUC is 97.8%-99.6%. VGG16 model has the best performance, the accuracy is 96.2% and AUC is 99.6%. Especially, the VGG16 model with a changing learning rate works best. Conclusion: The standard CNN, Inception, VGG16, and RNN four deep learning models are efficient for the classification of thyroid diseases with SPECT images. The accuracy of the assisted diagnostic method based on deep learning is higher than that of other methods reported in the literature.


Author(s):  
Chandni ◽  
Alok Kumar Singh Kushwaha ◽  
Jagwinder Kaur Dhillon

2020 ◽  
Vol 13 (4) ◽  
pp. 627-640 ◽  
Author(s):  
Avinash Chandra Pandey ◽  
Dharmveer Singh Rajpoot

Background: Sentiment analysis is a contextual mining of text which determines viewpoint of users with respect to some sentimental topics commonly present at social networking websites. Twitter is one of the social sites where people express their opinion about any topic in the form of tweets. These tweets can be examined using various sentiment classification methods to find the opinion of users. Traditional sentiment analysis methods use manually extracted features for opinion classification. The manual feature extraction process is a complicated task since it requires predefined sentiment lexicons. On the other hand, deep learning methods automatically extract relevant features from data hence; they provide better performance and richer representation competency than the traditional methods. Objective: The main aim of this paper is to enhance the sentiment classification accuracy and to reduce the computational cost. Method: To achieve the objective, a hybrid deep learning model, based on convolution neural network and bi-directional long-short term memory neural network has been introduced. Results: The proposed sentiment classification method achieves the highest accuracy for the most of the datasets. Further, from the statistical analysis efficacy of the proposed method has been validated. Conclusion: Sentiment classification accuracy can be improved by creating veracious hybrid models. Moreover, performance can also be enhanced by tuning the hyper parameters of deep leaning models.


Sign in / Sign up

Export Citation Format

Share Document