scholarly journals Body Joints and Trajectory Guided 3D Deep Convolutional Descriptors for Human Activity Identification

Human Activity Identification (HAI) in videos is one of the trendiest research fields in the computer visualization. Among various HAI techniques, Joints-pooled 3D-Deep convolutional Descriptors (JDD) have achieved effective performance by learning the body joint and capturing the spatiotemporal characteristics concurrently. However, the time consumption for estimating the locale of body joints by using large-scale dataset and computational cost of skeleton estimation algorithm were high. The recognition accuracy using traditional approaches need to be improved by considering both body joints and trajectory points together. Therefore, the key goal of this work is to improve the recognition accuracy using an optical flow integrated with a two-stream bilinear model, namely Joints and Trajectory-pooled 3D-Deep convolutional Descriptors (JTDD). In this model, an optical flow/trajectory point between video frames is also extracted at the body joint positions as input to the proposed JTDD. For this reason, two-streams of Convolutional 3D network (C3D) multiplied with the bilinear product is used for extracting the features, generating the joint descriptors for video sequences and capturing the spatiotemporal features. Then, the whole network is trained end-to-end based on the two-stream bilinear C3D model to obtain the video descriptors. Further, these video descriptors are classified by linear Support Vector Machine (SVM) to recognize human activities. Based on both body joints and trajectory points, action recognition is achieved efficiently. Finally, the recognition accuracy of the JTDD model and JDD model are compared.

2011 ◽  
Vol 188 ◽  
pp. 629-635
Author(s):  
Xia Yue ◽  
Chun Liang Zhang ◽  
Jian Li ◽  
H.Y. Zhu

A hybrid support vector machine (SVM) and hidden Markov model (HMM) model was introduced into the fault diagnosis of pump. This model had double layers: the first layer used HMM to classify preliminarily in order to get the coverage of possible faults; the second layer utilized this information to activate the corresponding SVMs for improving the recognition accuracy. The structure of this hybrid model was clear and feasible. Especially the model had the potential of large-scale multiclass application in fault diagnosis because of its good scalability. The recognition experiments of 26 statuses on the ZLH600-2 pump showed that the recognition capability of this model was sound in multiclass problems. The recognition rate of one bearing eccentricity increased from SVM’s 84.42% to 89.61% while the average recognition rate of hybrid model reached 95.05%. Although some goals while model constructed did not be fully realized, this model was still very good in practical applications.


2021 ◽  
Vol 6 (12) ◽  
pp. 13931-13953
Author(s):  
Yunfeng Shi ◽  
◽  
Shu Lv ◽  
Kaibo Shi ◽  
◽  
...  

<abstract><p>Support vector machine (SVM) is one of the most powerful technologies of machine learning, which has been widely concerned because of its remarkable performance. However, when dealing with the classification problem of large-scale datasets, the high complexity of SVM model leads to low efficiency and become impractical. Due to the sparsity of SVM in the sample space, this paper presents a new parallel data geometry analysis (PDGA) algorithm to reduce the training set of SVM, which helps to improve the efficiency of SVM training. The PDGA introduce Mahalanobis distance to measure the distance from each sample to its centroid. And based on this, proposes a method that can identify non support vectors and outliers at the same time to help remove redundant data. When the training set is further reduced, cosine angle distance analysis method is proposed to determine whether the samples are redundant data, ensure that the valuable data are not removed. Different from the previous data geometry analysis methods, the PDGA algorithm is implemented in parallel, which greatly saving the computational cost. Experimental results on artificial dataset and 6 real datasets show that the algorithm can adapt to different sample distributions. Which significantly reduce the training time and memory requirements without sacrificing the classification accuracy, and its performance is obviously better than the other five competitive algorithms.</p></abstract>


Author(s):  
Huan Wu ◽  
Yong-Ping Zhao ◽  
Tan Hui-Jun

Inlet flow pattern recognition is one of the most crucial issues and also the foundation of protection control for supersonic air-breathing propulsion systems. This article proposes a hybrid algorithm of fast K-nearest neighbors (F-KNN) and improved directed acyclic graph support vector machine (I-DAGSVM) to solve this issue based on a large amount of experimental data. The basic idea behind the proposed algorithm is combining F-KNN and I-DAGSVM together to reduce the classification error and computational cost when dealing with big data. The proposed algorithm first finds a small set of nearest samples from the training set quickly by F-KNN and then trains a local I-DAGSVM classifier based on these nearest samples. Compared with standard KNN which needs to compare each test sample with the entire training set, F-KNN uses an efficient index-based strategy to quickly find nearest samples, but there also exists misclassification when the number of nearest samples belonging to different classes is the same. To cope with this, I-DAGSVM is adopted, and its tree structure is improved by a measure of class separability to overcome the sequential randomization in classifier generation and to reduce the classification error. In addition, the proposed algorithm compensates for the expensive computational cost of I-DAGSVM because it only needs to train a local classifier based on a small number of samples found by F-KNN instead of all training samples. With all these strategies, the proposed algorithm combines the advantages of both F-KNN and I-DAGSVM and can be applied to the issue of large-scale supersonic inlet flow pattern recognition. The experimental results demonstrate the effectiveness of the proposed algorithm in terms of classification accuracy and test time.


Recognition of human emotions is a fascinating research field that motivates many researchers to use various approaches, such as facial expression, speech or gesture of the body. Electroencephalogram (EEG) is another approach of recognizing human emotion through brain signals and has offered promising findings. Although EEG signals provide detail information on human emotional states, the analysis of non-linear and chaotic characteristics of EEG signals is a substantial problem. The main challenge remains in analyzing EEG signals to extract relevant features in order to achieve optimum classification performance. Various feature extraction methods have been developed by researchers, which mainly can be categorized under time, frequency or time-frequency based feature extraction methods. Yet, there are numerous setting that could affect the performance of any model. In this paper, we investigated the performance of Discrete Wavelet Transform (DWT) and Discrete Wavelet Packet Transform (DWPT), which are time-frequency domain methods using Support Vector Machine (SVM) and k-Nearest Neighbor (KNN) classification techniques. Different SVM kernel functions and distance metrics of KNN are tested in this study by using subject-dependent and subject -independent approaches. The experiment is implemented using publicly available DEAP dataset. The experimental results show that DWT is mostly suitable with weighted KNN classifier while DWPT reported better results when tested using Linear SVM classifier to accurately classify the EEG signals on subject-dependent approach. Consistent results are observed for DWT-KNN on subject-independent approach, however SVM works better in the setting of quadratic kernel functions. These results indicate that further investigation is significant to examine the impact of different setting of methods in analyzing large scale of EEG data


2021 ◽  
Author(s):  
Bahareh Nikpour ◽  
Narges Armanfard

<div>Skeleton based human activity recognition has attracted lots of attention due to its wide range of applications. Skeleton data includes two or three dimensional coordinates of body joints. All of the body joints are not effective in recognizing different activities, so finding key joints within a video and across different activities has a significant role in improving the performance. In this paper we propose a novel framework that performs joint selection in skeleton video frames for the purpose of human activity recognition. To this end, we formulate the joint selection problem as a Markov Decision Process (MDP) where we employ deep reinforcement learning to find the most informative joints per frame. The proposed joint selection method is a general framework that can be employed to improve human activity classification methods. Experimental results on two benchmark activity recognition data sets using three different classifiers demonstrate effectiveness of the proposed joint selection method.</div>


Sensors ◽  
2020 ◽  
Vol 20 (15) ◽  
pp. 4178
Author(s):  
Yaguang Kong ◽  
Chuang Li ◽  
Zhangping Chen ◽  
Xiaodong Zhao

The recognition of non-line-of-sight (NLOS) state is a prerequisite for alleviating NLOS errors and is crucial to ensure the accuracy of positioning. Recent studies only identify the line-of-sight (LOS) state and the NLOS state, but ignore the contribution of occlusion categories to spatial information perception. This paper proposes a bidirectional search algorithm based on maximum correlation, minimum redundancy, and minimum computational cost (BS-mRMRMC). The optimal channel impulse response (CIR) feature set, which can identify NLOS and LOS states well, as well as the blocking categories, are determined by setting the constraint thresholds of both the maximum evaluation index, and the computational cost. The identification of blocking categories provides more effective information for the indoor space perception of ultra-wide band (UWB). Based on the vector projection method, the hierarchical structure of decision tree support vector machine (DT-SVM) is designed to verify the recognition accuracy of each category. Experiments show that the proposed algorithm has an average recognition accuracy of 96.7% for each occlusion category, which is better than those of the other three algorithms based on the same number of CIR signal characteristics of UWB.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2654
Author(s):  
Xue Ding ◽  
Ting Jiang ◽  
Yi Zhong ◽  
Yan Huang ◽  
Zhiwei Li

Wi-Fi-based device-free human activity recognition has recently become a vital underpinning for various emerging applications, ranging from the Internet of Things (IoT) to Human–Computer Interaction (HCI). Although this technology has been successfully demonstrated for location-dependent sensing, it relies on sufficient data samples for large-scale sensing, which is enormously labor-intensive and time-consuming. However, in real-world applications, location-independent sensing is crucial and indispensable. Therefore, how to alleviate adverse effects on recognition accuracy caused by location variations with the limited dataset is still an open question. To address this concern, we present a location-independent human activity recognition system based on Wi-Fi named WiLiMetaSensing. Specifically, we first leverage a Convolutional Neural Network and Long Short-Term Memory (CNN-LSTM) feature representation method to focus on location-independent characteristics. Then, in order to well transfer the model across different positions with limited data samples, a metric learning-based activity recognition method is proposed. Consequently, not only the generalization ability but also the transferable capability of the model would be significantly promoted. To fully validate the feasibility of the presented approach, extensive experiments have been conducted in an office with 24 testing locations. The evaluation results demonstrate that our method can achieve more than 90% in location-independent human activity recognition accuracy. More importantly, it can adapt well to the data samples with a small number of subcarriers and a low sampling rate.


2021 ◽  
Author(s):  
Bahareh Nikpour ◽  
Narges Armanfard

<div>Skeleton based human activity recognition has attracted lots of attention due to its wide range of applications. Skeleton data includes two or three dimensional coordinates of body joints. All of the body joints are not effective in recognizing different activities, so finding key joints within a video and across different activities has a significant role in improving the performance. In this paper we propose a novel framework that performs joint selection in skeleton video frames for the purpose of human activity recognition. To this end, we formulate the joint selection problem as a Markov Decision Process (MDP) where we employ deep reinforcement learning to find the most informative joints per frame. The proposed joint selection method is a general framework that can be employed to improve human activity classification methods. Experimental results on two benchmark activity recognition data sets using three different classifiers demonstrate effectiveness of the proposed joint selection method.</div>


2020 ◽  
Vol 10 (23) ◽  
pp. 8606
Author(s):  
Alfonso Monaco ◽  
Nicola Amoroso ◽  
Loredana Bellantuono ◽  
Ester Pantaleo ◽  
Sabina Tangaro ◽  
...  

The COVID-19 pandemic has amplified the urgency of the developments in computer-assisted medicine and, in particular, the need for automated tools supporting the clinical diagnosis and assessment of respiratory symptoms. This need was already clear to the scientific community, which launched an international challenge in 2017 at the International Conference on Biomedical Health Informatics (ICBHI) for the implementation of accurate algorithms for the classification of respiratory sound. In this work, we present a framework for respiratory sound classification based on two different kinds of features: (i) short-term features which summarize sound properties on a time scale of tenths of a second and (ii) long-term features which assess sounds properties on a time scale of seconds. Using the publicly available dataset provided by ICBHI, we cross-validated the classification performance of a neural network model over 6895 respiratory cycles and 126 subjects. The proposed model reached an accuracy of 85%±3% and an precision of 80%±8%, which compare well with the body of literature. The robustness of the predictions was assessed by comparison with state-of-the-art machine learning tools, such as the support vector machine, Random Forest and deep neural networks. The model presented here is therefore suitable for large-scale applications and for adoption in clinical practice. Finally, an interesting observation is that both short-term and long-term features are necessary for accurate classification, which could be the subject of future studies related to its clinical interpretation.


2019 ◽  
Vol 9 (20) ◽  
pp. 4397 ◽  
Author(s):  
Soad Almabdy ◽  
Lamiaa Elrefaei

Face recognition (FR) is defined as the process through which people are identified using facial images. This technology is applied broadly in biometrics, security information, accessing controlled areas, keeping of the law by different enforcement bodies, smart cards, and surveillance technology. The facial recognition system is built using two steps. The first step is a process through which the facial features are picked up or extracted, and the second step is pattern classification. Deep learning, specifically the convolutional neural network (CNN), has recently made commendable progress in FR technology. This paper investigates the performance of the pre-trained CNN with multi-class support vector machine (SVM) classifier and the performance of transfer learning using the AlexNet model to perform classification. The study considers CNN architecture, which has so far recorded the best outcome in the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) in the past years, more specifically, AlexNet and ResNet-50. In order to determine performance optimization of the CNN algorithm, recognition accuracy was used as a determinant. Improved classification rates were seen in the comprehensive experiments that were completed on the various datasets of ORL, GTAV face, Georgia Tech face, labelled faces in the wild (LFW), frontalized labeled faces in the wild (F_LFW), YouTube face, and FEI faces. The result showed that our model achieved a higher accuracy compared to most of the state-of-the-art models. An accuracy range of 94% to 100% for models with all databases was obtained. Also, this was obtained with an improvement in recognition accuracy up to 39%.


Sign in / Sign up

Export Citation Format

Share Document