scholarly journals NPU RGB+D Dataset and a Feature-Enhanced LSTM-DGCN Method for Action Recognition of Basketball Players

2021 ◽  
Vol 11 (10) ◽  
pp. 4426
Author(s):  
Chunyan Ma ◽  
Ji Fan ◽  
Jinghao Yao ◽  
Tao Zhang

Computer vision-based action recognition of basketball players in basketball training and competition has gradually become a research hotspot. However, owing to the complex technical action, diverse background, and limb occlusion, it remains a challenging task without effective solutions or public dataset benchmarks. In this study, we defined 32 kinds of atomic actions covering most of the complex actions for basketball players and built the dataset NPU RGB+D (a large scale dataset of basketball action recognition with RGB image data and Depth data captured in Northwestern Polytechnical University) for 12 kinds of actions of 10 professional basketball players with 2169 RGB+D videos and 75 thousand frames, including RGB frame sequences, depth maps, and skeleton coordinates. Through extracting the spatial features of the distances and angles between the joint points of basketball players, we created a new feature-enhanced skeleton-based method called LSTM-DGCN for basketball player action recognition based on the deep graph convolutional network (DGCN) and long short-term memory (LSTM) methods. Many advanced action recognition methods were evaluated on our dataset and compared with our proposed method. The experimental results show that the NPU RGB+D dataset is very competitive with the current action recognition algorithms and that our LSTM-DGCN outperforms the state-of-the-art action recognition methods in various evaluation criteria on our dataset. Our action classifications and this NPU RGB+D dataset are valuable for basketball player action recognition techniques. The feature-enhanced LSTM-DGCN has a more accurate action recognition effect, which improves the motion expression ability of the skeleton data.

2020 ◽  
Vol 34 (03) ◽  
pp. 2669-2676 ◽  
Author(s):  
Wei Peng ◽  
Xiaopeng Hong ◽  
Haoyu Chen ◽  
Guoying Zhao

Human action recognition from skeleton data, fuelled by the Graph Convolutional Network (GCN) with its powerful capability of modeling non-Euclidean data, has attracted lots of attention. However, many existing GCNs provide a pre-defined graph structure and share it through the entire network, which can loss implicit joint correlations especially for the higher-level features. Besides, the mainstream spectral GCN is approximated by one-order hop such that higher-order connections are not well involved. All of these require huge efforts to design a better GCN architecture. To address these problems, we turn to Neural Architecture Search (NAS) and propose the first automatically designed GCN for this task. Specifically, we explore the spatial-temporal correlations between nodes and build a search space with multiple dynamic graph modules. Besides, we introduce multiple-hop modules and expect to break the limitation of representational capacity caused by one-order approximation. Moreover, a corresponding sampling- and memory-efficient evolution strategy is proposed to search in this space. The resulted architecture proves the effectiveness of the higher-order approximation and the layer-wise dynamic graph modules. To evaluate the performance of the searched model, we conduct extensive experiments on two very large scale skeleton-based action recognition datasets. The results show that our model gets the state-of-the-art results in term of given metrics.


2019 ◽  
Author(s):  
Jalal Mirakhorli ◽  
Mojgan Mirakhorli

AbstractFunctional neuroimaging techniques using resting-state functional MRI (rs-fMRI) have accelerated progress in brain disorders and dysfunction studies. Since, there are the slight differences between healthy and disorder brains, investigation in the complex topology of human brain functional networks is difficult and complicated task with the growth of evaluation criteria. Recently, graph theory and deep learning applications have spread widely to understanding human cognitive functions that are linked to gene expression and related distributed spatial patterns. Irregular graph analysis has been widely applied in many brain recognition domains, these applications might involve both node-centric and graph-centric tasks. In this paper, we discuss about individual Variational Autoencoder and Graph Convolutional Network (GCN) for the region of interest identification areas of brain which do not have normal connection when apply certain tasks. Here, we identified a framework of Graph Auto-Encoder (GAE) with hyper sphere distributer for functional data analysis in brain imaging studies that is underlying non-Euclidean structure, in learning of strong rigid graphs among large scale data. In addition, we distinguish the possible mode correlations in abnormal brain connections.


Sensors ◽  
2020 ◽  
Vol 20 (18) ◽  
pp. 5260 ◽  
Author(s):  
Fanjia Li ◽  
Juanjuan Li ◽  
Aichun Zhu ◽  
Yonggang Xu ◽  
Hongsheng Yin ◽  
...  

In the skeleton-based human action recognition domain, the spatial-temporal graph convolution networks (ST-GCNs) have made great progress recently. However, they use only one fixed temporal convolution kernel, which is not enough to extract the temporal cues comprehensively. Moreover, simply connecting the spatial graph convolution layer (GCL) and the temporal GCL in series is not the optimal solution. To this end, we propose a novel enhanced spatial and extended temporal graph convolutional network (EE-GCN) in this paper. Three convolution kernels with different sizes are chosen to extract the discriminative temporal features from shorter to longer terms. The corresponding GCLs are then concatenated by a powerful yet efficient one-shot aggregation (OSA) + effective squeeze-excitation (eSE) structure. The OSA module aggregates the features from each layer once to the output, and the eSE module explores the interdependency between the channels of the output. Besides, we propose a new connection paradigm to enhance the spatial features, which expand the serial connection to a combination of serial and parallel connections by adding a spatial GCL in parallel with the temporal GCLs. The proposed method is evaluated on three large scale datasets, and the experimental results show that the performance of our method exceeds previous state-of-the-art methods.


2020 ◽  
Vol 2020 ◽  
pp. 1-14
Author(s):  
Jinyue Zhang ◽  
Lijun Zi ◽  
Yuexian Hou ◽  
Mingen Wang ◽  
Wenting Jiang ◽  
...  

In order to support smart construction, digital twin has been a well-recognized concept for virtually representing the physical facility. It is equally important to recognize human actions and the movement of construction equipment in virtual construction scenes. Compared to the extensive research on human action recognition (HAR) that can be applied to identify construction workers, research in the field of construction equipment action recognition (CEAR) is very limited, mainly due to the lack of available datasets with videos showing the actions of construction equipment. The contributions of this research are as follows: (1) the development of a comprehensive video dataset of 2,064 clips with five action types for excavators and dump trucks; (2) a new deep learning-based CEAR approach (known as a simplified temporal convolutional network or STCN) that combines a convolutional neural network (CNN) with long short-term memory (LSTM, an artificial recurrent neural network), where CNN is used to extract image features and LSTM is used to extract temporal features from video frame sequences; and (3) the comparison between this proposed new approach and a similar CEAR method and two of the best-performing HAR approaches, namely, three-dimensional (3D) convolutional networks (ConvNets) and two-stream ConvNets, to evaluate the performance of STCN and investigate the possibility of directly transferring HAR approaches to the field of CEAR.


Electronics ◽  
2021 ◽  
Vol 10 (18) ◽  
pp. 2198
Author(s):  
Chaoyue Li ◽  
Lian Zou ◽  
Cien Fan ◽  
Hao Jiang ◽  
Yifeng Liu

Graph convolutional networks (GCNs), which model human actions as a series of spatial-temporal graphs, have recently achieved superior performance in skeleton-based action recognition. However, the existing methods mostly use the physical connections of joints to construct a spatial graph, resulting in limited topological information of the human skeleton. In addition, the action features in the time domain have not been fully explored. To better extract spatial-temporal features, we propose a multi-stage attention-enhanced sparse graph convolutional network (MS-ASGCN) for skeleton-based action recognition. To capture more abundant joint dependencies, we propose a new strategy for constructing skeleton graphs. This simulates bidirectional information flows between neighboring joints and pays greater attention to the information transmission between sparse joints. In addition, a part attention mechanism is proposed to learn the weight of each part and enhance the part-level feature learning. We introduce multiple streams of different stages and merge them in specific layers of the network to further improve the performance of the model. Our model is finally verified on two large-scale datasets, namely NTU-RGB+D and Skeleton-Kinetics. Experiments demonstrate that the proposed MS-ASGCN outperformed the previous state-of-the-art methods on both datasets.


2021 ◽  
Vol 13 (9) ◽  
pp. 5108
Author(s):  
Navin Ranjan ◽  
Sovit Bhandari ◽  
Pervez Khan ◽  
Youn-Sik Hong ◽  
Hoon Kim

The transportation system, especially the road network, is the backbone of any modern economy. However, with rapid urbanization, the congestion level has surged drastically, causing a direct effect on the quality of urban life, the environment, and the economy. In this paper, we propose (i) an inexpensive and efficient Traffic Congestion Pattern Analysis algorithm based on Image Processing, which identifies the group of roads in a network that suffers from reoccurring congestion; (ii) deep neural network architecture, formed from Convolutional Autoencoder, which learns both spatial and temporal relationships from the sequence of image data to predict the city-wide grid congestion index. Our experiment shows that both algorithms are efficient because the pattern analysis is based on the basic operations of arithmetic, whereas the prediction algorithm outperforms two other deep neural networks (Convolutional Recurrent Autoencoder and ConvLSTM) in terms of large-scale traffic network prediction performance. A case study was conducted on the dataset from Seoul city.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Sungmin O. ◽  
Rene Orth

AbstractWhile soil moisture information is essential for a wide range of hydrologic and climate applications, spatially-continuous soil moisture data is only available from satellite observations or model simulations. Here we present a global, long-term dataset of soil moisture derived through machine learning trained with in-situ measurements, SoMo.ml. We train a Long Short-Term Memory (LSTM) model to extrapolate daily soil moisture dynamics in space and in time, based on in-situ data collected from more than 1,000 stations across the globe. SoMo.ml provides multi-layer soil moisture data (0–10 cm, 10–30 cm, and 30–50 cm) at 0.25° spatial and daily temporal resolution over the period 2000–2019. The performance of the resulting dataset is evaluated through cross validation and inter-comparison with existing soil moisture datasets. SoMo.ml performs especially well in terms of temporal dynamics, making it particularly useful for applications requiring time-varying soil moisture, such as anomaly detection and memory analyses. SoMo.ml complements the existing suite of modelled and satellite-based datasets given its distinct derivation, to support large-scale hydrological, meteorological, and ecological analyses.


Cancers ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 2111
Author(s):  
Bo-Wei Zhao ◽  
Zhu-Hong You ◽  
Lun Hu ◽  
Zhen-Hao Guo ◽  
Lei Wang ◽  
...  

Identification of drug-target interactions (DTIs) is a significant step in the drug discovery or repositioning process. Compared with the time-consuming and labor-intensive in vivo experimental methods, the computational models can provide high-quality DTI candidates in an instant. In this study, we propose a novel method called LGDTI to predict DTIs based on large-scale graph representation learning. LGDTI can capture the local and global structural information of the graph. Specifically, the first-order neighbor information of nodes can be aggregated by the graph convolutional network (GCN); on the other hand, the high-order neighbor information of nodes can be learned by the graph embedding method called DeepWalk. Finally, the two kinds of feature are fed into the random forest classifier to train and predict potential DTIs. The results show that our method obtained area under the receiver operating characteristic curve (AUROC) of 0.9455 and area under the precision-recall curve (AUPR) of 0.9491 under 5-fold cross-validation. Moreover, we compare the presented method with some existing state-of-the-art methods. These results imply that LGDTI can efficiently and robustly capture undiscovered DTIs. Moreover, the proposed model is expected to bring new inspiration and provide novel perspectives to relevant researchers.


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 411
Author(s):  
Yunkai Zhang ◽  
Yinghong Tian ◽  
Pingyi Wu ◽  
Dongfan Chen

The recognition of stereotyped action is one of the core diagnostic criteria of Autism Spectrum Disorder (ASD). However, it mainly relies on parent interviews and clinical observations, which lead to a long diagnosis cycle and prevents the ASD children from timely treatment. To speed up the recognition process of stereotyped actions, a method based on skeleton data and Long Short-Term Memory (LSTM) is proposed in this paper. In the first stage of our method, the OpenPose algorithm is used to obtain the initial skeleton data from the video of ASD children. Furthermore, four denoising methods are proposed to eliminate the noise of the initial skeleton data. In the second stage, we track multiple ASD children in the same scene by matching distance between current skeletons and previous skeletons. In the last stage, the neural network based on LSTM is proposed to classify the ASD children’s actions. The performed experiments show that our proposed method is effective for ASD children’s action recognition. Compared to the previous traditional schemes, our scheme has higher accuracy and is almost non-invasive for ASD children.


Mathematics ◽  
2021 ◽  
Vol 9 (7) ◽  
pp. 772
Author(s):  
Seonghun Kim ◽  
Seockhun Bae ◽  
Yinhua Piao ◽  
Kyuri Jo

Genomic profiles of cancer patients such as gene expression have become a major source to predict responses to drugs in the era of personalized medicine. As large-scale drug screening data with cancer cell lines are available, a number of computational methods have been developed for drug response prediction. However, few methods incorporate both gene expression data and the biological network, which can harbor essential information about the underlying process of the drug response. We proposed an analysis framework called DrugGCN for prediction of Drug response using a Graph Convolutional Network (GCN). DrugGCN first generates a gene graph by combining a Protein-Protein Interaction (PPI) network and gene expression data with feature selection of drug-related genes, and the GCN model detects the local features such as subnetworks of genes that contribute to the drug response by localized filtering. We demonstrated the effectiveness of DrugGCN using biological data showing its high prediction accuracy among the competing methods.


Sign in / Sign up

Export Citation Format

Share Document