scholarly journals STEP: Spatial Temporal Graph Convolutional Networks for Emotion Perception from Gaits

2020 ◽  
Vol 34 (02) ◽  
pp. 1342-1350 ◽  
Author(s):  
Uttaran Bhattacharya ◽  
Trisha Mittal ◽  
Rohan Chandra ◽  
Tanmay Randhavane ◽  
Aniket Bera ◽  
...  

We present a novel classifier network called STEP, to classify perceived human emotion from gaits, based on a Spatial Temporal Graph Convolutional Network (ST-GCN) architecture. Given an RGB video of an individual walking, our formulation implicitly exploits the gait features to classify the perceived emotion of the human into one of four emotions: happy, sad, angry, or neutral. We train STEP on annotated real-world gait videos, augmented with annotated synthetic gaits generated using a novel generative network called STEP-Gen, built on an ST-GCN based Conditional Variational Autoencoder (CVAE). We incorporate a novel push-pull regularization loss in the CVAE formulation of STEP-Gen to generate realistic gaits and improve the classification accuracy of STEP. We also release a novel dataset (E-Gait), which consists of 4,227 human gaits annotated with perceived emotions along with thousands of synthetic gaits. In practice, STEP can learn the affective features and exhibits classification accuracy of 88% on E-Gait, which is 14–30% more accurate over prior methods.

2021 ◽  
Vol 2078 (1) ◽  
pp. 012051
Author(s):  
Rong Liu ◽  
Luan Chen

Abstract To predict the load of the power system with a known network structure, this paper proposes a novel attention based spatial-temporal graph convolutional network (ASTGCN) model to predict the node load in the power grid. The experimental results show the good performance of ASTGCN.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Min Zhang ◽  
Haijie Yang ◽  
Pengfei Li ◽  
Ming Jiang

Skeleton-based human action recognition has attracted much attention in the field of computer vision. Most of the previous studies are based on fixed skeleton graphs so that only the local physical dependencies among joints can be captured, resulting in the omission of implicit joint correlations. In addition, under different views, the content of the same action is very different. In some views, keypoints will be blocked, which will cause recognition errors. In this paper, an action recognition method based on distance vector and multihigh view adaptive network (DV-MHNet) is proposed to address this challenging task. Among the mentioned techniques, the multihigh (MH) view adaptive networks are constructed to automatically determine the best observation view at different heights, obtain complete keypoints information of the current frame image, and enhance the robustness and generalization of the model to recognize actions at different heights. Then, the distance vector (DV) mechanism is introduced on this basis to establish the relative distance and relative orientation between different keypoints in the same frame and the same keypoints in different frame to obtain the global potential relationship of each keypoint, and finally by constructing the spatial temporal graph convolutional network to take into account the information in space and time, the characteristics of the action are learned. This paper has done the ablation study with traditional spatial temporal graph convolutional networks and with or without multihigh view adaptive networks, which reasonably proves the effectiveness of the model. The model is evaluated on two widely used action recognition benchmarks (NTU-RGB + D and PKU-MMD). Our method achieves better performance on both datasets.


2020 ◽  
Vol 10 (1) ◽  
Author(s):  
Wataru Kudo ◽  
Mao Nishiguchi ◽  
Fujio Toriumi

AbstractRating platforms provide users with useful information on products or other users. However, fake ratings are sometimes generated by fraudulent users. In this paper, we tackle the task of fraudulent user detection on rating platforms. We propose GCNEXT (Graph Convolutional Network with Expended Balance Theory), an end-to-end framework based on graph convolutional networks (GCNs) and expanded balance theory, which properly incorporates both the signs and directions of edges. The experimental results on seven real-world datasets show that the proposed framework performs better, or even best, in most settings. In particular, this framework shows remarkable stability in inductive settings, which is associated with the detection of new fraudulent users on rating platforms. Furthermore, using expanded balance theory, we provide new insight into the behavior of users in rating networks that fraudulent users form a faction to deal with the negative ratings from other users. The owner of a rating platform can detect fraudulent users earlier and constantly provide users with more credible information by using the proposed framework.


2021 ◽  
Vol 11 (15) ◽  
pp. 6975
Author(s):  
Tao Zhang ◽  
Lun He ◽  
Xudong Li ◽  
Guoqing Feng

Lipreading aims to recognize sentences being spoken by a talking face. In recent years, the lipreading method has achieved a high level of accuracy on large datasets and made breakthrough progress. However, lipreading is still far from being solved, and existing methods tend to have high error rates on the wild data and have the defects of disappearing training gradient and slow convergence. To overcome these problems, we proposed an efficient end-to-end sentence-level lipreading model, using an encoder based on a 3D convolutional network, ResNet50, Temporal Convolutional Network (TCN), and a CTC objective function as the decoder. More importantly, the proposed architecture incorporates TCN as a feature learner to decode feature. It can partly eliminate the defects of RNN (LSTM, GRU) gradient disappearance and insufficient performance, and this yields notable performance improvement as well as faster convergence. Experiments show that the training and convergence speed are 50% faster than the state-of-the-art method, and improved accuracy by 2.4% on the GRID dataset.


Water ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 1247
Author(s):  
Lydia Tsiami ◽  
Christos Makropoulos

Prompt detection of cyber–physical attacks (CPAs) on a water distribution system (WDS) is critical to avoid irreversible damage to the network infrastructure and disruption of water services. However, the complex interdependencies of the water network’s components make CPA detection challenging. To better capture the spatiotemporal dimensions of these interdependencies, we represented the WDS as a mathematical graph and approached the problem by utilizing graph neural networks. We presented an online, one-stage, prediction-based algorithm that implements the temporal graph convolutional network and makes use of the Mahalanobis distance. The algorithm exhibited strong detection performance and was capable of localizing the targeted network components for several benchmark attacks. We suggested that an important property of the proposed algorithm was its explainability, which allowed the extraction of useful information about how the model works and as such it is a step towards the creation of trustworthy AI algorithms for water applications. Additional insights into metrics commonly used to rank algorithm performance were also presented and discussed.


Author(s):  
Shengsheng Qian ◽  
Jun Hu ◽  
Quan Fang ◽  
Changsheng Xu

In this article, we focus on fake news detection task and aim to automatically identify the fake news from vast amount of social media posts. To date, many approaches have been proposed to detect fake news, which includes traditional learning methods and deep learning-based models. However, there are three existing challenges: (i) How to represent social media posts effectively, since the post content is various and highly complicated; (ii) how to propose a data-driven method to increase the flexibility of the model to deal with the samples in different contexts and news backgrounds; and (iii) how to fully utilize the additional auxiliary information (the background knowledge and multi-modal information) of posts for better representation learning. To tackle the above challenges, we propose a novel Knowledge-aware Multi-modal Adaptive Graph Convolutional Networks (KMAGCN) to capture the semantic representations by jointly modeling the textual information, knowledge concepts, and visual information into a unified framework for fake news detection. We model posts as graphs and use a knowledge-aware multi-modal adaptive graph learning principal for the effective feature learning. Compared with existing methods, the proposed KMAGCN addresses challenges from three aspects: (1) It models posts as graphs to capture the non-consecutive and long-range semantic relations; (2) it proposes a novel adaptive graph convolutional network to handle the variability of graph data; and (3) it leverages textual information, knowledge concepts and visual information jointly for model learning. We have conducted extensive experiments on three public real-world datasets and superior results demonstrate the effectiveness of KMAGCN compared with other state-of-the-art algorithms.


Author(s):  
Yinong Zhang ◽  
Shanshan Guan ◽  
Cheng Xu ◽  
Hongzhe Liu

In the era of intelligent education, human behavior recognition based on computer vision is an important branch of pattern recognition. Human behavior recognition is a basic technology in the fields of intelligent monitoring and human-computer interaction in education. The dynamic changes of human skeleton provide important information for the recognition of educational behavior. Traditional methods usually use manual information to label or traverse rules only, resulting in limited representation capabilities and poor generalization performance of the model. In this paper, a kind of dynamic skeleton model with residual is adopted—a spatio-temporal graph convolutional network based on residual connections, which not only overcomes the limitations of previous methods, but also can learn the spatio-temporal model from the skeleton data. In the big bone NTU-RGB + D dataset, the network model not only improved the representation ability of human behavior characteristics, but also improved the generalization ability, and achieved better recognition effect than the existing model. In addition, this paper also compares the results of behavior recognition on subsets of different joint points, and finds that spatial structure division have better effects.


2021 ◽  
pp. 1-13
Author(s):  
Jing Bai ◽  
Wentao Yu ◽  
Zhu Xiao ◽  
Vincent Havyarimana ◽  
Amelia C. Regan ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document