scholarly journals Zero-Shot Action Recognition with Three-Stream Graph Convolutional Networks

Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3793
Author(s):  
Nan Wu ◽  
Kazuhiko Kawamoto

Large datasets are often used to improve the accuracy of action recognition. However, very large datasets are problematic as, for example, the annotation of large datasets is labor-intensive. This has encouraged research in zero-shot action recognition (ZSAR). Presently, most ZSAR methods recognize actions according to each video frame. These methods are affected by light, camera angle, and background, and most methods are unable to process time series data. The accuracy of the model is reduced owing to these reasons. In this paper, in order to solve these problems, we propose a three-stream graph convolutional network that processes both types of data. Our model has two parts. One part can process RGB data, which contains extensive useful information. The other part can process skeleton data, which is not affected by light and background. By combining these two outputs with a weighted sum, our model predicts the final results for ZSAR. Experiments conducted on three datasets demonstrate that our model has greater accuracy than a baseline model. Moreover, we also prove that our model can learn from human experience, which can make the model more accurate.

Water ◽  
2021 ◽  
Vol 13 (14) ◽  
pp. 1944
Author(s):  
Haitham H. Mahmoud ◽  
Wenyan Wu ◽  
Yonghao Wang

This work develops a toolbox called WDSchain on MATLAB that can simulate blockchain on water distribution systems (WDS). WDSchain can import data from Excel and EPANET water modelling software. It extends the EPANET to enable simulation blockchain of the hydraulic data at any intended nodes. Using WDSchain will strengthen network automation and the security in WDS. WDSchain can process time-series data with two simulation modes: (1) static blockchain, which takes a snapshot of one-time interval data of all nodes in WDS as input and output into chained blocks at a time, and (2) dynamic blockchain, which takes all simulated time-series data of all the nodes as input and establishes chained blocks at the simulated time. Five consensus mechanisms are developed in WDSchain to provide data at different security levels using PoW, PoT, PoV, PoA, and PoAuth. Five different sizes of WDS are simulated in WDSchain for performance evaluation. The results show that a trade-off is needed between the system complexity and security level for data validation. The WDSchain provides a methodology to further explore the data validation using Blockchain to WDS. The limitations of WDSchain do not consider selection of blockchain nodes and broadcasting delay compared to commercial blockchain platforms.


2020 ◽  
Vol 08 (04) ◽  
pp. 2050020
Author(s):  
Shenning QU

As an analytical framework for studying the characteristics of changes in things and their action mechanisms, the decomposition analysis of greenhouse gas emissions has been increasingly used in environmental economics research. The author introduces several decomposition methods commonly used at present and compares them. The index decomposition analysis (IDA) of carbon emissions usually uses energy identities to express carbon emissions as the product of several factor indexes, and decomposes them according to different weight-determining methods to clarify the incremental share of each index, in which way it is possible to decompose the models that contain less factors, process time series data, and conduct cross-country comparisons. It mainly includes the Laspeyres index decomposition and the Divisia index decomposition. Among them, the LMDI I method has been widely used for its advantages such as generating no residuals and easy to use. The structural decomposition analysis (SDA) can be used to conduct a more systematic analysis, decompose models with more influencing factors, and analyze the impacts of various factors on emissions, but this method has higher requirements for data collection. The biggest difference between the SDA method and the IDA methods of carbon emissions is that the former is based on an input–output system, while the latter only needs to use sectors’ aggregate data.


2019 ◽  
Vol 11 (2) ◽  
pp. 42 ◽  
Author(s):  
Sheeraz Arif ◽  
Jing Wang ◽  
Tehseen Ul Hassan ◽  
Zesong Fei

Human activity recognition is an active field of research in computer vision with numerous applications. Recently, deep convolutional networks and recurrent neural networks (RNN) have received increasing attention in multimedia studies, and have yielded state-of-the-art results. In this research work, we propose a new framework which intelligently combines 3D-CNN and LSTM networks. First, we integrate discriminative information from a video into a map called a ‘motion map’ by using a deep 3-dimensional convolutional network (C3D). A motion map and the next video frame can be integrated into a new motion map, and this technique can be trained by increasing the training video length iteratively; then, the final acquired network can be used for generating the motion map of the whole video. Next, a linear weighted fusion scheme is used to fuse the network feature maps into spatio-temporal features. Finally, we use a Long-Short-Term-Memory (LSTM) encoder-decoder for final predictions. This method is simple to implement and retains discriminative and dynamic information. The improved results on benchmark public datasets prove the effectiveness and practicability of the proposed method.


Author(s):  
Chen Li ◽  
Junjun Zheng

Malicious software, called malware, can perform harmful actions on computer systems, which may cause economic damage and information leakage. Therefore, malware classification is meaningful and required to prevent malware attacks. Application programming interface (API) call sequences are easily observed and are good choices as features for malware classification. However, one of the main issues is how to generate a suitable feature for the algorithms of classification to achieve a high classification accuracy. Different malware sample brings API call sequence with different lengths, and these lengths may reach millions, which may cause computation cost and time complexities. Recurrent neural networks (RNNs) is one of the most versatile approaches to process time series data, which can be used to API call-based Malware calssification. In this paper, we propose a malware classification model with RNN, especially the long short-term memory (LSTM) and the gated recurrent unit (GRU), to classify variants of malware by using long-sequences of API calls. In numerical experiments, a benchmark dataset is used to illustrate the proposed approach and validate its accuracy. The numerical results show that the proposed RNN model works well on the malware classification.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Omobolanle Ruth Ogunseiju ◽  
Johnson Olayiwola ◽  
Abiola Abosede Akanmu ◽  
Chukwuma Nnaji

PurposeConstruction action recognition is essential to efficiently manage productivity, health and safety risks. These can be achieved by tracking and monitoring construction work. This study aims to examine the performance of a variant of deep convolutional neural networks (CNNs) for recognizing actions of construction workers from images of signals of time-series data.Design/methodology/approachThis paper adopts Inception v1 to classify actions involved in carpentry and painting activities from images of motion data. Augmented time-series data from wearable sensors attached to worker's lower arms are converted to signal images to train an Inception v1 network. Performance of Inception v1 is compared with the highest performing supervised learning classifier, k-nearest neighbor (KNN).FindingsResults show that the performance of Inception v1 network improved when trained with signal images of the augmented data but at a high computational cost. Inception v1 network and KNN achieved an accuracy of 95.2% and 99.8%, respectively when trained with 50-fold augmented carpentry dataset. The accuracy of Inception v1 and KNN with 10-fold painting augmented dataset is 95.3% and 97.1%, respectively.Research limitations/implicationsOnly acceleration data of the lower arm of the two trades were used for action recognition. Each signal image comprises 20 datasets.Originality/valueLittle has been reported on recognizing construction workers' actions from signal images. This study adds value to the existing literature, in particular by providing insights into the extent to which a deep CNN can classify subtasks from patterns in signal images compared to a traditional best performing shallow network.


2019 ◽  
Vol 11 (7) ◽  
pp. 783 ◽  
Author(s):  
Chunyong Ma ◽  
Siqing Li ◽  
Anni Wang ◽  
Jie Yang ◽  
Ge Chen

Eddies can be identified and tracked based on satellite altimeter data. However, few studies have focused on nowcasting the evolution of eddies using remote sensing data. In this paper, an improved Convolutional Long Short-Term Memory (Conv-LSTM) network named Prednet is used for eddy nowcasting. Prednet, which uses a deep, recurrent convolutional network with both bottom-up and top-down connects, has the ability to learn the temporal and spatial relationships associated with time series data. The network can effectively simulate and reconstruct the spatiotemporal characteristics of the future sea level anomaly (SLA) data. Based on the SLA data products provided by Archiving, Validation, and Interpretation of Satellite Oceanographic (AVISO) from 1993 to 2018, combined with an SLA-based eddy detection algorithm, seven-day eddy nowcasting experiments are conducted on the eddies in South China Sea. The matching ratio is defined as the percentage of true eddies that can be successfully predicted by Conv-LSTM network. On the first day of the nowcasting, matching ratio for eddies with diameters greater than 100 km is 95%, and the average matching ratio of the seven-day nowcasting is approximately 60%. In order to verify the performance of nowcasting method, two experiments were set up. A typical anticyclonic eddy shedding from Kuroshio in January 2017 was used to verify this nowcasting algorithm’s performance on single eddy, with the mean eddy center error is 11.2 km. Moreover, compared with the eddies detected in the Hybrid Coordinate Ocean Model data set (HYCOM), the eddies predicted with Conv-LSTM networks are closer to the eddies detected in the AVISO SLA data set, indicating that deep learning method can effectively nowcast eddies.


Electronics ◽  
2019 ◽  
Vol 8 (8) ◽  
pp. 876 ◽  
Author(s):  
Renzhuo Wan ◽  
Shuping Mei ◽  
Jun Wang ◽  
Min Liu ◽  
Fan Yang

Multivariable time series prediction has been widely studied in power energy, aerology, meteorology, finance, transportation, etc. Traditional modeling methods have complex patterns and are inefficient to capture long-term multivariate dependencies of data for desired forecasting accuracy. To address such concerns, various deep learning models based on Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN) methods are proposed. To improve the prediction accuracy and minimize the multivariate time series data dependence for aperiodic data, in this article, Beijing PM2.5 and ISO-NE Dataset are analyzed by a novel Multivariate Temporal Convolution Network (M-TCN) model. In this model, multi-variable time series prediction is constructed as a sequence-to-sequence scenario for non-periodic datasets. The multichannel residual blocks in parallel with asymmetric structure based on deep convolution neural network is proposed. The results are compared with rich competitive algorithms of long short term memory (LSTM), convolutional LSTM (ConvLSTM), Temporal Convolution Network (TCN) and Multivariate Attention LSTM-FCN (MALSTM-FCN), which indicate significant improvement of prediction accuracy, robust and generalization of our model.


Author(s):  
Junyu Gao ◽  
Tianzhu Zhang ◽  
Changsheng Xu

Recently, with the ever-growing action categories, zero-shot action recognition (ZSAR) has been achieved by automatically mining the underlying concepts (e.g., actions, attributes) in videos. However, most existing methods only exploit the visual cues of these concepts but ignore external knowledge information for modeling explicit relationships between them. In fact, humans have remarkable ability to transfer knowledge learned from familiar classes to recognize unfamiliar classes. To narrow the knowledge gap between existing methods and humans, we propose an end-to-end ZSAR framework based on a structured knowledge graph, which can jointly model the relationships between action-attribute, action-action, and attribute-attribute. To effectively leverage the knowledge graph, we design a novel Two-Stream Graph Convolutional Network (TS-GCN) consisting of a classifier branch and an instance branch. Specifically, the classifier branch takes the semantic-embedding vectors of all the concepts as input, then generates the classifiers for action categories. The instance branch maps the attribute embeddings and scores of each video instance into an attribute-feature space. Finally, the generated classifiers are evaluated on the attribute features of each video, and a classification loss is adopted for optimizing the whole network. In addition, a self-attention module is utilized to model the temporal information of videos. Extensive experimental results on three realistic action benchmarks Olympic Sports, HMDB51 and UCF101 demonstrate the favorable performance of our proposed framework.


Sign in / Sign up

Export Citation Format

Share Document