Multi-source Spatio-temporal Hybrid Dilated Graph Convolutional Network for Traffic Speed Forecasting

FetNet: a recurrent convolutional network for occlusion identification in fetoscopic videos

International Journal of Computer Assisted Radiology and Surgery ◽

10.1007/s11548-020-02169-0 ◽

2020 ◽

Vol 15 (5) ◽

pp. 791-801 ◽

Cited By ~ 1

Author(s):

Sophia Bano ◽

Francisco Vasconcelos ◽

Emmanuel Vander Poorten ◽

Tom Vercauteren ◽

Sebastien Ourselin ◽

...

Keyword(s):

Neural Network ◽

Network Architecture ◽

Laser Photocoagulation ◽

Short Term Memory ◽

Total Duration ◽

Frame Rate ◽

Superior Performance ◽

Computer Assisted ◽

Convolutional Network ◽

Spatio Temporal

Abstract Purpose Fetoscopic laser photocoagulation is a minimally invasive surgery for the treatment of twin-to-twin transfusion syndrome (TTTS). By using a lens/fibre-optic scope, inserted into the amniotic cavity, the abnormal placental vascular anastomoses are identified and ablated to regulate blood flow to both fetuses. Limited field-of-view, occlusions due to fetus presence and low visibility make it difficult to identify all vascular anastomoses. Automatic computer-assisted techniques may provide better understanding of the anatomical structure during surgery for risk-free laser photocoagulation and may facilitate in improving mosaics from fetoscopic videos. Methods We propose FetNet, a combined convolutional neural network (CNN) and long short-term memory (LSTM) recurrent neural network architecture for the spatio-temporal identification of fetoscopic events. We adapt an existing CNN architecture for spatial feature extraction and integrated it with the LSTM network for end-to-end spatio-temporal inference. We introduce differential learning rates during the model training to effectively utilising the pre-trained CNN weights. This may support computer-assisted interventions (CAI) during fetoscopic laser photocoagulation. Results We perform quantitative evaluation of our method using 7 in vivo fetoscopic videos captured from different human TTTS cases. The total duration of these videos was 5551 s (138,780 frames). To test the robustness of the proposed approach, we perform 7-fold cross-validation where each video is treated as a hold-out or test set and training is performed using the remaining videos. Conclusion FetNet achieved superior performance compared to the existing CNN-based methods and provided improved inference because of the spatio-temporal information modelling. Online testing of FetNet, using a Tesla V100-DGXS-32GB GPU, achieved a frame rate of 114 fps. These results show that our method could potentially provide a real-time solution for CAI and automating occlusion and photocoagulation identification during fetoscopic procedures.

Download Full-text

Based on spatio-temporal graph convolution networks with residual connection for intelligence behavior recognition

International Journal of Electrical Engineering Education ◽

10.1177/0020720921996600 ◽

2021 ◽

pp. 002072092199660

Author(s):

Yinong Zhang ◽

Shanshan Guan ◽

Cheng Xu ◽

Hongzhe Liu

Keyword(s):

Human Behavior ◽

Behavior Recognition ◽

Convolutional Network ◽

Skeleton Model ◽

Basic Technology ◽

Temporal Model ◽

Temporal Graph ◽

Spatio Temporal ◽

Human Behavior Recognition ◽

Behavior Characteristics

In the era of intelligent education, human behavior recognition based on computer vision is an important branch of pattern recognition. Human behavior recognition is a basic technology in the fields of intelligent monitoring and human-computer interaction in education. The dynamic changes of human skeleton provide important information for the recognition of educational behavior. Traditional methods usually use manual information to label or traverse rules only, resulting in limited representation capabilities and poor generalization performance of the model. In this paper, a kind of dynamic skeleton model with residual is adopted—a spatio-temporal graph convolutional network based on residual connections, which not only overcomes the limitations of previous methods, but also can learn the spatio-temporal model from the skeleton data. In the big bone NTU-RGB + D dataset, the network model not only improved the representation ability of human behavior characteristics, but also improved the generalization ability, and achieved better recognition effect than the existing model. In addition, this paper also compares the results of behavior recognition on subsets of different joint points, and finds that spatial structure division have better effects.

Download Full-text

Spatio-Temporal Hashing Multi-Graph Convolutional Network for Service-level Passenger Flow Forecasting in Bus Transit Systems

IEEE Internet of Things Journal ◽

10.1109/jiot.2021.3116241 ◽

2021 ◽

pp. 1-1

Author(s):

Dan Luo ◽

Dong Zhao ◽

Qixue Ke ◽

Xiaoyong You ◽

Liang Liu ◽

...

Keyword(s):

Service Level ◽

Convolutional Network ◽

Transit Systems ◽

Bus Transit ◽

Passenger Flow ◽

Spatio Temporal

Download Full-text

Residual Invertible Spatio-Temporal Network for Video Super-Resolution

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33015981 ◽

2019 ◽

Vol 33 ◽

pp. 5981-5988 ◽

Cited By ~ 12

Author(s):

Xiaobin Zhu ◽

Zhuangzi Li ◽

Xiao-Yu Zhang ◽

Changsheng Li ◽

Yaqi Liu ◽

...

Keyword(s):

Spatial Information ◽

Super Resolution ◽

Temporal Consistency ◽

Temporal Network ◽

Convolutional Network ◽

Feature Representations ◽

Video Frames ◽

Temporal Features ◽

Benchmark Datasets ◽

Spatio Temporal

Video super-resolution is a challenging task, which has attracted great attention in research and industry communities. In this paper, we propose a novel end-to-end architecture, called Residual Invertible Spatio-Temporal Network (RISTN) for video super-resolution. The RISTN can sufficiently exploit the spatial information from low-resolution to high-resolution, and effectively models the temporal consistency from consecutive video frames. Compared with existing recurrent convolutional network based approaches, RISTN is much deeper but more efficient. It consists of three major components: In the spatial component, a lightweight residual invertible block is designed to reduce information loss during feature transformation and provide robust feature representations. In the temporal component, a novel recurrent convolutional model with residual dense connections is proposed to construct deeper network and avoid feature degradation. In the reconstruction component, a new fusion method based on the sparse strategy is proposed to integrate the spatial and temporal features. Experiments on public benchmark datasets demonstrate that RISTN outperforms the state-ofthe-art methods.

Download Full-text

Regional-scale Spatio-Temporal Analysis of Impacts of Weather on Traffic Speed in Chicago using Probe Data

Procedia Computer Science ◽

10.1016/j.procs.2019.08.076 ◽

2019 ◽

Vol 155 ◽

pp. 551-558

Author(s):

Kuldeep Kurte ◽

Srinath Ravulaparthy ◽

Anne Berres ◽

Melissa Allen ◽

Jibonananda Sanyal

Keyword(s):

Regional Scale ◽

Temporal Analysis ◽

Traffic Speed ◽

Spatio Temporal

Download Full-text

A Spatio-Temporal Fully Convolutional Network for Breast Lesion Segmentation in DCE-MRI

Neural Information Processing - Lecture Notes in Computer Science ◽

10.1007/978-3-030-04239-4_32 ◽

2018 ◽

pp. 358-368 ◽

Cited By ~ 3

Author(s):

Mingjian Chen ◽

Hao Zheng ◽

Changsheng Lu ◽

Enmei Tu ◽

Jie Yang ◽

...

Keyword(s):

Breast Lesion ◽

Lesion Segmentation ◽

Convolutional Network ◽

Fully Convolutional Network ◽

Dce Mri ◽

Spatio Temporal

Download Full-text

3D-CNN-Based Fused Feature Maps with LSTM Applied to Action Recognition

Future Internet ◽

10.3390/fi11020042 ◽

2019 ◽

Vol 11 (2) ◽

pp. 42 ◽

Cited By ~ 5

Author(s):

Sheeraz Arif ◽

Jing Wang ◽

Tehseen Ul Hassan ◽

Zesong Fei

Keyword(s):

Short Term Memory ◽

Research Work ◽

Video Frame ◽

Feature Maps ◽

Convolutional Network ◽

Convolutional Networks ◽

Spatio Temporal ◽

3D Cnn ◽

Public Datasets ◽

Motion Map

Human activity recognition is an active field of research in computer vision with numerous applications. Recently, deep convolutional networks and recurrent neural networks (RNN) have received increasing attention in multimedia studies, and have yielded state-of-the-art results. In this research work, we propose a new framework which intelligently combines 3D-CNN and LSTM networks. First, we integrate discriminative information from a video into a map called a ‘motion map’ by using a deep 3-dimensional convolutional network (C3D). A motion map and the next video frame can be integrated into a new motion map, and this technique can be trained by increasing the training video length iteratively; then, the final acquired network can be used for generating the motion map of the whole video. Next, a linear weighted fusion scheme is used to fuse the network feature maps into spatio-temporal features. Finally, we use a Long-Short-Term-Memory (LSTM) encoder-decoder for final predictions. This method is simple to implement and retains discriminative and dynamic information. The improved results on benchmark public datasets prove the effectiveness and practicability of the proposed method.

Download Full-text

A Graph Attention Spatio-temporal Convolutional Network for 3D Human Pose Estimation in Video

10.1109/icra48506.2021.9561605 ◽

2021 ◽

Author(s):

Junfa Liu ◽

Juan Rojas ◽

Yihui Li ◽

Zhijun Liang ◽

Yisheng Guan ◽

...

Keyword(s):

Pose Estimation ◽

Human Pose Estimation ◽

Convolutional Network ◽

Spatio Temporal ◽

Human Pose ◽

3D Human Pose Estimation

Download Full-text

Global Spatial-Temporal Graph Convolutional Network for Urban Traffic Speed Prediction

Applied Sciences ◽

10.3390/app10041509 ◽

2020 ◽

Vol 10 (4) ◽

pp. 1509 ◽

Cited By ~ 2

Author(s):

Liang Ge ◽

Siyu Li ◽

Yaqian Wang ◽

Feng Chang ◽

Kunyan Wu

Keyword(s):

Weather Conditions ◽

Urban Traffic ◽

Traffic Data ◽

Convolutional Network ◽

Temporal Dimension ◽

Temporal Correlations ◽

Speed Prediction ◽

External Component ◽

Temporal Graph ◽

Traffic Speed

Traffic speed prediction plays a significant role in the intelligent traffic system (ITS). However, due to the complex spatial-temporal correlations of traffic data, it is very challenging to predict traffic speed timely and accurately. The traffic speed renders not only short-term neighboring and multiple long-term periodic dependencies in the temporal dimension but also local and global dependencies in the spatial dimension. To address this problem, we propose a novel deep-learning-based model, Global Spatial-Temporal Graph Convolutional Network (GSTGCN), for urban traffic speed prediction. The model consists of three spatial-temporal components with the same structure and an external component. The three spatial-temporal components are used to model the recent, daily-periodic, and weekly-periodic spatial-temporal correlations of the traffic data, respectively. More specifically, each spatial-temporal component consists of a dynamic temporal module and a global correlated spatial module. The former contains multiple residual blocks which are stacked by dilated casual convolutions, while the latter contains a localized graph convolution and a global correlated mechanism. The external component is used to extract the effect of external factors, such as holidays and weather conditions, on the traffic speed. Experimental results on two real-world traffic datasets have demonstrated that the proposed GSTGCN outperforms the state-of-the-art baselines.

Download Full-text