Multiscale feature fusion for surveillance video diagnosis

In order to improve the pedestrian behavior recognition accuracy of video sequences in complex background, an improved spatial-temporal two-stream network is proposed in this paper. Firstly, the deep differential network is used to replace the temporal-stream network so as to improve the representation ability and extraction efficiency of spatiotemporal features. Then, the improved Softmax loss function based on decision-making level feature fusion mechanism is used to train the model, which can retain the spatiotemporal characteristics of images between different network frames to a greater extent and reflect the action category of pedestrians more realistically. Simulation results show that the proposed improved network achieves 87% recognition accuracy on the self-built infrared dataset, and the computational efficiency is improved by 15.1%.

Download Full-text

Enhancing forensic human factors/ergonomics analyses using digital surveillance video

PsycEXTRA Dataset ◽

10.1037/e577982012-016 ◽

2007 ◽

Author(s):

Joseph Cohen ◽

H. Harvey Cohen

Keyword(s):

Human Factors ◽

Surveillance Video ◽

Digital Surveillance

Download Full-text

Multi-Feature Fusion Identification of Important Nodes in Traffic Network

International Conference on Transportation and Development 2020 ◽

10.1061/9780784483152.015 ◽

2020 ◽

Author(s):

Yuxin Xiao ◽

Jianming Hu ◽

Zuo Zhang ◽

Yi Zhang

Keyword(s):

Feature Fusion ◽

Traffic Network ◽

Important Nodes

Download Full-text

Traffic Congestion Net (TCNet): An Accurate Traffic Congestion Level Estimation Method Based on Traffic Surveillance Video Feature Extraction

CICTP 2020 ◽

10.1061/9780784483053.001 ◽

2020 ◽

Author(s):

Jiakang Li ◽

Yuxian Pang ◽

Xiying Li

Keyword(s):

Feature Extraction ◽

Traffic Congestion ◽

Estimation Method ◽

Surveillance Video ◽

Traffic Surveillance ◽

Video Feature

Download Full-text

A Study on Utilization of Three-Dimensional Sensor Lip Image for Developing a Pronunciation Recognition System

Journal of Imaging Science and Technology ◽

10.2352/j.imagingsci.technol.2019.63.5.050402 ◽

2019 ◽

Vol 63 (5) ◽

pp. 50402-1-50402-9 ◽

Cited By ~ 1

Author(s):

Ing-Jr Ding ◽

Chong-Min Ruan

Keyword(s):

Principal Component Analysis ◽

Automatic Speech Recognition ◽

Feature Fusion ◽

Three Dimensional ◽

Principal Component ◽

Recognition System ◽

Geometrical Characteristics ◽

3D Geometry ◽

Different Types ◽

The Disabled

Abstract The acoustic-based automatic speech recognition (ASR) technique has been a matured technique and widely seen to be used in numerous applications. However, acoustic-based ASR will not maintain a standard performance for the disabled group with an abnormal face, that is atypical eye or mouth geometrical characteristics. For governing this problem, this article develops a three-dimensional (3D) sensor lip image based pronunciation recognition system where the 3D sensor is efficiently used to acquire the action variations of the lip shapes of the pronunciation action from a speaker. In this work, two different types of 3D lip features for pronunciation recognition are presented, 3D-(x, y, z) coordinate lip feature and 3D geometry lip feature parameters. For the 3D-(x, y, z) coordinate lip feature design, 18 location points, each of which has 3D-sized coordinates, around the outer and inner lips are properly defined. In the design of 3D geometry lip features, eight types of features considering the geometrical space characteristics of the inner lip are developed. In addition, feature fusion to combine both 3D-(x, y, z) coordinate and 3D geometry lip features is further considered. The presented 3D sensor lip image based feature evaluated the performance and effectiveness using the principal component analysis based classification calculation approach. Experimental results on pronunciation recognition of two different datasets, Mandarin syllables and Mandarin phrases, demonstrate the competitive performance of the presented 3D sensor lip image based pronunciation recognition system.

Download Full-text