scholarly journals Integrated Multiscale Appearance Features and Motion Information Prediction Network for Anomaly Detection

2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Ting Liu ◽  
Chengqing Zhang ◽  
Liming Wang

The rise of video-prediction algorithms has largely promoted the development of anomaly detection in video surveillance for smart cities and public security. However, most current methods relied on single-scale information to extract appearance (spatial) features and lacked motion (temporal) continuity between video frames. This can cause a loss of partial spatiotemporal information that has great potential to predict future frames, affecting the accuracy of abnormality detection. Thus, we propose a novel prediction network to improve the performance of anomaly detection. Due to the objects of various scales in each video, we use different receptive fields to extract detailed appearance features by the hybrid dilated convolution (HDC) module. Meanwhile, the deeper bidirectional convolutional long short-term memory (DB-ConvLSTM) module can remember the motion information between consecutive frames. Furthermore, we use RGB difference loss to replace optical flow loss as temporal constraint, which greatly reduces the time for optical flow extraction. Compared with the state-of-the-art methods in the anomaly-detection task, experiments prove that our method can more accurately detect abnormalities in various video surveillance scenes.

Electronics ◽  
2021 ◽  
Vol 10 (5) ◽  
pp. 534
Author(s):  
Huogen Wang

The paper proposes an effective continuous gesture recognition method, which includes two modules: segmentation and recognition. In the segmentation module, the video frames are divided into gesture frames and transitional frames by using the information of hand motion and appearance, and continuous gesture sequences are segmented into isolated sequences. In the recognition module, our method exploits the spatiotemporal information embedded in RGB and depth sequences. For the RGB modality, our method adopts Convolutional Long Short-Term Memory Networks to learn long-term spatiotemporal features from short-term spatiotemporal features obtained from a 3D convolutional neural network. For the depth modality, our method converts a sequence into Dynamic Images and Motion Dynamic Images through weighted rank pooling and feed them into Convolutional Neural Networks, respectively. Our method has been evaluated on both ChaLearn LAP Large-scale Continuous Gesture Dataset and Montalbano Gesture Dataset and achieved state-of-the-art performance.


2020 ◽  
Vol 2020 ◽  
pp. 1-11
Author(s):  
Jie Shen ◽  
Mengxi Xu ◽  
Xinyu Du ◽  
Yunbo Xiong

Video surveillance is an important data source of urban computing and intelligence. The low resolution of many existing video surveillance devices affects the efficiency of urban computing and intelligence. Therefore, improving the resolution of video surveillance is one of the important tasks of urban computing and intelligence. In this paper, the resolution of video is improved by superresolution reconstruction based on a learning method. Different from the superresolution reconstruction of static images, the superresolution reconstruction of video is characterized by the application of motion information. However, there are few studies in this area so far. Aimed at fully exploring motion information to improve the superresolution of video, this paper proposes a superresolution reconstruction method based on an efficient subpixel convolutional neural network, where the optical flow is introduced in the deep learning network. Fusing the optical flow features between successive frames can compensate for information in frames and generate high-quality superresolution results. In addition, in order to improve the superresolution, a superpixel convolution layer is added after the deep convolution network. Finally, experimental evaluations demonstrate the satisfying performance of our method compared with previous methods and other deep learning networks; our method is more efficient.


Sensors ◽  
2021 ◽  
Vol 21 (8) ◽  
pp. 2811
Author(s):  
Waseem Ullah ◽  
Amin Ullah ◽  
Tanveer Hussain ◽  
Zulfiqar Ahmad Khan ◽  
Sung Wook Baik

Video anomaly recognition in smart cities is an important computer vision task that plays a vital role in smart surveillance and public safety but is challenging due to its diverse, complex, and infrequent occurrence in real-time surveillance environments. Various deep learning models use significant amounts of training data without generalization abilities and with huge time complexity. To overcome these problems, in the current work, we present an efficient light-weight convolutional neural network (CNN)-based anomaly recognition framework that is functional in a surveillance environment with reduced time complexity. We extract spatial CNN features from a series of video frames and feed them to the proposed residual attention-based long short-term memory (LSTM) network, which can precisely recognize anomalous activity in surveillance videos. The representative CNN features with the residual blocks concept in LSTM for sequence learning prove to be effective for anomaly detection and recognition, validating our model’s effective usage in smart cities video surveillance. Extensive experiments on the real-world benchmark UCF-Crime dataset validate the effectiveness of the proposed model within complex surveillance environments and demonstrate that our proposed model outperforms state-of-the-art models with a 1.77%, 0.76%, and 8.62% increase in accuracy on the UCF-Crime, UMN and Avenue datasets, respectively.


2019 ◽  
Vol 9 (16) ◽  
pp. 3337 ◽  
Author(s):  
Ming Xu ◽  
Xiaosheng Yu ◽  
Dongyue Chen ◽  
Chengdong Wu ◽  
Yang Jiang

Anomaly detection in crowded scenes is an important and challenging part of the intelligent video surveillance system. As the deep neural networks make success in feature representation, the features extracted by a deep neural network represent the appearance and motion patterns in different scenes more specifically, comparing with the hand-crafted features typically used in the traditional anomaly detection approaches. In this paper, we propose a new baseline framework of anomaly detection for complex surveillance scenes based on a variational auto-encoder with convolution kernels to learn feature representations. Firstly, the raw frames series are provided as input to our variational auto-encoder without any preprocessing to learn the appearance and motion features of the receptive fields. Then, multiple Gaussian models are used to predict the anomaly scores of the corresponding receptive fields. Our proposed two-stage anomaly detection system is evaluated on the video surveillance dataset for a large scene, UCSD pedestrian datasets, and yields competitive performance compared with state-of-the-art methods.


Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4419
Author(s):  
Hao Li ◽  
Tianhao Xiezhang ◽  
Cheng Yang ◽  
Lianbing Deng ◽  
Peng Yi

In the construction process of smart cities, more and more video surveillance systems have been deployed for traffic, office buildings, shopping malls, and families. Thus, the security of video surveillance systems has attracted more attention. At present, many researchers focus on how to select the region of interest (RoI) accurately and then realize privacy protection in videos by selective encryption. However, relatively few researchers focus on building a security framework by analyzing the security of a video surveillance system from the system and data life cycle. By analyzing the surveillance video protection and the attack surface of a video surveillance system in a smart city, we constructed a secure surveillance framework in this manuscript. In the secure framework, a secure video surveillance model is proposed, and a secure authentication protocol that can resist man-in-the-middle attacks (MITM) and replay attacks is implemented. For the management of the video encryption key, we introduced the Chinese remainder theorem (CRT) on the basis of group key management to provide an efficient and secure key update. In addition, we built a decryption suite based on transparent encryption to ensure the security of the decryption environment. The security analysis proved that our system can guarantee the forward and backward security of the key update. In the experiment environment, the average decryption speed of our system can reach 91.47 Mb/s, which can meet the real-time requirement of practical applications.


Sensors ◽  
2021 ◽  
Vol 21 (11) ◽  
pp. 3722
Author(s):  
Byeongkeun Kang ◽  
Yeejin Lee

Motion in videos refers to the pattern of the apparent movement of objects, surfaces, and edges over image sequences caused by the relative movement between a camera and a scene. Motion, as well as scene appearance, are essential features to estimate a driver’s visual attention allocation in computer vision. However, the fact that motion can be a crucial factor in a driver’s attention estimation has not been thoroughly studied in the literature, although driver’s attention prediction models focusing on scene appearance have been well studied. Therefore, in this work, we investigate the usefulness of motion information in estimating a driver’s visual attention. To analyze the effectiveness of motion information, we develop a deep neural network framework that provides attention locations and attention levels using optical flow maps, which represent the movements of contents in videos. We validate the performance of the proposed motion-based prediction model by comparing it to the performance of the current state-of-art prediction models using RGB frames. The experimental results for a real-world dataset confirm our hypothesis that motion plays a role in prediction accuracy improvement, and there is a margin for accuracy improvement by using motion features.


2021 ◽  
Vol 11 (7) ◽  
pp. 2925
Author(s):  
Edgar Cortés Gallardo Medina ◽  
Victor Miguel Velazquez Espitia ◽  
Daniela Chípuli Silva ◽  
Sebastián Fernández Ruiz de las Cuevas ◽  
Marco Palacios Hirata ◽  
...  

Autonomous vehicles are increasingly becoming a necessary trend towards building the smart cities of the future. Numerous proposals have been presented in recent years to tackle particular aspects of the working pipeline towards creating a functional end-to-end system, such as object detection, tracking, path planning, sentiment or intent detection, amongst others. Nevertheless, few efforts have been made to systematically compile all of these systems into a single proposal that also considers the real challenges these systems will have on the road, such as real-time computation, hardware capabilities, etc. This paper reviews the latest techniques towards creating our own end-to-end autonomous vehicle system, considering the state-of-the-art methods on object detection, and the possible incorporation of distributed systems and parallelization to deploy these methods. Our findings show that while techniques such as convolutional neural networks, recurrent neural networks, and long short-term memory can effectively handle the initial detection and path planning tasks, more efforts are required to implement cloud computing to reduce the computational time that these methods demand. Additionally, we have mapped different strategies to handle the parallelization task, both within and between the networks.


Author(s):  
Taku Wakui ◽  
Takao Kondo ◽  
Fumio Teraoka

AbstractThis paper proposes a general-purpose anomaly detection mechanism for Internet backbone traffic named GAMPAL (General-purpose Anomaly detection Mechanism using Prefix Aggregate without Labeled data). GAMPAL does not require labeled data to achieve general-purpose anomaly detection. For scalability to the number of entries in the BGP RIB (Border Gateway Protocol Routing Information Base), GAMPAL introduces prefix aggregate. The BGP RIB entries are classified into prefix aggregates, each of which is identified with the first three AS (Autonomous System) numbers in the AS_PATH attribute. GAMPAL establishes a prediction model for traffic sizes based on past traffic sizes. It adopts a LSTM-RNN (Long Short-Term Memory Recurrent Neural Network) model that focuses on the periodicity of the Internet traffic patterns at a weekly scale. The validity of GAMPAL is evaluated using real traffic information, BGP RIBs exported from the WIDE backbone network (AS2500), a nationwide backbone network for research and educational organizations in Japan, and the dataset of an ISP (Internet Service Provider) in Spain. As a result, GAMPAL successfully detects anomalies such as increased traffic due to an event, DDoS (Distributed Denial of Service) attacks targeted at a stub organization, a connection failure, an SSH (Secure Shell) scan attack, and anomaly spam.


Sign in / Sign up

Export Citation Format

Share Document