A Deep Learning-Based Approach for Camera Motion Classification

Author(s):  
Kaouther Ouenniche ◽  
Ruxandra Tapu ◽  
Titus Zaharia
Sensors ◽  
2019 ◽  
Vol 19 (23) ◽  
pp. 5310
Author(s):  
Lai Kang ◽  
Yingmei Wei ◽  
Jie Jiang ◽  
Yuxiang Xie

Cylindrical panorama stitching is able to generate high resolution images of a scene with a wide field-of-view (FOV), making it a useful scene representation for applications like environmental sensing and robot localization. Traditional image stitching methods based on hand-crafted features are effective for constructing a cylindrical panorama from a sequence of images in the case when there are sufficient reliable features in the scene. However, these methods are unable to handle low-texture environments where no reliable feature correspondence can be established. This paper proposes a novel two-step image alignment method based on deep learning and iterative optimization to address the above issue. In particular, a light-weight end-to-end trainable convolutional neural network (CNN) architecture called ShiftNet is proposed to estimate the initial shifts between images, which is further optimized in a sub-pixel refinement procedure based on a specified camera motion model. Extensive experiments on a synthetic dataset, rendered photo-realistic images, and real images were carried out to evaluate the performance of our proposed method. Both qualitative and quantitative experimental results demonstrate that cylindrical panorama stitching based on our proposed image alignment method leads to significant improvements over traditional feature based methods and recent deep learning based methods for challenging low-texture environments.


Sensors ◽  
2020 ◽  
Vol 20 (2) ◽  
pp. 547
Author(s):  
Abu Md Niamul Taufique ◽  
Breton Minnehan ◽  
Andreas Savakis

In recent years, deep learning-based visual object trackers have achieved state-of-the-art performance on several visual object tracking benchmarks. However, most tracking benchmarks are focused on ground level videos, whereas aerial tracking presents a new set of challenges. In this paper, we compare ten trackers based on deep learning techniques on four aerial datasets. We choose top performing trackers utilizing different approaches, specifically tracking by detection, discriminative correlation filters, Siamese networks and reinforcement learning. In our experiments, we use a subset of OTB2015 dataset with aerial style videos; the UAV123 dataset without synthetic sequences; the UAV20L dataset, which contains 20 long sequences; and DTB70 dataset as our benchmark datasets. We compare the advantages and disadvantages of different trackers in different tracking situations encountered in aerial data. Our findings indicate that the trackers perform significantly worse in aerial datasets compared to standard ground level videos. We attribute this effect to smaller target size, camera motion, significant camera rotation with respect to the target, out of view movement, and clutter in the form of occlusions or similar looking distractors near tracked object.


2020 ◽  
Author(s):  
Aqib Mumtaz ◽  
Allah Bux Sargano ◽  
Zulfiqar Habib

Abstract The violence detection is mostly achieved through handcrafted feature descriptors, while some researchers have also employed deep learning-based representation models for violent activity recognition. Deep learning-based models have achieved encouraging results for fight activity recognition on benchmark data sets such as hockey and movies. However, these models have limitations in learning discriminating features for violence activity classification with abrupt camera motion. This research work investigated deep representation models using transfer learning for handling the issue of abrupt camera motion. Consequently, a novel deep multi-net (DMN) architecture based on AlexNet and GoogleNet is proposed for violence detection in videos. AlexNet and GoogleNet are top-ranked pre-trained models for image classification with distinct pre-learnt potential features. The fusion of these models can yield superior performance. The proposed DMN unleashed the integrated potential by concurrently coalescing both networks. The results confirmed that DMN outperformed state-of-the-art methods by learning finest discriminating features and achieved 99.82% and 100% accuracy on hockey and movies data sets, respectively. Moreover, DMN has faster learning capability i.e. 1.33 and 2.28 times faster than AlexNet and GoogleNet, which makes it an effective learning architecture on images and videos.


2007 ◽  
Vol 25 (11) ◽  
pp. 1737-1747 ◽  
Author(s):  
M. Guironnet ◽  
D. Pellerin ◽  
M. Rombaut

2012 ◽  
Vol 562-564 ◽  
pp. 1887-1890
Author(s):  
Shao Shuai Lei ◽  
Chang Qing Cao ◽  
G. Xie

This paper proposed a robust and hierarchical camera motion classification approach based on integrated histogram. First, a motion direction histogram is built, and the information entropy of histogram is employed to classify camera motion operations into scaling operation and non-scaling operation. And then, static and translation operations are distinguished by distribu- tion information of histogram, meanwhile the direction of translation operation is also identified. Experimental results show that the new approach can efficiently achieve global motion estimation.


Sign in / Sign up

Export Citation Format

Share Document