Cross-modality person re-identication with triple-attentive feature aggregation

This paper deals with detecting small objects in remote sensing images from satellites or any aerial vehicle by utilizing the concept of image super-resolution for image resolution enhancement using a deep-learning-based detection method. This paper provides a rationale for image super-resolution for small objects by improving the current super-resolution (SR) framework by incorporating a cyclic generative adversarial network (GAN) and residual feature aggregation (RFA) to improve detection performance. The novelty of the method is threefold: first, a framework is proposed, independent of the final object detector used in research, i.e., YOLOv3 could be replaced with Faster R-CNN or any object detector to perform object detection; second, a residual feature aggregation network was used in the generator, which significantly improved the detection performance as the RFA network detected complex features; and third, the whole network was transformed into a cyclic GAN. The image super-resolution cyclic GAN with RFA and YOLO as the detection network is termed as SRCGAN-RFA-YOLO, which is compared with the detection accuracies of other methods. Rigorous experiments on both satellite images and aerial images (ISPRS Potsdam, VAID, and Draper Satellite Image Chronology datasets) were performed, and the results showed that the detection performance increased by using super-resolution methods for spatial resolution enhancement; for an IoU of 0.10, AP of 0.7867 was achieved for a scale factor of 16.

Download Full-text

Large-Scale Video Retrieval via Deep Local Convolutional Features

Advances in Multimedia ◽

10.1155/2020/7862894 ◽

2020 ◽

Vol 2020 ◽

pp. 1-8

Author(s):

Chen Zhang ◽

Bin Hu ◽

Yucong Suo ◽

Zhiqiang Zou ◽

Yimu Ji

Keyword(s):

Large Scale ◽

Video Retrieval ◽

Video Data ◽

Query Image ◽

Key Frame Extraction ◽

Key Frame ◽

Storage Cost ◽

Extraction Algorithm ◽

Feature Aggregation ◽

And Storage

In this paper, we study the challenge of image-to-video retrieval, which uses the query image to search relevant frames from a large collection of videos. A novel framework based on convolutional neural networks (CNNs) is proposed to perform large-scale video retrieval with low storage cost and high search efficiency. Our framework consists of the key-frame extraction algorithm and the feature aggregation strategy. Specifically, the key-frame extraction algorithm takes advantage of the clustering idea so that redundant information is removed in video data and storage cost is greatly reduced. The feature aggregation strategy adopts average pooling to encode deep local convolutional features followed by coarse-to-fine retrieval, which allows rapid retrieval in the large-scale video database. The results from extensive experiments on two publicly available datasets demonstrate that the proposed method achieves superior efficiency as well as accuracy over other state-of-the-art visual search methods.

Download Full-text

Cross-modality person re-identication with triple-attentive feature aggregation

Feature Aggregation Networks for Image Steganalysis

Multi-FAN: multi-spectral mosaic super-resolution via multi-scale feature aggregation network

Facial expression recognition from videos using CNN and feature aggregation

Real-Time Semantic Segmentation via Auto Depth, Downsampling Joint Decision and Feature Aggregation

Visual Localization Based on Remote Sensing Scene Matching with Siamese Feature Aggregation Network

Nuclei Segmentation in Histopathology Images Using Rotation Equivariant and Multi-level Feature Aggregation Neural Network

FANet: Feature Aggregation Network for Semantic Segmentation

Towards Accurate Estimation for Visual Object Tracking with Multi-hierarchy Feature Aggregation

Small Object Detection in Remote Sensing Images with Residual Feature Aggregation-Based Super-Resolution and Object Detector Network

Large-Scale Video Retrieval via Deep Local Convolutional Features

Export Citation Format