Mask Sparse Representation Based on Semantic Features for Thermal Infrared Target Tracking

Thermal infrared (TIR) target tracking is a challenging task as it entails learning an effective model to identify the target in the situation of poor target visibility and clutter background. The sparse representation, as a typical appearance modeling approach, has been successfully exploited in the TIR target tracking. However, the discriminative information of the target and its surrounding background is usually neglected in the sparse coding process. To address this issue, we propose a mask sparse representation (MaskSR) model, which combines sparse coding together with high-level semantic features for TIR target tracking. We first obtain the pixel-wise labeling results of the target and its surrounding background in the last frame, and then use such results to train target-specific deep networks using a supervised manner. According to the output features of the deep networks, the high-level pixel-wise discriminative map of the target area is obtained. We introduce the binarized discriminative map as a mask template to the sparse representation and develop a novel algorithm to collaboratively represent the reliable target part and unreliable target part partitioned with the mask template, which explicitly indicates different discriminant capabilities by label 1 and 0. The proposed MaskSR model controls the superiority of the reliable target part in the reconstruction process via a weighted scheme. We solve this multi-parameter constrained problem by a customized alternating direction method of multipliers (ADMM) method. This model is applied to achieve TIR target tracking in the particle filter framework. To improve the sampling effectiveness and decrease the computation cost at the same time, a discriminative particle selection strategy based on kernelized correlation filter is proposed to replace the previous random sampling for searching useful candidates. Our proposed tracking method was tested on the VOT-TIR2016 benchmark. The experiment results show that the proposed method has a significant superiority compared with various state-of-the-art methods in TIR target tracking.

Download Full-text

Pansharpening of WorldView-2 Data via Graph Regularized Sparse Coding and Adaptive Coupled Dictionary

Sensors ◽

10.3390/s21113586 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3586

Author(s):

Wenqing Wang ◽

Han Liu ◽

Guo Xie

Keyword(s):

High Resolution ◽

Sparse Representation ◽

Sparse Coding ◽

Spectral Response ◽

Objective Evaluation ◽

Coefficient Matrix ◽

Image Patch ◽

Subjective Analysis ◽

High Resolution Image ◽

Sparse Coefficient

The spectral mismatch between a multispectral (MS) image and its corresponding panchromatic (PAN) image affects the pansharpening quality, especially for WorldView-2 data. To handle this problem, a pansharpening method based on graph regularized sparse coding (GRSC) and adaptive coupled dictionary is proposed in this paper. Firstly, the pansharpening process is divided into three tasks according to the degree of correlation among the MS and PAN channels and the relative spectral response of WorldView-2 sensor. Then, for each task, the image patch set from the MS channels is clustered into several subsets, and the sparse representation of each subset is estimated through the GRSC algorithm. Besides, an adaptive coupled dictionary pair for each task is constructed to effectively represent the subsets. Finally, the high-resolution image subsets for each task are obtained by multiplying the estimated sparse coefficient matrix by the corresponding dictionary. A variety of experiments are conducted on the WorldView-2 data, and the experimental results demonstrate that the proposed method achieves better performance than the existing pansharpening algorithms in both subjective analysis and objective evaluation.

Download Full-text

An Efficient Module for Instance Segmentation Based on Multi-Level Features and Attention Mechanisms

Applied Sciences ◽

10.3390/app11030968 ◽

2021 ◽

Vol 11 (3) ◽

pp. 968

Author(s):

Yingchun Sun ◽

Wang Gao ◽

Shuguo Pan ◽

Tao Zhao ◽

Yahui Peng

Keyword(s):

Feature Extraction ◽

Spatial Structure ◽

Semantic Feature ◽

Semantic Features ◽

Segmentation Method ◽

Spatial Dimensions ◽

Feature Pyramid ◽

Multi Level ◽

High Level ◽

Instance Segmentation

Recently, multi-level feature networks have been extensively used in instance segmentation. However, because not all features are beneficial to instance segmentation tasks, the performance of networks cannot be adequately improved by synthesizing multi-level convolutional features indiscriminately. In order to solve the problem, an attention-based feature pyramid module (AFPM) is proposed, which integrates the attention mechanism on the basis of a multi-level feature pyramid network to efficiently and pertinently extract the high-level semantic features and low-level spatial structure features; for instance, segmentation. Firstly, we adopt a convolutional block attention module (CBAM) into feature extraction, and sequentially generate attention maps which focus on instance-related features along the channel and spatial dimensions. Secondly, we build inter-dimensional dependencies through a convolutional triplet attention module (CTAM) in lateral attention connections, which is used to propagate a helpful semantic feature map and filter redundant informative features irrelevant to instance objects. Finally, we construct branches for feature enhancement to strengthen detailed information to boost the entire feature hierarchy of the network. The experimental results on the Cityscapes dataset manifest that the proposed module outperforms other excellent methods under different evaluation metrics and effectively upgrades the performance of the instance segmentation method.

Download Full-text

Patch-based sparse reconstruction for electrical impedance tomography

Sensor Review ◽

10.1108/sr-07-2016-0126 ◽

2017 ◽

Vol 37 (3) ◽

pp. 257-269 ◽

Cited By ~ 4

Author(s):

Qi Wang ◽

Pengcheng Zhang ◽

Jianming Wang ◽

Qingliang Chen ◽

Zhijie Lian ◽

...

Keyword(s):

Image Reconstruction ◽

Sparse Representation ◽

Electrical Impedance Tomography ◽

Electrical Impedance ◽

Inverse Operator ◽

Reconstruction Algorithm ◽

Impedance Tomography ◽

Content Type ◽

Ill Posed ◽

High Level

Purpose Electrical impedance tomography (EIT) is a technique for reconstructing the conductivity distribution by injecting currents at the boundary of a subject and measuring the resulting changes in voltage. Image reconstruction for EIT is a nonlinear problem. A generalized inverse operator is usually ill-posed and ill-conditioned. Therefore, the solutions for EIT are not unique and highly sensitive to the measurement noise. Design/methodology/approach This paper develops a novel image reconstruction algorithm for EIT based on patch-based sparse representation. The sparsifying dictionary optimization and image reconstruction are performed alternately. Two patch-based sparsity, namely, square-patch sparsity and column-patch sparsity, are discussed and compared with the global sparsity. Findings Both simulation and experimental results indicate that the patch based sparsity method can improve the quality of image reconstruction and tolerate a relatively high level of noise in the measured voltages. Originality/value EIT image is reconstructed based on patch-based sparse representation. Square-patch sparsity and column-patch sparsity are proposed and compared. Sparse dictionary optimization and image reconstruction are performed alternately. The new method tolerates a relatively high level of noise in measured voltages.

Download Full-text

Bimodal fusion of low-level visual features and high-level semantic features for near-duplicate video clip detection

Signal Processing Image Communication ◽

10.1016/j.image.2011.04.001 ◽

2011 ◽

Vol 26 (10) ◽

pp. 612-627 ◽

Cited By ~ 2

Author(s):

Hyun-seok Min ◽

Jae Young Choi ◽

Wesley De Neve ◽

Yong Man Ro

Keyword(s):

Video Clip ◽

Visual Features ◽

Semantic Features ◽

Low Level ◽

High Level ◽

Duplicate Video

Download Full-text

Research on software credibility algorithm based on deep convolutional sparse coding

MATEC Web of Conferences ◽

10.1051/matecconf/202133608013 ◽

2021 ◽

Vol 336 ◽

pp. 08013

Author(s):

Zhaosheng Xu

Keyword(s):

Neural Network ◽

Sparse Representation ◽

Sparse Coding ◽

Classification System ◽

Convolution Neural Network ◽

Deep Convolution Neural Network

Based on the author's research time, this paper studies the software credibility algorithm based on deep convolutional sparse coding. Firstly, it summarizes the convolutional sparse coding and trust classification system, and then constructs the algorithm from two aspects: factor processing based on deep convolution neural network and trust classification based on sparse representation.

Download Full-text

A feature fusion deep-projection convolution neural network for vehicle detection in aerial images

PLoS ONE ◽

10.1371/journal.pone.0250782 ◽

2021 ◽

Vol 16 (5) ◽

pp. e0250782

Author(s):

Bin Wang ◽

Bin Xu

Keyword(s):

Neural Network ◽

Feature Fusion ◽

Rapid Development ◽

Vehicle Detection ◽

Convolution Neural Network ◽

Aerial Images ◽

Semantic Features ◽

General Object ◽

High Level ◽

The Impact

With the rapid development of Unmanned Aerial Vehicles, vehicle detection in aerial images plays an important role in different applications. Comparing with general object detection problems, vehicle detection in aerial images is still a challenging research topic since it is plagued by various unique factors, e.g. different camera angle, small vehicle size and complex background. In this paper, a Feature Fusion Deep-Projection Convolution Neural Network is proposed to enhance the ability to detect small vehicles in aerial images. The backbone of the proposed framework utilizes a novel residual block named stepwise res-block to explore high-level semantic features as well as conserve low-level detail features at the same time. A specially designed feature fusion module is adopted in the proposed framework to further balance the features obtained from different levels of the backbone. A deep-projection deconvolution module is used to minimize the impact of the information contamination introduced by down-sampling/up-sampling processes. The proposed framework has been evaluated by UCAS-AOD, VEDAI, and DOTA datasets. According to the evaluation results, the proposed framework outperforms other state-of-the-art vehicle detection algorithms for aerial images.

Download Full-text

Joint Transmit Resource Management and Waveform Selection Strategy for Target Tracking in Distributed Phased Array Radar Network

IEEE Transactions on Aerospace and Electronic Systems ◽

10.1109/taes.2021.3138869 ◽

2021 ◽

pp. 1-1

Author(s):

Chenguang Shi ◽

Yijie Wang ◽

Sana Salous ◽

Jianjaing Zhou ◽

Junkun Yan

Keyword(s):

Resource Management ◽

Target Tracking ◽

Phased Array ◽

Selection Strategy ◽

Radar Network ◽

Phased Array Radar ◽

Waveform Selection

Download Full-text

Interpretable Aspect-Aware Capsule Network for Peer Review Based Citation Count Prediction

ACM Transactions on Information Systems ◽

10.1145/3466640 ◽

2022 ◽

Vol 40 (1) ◽

pp. 1-29

Author(s):

Siqing Li ◽

Yaliang Li ◽

Wayne Xin Zhao ◽

Bolin Ding ◽

Ji-Rong Wen

Keyword(s):

Peer Review ◽

Prediction Models ◽

Citation Count ◽

Specific Aspect ◽

Semantic Features ◽

Predictive Capacity ◽

Topic Distribution ◽

Real World Datasets ◽

Data Signal ◽

High Level

Citation count prediction is an important task for estimating the future impact of research papers. Most of the existing works utilize the information extracted from the paper itself. In this article, we focus on how to utilize another kind of useful data signal (i.e., peer review text) to improve both the performance and interpretability of the prediction models. Specially, we propose a novel aspect-aware capsule network for citation count prediction based on review text. It contains two major capsule layers, namely the feature capsule layer and the aspect capsule layer, with two different routing approaches, respectively. Feature capsules encode the local semantics from review sentences as the input of aspect capsule layer, whereas aspect capsules aim to capture high-level semantic features that will be served as final representations for prediction. Besides the predictive capacity, we also enhance the model interpretability with two strategies. First, we use the topic distribution of the review text to guide the learning of aspect capsules so that each aspect capsule can represent a specific aspect in the review. Then, we use the learned aspect capsules to generate readable text for explaining the predicted citation count. Extensive experiments on two real-world datasets have demonstrated the effectiveness of the proposed model in both performance and interpretability.

Download Full-text