GlobalTrack: A Simple and Strong Baseline for Long-Term Tracking

Lianghua Huang; Xin Zhao; Kaiqi Huang

doi:10.1609/aaai.v34i07.6758

GlobalTrack: A Simple and Strong Baseline for Long-Term Tracking

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6758 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11037-11044

Author(s):

Lianghua Huang ◽

Xin Zhao ◽

Kaiqi Huang

Keyword(s):

Online Learning ◽

Success Rate ◽

Large Scale ◽

State Of The Art ◽

Temporal Consistency ◽

Post Processing ◽

Two Stage ◽

Multi Scale ◽

Instance Search

A key capability of a long-term tracker is to search for targets in very large areas (typically the entire image) to handle possible target absences or tracking failures. However, currently there is a lack of such a strong baseline for global instance search. In this work, we aim to bridge this gap. Specifically, we propose GlobalTrack, a pure global instance search based tracker that makes no assumption on the temporal consistency of the target's positions and scales. GlobalTrack is developed based on two-stage object detectors, and it is able to perform full-image and multi-scale search of arbitrary instances with only a single query as the guide. We further propose a cross-query loss to improve the robustness of our approach against distractors. With no online learning, no punishment on position or scale changes, no scale smoothing and no trajectory refinement, our pure global instance search based tracker achieves comparable, sometimes much better performance on four large-scale tracking benchmarks (i.e., 52.1% AUC on LaSOT, 63.8% success rate on TLP, 60.3% MaxGM on OxUvA and 75.4% normalized precision on TrackingNet), compared to state-of-the-art approaches that typically require complex post-processing. More importantly, our tracker runs without cumulative errors, i.e., any type of temporary tracking failures will not affect its performance on future frames, making it ideal for long-term tracking. We hope this work will be a strong baseline for long-term tracking and will stimulate future works in this area.

Download Full-text

Documentary data and the study of past droughts: a global state of the art

Climate of the Past ◽

10.5194/cp-14-1915-2018 ◽

2018 ◽

Vol 14 (12) ◽

pp. 1915-1960 ◽

Cited By ~ 34

Author(s):

Rudolf Brázdil ◽

Andrea Kiss ◽

Jürg Luterbacher ◽

David J. Nash ◽

Ladislava Řezníčková

Keyword(s):

Large Scale ◽

State Of The Art ◽

Drought Indices ◽

Documentary Evidence ◽

Climatic Trends ◽

Instrumental Observations ◽

Spatio Temporal ◽

Epigraphic Evidence ◽

Administrative Evidence

Abstract. The use of documentary evidence to investigate past climatic trends and events has become a recognised approach in recent decades. This contribution presents the state of the art in its application to droughts. The range of documentary evidence is very wide, including general annals, chronicles, memoirs and diaries kept by missionaries, travellers and those specifically interested in the weather; records kept by administrators tasked with keeping accounts and other financial and economic records; legal-administrative evidence; religious sources; letters; songs; newspapers and journals; pictographic evidence; chronograms; epigraphic evidence; early instrumental observations; society commentaries; and compilations and books. These are available from many parts of the world. This variety of documentary information is evaluated with respect to the reconstruction of hydroclimatic conditions (precipitation, drought frequency and drought indices). Documentary-based drought reconstructions are then addressed in terms of long-term spatio-temporal fluctuations, major drought events, relationships with external forcing and large-scale climate drivers, socio-economic impacts and human responses. Documentary-based drought series are also considered from the viewpoint of spatio-temporal variability for certain continents, and their employment together with hydroclimate reconstructions from other proxies (in particular tree rings) is discussed. Finally, conclusions are drawn, and challenges for the future use of documentary evidence in the study of droughts are presented.

Download Full-text

Hybrid Graph Neural Networks for Crowd Counting

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6839 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11693-11700 ◽

Cited By ~ 2

Author(s):

Ao Luo ◽

Fan Yang ◽

Xin Li ◽

Dong Nie ◽

Zhicheng Jiao ◽

...

Keyword(s):

Network Architecture ◽

Message Passing ◽

Large Scale ◽

State Of The Art ◽

Density Variation ◽

Feature Maps ◽

Crowd Counting ◽

Multi Scale ◽

Crowd Density ◽

Graph Neural Networks

Crowd counting is an important yet challenging task due to the large scale and density variation. Recent investigations have shown that distilling rich relations among multi-scale features and exploiting useful information from the auxiliary task, i.e., localization, are vital for this task. Nevertheless, how to comprehensively leverage these relations within a unified network architecture is still a challenging problem. In this paper, we present a novel network structure called Hybrid Graph Neural Network (HyGnn) which targets to relieve the problem by interweaving the multi-scale features for crowd density as well as its auxiliary task (localization) together and performing joint reasoning over a graph. Specifically, HyGnn integrates a hybrid graph to jointly represent the task-specific feature maps of different scales as nodes, and two types of relations as edges: (i) multi-scale relations capturing the feature dependencies across scales and (ii) mutual beneficial relations building bridges for the cooperation between counting and localization. Thus, through message passing, HyGnn can capture and distill richer relations between nodes to obtain more powerful representations, providing robust and accurate results. Our HyGnn performs significantly well on four challenging datasets: ShanghaiTech Part A, ShanghaiTech Part B, UCF_CC_50 and UCF_QNRF, outperforming the state-of-the-art algorithms by a large margin.

Download Full-text

Bistability of somatic pattern memories: stochastic outcomes in bioelectric circuits underlying regeneration

Philosophical Transactions of the Royal Society B Biological Sciences ◽

10.1098/rstb.2019.0765 ◽

2021 ◽

Vol 376 (1821) ◽

pp. 20190765 ◽

Cited By ~ 2

Author(s):

Giovanni Pezzulo ◽

Joshua LaPalme ◽

Fallon Durant ◽

Michael Levin

Keyword(s):

Large Scale ◽

Computational Models ◽

State Of The Art ◽

Memory Representation ◽

Theme Issue ◽

Evolutionary Innovation ◽

New Interpretation ◽

The Brain

Nervous systems’ computational abilities are an evolutionary innovation, specializing and speed-optimizing ancient biophysical dynamics. Bioelectric signalling originated in cells' communication with the outside world and with each other, enabling cooperation towards adaptive construction and repair of multicellular bodies. Here, we review the emerging field of developmental bioelectricity, which links the field of basal cognition to state-of-the-art questions in regenerative medicine, synthetic bioengineering and even artificial intelligence. One of the predictions of this view is that regeneration and regulative development can restore correct large-scale anatomies from diverse starting states because, like the brain, they exploit bioelectric encoding of distributed goal states—in this case, pattern memories. We propose a new interpretation of recent stochastic regenerative phenotypes in planaria, by appealing to computational models of memory representation and processing in the brain. Moreover, we discuss novel findings showing that bioelectric changes induced in planaria can be stored in tissue for over a week, thus revealing that somatic bioelectric circuits in vivo can implement a long-term, re-writable memory medium. A consideration of the mechanisms, evolution and functionality of basal cognition makes novel predictions and provides an integrative perspective on the evolution, physiology and biomedicine of information processing in vivo . This article is part of the theme issue ‘Basal cognition: multicellularity, neurons and the cognitive lens’.

Download Full-text

SPSTracker: Sub-Peak Suppression of Response Map for Robust Object Tracking

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6733 ◽

2020 ◽

Vol 34 (07) ◽

pp. 10989-10996

Author(s):

Qintao Hu ◽

Lijun Zhou ◽

Xiaoxiao Wang ◽

Yao Mao ◽

Jianlin Zhang ◽

...

Keyword(s):

Online Learning ◽

Object Tracking ◽

Background Noise ◽

State Of The Art ◽

Response Suppression ◽

Learning Approach ◽

Response Distribution ◽

Peak Response ◽

Multi Scale ◽

Tracking Response

Modern visual trackers usually construct online learning models under the assumption that the feature response has a Gaussian distribution with target-centered peak response. Nevertheless, such an assumption is implausible when there is progressive interference from other targets and/or background noise, which produce sub-peaks on the tracking response map and cause model drift. In this paper, we propose a rectified online learning approach for sub-peak response suppression and peak response enforcement and target at handling progressive interference in a systematic way. Our approach, referred to as SPSTracker, applies simple-yet-efficient Peak Response Pooling (PRP) to aggregate and align discriminative features, as well as leveraging a Boundary Response Truncation (BRT) to reduce the variance of feature response. By fusing with multi-scale features, SPSTracker aggregates the response distribution of multiple sub-peaks to a single maximum peak, which enforces the discriminative capability of features for robust object tracking. Experiments on the OTB, NFS and VOT2018 benchmarks demonstrate that SPSTrack outperforms the state-of-the-art real-time trackers with significant margins1

Download Full-text

Temporal Pyramid Recurrent Neural Network

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5947 ◽

2020 ◽

Vol 34 (04) ◽

pp. 5061-5068

Author(s):

Qianli Ma ◽

Zhenxi Lin ◽

Enhuan Chen ◽

Garrison Cottrell

Keyword(s):

Large Scale ◽

Speaker Identification ◽

Layer Structure ◽

Input Sequence ◽

Sequential Data ◽

Multi Scale ◽

Sequence Modeling ◽

Addition Problem ◽

Hidden States

Learning long-term and multi-scale dependencies in sequential data is a challenging task for recurrent neural networks (RNNs). In this paper, a novel RNN structure called temporal pyramid RNN (TP-RNN) is proposed to achieve these two goals. TP-RNN is a pyramid-like structure and generally has multiple layers. In each layer of the network, there are several sub-pyramids connected by a shortcut path to the output, which can efficiently aggregate historical information from hidden states and provide many gradient feedback short-paths. This avoids back-propagating through many hidden states as in usual RNNs. In particular, in the multi-layer structure of TP-RNN, the input sequence of the higher layer is a large-scale aggregated state sequence produced by the sub-pyramids in the previous layer, instead of the usual sequence of hidden states. In this way, TP-RNN can explicitly learn multi-scale dependencies with multi-scale input sequences of different layers, and shorten the input sequence and gradient feedback paths of each layer. This avoids the vanishing gradient problem in deep RNNs and allows the network to efficiently learn long-term dependencies. We evaluate TP-RNN on several sequence modeling tasks, including the masked addition problem, pixel-by-pixel image classification, signal recognition and speaker identification. Experimental results demonstrate that TP-RNN consistently outperforms existing RNNs for learning long-term and multi-scale dependencies in sequential data.

Download Full-text

Documentary data and the study of the past droughts: an overview of the state of the art worldwide

10.5194/cp-2018-118 ◽

2018 ◽

Cited By ~ 1

Author(s):

Rudolf Brázdil ◽

Andrea Kiss ◽

Jürg Luterbacher ◽

David J. Nash ◽

Ladislava Řezníčková

Keyword(s):

Large Scale ◽

State Of The Art ◽

The State ◽

Drought Indices ◽

Documentary Evidence ◽

Instrumental Observations ◽

Spatio Temporal ◽

Epigraphic Evidence ◽

Administrative Evidence

Abstract. The use of documentary evidence to investigate past climatic trends and events has become a recognised approach in recent decades. This contribution presents the state of the art in its application to droughts. The range of documentary evidence is very wide, including: general annals, chronicles, and memoirs, diaries kept by missionaries, travellers and those specifically interested in the weather, the records kept by administrators tasked with keeping accounts and other financial and economic records, legal-administrative evidence, religious sources, letters, marketplace and shopkeepers' songs, newspapers and journals, pictographic evidence, chronograms, epigraphic evidence, early instrumental observations, society commentaries, compilations and books, and historical-climatological databases. These come from many parts of the world. This variety of documentary information is evaluated with respect to the reconstruction of hydroclimatic conditions (precipitation, drought frequency and drought indices). Documentary-based drought reconstructions are then addressed in terms of long-term spatio-temporal fluctuations, major drought events, relationships with external forcing and large-scale climate drivers, socio-economic impacts and human responses. Documentary-based drought series are also discussed from the viewpoint of spatio-temporal variability for certain continents, and their employment together with hydroclimate reconstructions from other proxies (in particular tree-rings) is discussed. Finally, conclusions are drawn and challenges for the future use of documentary evidence in the study of droughts are presented.

Download Full-text

Discriminative Deep Hashing for Scalable Face Image Retrieval

Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2017/315 ◽

2017 ◽

Cited By ~ 14

Author(s):

Jie Lin ◽

Zechao Li ◽

Jinhui Tang

Keyword(s):

Image Retrieval ◽

Large Scale ◽

State Of The Art ◽

Face Image ◽

Superior Performance ◽

Prediction Errors ◽

Unified Framework ◽

Multi Scale ◽

Deep Hashing ◽

Hash Codes

With the explosive growth of images containing faces, scalable face image retrieval has attracted increasing attention. Due to the amazing effectiveness, deep hashing has become a popular hashing method recently. In this work, we propose a new Discriminative Deep Hashing (DDH) network to learn discriminative and compact hash codes for large-scale face image retrieval. The proposed network incorporates the end-to-end learning, the divide-and-encode module and the desired discrete code learning into a unified framework. Specifically, a network with a stack of convolution-pooling layers is proposed to extract multi-scale and robust features by merging the outputs of the third max pooling layer and the fourth convolutional layer. To reduce the redundancy among hash codes and the network parameters simultaneously, a divide-and-encode module to generate compact hash codes. Moreover, a loss function is introduced to minimize the prediction errors of the learned hash codes, which can lead to discriminative hash codes. Extensive experiments on two datasets demonstrate that the proposed method achieves superior performance compared with some state-of-the-art hashing methods.

Download Full-text

SafeNet: Scale-normalization and Anchor-based Feature Extraction Network for Person Re-identification

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/156 ◽

2018 ◽

Author(s):

Kun Yuan ◽

Qian Zhang ◽

Chang Huang ◽

Shiming Xiang ◽

Chunhong Pan

Keyword(s):

Large Scale ◽

State Of The Art ◽

Spatial Distributions ◽

Body Parts ◽

Retrieval Task ◽

Multi Scale ◽

Aspect Ratios ◽

Scale Normalization ◽

Full Body

Person Re-identification (ReID) is a challenging retrieval task that requires matching a person's image across non-overlapping camera views. The quality of fulfilling this task is largely determined on the robustness of the features that are used to describe the person. In this paper, we show the advantage of jointly utilizing multi-scale abstract information to learn powerful features over full body and parts. A scale normalization module is proposed to balance different scales through residual-based integration. To exploit the information hidden in non-rigid body parts, we propose an anchor-based method to capture the local contents by stacking convolutions of kernels with various aspect ratios, which focus on different spatial distributions. Finally, a well-defined framework is constructed for simultaneously learning the representations of both full body and parts. Extensive experiments conducted on current challenging large-scale person ReID datasets, including Market1501, CUHK03 and DukeMTMC, demonstrate that our proposed method achieves the state-of-the-art results.

Download Full-text

GourmetNet: Food Segmentation Using Multi-Scale Waterfall Features with Spatial and Channel Attention

Sensors ◽

10.3390/s21227504 ◽

2021 ◽

Vol 21 (22) ◽

pp. 7504

Author(s):

Udit Sharma ◽

Bruno Artacho ◽

Andreas Savakis

Keyword(s):

Feature Extraction ◽

State Of The Art ◽

Extraction Process ◽

Feature Representation ◽

Post Processing ◽

Multi Scale ◽

Spatial Pooling ◽

Current State ◽

Nutrition Monitoring ◽

Multiple Levels

We propose GourmetNet, a single-pass, end-to-end trainable network for food segmentation that achieves state-of-the-art performance. Food segmentation is an important problem as the first step for nutrition monitoring, food volume and calorie estimation. Our novel architecture incorporates both channel attention and spatial attention information in an expanded multi-scale feature representation using our advanced Waterfall Atrous Spatial Pooling module. GourmetNet refines the feature extraction process by merging features from multiple levels of the backbone through the two attention modules. The refined features are processed with the advanced multi-scale waterfall module that combines the benefits of cascade filtering and pyramid representations without requiring a separate decoder or post-processing. Our experiments on two food datasets show that GourmetNet significantly outperforms existing current state-of-the-art methods.

Download Full-text

A two-stage heuristic approach for solving the long-term unit commitment problem with hydro-thermal coordination in large-scale electricity systems

2016 IEEE International Conference on Industrial Engineering and Engineering Management (IEEM) ◽

10.1109/ieem.2016.7797838 ◽

2016 ◽

Cited By ~ 2

Author(s):

A. Franz ◽

J. Zimmermann

Keyword(s):

Large Scale ◽

Unit Commitment ◽

Heuristic Approach ◽

Unit Commitment Problem ◽

Two Stage ◽

Commitment Problem ◽

Electricity Systems

Download Full-text