consistency constraint
Recently Published Documents


TOTAL DOCUMENTS

68
(FIVE YEARS 21)

H-INDEX

11
(FIVE YEARS 2)

2022 ◽  
Vol 183 ◽  
pp. 164-177
Author(s):  
Yongjun Zhang ◽  
Siyuan Zou ◽  
Xinyi Liu ◽  
Xu Huang ◽  
Yi Wan ◽  
...  

Author(s):  
Guoqiang Gong ◽  
Liangfeng Zheng ◽  
Wenhao Jiang ◽  
Yadong Mu

Weakly-supervised temporal action localization aims to locate intervals of action instances with only video-level action labels for training. However, the localization results generated from video classification networks are often not accurate due to the lack of temporal boundary annotation of actions. Our motivating insight is that the temporal boundary of action should be stably predicted under various temporal transforms. This inspires a self-supervised equivariant transform consistency constraint. We design a set of temporal transform operations, including naive temporal down-sampling to learnable attention-piloted time warping. In our model, a localization network aims to perform well under all transforms, and another policy network is designed to choose a temporal transform at each iteration that adversarially brings localization result inconsistent with the localization network's. Additionally, we devise a self-refine module to enhance the completeness of action intervals harnessing temporal and semantic contexts. Experimental results on THUMOS14 and ActivityNet demonstrate that our model consistently outperforms the state-of-the-art weakly-supervised temporal action localization methods.


Author(s):  
Lianli Gao ◽  
Yaya Cheng ◽  
Qilong Zhang ◽  
Xing Xu ◽  
Jingkuan Song

By adding human-imperceptible perturbations to images, DNNs can be easily fooled. As one of the mainstream methods, feature space targeted attacks perturb images by modulating their intermediate feature maps, for the discrepancy between the intermediate source and target features is minimized. However, the current choice of pixel-wise Euclidean Distance to measure the discrepancy is questionable because it unreasonably imposes a spatial-consistency constraint on the source and target features. Intuitively, an image can be categorized as "cat'' no matter the cat is on the left or right of the image. To address this issue, we propose to measure this discrepancy using statistic alignment. Specifically, we design two novel approaches called Pair-wise Alignment Attack and Global-wise Alignment Attack, which attempt to measure similarities between feature maps by high-order statistics with translation invariance. Furthermore, we systematically analyze the layer-wise transferability with varied difficulties to obtain highly reliable attacks. Extensive experiments verify the effectiveness of our proposed method, and it outperforms the state-of-the-art algorithms by a large margin. Our code is publicly available at https://github.com/yaya-cheng/PAA-GAA.


Author(s):  
Xiaobin Liu ◽  
Shiliang Zhang

Recent works show that mean-teaching is an effective framework for unsupervised domain adaptive person re-identification. However, existing methods perform contrastive learning on selected samples between teacher and student networks, which is sensitive to noises in pseudo labels and neglects the relationship among most samples. Moreover, these methods are not effective in cooperation of different teacher networks. To handle these issues, this paper proposes a Graph Consistency based Mean-Teaching (GCMT) method with constructing the Graph Consistency Constraint (GCC) between teacher and student networks. Specifically, given unlabeled training images, we apply teacher networks to extract corresponding features and further construct a teacher graph for each teacher network to describe the similarity relationships among training images. To boost the representation learning, different teacher graphs are fused to provide the supervise signal for optimizing student networks. GCMT fuses similarity relationships predicted by different teacher networks as supervision and effectively optimizes student networks with more sample relationships involved. Experiments on three datasets, i.e., Market-1501, DukeMTMCreID, and MSMT17, show that proposed GCMT outperforms state-of-the-art methods by clear margin. Specially, GCMT even outperforms the previous method that uses a deeper backbone. Experimental results also show that GCMT can effectively boost the performance with multiple teacher and student networks. Our code is available at https://github.com/liu-xb/GCMT .


2020 ◽  
pp. 1-24
Author(s):  
Xue Deng ◽  
Chuangjie Chen

Considering that most studies have taken the investors’ preference for risk into account but ignored the investors’ preference for assets, in this paper, we combine the prospect theory and possibility theory to provide investors with a portfolio strategy that meets investors’ preference for assets. Firstly, a novel reference point is proposed to give investors a comprehensive impression of assets. Secondly, the prospect return rate of assets is quantified as trapezoidal fuzzy number, and its possibilistic mean value and variance are regarded as prospect return and risk and then used to define the fuzzy prospect value. This new definition is presented to denote the score of an asset in investors’ subjective cognition. And then, a prospect asset filtering frame is proposed to help investors select assets according to their preference. When assets are selected, another new definition called prospect consistency coefficient is proposed to measure the deviation of a portfolio strategy from investors’ preference. Some properties of the definition are presented by rigorous mathematical proof. Based on the definition and its properties, a possibilistic model is constructed, which can not only provide investors optimal strategies to make profit and reduce risk as much as possible, but also ensure that the deviation between the strategies and investors’ preference is tolerable. Finally, a numerical example is given to validate the proposed method, and the sensitivity analysis of parameters in prospect value function and prospect consistency constraint is conducted to help investors choose appropriate values according to their preferences. The results show that compared with the general M-V model, our model can not only better satisfy investors’ preference for assets, but also disperse risk effectively.


2020 ◽  
Vol 34 (04) ◽  
pp. 5842-5850
Author(s):  
Vignesh Srinivasan ◽  
Klaus-Robert Müller ◽  
Wojciech Samek ◽  
Shinichi Nakajima

Unpaired image-to-image domain translation involves the task of transferring an image in one domain to another domain without having pairs of data for supervision. Several methods have been proposed to address this task using Generative Adversarial Networks (GANs) and cycle consistency constraint enforcing the translated image to be mapped back to the original domain. This way, a Deep Neural Network (DNN) learns mapping such that the input training distribution transferred to the target domain matches the target training distribution. However, not all test images are expected to fall inside the data manifold in the input space where the DNN has learned to perform the mapping very well. Such images can have a poor mapping to the target domain. In this paper, we propose to perform Langevin dynamics, which makes a subtle change in the input space bringing them close to the data manifold, producing benign examples. The effect is significant improvement of the mapped image on the target domain. We also show that the score function estimation by denoising autoencoder (DAE), can practically be replaced with any autoencoding structure, which most image-to-image translation methods contain intrinsically due to the cycle consistency constraint. Thus, no additional training is required. We show advantages of our approach for several state-of-the-art image-to-image domain translation models. Quantitative evaluation shows that our proposed method leads to a substantial increase in the accuracy to the target label on multiple state-of-the-art image classifiers, while qualitative user study proves that our method better represents the target domain, achieving better human preference scores.


2020 ◽  
Vol 34 (07) ◽  
pp. 12031-12038
Author(s):  
Wonil Song ◽  
Sungil Choi ◽  
Somi Jeong ◽  
Kwanghoon Sohn

We present a first attempt for stereoscopic image super-resolution (SR) for recovering high-resolution details while preserving stereo-consistency between stereoscopic image pair. The most challenging issue in the stereoscopic SR is that the texture details should be consistent for corresponding pixels in stereoscopic SR image pair. However, existing stereo SR methods cannot maintain the stereo-consistency, thus causing 3D fatigue to the viewers. To address this issue, in this paper, we propose a self and parallax attention mechanism (SPAM) to aggregate the information from its own image and the counterpart stereo image simultaneously, thus reconstructing high-quality stereoscopic SR image pairs. Moreover, we design an efficient network architecture and effective loss functions to enforce stereo-consistency constraint. Finally, experimental results demonstrate the superiority of our method over state-of-the-art SR methods in terms of both quantitative metrics and qualitative visual quality while maintaining stereo-consistency between stereoscopic image pair.


Sign in / Sign up

Export Citation Format

Share Document