scholarly journals Multi-scale 2D Representation Learning for weakly-supervised moment retrieval

Author(s):  
Ding Li ◽  
Rui Wu ◽  
Yongqiang Tang ◽  
Zhizhong Zhang ◽  
Wensheng Zhang
2021 ◽  
Vol 13 (2) ◽  
pp. 328
Author(s):  
Wenkai Liang ◽  
Yan Wu ◽  
Ming Li ◽  
Yice Cao ◽  
Xin Hu

The classification of high-resolution (HR) synthetic aperture radar (SAR) images is of great importance for SAR scene interpretation and application. However, the presence of intricate spatial structural patterns and complex statistical nature makes SAR image classification a challenging task, especially in the case of limited labeled SAR data. This paper proposes a novel HR SAR image classification method, using a multi-scale deep feature fusion network and covariance pooling manifold network (MFFN-CPMN). MFFN-CPMN combines the advantages of local spatial features and global statistical properties and considers the multi-feature information fusion of SAR images in representation learning. First, we propose a Gabor-filtering-based multi-scale feature fusion network (MFFN) to capture the spatial pattern and get the discriminative features of SAR images. The MFFN belongs to a deep convolutional neural network (CNN). To make full use of a large amount of unlabeled data, the weights of each layer of MFFN are optimized by unsupervised denoising dual-sparse encoder. Moreover, the feature fusion strategy in MFFN can effectively exploit the complementary information between different levels and different scales. Second, we utilize a covariance pooling manifold network to extract further the global second-order statistics of SAR images over the fusional feature maps. Finally, the obtained covariance descriptor is more distinct for various land covers. Experimental results on four HR SAR images demonstrate the effectiveness of the proposed method and achieve promising results over other related algorithms.


PLoS ONE ◽  
2021 ◽  
Vol 16 (7) ◽  
pp. e0254054
Author(s):  
Gaihua Wang ◽  
Lei Cheng ◽  
Jinheng Lin ◽  
Yingying Dai ◽  
Tianlun Zhang

The large intra-class variance and small inter-class variance are the key factor affecting fine-grained image classification. Recently, some algorithms have been more accurate and efficient. However, these methods ignore the multi-scale information of the network, resulting in insufficient ability to capture subtle changes. To solve this problem, a weakly supervised fine-grained classification network based on multi-scale pyramid is proposed in this paper. It uses pyramid convolution kernel to replace ordinary convolution kernel in residual network, which can expand the receptive field of the convolution kernel and use complementary information of different scales. Meanwhile, the weakly supervised data augmentation network (WS-DAN) is used to prevent over fitting and improve the performance of the model. In addition, a new attention module, which includes spatial attention and channel attention, is introduced to pay more attention to the object part in the image. The comprehensive experiments are carried out on three public benchmarks. It shows that the proposed method can extract subtle feature and achieve classification effectively.


Author(s):  
Wenfei Yang ◽  
Tianzhu Zhang ◽  
Zhendong Mao ◽  
Yongdong Zhanga ◽  
Qi Tian ◽  
...  

2019 ◽  
Vol 11 (2) ◽  
pp. 142 ◽  
Author(s):  
Wenping Ma ◽  
Hui Yang ◽  
Yue Wu ◽  
Yunta Xiong ◽  
Tao Hu ◽  
...  

In this paper, a novel change detection approach based on multi-grained cascade forest(gcForest) and multi-scale fusion for synthetic aperture radar (SAR) images is proposed. It detectsthe changed and unchanged areas of the images by using the well-trained gcForest. Most existingchange detection methods need to select the appropriate size of the image block. However, thesingle size image block only provides a part of the local information, and gcForest cannot achieve agood effect on the image representation learning ability. Therefore, the proposed approach choosesdifferent sizes of image blocks as the input of gcForest, which can learn more image characteristicsand reduce the influence of the local information of the image on the classification result as well.In addition, in order to improve the detection accuracy of those pixels whose gray value changesabruptly, the proposed approach combines gradient information of the difference image with theprobability map obtained from the well-trained gcForest. Therefore, the image edge information canbe enhanced and the accuracy of edge detection can be improved by extracting the image gradientinformation. Experiments on four data sets indicate that the proposed approach outperforms otherstate-of-the-art algorithms.


Electronics ◽  
2020 ◽  
Vol 9 (6) ◽  
pp. 955
Author(s):  
Chang Sun ◽  
Yibo Ai ◽  
Sheng Wang ◽  
Weidong Zhang

Weakly supervised object localization (WSOL) has attracted intense interest in computer vision for instance level annotations. As a hot research topic, a number of existing works concentrated on utilizing convolutional neural network (CNN)-based methods, which are powerful in extracting and representing features. The main challenge in CNN-based WSOL methods is to obtain features covering the entire target objects, not only the most discriminative object parts. To overcome this challenge and to improve the detection performance of feature extracting related WSOL methods, a CNN-based two-branch model was presented in this paper to locate objects using supervised learning. Our method contained two branches, including a detection branch and a self-attention branch. During the training process, the two branches interacted with each other by regarding the segmentation mask from the other branch as the pseudo ground truth labels of itself. Our model was able to focus on capturing the information of all the object parts due to the self-attention mechanism. Additionally, we embedded multi-scale detection into our two-branch method to output two-scale features. We evaluated our two-branch network on the CUB-200-2011 and VOC2007 datasets. The pointing localization, intersection over union (IoU) localization, and correct localization precision (CorLoc) results demonstrated competitive performance with other state-of-the-art methods in WSOL.


2020 ◽  
pp. 1-51
Author(s):  
Ivan Vulić ◽  
Simon Baker ◽  
Edoardo Maria Ponti ◽  
Ulla Petti ◽  
Ira Leviant ◽  
...  

We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering data sets for 12 typologically diverse languages, including major languages (e.g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e.g., Welsh, Kiswahili). Each language data set is annotated for the lexical relation of semantic similarity and contains 1,888 semantically aligned concept pairs, providing a representative coverage of word classes (nouns, verbs, adjectives, adverbs), frequency ranks, similarity intervals, lexical fields, and concreteness levels. Additionally, owing to the alignment of concepts across languages, we provide a suite of 66 crosslingual semantic similarity data sets. Because of its extensive size and language coverage, Multi-SimLex provides entirely novel opportunities for experimental evaluation and analysis. On its monolingual and crosslingual benchmarks, we evaluate and analyze a wide array of recent state-of-the-art monolingual and crosslingual representation models, including static and contextualized word embeddings (such as fastText, monolingual and multilingual BERT, XLM), externally informed lexical representations, as well as fully unsupervised and (weakly) supervised crosslingual word embeddings. We also present a step-by-step data set creation protocol for creating consistent, Multi-Simlex -style resources for additional languages.We make these contributions—the public release of Multi-SimLex data sets, their creation protocol, strong baseline results, and in-depth analyses which can be be helpful in guiding future developments in multilingual lexical semantics and representation learning—available via aWeb site that will encourage community effort in further expansion of Multi-Simlex to many more languages. Such a large-scale semantic resource could inspire significant further advances in NLP across languages.


Sign in / Sign up

Export Citation Format

Share Document