scholarly journals Few-Shot Learning with Part Discovery and Augmentation from Unlabeled Images

Author(s):  
Wentao Chen ◽  
Chenyang Si ◽  
Wei Wang ◽  
Liang Wang ◽  
Zilei Wang ◽  
...  

Few-shot learning is a challenging task since only few instances are given for recognizing an unseen class. One way to alleviate this problem is to acquire a strong inductive bias via meta-learning on similar tasks. In this paper, we show that such inductive bias can be learned from a flat collection of unlabeled images, and instantiated as transferable representations among seen and unseen classes. Specifically, we propose a novel part-based self-supervised representation learning scheme to learn transferable representations by maximizing the similarity of an image to its discriminative part. To mitigate the overfitting in few-shot classification caused by data scarcity, we further propose a part augmentation strategy by retrieving extra images from a base dataset. We conduct systematic studies on miniImageNet and tieredImageNet benchmarks. Remarkably, our method yields impressive results, outperforming the previous best unsupervised methods by 7.74% and 9.24% under 5-way 1-shot and 5-way 5-shot settings, which are comparable with state-of-the-art supervised methods.

Author(s):  
Yuexuan An ◽  
Hui Xue ◽  
Xingyu Zhao ◽  
Lu Zhang

How to learn a transferable feature representation from limited examples is a key challenge for few-shot classification. Self-supervision as an auxiliary task to the main supervised few-shot task is considered to be a conceivable way to solve the problem since self-supervision can provide additional structural information easily ignored by the main task. However, learning a good representation by traditional self-supervised methods is usually dependent on large training samples. In few-shot scenarios, due to the lack of sufficient samples, these self-supervised methods might learn a biased representation, which more likely leads to the wrong guidance for the main tasks and finally causes the performance degradation. In this paper, we propose conditional self-supervised learning (CSS) to use auxiliary information to guide the representation learning of self-supervised tasks. Specifically, CSS leverages supervised information as prior knowledge to shape and improve the learning feature manifold of self-supervision without auxiliary unlabeled data, so as to reduce representation bias and mine more effective semantic information. Moreover, CSS exploits more meaningful information through supervised and the improved self-supervised learning respectively and integrates the information into a unified distribution, which can further enrich and broaden the original representation. Extensive experiments demonstrate that our proposed method without any fine-tuning can achieve a significant accuracy improvement on the few-shot classification scenarios compared to the state-of-the-art few-shot learning methods.


Sensors ◽  
2021 ◽  
Vol 21 (5) ◽  
pp. 1566
Author(s):  
Pei Zhang ◽  
Ying Li ◽  
Dong Wang ◽  
Jiyue Wang

While growing instruments generate more and more airborne or satellite images, the bottleneck in remote sensing (RS) scene classification has shifted from data limits toward a lack of ground truth samples. There are still many challenges when we are facing unknown environments, especially those with insufficient training data. Few-shot classification offers a different picture under the umbrella of meta-learning: digging rich knowledge from a few data are possible. In this work, we propose a method named RS-SSKD for few-shot RS scene classification from a perspective of generating powerful representation for the downstream meta-learner. Firstly, we propose a novel two-branch network that takes three pairs of original-transformed images as inputs and incorporates Class Activation Maps (CAMs) to drive the network mining, the most relevant category-specific region. This strategy ensures that the network generates discriminative embeddings. Secondly, we set a round of self-knowledge distillation to prevent overfitting and boost the performance. Our experiments show that the proposed method surpasses current state-of-the-art approaches on two challenging RS scene datasets: NWPU-RESISC45 and RSD46-WHU. Finally, we conduct various ablation experiments to investigate the effect of each component of the proposed method and analyze the training time of state-of-the-art methods and ours.


Author(s):  
Haoqing Wang ◽  
Zhi-Hong Deng

Few-shot classification aims to recognize unseen classes with few labeled samples from each class. Many meta-learning models for few-shot classification elaborately design various task-shared inductive bias (meta-knowledge) to solve such tasks, and achieve impressive performance. However, when there exists the domain shift between the training tasks and the test tasks, the obtained inductive bias fails to generalize across domains, which degrades the performance of the meta-learning models. In this work, we aim to improve the robustness of the inductive bias through task augmentation. Concretely, we consider the worst-case problem around the source task distribution, and propose the adversarial task augmentation method which can generate the inductive bias-adaptive 'challenging' tasks. Our method can be used as a simple plug-and-play module for various meta-learning models, and improve their cross-domain generalization capability. We conduct extensive experiments under the cross-domain setting, using nine few-shot classification datasets: mini-ImageNet, CUB, Cars, Places, Plantae, CropDiseases, EuroSAT, ISIC and ChestX. Experimental results show that our method can effectively improve the few-shot classification performance of the meta-learning models under domain shift, and outperforms the existing works. Our code is available at https://github.com/Haoqing-Wang/CDFSL-ATA.


AI ◽  
2021 ◽  
Vol 2 (2) ◽  
pp. 195-208
Author(s):  
Gabriel Dahia ◽  
Maurício Pamplona Segundo

We propose a method that can perform one-class classification given only a small number of examples from the target class and none from the others. We formulate the learning of meaningful features for one-class classification as a meta-learning problem in which the meta-training stage repeatedly simulates one-class classification, using the classification loss of the chosen algorithm to learn a feature representation. To learn these representations, we require only multiclass data from similar tasks. We show how the Support Vector Data Description method can be used with our method, and also propose a simpler variant based on Prototypical Networks that obtains comparable performance, indicating that learning feature representations directly from data may be more important than which one-class algorithm we choose. We validate our approach by adapting few-shot classification datasets to the few-shot one-class classification scenario, obtaining similar results to the state-of-the-art of traditional one-class classification, and that improves upon that of one-class classification baselines employed in the few-shot setting.


Sensors ◽  
2021 ◽  
Vol 21 (13) ◽  
pp. 4486
Author(s):  
Niall O’Mahony ◽  
Sean Campbell ◽  
Lenka Krpalkova ◽  
Anderson Carvalho ◽  
Joseph Walsh ◽  
...  

Fine-grained change detection in sensor data is very challenging for artificial intelligence though it is critically important in practice. It is the process of identifying differences in the state of an object or phenomenon where the differences are class-specific and are difficult to generalise. As a result, many recent technologies that leverage big data and deep learning struggle with this task. This review focuses on the state-of-the-art methods, applications, and challenges of representation learning for fine-grained change detection. Our research focuses on methods of harnessing the latent metric space of representation learning techniques as an interim output for hybrid human-machine intelligence. We review methods for transforming and projecting embedding space such that significant changes can be communicated more effectively and a more comprehensive interpretation of underlying relationships in sensor data is facilitated. We conduct this research in our work towards developing a method for aligning the axes of latent embedding space with meaningful real-world metrics so that the reasoning behind the detection of change in relation to past observations may be revealed and adjusted. This is an important topic in many fields concerned with producing more meaningful and explainable outputs from deep learning and also for providing means for knowledge injection and model calibration in order to maintain user confidence.


2021 ◽  
Vol 7 (3) ◽  
pp. 49
Author(s):  
Daniel Carlos Guimarães Pedronette ◽  
Lucas Pascotti Valem ◽  
Longin Jan Latecki

Visual features and representation learning strategies experienced huge advances in the previous decade, mainly supported by deep learning approaches. However, retrieval tasks are still performed mainly based on traditional pairwise dissimilarity measures, while the learned representations lie on high dimensional manifolds. With the aim of going beyond pairwise analysis, post-processing methods have been proposed to replace pairwise measures by globally defined measures, capable of analyzing collections in terms of the underlying data manifold. The most representative approaches are diffusion and ranked-based methods. While the diffusion approaches can be computationally expensive, the rank-based methods lack theoretical background. In this paper, we propose an efficient Rank-based Diffusion Process which combines both approaches and avoids the drawbacks of each one. The obtained method is capable of efficiently approximating a diffusion process by exploiting rank-based information, while assuring its convergence. The algorithm exhibits very low asymptotic complexity and can be computed regionally, being suitable to outside of dataset queries. An experimental evaluation conducted for image retrieval and person re-ID tasks on diverse datasets demonstrates the effectiveness of the proposed approach with results comparable to the state-of-the-art.


Sensors ◽  
2021 ◽  
Vol 21 (14) ◽  
pp. 4666
Author(s):  
Zhiqiang Pan ◽  
Honghui Chen

Collaborative filtering (CF) aims to make recommendations for users by detecting user’s preference from the historical user–item interactions. Existing graph neural networks (GNN) based methods achieve satisfactory performance by exploiting the high-order connectivity between users and items, however they suffer from the poor training efficiency problem and easily introduce bias for information propagation. Moreover, the widely applied Bayesian personalized ranking (BPR) loss is insufficient to provide supervision signals for training due to the extremely sparse observed interactions. To deal with the above issues, we propose the Efficient Graph Collaborative Filtering (EGCF) method. Specifically, EGCF adopts merely one-layer graph convolution to model the collaborative signal for users and items from the first-order neighbors in the user–item interactions. Moreover, we introduce contrastive learning to enhance the representation learning of users and items by deriving the self-supervisions, which is jointly trained with the supervised learning. Extensive experiments are conducted on two benchmark datasets, i.e., Yelp2018 and Amazon-book, and the experimental results demonstrate that EGCF can achieve the state-of-the-art performance in terms of Recall and normalized discounted cumulative gain (NDCG), especially on ranking the target items at right positions. In addition, EGCF shows obvious advantages in the training efficiency compared with the competitive baselines, making it practicable for potential applications.


2021 ◽  
Author(s):  
Yuan-Chia Cheng ◽  
Ci-Siang Lin ◽  
Fu-En Yang ◽  
Yu-Chiang Frank Wang

2021 ◽  
Vol 15 (6) ◽  
pp. 1-21
Author(s):  
Huandong Wang ◽  
Yong Li ◽  
Mu Du ◽  
Zhenhui Li ◽  
Depeng Jin

Both app developers and service providers have strong motivations to understand when and where certain apps are used by users. However, it has been a challenging problem due to the highly skewed and noisy app usage data. Moreover, apps are regarded as independent items in existing studies, which fail to capture the hidden semantics in app usage traces. In this article, we propose App2Vec, a powerful representation learning model to learn the semantic embedding of apps with the consideration of spatio-temporal context. Based on the obtained semantic embeddings, we develop a probabilistic model based on the Bayesian mixture model and Dirichlet process to capture when , where , and what semantics of apps are used to predict the future usage. We evaluate our model using two different app usage datasets, which involve over 1.7 million users and 2,000+ apps. Evaluation results show that our proposed App2Vec algorithm outperforms the state-of-the-art algorithms in app usage prediction with a performance gap of over 17.0%.


Sign in / Sign up

Export Citation Format

Share Document