Two-stage local constrained sparse coding for fine-grained visual categorization

Fine-grained visual categorization (FGVC) is the discrimination of similar subcategories, whose main challenge is to localize the quite subtle visual distinctions between similar subcategories. There are two pivotal problems: discovering which region is discriminative and representative, and determining how many discriminative regions are necessary to achieve the best performance. Existing methods generally solve these two problems relying on the prior knowledge or experimental validation, which extremely restricts the usability and scalability of FGVC. To address the "which" and "how many" problems adaptively and intelligently, this paper proposes a stacked deep reinforcement learning approach (StackDRL). It adopts a two-stage learning architecture, which is driven by the semantic reward function. Two-stage learning localizes the object and its parts in sequence ("which"), and determines the number of discriminative regions adaptively ("how many"), which is quite appealing in FGVC. Semantic reward function drives StackDRL to fully learn the discriminative and conceptual visual information, via jointly combining the attention-based reward and category-based reward. Furthermore, unsupervised discriminative localization avoids the heavy labor consumption of labeling, and extremely strengthens the usability and scalability of our StackDRL approach. Comparing with ten state-of-the-art methods on CUB-200-2011 dataset, our StackDRL approach achieves the best categorization accuracy.

Download Full-text

A deep sparse coding method for fine-grained visual categorization

2016 International Joint Conference on Neural Networks (IJCNN) ◽

10.1109/ijcnn.2016.7727259 ◽

2016 ◽

Cited By ~ 3

Author(s):

Lihua Guo ◽

Chenggang Guo

Keyword(s):

Sparse Coding ◽

Visual Categorization ◽

Fine Grained ◽

Coding Method

Download Full-text

A Survey of Recent Advances in CNN-Based Fine-Grained Visual Categorization

2020 IEEE 20th International Conference on Communication Technology (ICCT) ◽

10.1109/icct50939.2020.9295723 ◽

2020 ◽

Author(s):

Chenyang Qiu ◽

Wei Zhou

Keyword(s):

Visual Categorization ◽

Fine Grained ◽

Recent Advances

Download Full-text

Coarse2Fine: a two-stage training method for fine-grained visual classification

Machine Vision and Applications ◽

10.1007/s00138-021-01180-y ◽

2021 ◽

Vol 32 (2) ◽

Author(s):

Amir Erfan Eshratifar ◽

David Eigen ◽

Michael Gormish ◽

Massoud Pedram

Keyword(s):

Training Method ◽

Two Stage ◽

Visual Classification ◽

Fine Grained

Download Full-text

Knowing What, How and Why: A Near Complete Solution for Aspect-Based Sentiment Analysis

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6383 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8600-8607

Author(s):

Haiyun Peng ◽

Lu Xu ◽

Lidong Bing ◽

Fei Huang ◽

Wei Lu ◽

...

Keyword(s):

Sentiment Analysis ◽

State Of The Art ◽

Complete Solution ◽

Unified Model ◽

Two Stage ◽

Fine Grained ◽

Aspect Extraction ◽

Second Stage ◽

Opinion Extraction ◽

Complete Story

Target-based sentiment analysis or aspect-based sentiment analysis (ABSA) refers to addressing various sentiment analysis tasks at a fine-grained level, which includes but is not limited to aspect extraction, aspect sentiment classification, and opinion extraction. There exist many solvers of the above individual subtasks or a combination of two subtasks, and they can work together to tell a complete story, i.e. the discussed aspect, the sentiment on it, and the cause of the sentiment. However, no previous ABSA research tried to provide a complete solution in one shot. In this paper, we introduce a new subtask under ABSA, named aspect sentiment triplet extraction (ASTE). Particularly, a solver of this task needs to extract triplets (What, How, Why) from the inputs, which show WHAT the targeted aspects are, HOW their sentiment polarities are and WHY they have such polarities (i.e. opinion reasons). For instance, one triplet from “Waiters are very friendly and the pasta is simply average” could be (‘Waiters’, positive, ‘friendly’). We propose a two-stage framework to address this task. The first stage predicts what, how and why in a unified model, and then the second stage pairs up the predicted what (how) and why from the first stage to output triplets. In the experiments, our framework has set a benchmark performance in this novel triplet extraction task. Meanwhile, it outperforms a few strong baselines adapted from state-of-the-art related methods.

Download Full-text

A Public Dataset for Fine-Grained Ship Classification in Optical Remote Sensing Images

Remote Sensing ◽

10.3390/rs13040747 ◽

2021 ◽

Vol 13 (4) ◽

pp. 747

Author(s):

Yanghua Di ◽

Zhiguo Jiang ◽

Haopeng Zhang

Keyword(s):

Remote Sensing ◽

Image Data ◽

Remote Sensing Image ◽

Google Earth ◽

Optical Remote Sensing ◽

Remote Sensing Images ◽

Visual Categorization ◽

Class Differences ◽

Fine Grained ◽

Ship Classification

Fine-grained visual categorization (FGVC) is an important and challenging problem due to large intra-class differences and small inter-class differences caused by deformation, illumination, angles, etc. Although major advances have been achieved in natural images in the past few years due to the release of popular datasets such as the CUB-200-2011, Stanford Cars and Aircraft datasets, fine-grained ship classification in remote sensing images has been rarely studied because of relative scarcity of publicly available datasets. In this paper, we investigate a large amount of remote sensing image data of sea ships and determine most common 42 categories for fine-grained visual categorization. Based our previous DSCR dataset, a dataset for ship classification in remote sensing images, we collect more remote sensing images containing warships and civilian ships of various scales from Google Earth and other popular remote sensing image datasets including DOTA, HRSC2016, NWPU VHR-10, We call our dataset FGSCR-42, meaning a dataset for Fine-Grained Ship Classification in Remote sensing images with 42 categories. The whole dataset of FGSCR-42 contains 9320 images of most common types of ships. We evaluate popular object classification algorithms and fine-grained visual categorization algorithms to build a benchmark. Our FGSCR-42 dataset is publicly available at our webpages.

Download Full-text

Coarse Label Refined Knowledge Reasoning for Fine-Grained Visual Categorization

Lecture Notes in Computer Science - Intelligence Science and Big Data Engineering ◽

10.1007/978-3-030-02698-1_30 ◽

2018 ◽

pp. 349-359

Author(s):

Xiangyu Zhao ◽

Yuxin Peng

Keyword(s):

Visual Categorization ◽

Fine Grained ◽

Knowledge Reasoning

Download Full-text

Label-Smooth Learning for Fine-Grained Visual Categorization

Lecture Notes in Computer Science - Pattern Recognition ◽

10.1007/978-3-030-41404-7_2 ◽

2020 ◽

pp. 17-31

Author(s):

Xianjie Mo ◽

Tingting Wei ◽

Hengmin Zhang ◽

Qiong Huang ◽

Wei Luo

Keyword(s):

Visual Categorization ◽

Fine Grained

Download Full-text

Extracting fine-grained location with temporal awareness in tweets: A two-stage approach

Journal of the Association for Information Science and Technology ◽

10.1002/asi.23816 ◽

2017 ◽

Vol 68 (7) ◽

pp. 1652-1670 ◽

Cited By ~ 5

Author(s):

Chenliang Li ◽

Aixin Sun

Keyword(s):

Two Stage ◽

Fine Grained

Download Full-text

P-CNN: Part-Based Convolutional Neural Networks for Fine-Grained Visual Categorization

IEEE Transactions on Pattern Analysis and Machine Intelligence ◽

10.1109/tpami.2019.2933510 ◽

2020 ◽

pp. 1-1 ◽

Cited By ~ 15

Author(s):

Junwei Han ◽

Xiwen Yao ◽

Gong Cheng ◽

Xiaoxu Feng ◽

Dong Xu

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Visual Categorization ◽

Fine Grained

Download Full-text