Reliable Part Guided Multiple Level Attention Learning for Person Re-Identification

Person Re-ID is challenged by background clutter, body misalignment and part missing. In this paper, we propose a reliable part-based multiple levels attention deep network to learn multiple scales salience representation. In particular, person alignment and key point detection are sequentially carried out to locate three relative stable body components, then fused attention (FA) mode is designed to capture the fine-grained salient features from effective spatial of valuable channels of each part, regional attention mode is succeeded to weight the importance of different parts for highlighting the representative parts while suppressing the valueless ones. A late fusion-based multiple-task loss is finally adopted to further optimize the valuable feature representation. Experimental results demonstrate that the proposed method achieves state-of-the-art performances on three challenging benchmarks: Market-1501, DukeMTMC-reID and CUHK03.

Download Full-text

GourmetNet: Food Segmentation Using Multi-Scale Waterfall Features with Spatial and Channel Attention

Sensors ◽

10.3390/s21227504 ◽

2021 ◽

Vol 21 (22) ◽

pp. 7504

Author(s):

Udit Sharma ◽

Bruno Artacho ◽

Andreas Savakis

Keyword(s):

Feature Extraction ◽

State Of The Art ◽

Extraction Process ◽

Feature Representation ◽

Post Processing ◽

Multi Scale ◽

Spatial Pooling ◽

Current State ◽

Nutrition Monitoring ◽

Multiple Levels

We propose GourmetNet, a single-pass, end-to-end trainable network for food segmentation that achieves state-of-the-art performance. Food segmentation is an important problem as the first step for nutrition monitoring, food volume and calorie estimation. Our novel architecture incorporates both channel attention and spatial attention information in an expanded multi-scale feature representation using our advanced Waterfall Atrous Spatial Pooling module. GourmetNet refines the feature extraction process by merging features from multiple levels of the backbone through the two attention modules. The refined features are processed with the advanced multi-scale waterfall module that combines the benefits of cascade filtering and pyramid representations without requiring a separate decoder or post-processing. Our experiments on two food datasets show that GourmetNet significantly outperforms existing current state-of-the-art methods.

Download Full-text

Patchy Image Structure Classification Using Multi-Orientation Region Transform

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6968 ◽

2020 ◽

Vol 34 (07) ◽

pp. 12741-12748

Author(s):

Xiaohan Yu ◽

Yang Zhao ◽

Yongsheng Gao ◽

Shengwu Xiong ◽

Xiaohui Yuan

Keyword(s):

Multiple Scales ◽

State Of The Art ◽

Flexible Structures ◽

Recognition Task ◽

Interior Structure ◽

Insect Wing ◽

Image Structure ◽

Fine Grained ◽

Contour Feature ◽

Coarse To Fine

Exterior contour and interior structure are both vital features for classifying objects. However, most of the existing methods consider exterior contour feature and internal structure feature separately, and thus fail to function when classifying patchy image structures that have similar contours and flexible structures. To address above limitations, this paper proposes a novel Multi-Orientation Region Transform (MORT), which can effectively characterize both contour and structure features simultaneously, for patchy image structure classification. MORT is performed over multiple orientation regions at multiple scales to effectively integrate patchy features, and thus enables a better description of the shape in a coarse-to-fine manner. Moreover, the proposed MORT can be extended to combine with the deep convolutional neural network techniques, for further enhancement of classification accuracy. Very encouraging experimental results on the challenging ultra-fine-grained cultivar recognition task, insect wing recognition task, and large variation butterfly recognition task are obtained, which demonstrate the effectiveness and superiority of the proposed MORT over the state-of-the-art methods in classifying patchy image structures. Our code and three patchy image structure datasets are available at: https://github.com/XiaohanYu-GU/MReT2019.

Download Full-text

Two-Branch Attention Learning for Fine-Grained Class Incremental Learning

Electronics ◽

10.3390/electronics10232987 ◽

2021 ◽

Vol 10 (23) ◽

pp. 2987

Author(s):

Jiaqi Guo ◽

Guanqiu Qi ◽

Shuiqing Xie ◽

Xiangyuan Li

Keyword(s):

Incremental Learning ◽

Network Architecture ◽

State Of The Art ◽

Research Area ◽

Feature Representation ◽

Learning Network ◽

Fine Grained ◽

Effective Training ◽

Critical Regions ◽

Number Of Classes

As a long-standing research area, class incremental learning (CIL) aims to effectively learn a unified classifier along with the growth of the number of classes. Due to the small inter-class variances and large intra-class variances, fine-grained visual categorization (FGVC) as a challenging visual task has not attracted enough attention in CIL. Therefore, the localization of critical regions specialized for fine-grained object recognition plays a crucial role in FGVC. Additionally, it is important to learn fine-grained features from critical regions in fine-grained CIL for the recognition of new object classes. This paper designs a network architecture named two-branch attention learning network (TBAL-Net) for fine-grained CIL. TBAL-Net can localize critical regions and learn fine-grained feature representation by a lightweight attention module. An effective training framework is proposed for fine-grained CIL by integrating TBAL-Net into an effective CIL process. This framework is tested on three popular fine-grained object datasets, including CUB-200-2011, FGVC-Aircraft, and Stanford-Car. The comparative experimental results demonstrate that the proposed framework can achieve the state-of-the-art performance on the three fine-grained object datasets.

Download Full-text

Material feature representation and identification with composite surfacelets

Journal of Computational Design and Engineering ◽

10.1016/j.jcde.2016.06.005 ◽

2016 ◽

Vol 3 (4) ◽

pp. 370-384 ◽

Cited By ~ 3

Author(s):

Wei Huang ◽

Yan Wang ◽

David W. Rosen

Keyword(s):

Reverse Engineering ◽

Multiple Scales ◽

Three Dimensional ◽

Parametric Representation ◽

Feature Representation ◽

Fine Grained ◽

Modeling Material ◽

Material Feature ◽

Surfacelet Transform ◽

Edge Features

Abstract Computer-aided materials design requires new modeling approaches to characterize and represent fine-grained geometric structures and material compositions at multiple scales. Recently, a dual-Rep approach was developed to model materials microstructures based on a new basis function, called surfacelet. As a combination of implicit surface and wavelets, surfacelets can efficiently identify and represent planar, cylindrical, and ellipsoidal geometries in material microstructures and describe the distribution of compositions and properties. In this paper, these primitive surfacelets are extended and composite surfacelets are proposed to model more complex geometries. Composite surfacelets are constructed by Boolean operations on the primitives. The surfacelet transform is applied to match geometric features in three-dimensional images. The composition of the material near the identified features can then be modeled. A cubic surfacelet and a v-joint surfacelet are developed to demonstrate the reverse engineering process of retrieving material compositions from material images. Highlights Modeling material distribution and edge singularity with composition of implicit surfaces. Identifying edge features in images with surface integrals and surfacelet transform. Enabling reverse engineering of materials with parametric representation.

Download Full-text

Behavioral Genetics: Concepts for Research and Practice in Language Development and Disorders

Journal of Speech Language and Hearing Research ◽

10.1044/jshr.3805.1126 ◽

1995 ◽

Vol 38 (5) ◽

pp. 1126-1142 ◽

Cited By ~ 14

Author(s):

Jeffrey W. Gilger

Keyword(s):

Language Development ◽

Behavioral Genetics ◽

State Of The Art ◽

Genetic Research ◽

Great Promise ◽

Behavioral Genetic ◽

Fine Grained ◽

Future Goals ◽

Current State ◽

Research Designs

This paper is an introduction to behavioral genetics for researchers and practioners in language development and disorders. The specific aims are to illustrate some essential concepts and to show how behavioral genetic research can be applied to the language sciences. Past genetic research on language-related traits has tended to focus on simple etiology (i.e., the heritability or familiality of language skills). The current state of the art, however, suggests that great promise lies in addressing more complex questions through behavioral genetic paradigms. In terms of future goals it is suggested that: (a) more behavioral genetic work of all types should be done—including replications and expansions of preliminary studies already in print; (b) work should focus on fine-grained, theory-based phenotypes with research designs that can address complex questions in language development; and (c) work in this area should utilize a variety of samples and methods (e.g., twin and family samples, heritability and segregation analyses, linkage and association tests, etc.).

Download Full-text

Representation Learning for Fine-Grained Change Detection

Sensors ◽

10.3390/s21134486 ◽

2021 ◽

Vol 21 (13) ◽

pp. 4486

Author(s):

Niall O’Mahony ◽

Sean Campbell ◽

Lenka Krpalkova ◽

Anderson Carvalho ◽

Joseph Walsh ◽

...

Keyword(s):

Deep Learning ◽

Change Detection ◽

Model Calibration ◽

State Of The Art ◽

Representation Learning ◽

Machine Intelligence ◽

The State ◽

Sensor Data ◽

Fine Grained ◽

Learning Techniques

Fine-grained change detection in sensor data is very challenging for artificial intelligence though it is critically important in practice. It is the process of identifying differences in the state of an object or phenomenon where the differences are class-specific and are difficult to generalise. As a result, many recent technologies that leverage big data and deep learning struggle with this task. This review focuses on the state-of-the-art methods, applications, and challenges of representation learning for fine-grained change detection. Our research focuses on methods of harnessing the latent metric space of representation learning techniques as an interim output for hybrid human-machine intelligence. We review methods for transforming and projecting embedding space such that significant changes can be communicated more effectively and a more comprehensive interpretation of underlying relationships in sensor data is facilitated. We conduct this research in our work towards developing a method for aligning the axes of latent embedding space with meaningful real-world metrics so that the reasoning behind the detection of change in relation to past observations may be revealed and adjusted. This is an important topic in many fields concerned with producing more meaningful and explainable outputs from deep learning and also for providing means for knowledge injection and model calibration in order to maintain user confidence.

Download Full-text

ShadingNet: Image Intrinsics by Fine-Grained Shading Decomposition

International Journal of Computer Vision ◽

10.1007/s11263-021-01477-5 ◽

2021 ◽

Author(s):

Anil S. Baslamisli ◽

Partha Das ◽

Hoang-An Le ◽

Sezer Karaoglu ◽

Theo Gevers

Keyword(s):

Neural Network ◽

Large Scale ◽

State Of The Art ◽

Image Decomposition ◽

Natural Environments ◽

Decomposition Algorithms ◽

Ambient Light ◽

Fine Grained ◽

Large Scale Dataset ◽

Direct Illumination

AbstractIn general, intrinsic image decomposition algorithms interpret shading as one unified component including all photometric effects. As shading transitions are generally smoother than reflectance (albedo) changes, these methods may fail in distinguishing strong photometric effects from reflectance variations. Therefore, in this paper, we propose to decompose the shading component into direct (illumination) and indirect shading (ambient light and shadows) subcomponents. The aim is to distinguish strong photometric effects from reflectance variations. An end-to-end deep convolutional neural network (ShadingNet) is proposed that operates in a fine-to-coarse manner with a specialized fusion and refinement unit exploiting the fine-grained shading model. It is designed to learn specific reflectance cues separated from specific photometric effects to analyze the disentanglement capability. A large-scale dataset of scene-level synthetic images of outdoor natural environments is provided with fine-grained intrinsic image ground-truths. Large scale experiments show that our approach using fine-grained shading decompositions outperforms state-of-the-art algorithms utilizing unified shading on NED, MPI Sintel, GTA V, IIW, MIT Intrinsic Images, 3DRMS and SRD datasets.

Download Full-text

A Machine Vision Approach for Bioreactor Foam Sensing

SLAS TECHNOLOGY Translating Life Sciences Innovation ◽

10.1177/24726303211008861 ◽

2021 ◽

pp. 247263032110088

Author(s):

Jonas Austerjost ◽

Robert Söldner ◽

Christoffer Edlund ◽

Johan Trygg ◽

David Pollard ◽

...

Keyword(s):

Machine Learning ◽

Machine Vision ◽

State Of The Art ◽

Low Cost ◽

High Accuracy ◽

Consumer Electronics ◽

Learning System ◽

Automotive Applications ◽

Fine Grained

Machine vision is a powerful technology that has become increasingly popular and accurate during the last decade due to rapid advances in the field of machine learning. The majority of machine vision applications are currently found in consumer electronics, automotive applications, and quality control, yet the potential for bioprocessing applications is tremendous. For instance, detecting and controlling foam emergence is important for all upstream bioprocesses, but the lack of robust foam sensing often leads to batch failures from foam-outs or overaddition of antifoam agents. Here, we report a new low-cost, flexible, and reliable foam sensor concept for bioreactor applications. The concept applies convolutional neural networks (CNNs), a state-of-the-art machine learning system for image processing. The implemented method shows high accuracy for both binary foam detection (foam/no foam) and fine-grained classification of foam levels.

Download Full-text

BeautyNet: Joint Multiscale CNN and Transfer Learning Method for Unconstrained Facial Beauty Prediction

Computational Intelligence and Neuroscience ◽

10.1155/2019/1910624 ◽

2019 ◽

Vol 2019 ◽

pp. 1-14 ◽

Cited By ~ 4

Author(s):

Yikui Zhai ◽

He Cao ◽

Wenbo Deng ◽

Junying Gan ◽

Vincenzo Piuri ◽

...

Keyword(s):

Transfer Learning ◽

Classification Accuracy ◽

Learning Strategy ◽

State Of The Art ◽

Activation Function ◽

Training Data ◽

Fine Grained ◽

Pattern Recognition Problem ◽

Face Features ◽

Facial Beauty

Because of the lack of discriminative face representations and scarcity of labeled training data, facial beauty prediction (FBP), which aims at assessing facial attractiveness automatically, has become a challenging pattern recognition problem. Inspired by recent promising work on fine-grained image classification using the multiscale architecture to extend the diversity of deep features, BeautyNet for unconstrained facial beauty prediction is proposed in this paper. Firstly, a multiscale network is adopted to improve the discriminative of face features. Secondly, to alleviate the computational burden of the multiscale architecture, MFM (max-feature-map) is utilized as an activation function which can not only lighten the network and speed network convergence but also benefit the performance. Finally, transfer learning strategy is introduced here to mitigate the overfitting phenomenon which is caused by the scarcity of labeled facial beauty samples and improves the proposed BeautyNet’s performance. Extensive experiments performed on LSFBD demonstrate that the proposed scheme outperforms the state-of-the-art methods, which can achieve 67.48% classification accuracy.

Download Full-text

Knowing What, How and Why: A Near Complete Solution for Aspect-Based Sentiment Analysis

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6383 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8600-8607

Author(s):

Haiyun Peng ◽

Lu Xu ◽

Lidong Bing ◽

Fei Huang ◽

Wei Lu ◽

...

Keyword(s):

Sentiment Analysis ◽

State Of The Art ◽

Complete Solution ◽

Unified Model ◽

Two Stage ◽

Fine Grained ◽

Aspect Extraction ◽

Second Stage ◽

Opinion Extraction ◽

Complete Story

Target-based sentiment analysis or aspect-based sentiment analysis (ABSA) refers to addressing various sentiment analysis tasks at a fine-grained level, which includes but is not limited to aspect extraction, aspect sentiment classification, and opinion extraction. There exist many solvers of the above individual subtasks or a combination of two subtasks, and they can work together to tell a complete story, i.e. the discussed aspect, the sentiment on it, and the cause of the sentiment. However, no previous ABSA research tried to provide a complete solution in one shot. In this paper, we introduce a new subtask under ABSA, named aspect sentiment triplet extraction (ASTE). Particularly, a solver of this task needs to extract triplets (What, How, Why) from the inputs, which show WHAT the targeted aspects are, HOW their sentiment polarities are and WHY they have such polarities (i.e. opinion reasons). For instance, one triplet from “Waiters are very friendly and the pasta is simply average” could be (‘Waiters’, positive, ‘friendly’). We propose a two-stage framework to address this task. The first stage predicts what, how and why in a unified model, and then the second stage pairs up the predicted what (how) and why from the first stage to output triplets. In the experiments, our framework has set a benchmark performance in this novel triplet extraction task. Meanwhile, it outperforms a few strong baselines adapted from state-of-the-art related methods.

Download Full-text