Semantic Locality-Aware Deformable Network for Clothing Segmentation

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/106 ◽

2018 ◽

Cited By ~ 4

Author(s):

Wei Ji ◽

Xi Li ◽

Yueting Zhuang ◽

Omar El Farouk Bourahla ◽

Yixin Ji ◽

...

Keyword(s):

Learning Process ◽

State Of The Art ◽

Semantic Segmentation ◽

Small Sample ◽

Sample Problem ◽

Fine Grained ◽

Domain Specific ◽

Proposed Model ◽

Segmentation Framework ◽

Small Sample Problem

Clothing segmentation is a challenging vision problem typically implemented within a fine-grained semantic segmentation framework. Different from conventional segmentation, clothing segmentation has some domain-specific properties such as texture richness, diverse appearance variations, non-rigid geometry deformations, and small sample learning. To deal with these points, we propose a semantic locality-aware segmentation model, which adaptively attaches an original clothing image with a semantically similar (e.g., appearance or pose) auxiliary exemplar by search. Through considering the interactions of the clothing image and its exemplar, more intrinsic knowledge about the locality manifold structures of clothing images is discovered to make the learning process of small sample problem more stable and tractable. Furthermore, we present a CNN model based on the deformable convolutions to extract the non-rigid geometry-aware features for clothing images. Experimental results demonstrate the effectiveness of the proposed model against the state-of-the-art approaches.

Download Full-text

A Novel Neural Model with Lateral Interaction for Learning Tasks

Neural Computation ◽

10.1162/neco_a_01345 ◽

2020 ◽

pp. 1-24

Author(s):

Dequan Jin ◽

Ziyan Qin ◽

Murong Yang ◽

Penghe Chen

Keyword(s):

Numerical Experiments ◽

State Of The Art ◽

Neural Model ◽

Parameter Tuning ◽

Small Sample ◽

Lateral Interaction ◽

Learning Tasks ◽

Proposed Model ◽

High Level ◽

Elementary Field

We propose a novel neural model with lateral interaction for learning tasks. The model consists of two functional fields: an elementary field to extract features and a high-level field to store and recognize patterns. Each field is composed of some neurons with lateral interaction, and the neurons in different fields are connected by the rules of synaptic plasticity. The model is established on the current research of cognition and neuroscience, making it more transparent and biologically explainable. Our proposed model is applied to data classification and clustering. The corresponding algorithms share similar processes without requiring any parameter tuning and optimization processes. Numerical experiments validate that the proposed model is feasible in different learning tasks and superior to some state-of-the-art methods, especially in small sample learning, one-shot learning, and clustering.

Download Full-text

Discriminative Center Loss for Face Photo-Sketch Recognition

10.21203/rs.3.rs-89691/v1 ◽

2020 ◽

Author(s):

lin cao ◽

xibao huo ◽

yanan guo ◽

yuying shao ◽

kangning du

Keyword(s):

Small Sample ◽

Discriminative Power ◽

Sample Problem ◽

Sketch Recognition ◽

Regularization Technique ◽

Large Domain ◽

Class Separability ◽

Center Loss ◽

Small Sample Problem ◽

Learned Features

Abstract Face photo-sketch recognition refers to the process of matching sketches to photos. Recently, there has been a growing interest in using a convolutional neural network to learn discriminatively deep features. However, due to the large domain discrepancy and the high cost of acquiring sketches, the discriminative power of the deeply learned features will be inevitably reduced. In this paper, we propose a discriminative center loss to learn domain invariant features for face photo-sketch recognition. Specifically, two Mahalanobis distance matrices are proposed to enhance the intra-class compactness during inter-class separability. Moreover, a regularization technique is adopted on the Mahalanobis matrices to alleviate the small sample problem. Extensive experimental results on the e-PRIP dataset verified the effectiveness of the proposed discriminative center loss.

Download Full-text

Depthwise Convolution Is All You Need for Learning Multiple Visual Domains

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018368 ◽

2019 ◽

Vol 33 ◽

pp. 8368-8375 ◽

Cited By ~ 6

Author(s):

Yunhui Guo ◽

Yandong Li ◽

Liqiang Wang ◽

Tajana Rosing

Keyword(s):

State Of The Art ◽

The State ◽

Spatial Correlations ◽

Single Model ◽

Domain Specific ◽

Gating Mechanism ◽

Proposed Model ◽

Domain Learning ◽

Domain Models ◽

Universal Structure

There is a growing interest in designing models that can deal with images from different visual domains. If there exists a universal structure in different visual domains that can be captured via a common parameterization, then we can use a single model for all domains rather than one model per domain. A model aware of the relationships between different domains can also be trained to work on new domains with less resources. However, to identify the reusable structure in a model is not easy. In this paper, we propose a multi-domain learning architecture based on depthwise separable convolution. The proposed approach is based on the assumption that images from different domains share cross-channel correlations but have domain-specific spatial correlations. The proposed model is compact and has minimal overhead when being applied to new domains. Additionally, we introduce a gating mechanism to promote soft sharing between different domains. We evaluate our approach on Visual Decathlon Challenge, a benchmark for testing the ability of multi-domain models. The experiments show that our approach can achieve the highest score while only requiring 50% of the parameters compared with the state-of-the-art approaches.

Download Full-text

Semantic Segmentation Framework for Glomeruli Detection and Classification in Kidney Histological Sections

Electronics ◽

10.3390/electronics9030503 ◽

2020 ◽

Vol 9 (3) ◽

pp. 503 ◽

Cited By ~ 6

Author(s):

Nicola Altini ◽

Giacomo Donato Cascarano ◽

Antonio Brunetti ◽

Francescomaria Marino ◽

Maria Teresa Rocchetti ◽

...

Keyword(s):

High Performance ◽

Evaluation Process ◽

Semantic Segmentation ◽

University Hospital ◽

Detection Level ◽

Cad System ◽

Proposed Model ◽

Segmentation Framework ◽

Global Glomerulosclerosis ◽

Aided Diagnosis

The evaluation of kidney biopsies performed by expert pathologists is a crucial process for assessing if a kidney is eligible for transplantation. In this evaluation process, an important step consists of the quantification of global glomerulosclerosis, which is the ratio between sclerotic glomeruli and the overall number of glomeruli. Since there is a shortage of organs available for transplantation, a quick and accurate assessment of global glomerulosclerosis is essential for retaining the largest number of eligible kidneys. In the present paper, the authors introduce a Computer-Aided Diagnosis (CAD) system to assess global glomerulosclerosis. The proposed tool is based on Convolutional Neural Networks (CNNs). In particular, the authors considered approaches based on Semantic Segmentation networks, such as SegNet and DeepLab v3+. The dataset has been provided by the Department of Emergency and Organ Transplantations (DETO) of Bari University Hospital, and it is composed of 26 kidney biopsies coming from 19 donors. The dataset contains 2344 non-sclerotic glomeruli and 428 sclerotic glomeruli. The proposed model consents to achieve promising results in the task of automatically detecting and classifying glomeruli, thus easing the burden of pathologists. We get high performance both at pixel-level, achieving mean F-score higher than 0.81, and Weighted Intersection over Union (IoU) higher than 0.97 for both SegNet and Deeplab v3+ approaches, and at object detection level, achieving 0.924 as best F-score for non-sclerotic glomeruli and 0.730 as best F-score for sclerotic glomeruli.

Download Full-text

Multiscale Information Fusion for Hyperspectral Image Classification Based on Hybrid 2D-3D CNN

Remote Sensing ◽

10.3390/rs13122268 ◽

2021 ◽

Vol 13 (12) ◽

pp. 2268

Author(s):

Hang Gong ◽

Qiuxia Li ◽

Chunlai Li ◽

Haishan Dai ◽

Zhiping He ◽

...

Keyword(s):

Classification Accuracy ◽

Spatial Information ◽

Hyperspectral Image ◽

Small Sample ◽

Hyperspectral Images ◽

Sample Problem ◽

Training Samples ◽

Small Sample Problem ◽

3D Cnn ◽

Hyperspectral Classification

Hyperspectral images are widely used for classification due to its rich spectral information along with spatial information. To process the high dimensionality and high nonlinearity of hyperspectral images, deep learning methods based on convolutional neural network (CNN) are widely used in hyperspectral classification applications. However, most CNN structures are stacked vertically in addition to using a onefold size of convolutional kernels or pooling layers, which cannot fully mine the multiscale information on the hyperspectral images. When such networks meet the practical challenge of a limited labeled hyperspectral image dataset—i.e., “small sample problem”—the classification accuracy and generalization ability would be limited. In this paper, to tackle the small sample problem, we apply the semantic segmentation function to the pixel-level hyperspectral classification due to their comparability. A lightweight, multiscale squeeze-and-excitation pyramid pooling network (MSPN) is proposed. It consists of a multiscale 3D CNN module, a squeezing and excitation module, and a pyramid pooling module with 2D CNN. Such a hybrid 2D-3D-CNN MSPN framework can learn and fuse deeper hierarchical spatial–spectral features with fewer training samples. The proposed MSPN was tested on three publicly available hyperspectral classification datasets: Indian Pine, Salinas, and Pavia University. Using 5%, 0.5%, and 0.5% training samples of the three datasets, the classification accuracies of the MSPN were 96.09%, 97%, and 96.56%, respectively. In addition, we also selected the latest dataset with higher spatial resolution, named WHU-Hi-LongKou, as the challenge object. Using only 0.1% of the training samples, we could achieve a 97.31% classification accuracy, which is far superior to the state-of-the-art hyperspectral classification methods.

Download Full-text

A novel multi-objective optimization approach of machining parameters with small sample problem in gear hobbing

The International Journal of Advanced Manufacturing Technology ◽

10.1007/s00170-017-0823-y ◽

2017 ◽

Vol 93 (9-12) ◽

pp. 4099-4110 ◽

Cited By ~ 5

Author(s):

W. D. Cao ◽

C. P. Yan ◽

D. J. Wu ◽

J. B. Tuo

Keyword(s):

Small Sample ◽

Optimization Approach ◽

Machining Parameters ◽

Multi Objective Optimization ◽

Sample Problem ◽

Multi Objective ◽

Gear Hobbing ◽

Small Sample Problem

Download Full-text

On an alternative formulation of the Fisher criterion that overcomes the small sample problem

Pattern Recognition ◽

10.1016/j.patcog.2006.11.002 ◽

2007 ◽

Vol 40 (6) ◽

pp. 1753-1755 ◽

Cited By ~ 1

Author(s):

Marco Loog

Keyword(s):

Small Sample ◽

Alternative Formulation ◽

Sample Problem ◽

Fisher Criterion ◽

Small Sample Problem

Download Full-text

WMW-A: Rank-based two-sample independent test for small sample sizes through an auxiliary sample

10.1101/2021.06.24.449844 ◽

2021 ◽

Author(s):

Yin Guo ◽

Limin Li

Keyword(s):

Gene Expression ◽

Case Control ◽

Test Methods ◽

Small Sample ◽

Sample Problem ◽

Auxiliary Data ◽

Independent Test ◽

Small Sample Problem ◽

Small Sample Sizes ◽

And Control

Two-sample independent test methods are widely used in case-control studies to identify significant changes or differences, for example, to identify key pathogenic genes by comparing the gene expression levels in normal and disease cells. However, due to the high cost of data collection or labelling, many studies face the small sample problem, for which the traditional two-sample test methods often lose power. We propose a novel rank-based nonparametric test method WMW-A for small sample problem by introducing a three-sample statistic through another auxiliary sample. By combining the case, control and auxiliary samples together, we construct a three-sample WMW-A statistic based on the gap between the average ranks of the case and control samples in the combined samples. By assuming that the auxiliary sample follows a mixed distribution of the case and control populations, we analyze the theoretical properties of the WMW-A statistic and approximate the theoretical power. The extensive simulation experiments and real applications on microarray gene expression data sets show the WMW-A test could significantly improve the test power for two-sample problem with small sample sizes, by either available unlabelled auxiliary data or generated auxiliary data.

Download Full-text

Action-Guided Attention Mining and Relation Reasoning Network for Human-Object Interaction Detection

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/154 ◽

2020 ◽

Author(s):

Xue Lin ◽

Qi Zou ◽

Xixia Xu

Keyword(s):

State Of The Art ◽

Subtle Difference ◽

Fine Grained ◽

Interaction Detection ◽

Object Interaction ◽

Proposed Model ◽

Human Object ◽

Domination Problem ◽

Combination Space ◽

Activation Map

Human-object interaction (HOI) detection is important to understand human-centric scenes and is challenging due to subtle difference between fine-grained actions, and multiple co-occurring interactions. Most approaches tackle the problems by considering the multi-stream information and even introducing extra knowledge, which suffer from a huge combination space and the non-interactive pair domination problem. In this paper, we propose an Action-Guided attention mining and Relation Reasoning (AGRR) network to solve the problems. Relation reasoning on human-object pairs is performed by exploiting contextual compatibility consistency among pairs to filter out the non-interactive combinations. To better discriminate the subtle difference between fine-grained actions, an action-aware attention based on class activation map is proposed to mine the most relevant features for recognizing HOIs. Extensive experiments on V-COCO and HICO-DET datasets demonstrate the effectiveness of the proposed model compared with the state-of-the-art approaches.

Download Full-text

Hierarchical Domain-Adapted Feature Learning for Video Saliency Prediction

International Journal of Computer Vision ◽

10.1007/s11263-021-01519-y ◽

2021 ◽

Author(s):

G. Bellitto ◽

F. Proietto Salanitri ◽

S. Palazzo ◽

F. Rundo ◽

D. Giordano ◽

...

Keyword(s):

Multiple Scales ◽

Domain Adaptation ◽

State Of The Art ◽

Feature Learning ◽

Hierarchical Learning ◽

Domain Specific ◽

Proposed Model ◽

Saliency Prediction ◽

Video Saliency ◽

Abstraction Levels

AbstractIn this work, we propose a 3D fully convolutional architecture for video saliency prediction that employs hierarchical supervision on intermediate maps (referred to as conspicuity maps) generated using features extracted at different abstraction levels. We provide the base hierarchical learning mechanism with two techniques for domain adaptation and domain-specific learning. For the former, we encourage the model to unsupervisedly learn hierarchical general features using gradient reversal at multiple scales, to enhance generalization capabilities on datasets for which no annotations are provided during training. As for domain specialization, we employ domain-specific operations (namely, priors, smoothing and batch normalization) by specializing the learned features on individual datasets in order to maximize performance. The results of our experiments show that the proposed model yields state-of-the-art accuracy on supervised saliency prediction. When the base hierarchical model is empowered with domain-specific modules, performance improves, outperforming state-of-the-art models on three out of five metrics on the DHF1K benchmark and reaching the second-best results on the other two. When, instead, we test it in an unsupervised domain adaptation setting, by enabling hierarchical gradient reversal layers, we obtain performance comparable to supervised state-of-the-art. Source code, trained models and example outputs are publicly available at https://github.com/perceivelab/hd2s.

Download Full-text