Label Embedding Based on Multi-Scale Locality Preservation

Label Distribution Learning (LDL) fits the situations well that focus on the overall distribution of the whole series of labels. The numerical labels of LDL satisfy the integrity probability constraint. Due to LDL's special label domain, existing label embedding algorithms that focus on embedding of binary labels are thus unfit for LDL. This paper proposes a specially designed approach MSLP that achieves label embedding for LDL by Multi-Scale Locality Preserving (MSLP). Specifically, MSLP takes the locality information of data in both the label space and the feature space into account with different locality granularity. By assuming an explicit mapping from the features to the embedded labels, MSLP does not need an additional learning process after completing embedding. Besides, MSLP is insensitive to the existing of data points violating the smoothness assumption, which is usually caused by noises. Experimental results demonstrate the effectiveness of MSLP in preserving the locality structure of label distributions in the embedding space and show its superiority over the state-of-the-art baseline methods.

Download Full-text

Label Enhancement for Label Distribution Learning

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/406 ◽

2018 ◽

Cited By ~ 1

Author(s):

Ning Xu ◽

An Tao ◽

Xin Geng

Keyword(s):

Learning Process ◽

Real World ◽

Feature Space ◽

Graph Laplacian ◽

Training Set ◽

Topological Information ◽

Label Distribution Learning ◽

Real World Datasets ◽

Label Distribution ◽

Training Sets

Label distribution is more general than both single-label annotation and multi-label annotation. It covers a certain number of labels, representing the degree to which each label describes the instance. The learning process on the instances labeled by label distributions is called label distribution learning (LDL). Unfortunately, many training sets only contain simple logical labels rather than label distributions due to the difficulty of obtaining the label distributions directly. To solve the problem, one way is to recover the label distributions from the logical labels in the training set via leveraging the topological information of the feature space and the correlation among the labels. Such process of recovering label distributions from logical labels is defined as label enhancement (LE), which reinforces the supervision information in the training sets. This paper proposes a novel LE algorithm called Graph Laplacian Label Enhancement (GLLE). Experimental results on one artificial dataset and fourteen real-world datasets show clear advantages of GLLE over several existing LE algorithms.

Download Full-text

Dynamic Hypergraph Structure Learning

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/439 ◽

2018 ◽

Cited By ~ 2

Author(s):

Zizhao Zhang ◽

Haojie Lin ◽

Yue Gao

Keyword(s):

Learning Process ◽

Structure Learning ◽

State Of The Art ◽

Feature Space ◽

Projection Matrix ◽

Data Correlation ◽

Common Task ◽

Hypergraph Learning ◽

The Common ◽

Public Datasets

In recent years, hypergraph modeling has shown its superiority on correlation formulation among samples and has wide applications in classification, retrieval, and other tasks. In all these works, the performance of hypergraph learning highly depends on the generated hypergraph structure. A good hypergraph structure can represent the data correlation better, and vice versa. Although hypergraph learning has attracted much attention recently, most of existing works still rely on a static hypergraph structure, and little effort concentrates on optimizing the hypergraph structure during the learning process. To tackle this problem, we propose a dynamic hypergraph structure learning method in this paper. In this method, given the originally generated hypergraph structure, the objective of our work is to simultaneously optimize the label projection matrix (the common task in hypergraph learning) and the hypergraph structure itself. More specifically, in this formulation, the label projection matrix is related to the hypergraph structure, and the hypergraph structure is associated with the data correlation from both the label space and the feature space. Here, we alternatively learn the optimal label projection matrix and the hypergraph structure, leading to a dynamic hypergraph structure during the learning process. We have applied the proposed method in the tasks of 3D shape recognition and gesture recognition. Experimental results on 4 public datasets show better performance compared with the state-of-the-art methods. We note that the proposed method can be further applied in other tasks.

Download Full-text

A multi-scale and multi-physics simulation methodology with the state-of-the-art tools for safety analysis in light water reactors applied to a turbine trip scenario (PART I)

Nuclear Engineering and Design ◽

10.1016/j.nucengdes.2019.05.008 ◽

2019 ◽

Vol 350 ◽

pp. 195-204 ◽

Cited By ~ 1

Author(s):

Patricio Hidalga ◽

Agustín Abarca ◽

Rafael Miró ◽

Abdelkrim Sekrhi ◽

Gumersindo Verdú

Keyword(s):

State Of The Art ◽

Safety Analysis ◽

The State ◽

Light Water Reactors ◽

Light Water ◽

Simulation Methodology ◽

Multi Scale ◽

Physics Simulation ◽

Water Reactors

Download Full-text

Ensemble-Based Out-of-Distribution Detection

Electronics ◽

10.3390/electronics10050567 ◽

2021 ◽

Vol 10 (5) ◽

pp. 567

Author(s):

Donghun Yang ◽

Kien Mai Mai Ngoc ◽

Iksoo Shin ◽

Kyong-Ha Lee ◽

Myunggwon Hwang

Keyword(s):

Detection Method ◽

State Of The Art ◽

Metric Learning ◽

Feature Space ◽

Confidence Score ◽

Distance Metric Learning ◽

Current State ◽

Overall Performance ◽

Deep Learning Model

To design an efficient deep learning model that can be used in the real-world, it is important to detect out-of-distribution (OOD) data well. Various studies have been conducted to solve the OOD problem. The current state-of-the-art approach uses a confidence score based on the Mahalanobis distance in a feature space. Although it outperformed the previous approaches, the results were sensitive to the quality of the trained model and the dataset complexity. Herein, we propose a novel OOD detection method that can train more efficient feature space for OOD detection. The proposed method uses an ensemble of the features trained using the softmax-based classifier and the network based on distance metric learning (DML). Through the complementary interaction of these two networks, the trained feature space has a more clumped distribution and can fit well on the Gaussian distribution by class. Therefore, OOD data can be efficiently detected by setting a threshold in the trained feature space. To evaluate the proposed method, we applied our method to various combinations of image datasets. The results show that the overall performance of the proposed approach is superior to those of other methods, including the state-of-the-art approach, on any combination of datasets.

Download Full-text

PCAN—Part-Based Context Attention Network for Thermal Power Plant Detection in Remote Sensing Imagery

Remote Sensing ◽

10.3390/rs13071243 ◽

2021 ◽

Vol 13 (7) ◽

pp. 1243

Author(s):

Wenxin Yin ◽

Wenhui Diao ◽

Peijin Wang ◽

Xin Gao ◽

Ya Li ◽

...

Keyword(s):

Remote Sensing ◽

Power Plants ◽

State Of The Art ◽

Thermal Power ◽

Image Interpretation ◽

Remote Sensing Image ◽

Thermal Power Plants ◽

Average Precision ◽

Deep Convolutional Neural Networks ◽

Multi Scale

The detection of Thermal Power Plants (TPPs) is a meaningful task for remote sensing image interpretation. It is a challenging task, because as facility objects TPPs are composed of various distinctive and irregular components. In this paper, we propose a novel end-to-end detection framework for TPPs based on deep convolutional neural networks. Specifically, based on the RetinaNet one-stage detector, a context attention multi-scale feature extraction network is proposed to fuse global spatial attention to strengthen the ability in representing irregular objects. In addition, we design a part-based attention module to adapt to TPPs containing distinctive components. Experiments show that the proposed method outperforms the state-of-the-art methods and can achieve 68.15% mean average precision.

Download Full-text

EfficientPS: Efficient Panoptic Segmentation

International Journal of Computer Vision ◽

10.1007/s11263-021-01445-z ◽

2021 ◽

Author(s):

Rohit Mohan ◽

Abhinav Valada

Keyword(s):

State Of The Art ◽

Autonomous Robot ◽

Multi Scale ◽

Contextual Features ◽

New Variant ◽

Competent Functioning ◽

Segmentation Task

AbstractUnderstanding the scene in which an autonomous robot operates is critical for its competent functioning. Such scene comprehension necessitates recognizing instances of traffic participants along with general scene semantics which can be effectively addressed by the panoptic segmentation task. In this paper, we introduce the Efficient Panoptic Segmentation (EfficientPS) architecture that consists of a shared backbone which efficiently encodes and fuses semantically rich multi-scale features. We incorporate a new semantic head that aggregates fine and contextual features coherently and a new variant of Mask R-CNN as the instance head. We also propose a novel panoptic fusion module that congruously integrates the output logits from both the heads of our EfficientPS architecture to yield the final panoptic segmentation output. Additionally, we introduce the KITTI panoptic segmentation dataset that contains panoptic annotations for the popularly challenging KITTI benchmark. Extensive evaluations on Cityscapes, KITTI, Mapillary Vistas and Indian Driving Dataset demonstrate that our proposed architecture consistently sets the new state-of-the-art on all these four benchmarks while being the most efficient and fast panoptic segmentation architecture to date.

Download Full-text

Representing Deep Neural Networks Latent Space Geometries with Graphs

Algorithms ◽

10.3390/a14020039 ◽

2021 ◽

Vol 14 (2) ◽

pp. 39

Author(s):

Carlos Lassance ◽

Vincent Gripon ◽

Antonio Ortega

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Objective Function ◽

Learning Process ◽

Deep Neural Networks ◽

State Of The Art ◽

The Core ◽

Learning Tasks ◽

Latent Space

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.

Download Full-text

Hybrid Graph Neural Networks for Crowd Counting

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6839 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11693-11700 ◽

Cited By ~ 2

Author(s):

Ao Luo ◽

Fan Yang ◽

Xin Li ◽

Dong Nie ◽

Zhicheng Jiao ◽

...

Keyword(s):

Network Architecture ◽

Message Passing ◽

Large Scale ◽

State Of The Art ◽

Density Variation ◽

Feature Maps ◽

Crowd Counting ◽

Multi Scale ◽

Crowd Density ◽

Graph Neural Networks

Crowd counting is an important yet challenging task due to the large scale and density variation. Recent investigations have shown that distilling rich relations among multi-scale features and exploiting useful information from the auxiliary task, i.e., localization, are vital for this task. Nevertheless, how to comprehensively leverage these relations within a unified network architecture is still a challenging problem. In this paper, we present a novel network structure called Hybrid Graph Neural Network (HyGnn) which targets to relieve the problem by interweaving the multi-scale features for crowd density as well as its auxiliary task (localization) together and performing joint reasoning over a graph. Specifically, HyGnn integrates a hybrid graph to jointly represent the task-specific feature maps of different scales as nodes, and two types of relations as edges: (i) multi-scale relations capturing the feature dependencies across scales and (ii) mutual beneficial relations building bridges for the cooperation between counting and localization. Thus, through message passing, HyGnn can capture and distill richer relations between nodes to obtain more powerful representations, providing robust and accurate results. Our HyGnn performs significantly well on four challenging datasets: ShanghaiTech Part A, ShanghaiTech Part B, UCF_CC_50 and UCF_QNRF, outperforming the state-of-the-art algorithms by a large margin.

Download Full-text