Label Distribution for Learning with Noisy Labels

The performances of deep neural networks (DNNs) crucially rely on the quality of labeling. In some situations, labels are easily corrupted, and therefore some labels become noisy labels. Thus, designing algorithms that deal with noisy labels is of great importance for learning robust DNNs. However, it is difficult to distinguish between clean labels and noisy labels, which becomes the bottleneck of many methods. To address the problem, this paper proposes a novel method named Label Distribution based Confidence Estimation (LDCE). LDCE estimates the confidence of the observed labels based on label distribution. Then, the boundary between clean labels and noisy labels becomes clear according to confidence scores. To verify the effectiveness of the method, LDCE is combined with the existing learning algorithm to train robust DNNs. Experiments on both synthetic and real-world datasets substantiate the superiority of the proposed algorithm against state-of-the-art methods.

Download Full-text

Structured Probabilistic End-to-End Learning from Crowds

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/210 ◽

2020 ◽

Author(s):

Zhijun Chen ◽

Huimin Wang ◽

Hailong Sun ◽

Pengpeng Chen ◽

Tao Han ◽

...

Keyword(s):

Neural Networks ◽

Real World ◽

Probabilistic Model ◽

Deep Neural Networks ◽

State Of The Art ◽

Probabilistic Approach ◽

Probabilistic Interpretation ◽

End To End ◽

Real World Datasets ◽

The Relationship

End-to-end learning from crowds has recently been introduced as an EM-free approach to training deep neural networks directly from noisy crowdsourced annotations. It models the relationship between true labels and annotations with a specific type of neural layer, termed as the crowd layer, which can be trained using pure backpropagation. Parameters of the crowd layer, however, can hardly be interpreted as annotator reliability, as compared with the more principled probabilistic approach. The lack of probabilistic interpretation further prevents extensions of the approach to account for important factors of annotation processes, e.g., instance difficulty. This paper presents SpeeLFC, a structured probabilistic model that incorporates the constraints of probability axioms for parameters of the crowd layer, which allows to explicitly model annotator reliability while benefiting from the end-to-end training of neural networks. Moreover, we propose SpeeLFC-D, which further takes into account instance difficulty. Extensive validation on real-world datasets shows that our methods improve the state-of-the-art.

Download Full-text

Direct Quantization for Training Highly Accurate Low Bit-width Deep Neural Networks

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/292 ◽

2020 ◽

Author(s):

Tuan Hoang ◽

Thanh-Toan Do ◽

Tam V. Nguyen ◽

Ngai-Man Cheung

Keyword(s):

Neural Networks ◽

Cost Function ◽

Image Classification ◽

Convolutional Neural Networks ◽

Gradient Descent ◽

Deep Neural Networks ◽

State Of The Art ◽

Deep Convolutional Neural Networks ◽

Novel Method ◽

The Cost

This paper proposes two novel techniques to train deep convolutional neural networks with low bit-width weights and activations. First, to obtain low bit-width weights, most existing methods obtain the quantized weights by performing quantization on the full-precision network weights. However, this approach would result in some mismatch: the gradient descent updates full-precision weights, but it does not update the quantized weights. To address this issue, we propose a novel method that enables direct updating of quantized weights with learnable quantization levels to minimize the cost function using gradient descent. Second, to obtain low bit-width activations, existing works consider all channels equally. However, the activation quantizers could be biased toward a few channels with high-variance. To address this issue, we propose a method to take into account the quantization errors of individual channels. With this approach, we can learn activation quantizers that minimize the quantization errors in the majority of channels. Experimental results demonstrate that our proposed method achieves state-of-the-art performance on the image classification task, using AlexNet, ResNet and MobileNetV2 architectures on CIFAR-100 and ImageNet datasets.

Download Full-text

Unsupervised Representation Learning by Predicting Random Distances

Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2020/408 ◽

2020 ◽

Author(s):

Hu Wang ◽

Guansong Pang ◽

Chunhua Shen ◽

Congbo Ma

Keyword(s):

Neural Networks ◽

Anomaly Detection ◽

Unsupervised Learning ◽

Large Scale ◽

Deep Neural Networks ◽

State Of The Art ◽

Representation Learning ◽

Great Success ◽

Learning Tasks ◽

Real World Datasets

Deep neural networks have gained great success in a broad range of tasks due to its remarkable capability to learn semantically rich features from high-dimensional data. However, they often require large-scale labelled data to successfully learn such features, which significantly hinders their adaption in unsupervised learning tasks, such as anomaly detection and clustering, and limits their applications to critical domains where obtaining massive labelled data is prohibitively expensive. To enable unsupervised learning on those domains, in this work we propose to learn features without using any labelled data by training neural networks to predict data distances in a randomly projected space. Random mapping is a theoretically proven approach to obtain approximately preserved distances. To well predict these distances, the representation learner is optimised to learn genuine class structures that are implicitly embedded in the randomly projected space. Empirical results on 19 real-world datasets show that our learned representations substantially outperform a few state-of-the-art methods for both anomaly detection and clustering tasks. Code is available at: \url{https://git.io/RDP}

Download Full-text

ART-UP: A Novel Method for Generating Scanning-Robust Aesthetic QR Codes

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3418214 ◽

2021 ◽

Vol 17 (1) ◽

pp. 1-23

Author(s):

Mingliang Xu ◽

Qingfeng Li ◽

Jianwei Niu ◽

Hao Su ◽

Xiting Liu ◽

...

Keyword(s):

State Of The Art ◽

Visual Quality ◽

Qr Code ◽

Quick Response ◽

Estimation Model ◽

Qr Codes ◽

Excellent Performance ◽

Novel Method ◽

Coarse To Fine

Quick response (QR) codes are usually scanned in different environments, so they must be robust to variations in illumination, scale, coverage, and camera angles. Aesthetic QR codes improve the visual quality, but subtle changes in their appearance may cause scanning failure. In this article, a new method to generate scanning-robust aesthetic QR codes is proposed, which is based on a module-based scanning probability estimation model that can effectively balance the tradeoff between visual quality and scanning robustness. Our method locally adjusts the luminance of each module by estimating the probability of successful sampling. The approach adopts the hierarchical, coarse-to-fine strategy to enhance the visual quality of aesthetic QR codes, which sequentially generate the following three codes: a binary aesthetic QR code, a grayscale aesthetic QR code, and the final color aesthetic QR code. Our approach also can be used to create QR codes with different visual styles by adjusting some initialization parameters. User surveys and decoding experiments were adopted for evaluating our method compared with state-of-the-art algorithms, which indicates that the proposed approach has excellent performance in terms of both visual quality and scanning robustness.

Download Full-text

Representing Deep Neural Networks Latent Space Geometries with Graphs

Algorithms ◽

10.3390/a14020039 ◽

2021 ◽

Vol 14 (2) ◽

pp. 39

Author(s):

Carlos Lassance ◽

Vincent Gripon ◽

Antonio Ortega

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Deep Learning ◽

Objective Function ◽

Learning Process ◽

Deep Neural Networks ◽

State Of The Art ◽

The Core ◽

Learning Tasks ◽

Latent Space

Deep Learning (DL) has attracted a lot of attention for its ability to reach state-of-the-art performance in many machine learning tasks. The core principle of DL methods consists of training composite architectures in an end-to-end fashion, where inputs are associated with outputs trained to optimize an objective function. Because of their compositional nature, DL architectures naturally exhibit several intermediate representations of the inputs, which belong to so-called latent spaces. When treated individually, these intermediate representations are most of the time unconstrained during the learning process, as it is unclear which properties should be favored. However, when processing a batch of inputs concurrently, the corresponding set of intermediate representations exhibit relations (what we call a geometry) on which desired properties can be sought. In this work, we show that it is possible to introduce constraints on these latent geometries to address various problems. In more detail, we propose to represent geometries by constructing similarity graphs from the intermediate representations obtained when processing a batch of inputs. By constraining these Latent Geometry Graphs (LGGs), we address the three following problems: (i) reproducing the behavior of a teacher architecture is achieved by mimicking its geometry, (ii) designing efficient embeddings for classification is achieved by targeting specific geometries, and (iii) robustness to deviations on inputs is achieved via enforcing smooth variation of geometry between consecutive latent spaces. Using standard vision benchmarks, we demonstrate the ability of the proposed geometry-based methods in solving the considered problems.

Download Full-text

Reconfigurable Binary Neural Network Accelerator with Adaptive Parallelism Scheme

Electronics ◽

10.3390/electronics10030230 ◽

2021 ◽

Vol 10 (3) ◽

pp. 230

Author(s):

Jaechan Cho ◽

Yongchul Jung ◽

Seongjoo Lee ◽

Yunho Jung

Keyword(s):

Neural Networks ◽

High Throughput ◽

Deep Neural Networks ◽

State Of The Art ◽

Throughput Performance ◽

Adaptive Parallelism ◽

Sensor Applications ◽

Binary Neural Network ◽

Target Layer ◽

Network Topologies

Binary neural networks (BNNs) have attracted significant interest for the implementation of deep neural networks (DNNs) on resource-constrained edge devices, and various BNN accelerator architectures have been proposed to achieve higher efficiency. BNN accelerators can be divided into two categories: streaming and layer accelerators. Although streaming accelerators designed for a specific BNN network topology provide high throughput, they are infeasible for various sensor applications in edge AI because of their complexity and inflexibility. In contrast, layer accelerators with reasonable resources can support various network topologies, but they operate with the same parallelism for all the layers of the BNN, which degrades throughput performance at certain layers. To overcome this problem, we propose a BNN accelerator with adaptive parallelism that offers high throughput performance in all layers. The proposed accelerator analyzes target layer parameters and operates with optimal parallelism using reasonable resources. In addition, this architecture is able to fully compute all types of BNN layers thanks to its reconfigurability, and it can achieve a higher area–speed efficiency than existing accelerators. In performance evaluation using state-of-the-art BNN topologies, the designed BNN accelerator achieved an area–speed efficiency 9.69 times higher than previous FPGA implementations and 24% higher than existing VLSI implementations for BNNs.

Download Full-text

Gromov-Wasserstein optimal transport to align single-cell multi-omics data

10.1101/2020.04.28.066787 ◽

2020 ◽

Cited By ~ 2

Author(s):

Pinar Demetci ◽

Rebecca Santorella ◽

Björn Sandstede ◽

William Stafford Noble ◽

Ritambhara Singh

Keyword(s):

Single Cell ◽

Optimal Transport ◽

Learning Algorithm ◽

State Of The Art ◽

Single Cells ◽

Wasserstein Distance ◽

Cell Alignment ◽

Shared Space ◽

Real World Datasets ◽

Unsupervised Algorithms

AbstractData integration of single-cell measurements is critical for understanding cell development and disease, but the lack of correspondence between different types of measurements makes such efforts challenging. Several unsupervised algorithms can align heterogeneous single-cell measurements in a shared space, enabling the creation of mappings between single cells in different data domains. However, these algorithms require hyperparameter tuning for high-quality alignments, which is difficult in an unsupervised setting without correspondence information for validation. We present Single-Cell alignment using Optimal Transport (SCOT), an unsupervised learning algorithm that uses Gromov Wasserstein-based optimal transport to align single-cell multi-omics datasets. We compare the alignment performance of SCOT with state-of-the-art algorithms on four simulated and two real-world datasets. SCOT performs on par with state-of-the-art methods but is faster and requires tuning fewer hyperparameters. Furthermore, we provide an algorithm for SCOT to use Gromov Wasserstein distance to guide the parameter selection. Thus, unlike previous methods, SCOT aligns well without using any orthogonal correspondence information to pick the hyperparameters. Our source code and scripts for replicating the results are available at https://github.com/rsinghlab/SCOT.

Download Full-text

Framework for TCAD augmented machine learning on multi- I–V characteristics using convolutional neural network and multiprocessing

Journal of Semiconductors ◽

10.1088/1674-4926/42/12/124101 ◽

2021 ◽

Vol 42 (12) ◽

pp. 124101

Author(s):

Thomas Hirtz ◽

Steyn Huurman ◽

He Tian ◽

Yi Yang ◽

Tian-Ling Ren

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Information Technologies ◽

Deep Neural Networks ◽

State Of The Art ◽

Data Driven ◽

Sufficient Data ◽

Learning Models ◽

Simulation Tools ◽

New Information

Abstract In a world where data is increasingly important for making breakthroughs, microelectronics is a field where data is sparse and hard to acquire. Only a few entities have the infrastructure that is required to automate the fabrication and testing of semiconductor devices. This infrastructure is crucial for generating sufficient data for the use of new information technologies. This situation generates a cleavage between most of the researchers and the industry. To address this issue, this paper will introduce a widely applicable approach for creating custom datasets using simulation tools and parallel computing. The multi-I–V curves that we obtained were processed simultaneously using convolutional neural networks, which gave us the ability to predict a full set of device characteristics with a single inference. We prove the potential of this approach through two concrete examples of useful deep learning models that were trained using the generated data. We believe that this work can act as a bridge between the state-of-the-art of data-driven methods and more classical semiconductor research, such as device engineering, yield engineering or process monitoring. Moreover, this research gives the opportunity to anybody to start experimenting with deep neural networks and machine learning in the field of microelectronics, without the need for expensive experimentation infrastructure.

Download Full-text

Fundamentals of Higher Order Neural Networks for Modeling and Simulation

Artificial Higher Order Neural Networks for Modeling and Simulation ◽

10.4018/978-1-4666-2175-6.ch006 ◽

2013 ◽

pp. 103-133 ◽

Cited By ~ 13

Author(s):

Madan M. Gupta ◽

Ivo Bukovsky ◽

Noriyasu Homma ◽

Ashu M. G. Solo ◽

Zeng-Guang Hou

Keyword(s):

Neural Networks ◽

Modeling And Simulation ◽

Learning Algorithm ◽

Nonlinear Approximation ◽

Higher Order ◽

Higher Order Neural Networks ◽

High Dynamic ◽

Input Variables ◽

Continuous Dynamic

In this chapter, the authors provide fundamental principles of Higher Order Neural Units (HONUs) and Higher Order Neural Networks (HONNs) for modeling and simulation. An essential core of HONNs can be found in higher order weighted combinations or correlations between the input variables and HONU. Except for the high quality of nonlinear approximation of static HONUs, the capability of dynamic HONUs for the modeling of dynamic systems is shown and compared to conventional recurrent neural networks when a practical learning algorithm is used. In addition, the potential of continuous dynamic HONUs to approximate high dynamic order systems is discussed, as adaptable time delays can be implemented. By using some typical examples, this chapter describes how and why higher order combinations or correlations can be effective for modeling of systems.

Download Full-text

Analyzing Deep Neural Networks with Noisy Labels

2020 IEEE International Conference on Big Data and Smart Computing (BigComp) ◽

10.1109/bigcomp48618.2020.00012 ◽

2020 ◽

Author(s):

Chan Lim ◽

Sangwoo Han ◽

Jongwuk Lee

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Noisy Labels

Download Full-text