Trimmed Robust Loss Function for Training Deep Neural Networks with Label Noise

Making Deep Neural Networks Robust to Label Noise: Cross-Training With a Novel Loss Function

IEEE Access ◽

10.1109/access.2019.2940653 ◽

2019 ◽

Vol 7 ◽

pp. 130893-130902 ◽

Cited By ~ 2

Author(s):

Zhen Qin ◽

Zhengwen Zhang ◽

Yan Li ◽

Jun Guo

Keyword(s):

Neural Networks ◽

Loss Function ◽

Deep Neural Networks ◽

Label Noise ◽

Cross Training

Download Full-text

Stratified neural networks in a time-to-event setting

10.1101/2021.02.01.429169 ◽

2021 ◽

Author(s):

Fabrizio Kuruc ◽

Harald Binder ◽

Moritz Hess

Keyword(s):

Neural Networks ◽

Loss Function ◽

Deep Neural Networks ◽

Proportional Hazards ◽

Proportional Hazards Model ◽

Cox Proportional Hazards ◽

Cox Proportional Hazards Model ◽

Loss Functions ◽

Partial Likelihood ◽

Hazards Model

AbstractDeep neural networks are now frequently employed to predict survival conditional on omics-type biomarkers, e.g. by employing the partial likelihood of Cox proportional hazards model as loss function. Due to the generally limited number of observations in clinical studies, combining different data-sets has been proposed to improve learning of network parameters. However, if baseline hazards differ between the studies, the assumptions of Cox proportional hazards model are violated. Based on high dimensional transcriptome profiles from different tumor entities, we demonstrate how using a stratified partial likelihood as loss function allows for accounting for the different baseline hazards in a deep learning framework. Additionally, we compare the partial likelihood with the ranking loss, which is frequently employed as loss function in machine learning approaches due to its seemingly simplicity. Using RNA-seq data from the Cancer Genome Atlas (TCGA) we show that use of stratified loss functions leads to an overall better discriminatory power and lower prediction error compared to their nonstratified counterparts. We investigate which genes are identified to have the greatest marginal impact on prediction of survival when using different loss functions. We find that while similar genes are identified, in particular known prognostic genes receive higher importance from stratified loss functions. Taken together, pooling data from different sources for improved parameter learning of deep neural networks benefits largely from employing stratified loss functions that consider potentially varying baseline hazards. For easy application, we provide PyTorch code for stratified loss functions and an explanatory Jupyter notebook in a GitHub repository.

Download Full-text

Spoofing Speaker Verification System by Adversarial Examples Leveraging the Generalized Speaker Difference

Security and Communication Networks ◽

10.1155/2021/6664578 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Hongwei Luo ◽

Yijie Shen ◽

Feng Lin ◽

Guoai Xu

Keyword(s):

Neural Networks ◽

Loss Function ◽

Deep Neural Networks ◽

State Of The Art ◽

Speaker Verification ◽

Signal To Noise Ratio ◽

The State ◽

Verification System ◽

Adversarial Examples ◽

Human Hearing

Speaker verification system has gained great popularity in recent years, especially with the development of deep neural networks and Internet of Things. However, the security of speaker verification system based on deep neural networks has not been well investigated. In this paper, we propose an attack to spoof the state-of-the-art speaker verification system based on generalized end-to-end (GE2E) loss function for misclassifying illegal users into the authentic user. Specifically, we design a novel loss function to deploy a generator for generating effective adversarial examples with slight perturbation and then spoof the system with these adversarial examples to achieve our goals. The success rate of our attack can reach 82% when cosine similarity is adopted to deploy the deep-learning-based speaker verification system. Beyond that, our experiments also reported the signal-to-noise ratio at 76 dB, which proves that our attack has higher imperceptibility than previous works. In summary, the results show that our attack not only can spoof the state-of-the-art neural-network-based speaker verification system but also more importantly has the ability to hide from human hearing or machine discrimination.

Download Full-text

Memorization in Deep Neural Networks: Does the Loss Function Matter?

Advances in Knowledge Discovery and Data Mining - Lecture Notes in Computer Science ◽

10.1007/978-3-030-75765-6_11 ◽

2021 ◽

pp. 131-142

Author(s):

Deep Patel ◽

P. S. Sastry

Keyword(s):

Neural Networks ◽

Loss Function ◽

Deep Neural Networks

Download Full-text

An Entropic Optimal Transport loss for learning deep neural networks under label noise in remote sensing images

Computer Vision and Image Understanding ◽

10.1016/j.cviu.2019.102863 ◽

2020 ◽

Vol 191 ◽

pp. 102863 ◽

Cited By ~ 1

Author(s):

Bharath Bhushan Damodaran ◽

Rémi Flamary ◽

Vivien Seguy ◽

Nicolas Courty

Keyword(s):

Remote Sensing ◽

Neural Networks ◽

Deep Neural Networks ◽

Optimal Transport ◽

Remote Sensing Images ◽

Label Noise ◽

Transport Loss

Download Full-text

f-Similarity Preservation Loss for Soft Labels: A Demonstration on Cross-Corpus Speech Emotion Recognition

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33015725 ◽

2019 ◽

Vol 33 ◽

pp. 5725-5732

Author(s):

Biqiao Zhang ◽

Yuqing Kong ◽

Georg Essl ◽

Emily Mower Provost

Keyword(s):

Neural Networks ◽

Emotion Recognition ◽

Loss Function ◽

Deep Neural Networks ◽

Metric Learning ◽

Loss Functions ◽

Speech Emotion Recognition ◽

Subjective Data ◽

Dual Form ◽

Deep Metric Learning

In this paper, we propose a Deep Metric Learning (DML) approach that supports soft labels. DML seeks to learn representations that encode the similarity between examples through deep neural networks. DML generally presupposes that data can be divided into discrete classes using hard labels. However, some tasks, such as our exemplary domain of speech emotion recognition (SER), work with inherently subjective data, data for which it may not be possible to identify a single hard label. We propose a family of loss functions, fSimilarity Preservation Loss (f-SPL), based on the dual form of f-divergence for DML with soft labels. We show that the minimizer of f-SPL preserves the pairwise label similarities in the learned feature embeddings. We demonstrate the efficacy of the proposed loss function on the task of cross-corpus SER with soft labels. Our approach, which combines f-SPL and classification loss, significantly outperforms a baseline SER system with the same structure but trained with only classification loss in most experiments. We show that the presented techniques are more robust to over-training and can learn an embedding space in which the similarity between examples is meaningful.

Download Full-text

Making Deep Neural Networks Robust to Label Noise: A Reweighting Loss and Data Filtration

2019 IEEE 4th International Conference on Signal and Image Processing (ICSIP) ◽

10.1109/siprocess.2019.8868645 ◽

2019 ◽

Author(s):

Zhengwen Zhang ◽

Yan li ◽

Yunjie Li ◽

Ying Qin

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Label Noise

Download Full-text

Adversarial Erasing method based on graph neural network

Journal of Physics Conference Series ◽

10.1088/1742-6596/2083/4/042083 ◽

2021 ◽

Vol 2083 (4) ◽

pp. 042083

Author(s):

Shuhan Liu

Keyword(s):

Neural Network ◽

Neural Networks ◽

Loss Function ◽

Deep Neural Networks ◽

Ground Truth ◽

Semantic Segmentation ◽

Data Sets ◽

Recent Developments ◽

Weakly Supervised ◽

Network Weight

Abstract Semantic segmentation is a traditional task that requires a large number of pixel-level ground truth label data sets, which is time-consuming and expensive. Recent developments in weakly-supervised settings have shown that reasonable performance can be obtained using only image-level labels. Classification is often used as an agent task to train deep neural networks and extract attention maps from them. The classification task only needs less supervision information to obtain the most discriminative part of the object. For this purpose, we propose a new end-to-end counter-wipe network. Compared with the baseline network, we propose a method to apply the graph neural network to obtain the first CAM. It is proposed to train the joint loss function to avoid the network weight sharing and cause the network to fall into a saddle point. Our experiments on the Pascal VOC2012 dataset show that 64.9% segmentation performance is obtained, which is an improvement of 2.1% compared to our baseline.

Download Full-text

Stability for the training of deep neural networks and other classifiers

Mathematical Models and Methods in Applied Sciences ◽

10.1142/s0218202521500500 ◽

2021 ◽

pp. 1-46

Author(s):

Leonid Berlyand ◽

Pierre-Emmanuel Jabin ◽

C. Alex Safsten

Keyword(s):

Neural Networks ◽

Loss Function ◽

Deep Neural Networks ◽

Training Dataset ◽

Sufficient Condition ◽

Training Set ◽

Potential Sources ◽

The Stability

We examine the stability of loss-minimizing training processes that are used for deep neural networks (DNN) and other classifiers. While a classifier is optimized during training through a so-called loss function, the performance of classifiers is usually evaluated by some measure of accuracy, such as the overall accuracy which quantifies the proportion of objects that are well classified. This leads to the guiding question of stability: does decreasing loss through training always result in increased accuracy? We formalize the notion of stability, and provide examples of instability. Our main result consists of two novel conditions on the classifier which, if either is satisfied, ensure stability of training, that is we derive tight bounds on accuracy as loss decreases. We also derive a sufficient condition for stability on the training set alone, identifying flat portions of the data manifold as potential sources of instability. The latter condition is explicitly verifiable on the training dataset. Our results do not depend on the algorithm used for training, as long as loss decreases with training.

Download Full-text

Stochastic Loss Function

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i04.5925 ◽

2020 ◽

Vol 34 (04) ◽

pp. 4884-4891

Author(s):

Qingliang Liu ◽

Jinmei Lai

Keyword(s):

Neural Networks ◽

Loss Function ◽

Real World ◽

Optimization Problem ◽

Deep Neural Networks ◽

Back Propagation ◽

Loss Functions ◽

Joint Optimization ◽

Neural Machine Translation ◽

Deep Networks

Training deep neural networks is inherently subject to the predefined and fixed loss functions during optimizing. To improve learning efficiency, we develop Stochastic Loss Function (SLF) to dynamically and automatically generating appropriate gradients to train deep networks in the same round of back-propagation, while maintaining the completeness and differentiability of the training pipeline. In SLF, a generic loss function is formulated as a joint optimization problem of network weights and loss parameters. In order to guarantee the requisite efficiency, gradients with the respect to the generic differentiable loss are leveraged for selecting loss function and optimizing network weights. Extensive experiments on a variety of popular datasets strongly demonstrate that SLF is capable of obtaining appropriate gradients at different stages during training, and can significantly improve the performance of various deep models on real world tasks including classification, clustering, regression, neural machine translation, and objection detection.

Download Full-text