Regression Neural Networks with a Highly Robust Loss Function

Neural networks and their application in communication systems are receiving growing attention from both academia and industry. The authors note that there is a disconnect between the typical objective functions of these neural networks with regards to the context in which the neural network will eventually be deployed and evaluated. To this end, a new loss function is proposed and shown to increase the performance of neural networks when implemented in a communication system compared to previous methods. It is further shown that a ‘split complex’ approach used by many implementations can be improved via formalisation of the ‘concatenated complex’ approach described herein. Experimental results using the orthogonal frequency division multiplexing (OFDM) and spectrally efficient frequency division multiplexing (SEFDM) modulation formats with varying bandwidth compression factors over a wireless visible light communication (VLC) link validate the efficacy of the proposed method in a real system, achieving the lowest error vector magnitude (EVM), and thus bit error rate (BER), across all experiments, with a 5 dB to 10 dB improvement in the received symbols EVM overall compared to the baseline implementation, with bandwidth compressions down to 40% compared to OFDM, resulting in a spectral efficiency gain of 67%.

Download Full-text

Integrating a softened multi-interval loss function into neural networks for wind power prediction

Applied Soft Computing ◽

10.1016/j.asoc.2021.108009 ◽

2021 ◽

pp. 108009

Author(s):

Jianming Hu ◽

Weigang Zhao ◽

Jingwei Tang ◽

Qingxi Luo

Keyword(s):

Neural Networks ◽

Wind Power ◽

Loss Function ◽

Power Prediction ◽

Wind Power Prediction

Download Full-text

The Goldilocks Zone: Towards Better Understanding of Neural Network Loss Landscapes

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013574 ◽

2019 ◽

Vol 33 ◽

pp. 3574-3581

Author(s):

Stanislav Fort ◽

Adam Scherlis

Keyword(s):

Neural Network ◽

Neural Networks ◽

Configuration Space ◽

Loss Function ◽

Positive Curvature ◽

Local Convexity ◽

Convolutional Networks ◽

Hollow Spherical Shell ◽

Low Dimensional ◽

Fully Connected

We explore the loss landscape of fully-connected and convolutional neural networks using random, low-dimensional hyperplanes and hyperspheres. Evaluating the Hessian, H, of the loss function on these hypersurfaces, we observe 1) an unusual excess of the number of positive eigenvalues of H, and 2) a large value of Tr(H)/||H|| at a well defined range of configuration space radii, corresponding to a thick, hollow, spherical shell we refer to as the Goldilocks zone. We observe this effect for fully-connected neural networks over a range of network widths and depths on MNIST and CIFAR-10 datasets with the ReLU and tanh non-linearities, and a similar effect for convolutional networks. Using our observations, we demonstrate a close connection between the Goldilocks zone, measures of local convexity/prevalence of positive curvature, and the suitability of a network initialization. We show that the high and stable accuracy reached when optimizing on random, low-dimensional hypersurfaces is directly related to the overlap between the hypersurface and the Goldilocks zone, and as a corollary demonstrate that the notion of intrinsic dimension is initialization-dependent. We note that common initialization techniques initialize neural networks in this particular region of unusually high convexity/prevalence of positive curvature, and offer a geometric intuition for their success. Furthermore, we demonstrate that initializing a neural network at a number of points and selecting for high measures of local convexity such as Tr(H)/||H||, number of positive eigenvalues of H, or low initial loss, leads to statistically significantly faster training on MNIST. Based on our observations, we hypothesize that the Goldilocks zone contains an unusually high density of suitable initialization configurations.

Download Full-text

Rectified Wing Loss for Efficient and Robust Facial Landmark Localisation with Convolutional Neural Networks

International Journal of Computer Vision ◽

10.1007/s11263-019-01275-0 ◽

2019 ◽

Vol 128 (8-9) ◽

pp. 2126-2145 ◽

Cited By ~ 3

Author(s):

Zhen-Hua Feng ◽

Josef Kittler ◽

Muhammad Awais ◽

Xiao-Jun Wu

Keyword(s):

Neural Networks ◽

Convolutional Neural Networks ◽

Loss Function ◽

Data Augmentation ◽

Small Sample Size ◽

Small Sample ◽

Imbalance Problem ◽

Facial Landmark ◽

The Impact ◽

Coarse To Fine

AbstractEfficient and robust facial landmark localisation is crucial for the deployment of real-time face analysis systems. This paper presents a new loss function, namely Rectified Wing (RWing) loss, for regression-based facial landmark localisation with Convolutional Neural Networks (CNNs). We first systemically analyse different loss functions, including L2, L1 and smooth L1. The analysis suggests that the training of a network should pay more attention to small-medium errors. Motivated by this finding, we design a piece-wise loss that amplifies the impact of the samples with small-medium errors. Besides, we rectify the loss function for very small errors to mitigate the impact of inaccuracy of manual annotation. The use of our RWing loss boosts the performance significantly for regression-based CNNs in facial landmarking, especially for lightweight network architectures. To address the problem of under-representation of samples with large pose variations, we propose a simple but effective boosting strategy, referred to as pose-based data balancing. In particular, we deal with the data imbalance problem by duplicating the minority training samples and perturbing them by injecting random image rotation, bounding box translation and other data augmentation strategies. Last, the proposed approach is extended to create a coarse-to-fine framework for robust and efficient landmark localisation. Moreover, the proposed coarse-to-fine framework is able to deal with the small sample size problem effectively. The experimental results obtained on several well-known benchmarking datasets demonstrate the merits of our RWing loss and prove the superiority of the proposed method over the state-of-the-art approaches.

Download Full-text

Tree-loss function for training neural networks on weakly-labelled datasets

2017 IEEE 14th International Symposium on Biomedical Imaging (ISBI 2017) ◽

10.1109/isbi.2017.7950521 ◽

2017 ◽

Cited By ~ 2

Author(s):

Sergey Demyanov ◽

Rajib Chakravorty ◽

Zongyuan Ge ◽

SeyedBehzad Bozorgtabar ◽

Michelle Pablo ◽

...

Keyword(s):

Neural Networks ◽

Loss Function

Download Full-text

Complexity-aware loss function for fast neural networks with early exits

Twelfth International Conference on Machine Vision (ICMV 2019) ◽

10.1117/12.2557077 ◽

2020 ◽

Author(s):

Lev Teplyakov ◽

Segrey Gladilin ◽

Evgeny Shvets

Keyword(s):

Neural Networks ◽

Loss Function

Download Full-text

Trimmed Robust Loss Function for Training Deep Neural Networks with Label Noise

Artificial Intelligence and Soft Computing - Lecture Notes in Computer Science ◽

10.1007/978-3-030-20912-4_21 ◽

2019 ◽

pp. 215-222 ◽

Cited By ~ 1

Author(s):

Andrzej Rusiecki

Keyword(s):

Neural Networks ◽

Loss Function ◽

Deep Neural Networks ◽

Label Noise

Download Full-text

Graph Self Supervised Learning: the BT, the HSIC, and the VICReg

10.31219/osf.io/tvmdu ◽

2021 ◽

Author(s):

Sayan Nag

Keyword(s):

Neural Networks ◽

Supervised Learning ◽

Loss Function ◽

Data Augmentation ◽

Learning Strategy ◽

Loss Functions ◽

Augmentation Strategies ◽

Batch Sizes ◽

Graph Neural Networks ◽

The Impact

Self-supervised learning and pre-training strategies have developed over the last few years especially for Convolutional Neural Networks (CNNs). Recently application of such methods can also be noticed for Graph Neural Networks (GNNs). In this paper, we have used a graph based self-supervised learning strategy with different loss functions (Barlow Twins[? ], HSIC[? ], VICReg[? ]) which have shown promising results when applied with CNNs previously. We have also proposed a hybrid loss function combining the advantages of VICReg and HSIC and called it as VICRegHSIC. The performance of these aforementioned methods have been compared when applied to two different datasets namely MUTAG and PROTEINS. Moreover, the impact of different batch sizes, projector dimensions and data augmentation strategies have also been explored. The results are preliminary and we will be continuing to explore with other datasets.

Download Full-text

Stratified neural networks in a time-to-event setting

10.1101/2021.02.01.429169 ◽

2021 ◽

Author(s):

Fabrizio Kuruc ◽

Harald Binder ◽

Moritz Hess

Keyword(s):

Neural Networks ◽

Loss Function ◽

Deep Neural Networks ◽

Proportional Hazards ◽

Proportional Hazards Model ◽

Cox Proportional Hazards ◽

Cox Proportional Hazards Model ◽

Loss Functions ◽

Partial Likelihood ◽

Hazards Model

AbstractDeep neural networks are now frequently employed to predict survival conditional on omics-type biomarkers, e.g. by employing the partial likelihood of Cox proportional hazards model as loss function. Due to the generally limited number of observations in clinical studies, combining different data-sets has been proposed to improve learning of network parameters. However, if baseline hazards differ between the studies, the assumptions of Cox proportional hazards model are violated. Based on high dimensional transcriptome profiles from different tumor entities, we demonstrate how using a stratified partial likelihood as loss function allows for accounting for the different baseline hazards in a deep learning framework. Additionally, we compare the partial likelihood with the ranking loss, which is frequently employed as loss function in machine learning approaches due to its seemingly simplicity. Using RNA-seq data from the Cancer Genome Atlas (TCGA) we show that use of stratified loss functions leads to an overall better discriminatory power and lower prediction error compared to their nonstratified counterparts. We investigate which genes are identified to have the greatest marginal impact on prediction of survival when using different loss functions. We find that while similar genes are identified, in particular known prognostic genes receive higher importance from stratified loss functions. Taken together, pooling data from different sources for improved parameter learning of deep neural networks benefits largely from employing stratified loss functions that consider potentially varying baseline hazards. For easy application, we provide PyTorch code for stratified loss functions and an explanatory Jupyter notebook in a GitHub repository.

Download Full-text

Spoofing Speaker Verification System by Adversarial Examples Leveraging the Generalized Speaker Difference

Security and Communication Networks ◽

10.1155/2021/6664578 ◽

2021 ◽

Vol 2021 ◽

pp. 1-10

Author(s):

Hongwei Luo ◽

Yijie Shen ◽

Feng Lin ◽

Guoai Xu

Keyword(s):

Neural Networks ◽

Loss Function ◽

Deep Neural Networks ◽

State Of The Art ◽

Speaker Verification ◽

Signal To Noise Ratio ◽

The State ◽

Verification System ◽

Adversarial Examples ◽

Human Hearing

Speaker verification system has gained great popularity in recent years, especially with the development of deep neural networks and Internet of Things. However, the security of speaker verification system based on deep neural networks has not been well investigated. In this paper, we propose an attack to spoof the state-of-the-art speaker verification system based on generalized end-to-end (GE2E) loss function for misclassifying illegal users into the authentic user. Specifically, we design a novel loss function to deploy a generator for generating effective adversarial examples with slight perturbation and then spoof the system with these adversarial examples to achieve our goals. The success rate of our attack can reach 82% when cosine similarity is adopted to deploy the deep-learning-based speaker verification system. Beyond that, our experiments also reported the signal-to-noise ratio at 76 dB, which proves that our attack has higher imperceptibility than previous works. In summary, the results show that our attack not only can spoof the state-of-the-art neural-network-based speaker verification system but also more importantly has the ability to hide from human hearing or machine discrimination.

Download Full-text