Ranking Loss: Maximizing the Success Rate in Deep Learning Side-Channel Analysis

The side-channel community recently investigated a new approach, based on deep learning, to significantly improve profiled attacks against embedded systems. Compared to template attacks, deep learning techniques can deal with protected implementations, such as masking or desynchronization, without substantial preprocessing. However, important issues are still open. One challenging problem is to adapt the methods classically used in the machine learning field (e.g. loss function, performance metrics) to the specific side-channel context in order to obtain optimal results. We propose a new loss function derived from the learning to rank approach that helps preventing approximation and estimation errors, induced by the classical cross-entropy loss. We theoretically demonstrate that this new function, called Ranking Loss (RkL), maximizes the success rate by minimizing the ranking error of the secret key in comparison with all other hypotheses. The resulting model converges towards the optimal distinguisher when considering the mutual information between the secret and the leakage. Consequently, the approximation error is prevented. Furthermore, the estimation error, induced by the cross-entropy, is reduced by up to 23%. When the ranking loss is used, the convergence towards the best solution is up to 23% faster than a model using the cross-entropy loss function. We validate our theoretical propositions on public datasets.

Download Full-text

Deep Learning Structure for Cross-Domain Sentiment Classification Based on Improved Cross Entropy and Weight

Scientific Programming ◽

10.1155/2020/3810261 ◽

2020 ◽

Vol 2020 ◽

pp. 1-20

Author(s):

Rong Fei ◽

Quanzhu Yao ◽

Yuanbo Zhu ◽

Qingzheng Xu ◽

Aimin Li ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Loss Function ◽

Recurrent Neural Network ◽

Sentiment Classification ◽

Cross Entropy ◽

Entropy Loss ◽

Cross Domain ◽

Hinge Loss ◽

Triplet Loss

Within the sentiment classification field, the convolutional neural network (CNN) and long short-term memory (LSTM) are praised for their classification and prediction performance, but their accuracy, loss rate, and time are not ideal. To this purpose, a deep learning structure combining the improved cross entropy and weight for word is proposed for solving cross-domain sentiment classification, which focuses on achieving better text sentiment classification by optimizing and improving recurrent neural network (RNN) and CNN. Firstly, we use the idea of hinge loss function (hinge loss) and the triplet loss function (triplet loss) to improve the cross entropy loss. The improved cross entropy loss function is combined with the CNN model and LSTM network which are tested in the two classification problems. Then, the LSTM binary-optimize (LSTM-BO) model and CNN binary-optimize (CNN-BO) model are proposed, which are more effective in fitting the predicted errors and preventing overfitting. Finally, considering the characteristics of the processing text of the recurrent neural network, the influence of input words for the final classification is analysed, which can obtain the importance of each word to the classification results. The experiment results show that within the same time, the proposed weight-recurrent neural network (W-RNN) model gives higher weight to words with stronger emotional tendency to reduce the loss of emotional information, which improves the accuracy of classification.

Download Full-text

Hypergraph Convolution on Nodes-Hyperedges Network for Semi-Supervised Node Classification

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3494567 ◽

2022 ◽

Vol 16 (4) ◽

pp. 1-19

Author(s):

Hanrui Wu ◽

Michael K. Ng

Keyword(s):

Deep Learning ◽

Classification Problem ◽

Cross Entropy ◽

Learning Approaches ◽

Entropy Loss ◽

Order Relations ◽

Data Representations ◽

Node Classification ◽

The Cross ◽

Full Consideration

Hypergraphs have shown great power in representing high-order relations among entities, and lots of hypergraph-based deep learning methods have been proposed to learn informative data representations for the node classification problem. However, most of these deep learning approaches do not take full consideration of either the hyperedge information or the original relationships among nodes and hyperedges. In this article, we present a simple yet effective semi-supervised node classification method named Hypergraph Convolution on Nodes-Hyperedges network, which performs filtering on both nodes and hyperedges as well as recovers the original hypergraph with the least information loss. Instead of only reducing the cross-entropy loss over the labeled samples as most previous approaches do, we additionally consider the hypergraph reconstruction loss as prior information to improve prediction accuracy. As a result, by taking both the cross-entropy loss on the labeled samples and the hypergraph reconstruction loss into consideration, we are able to achieve discriminative latent data representations for training a classifier. We perform extensive experiments on the semi-supervised node classification problem and compare the proposed method with state-of-the-art algorithms. The promising results demonstrate the effectiveness of the proposed method.

Download Full-text

A Novel Evaluation Metric for Deep Learning-Based Side Channel Analysis and Its Extended Application to Imbalanced Data

IACR Transactions on Cryptographic Hardware and Embedded Systems ◽

10.46586/tches.v2020.i3.73-96 ◽

2020 ◽

pp. 73-96

Author(s):

Jiajia Zhang ◽

Mengce Zheng ◽

Jiehui Nan ◽

Honggang Hu ◽

Nenghai Yu

Keyword(s):

Deep Learning ◽

Loss Function ◽

Imbalanced Data ◽

Cross Entropy ◽

Side Channel ◽

Worst Case ◽

Side Channel Analysis ◽

Learning Techniques ◽

Channel Analysis ◽

Learning Metrics

Since Kocher (CRYPTO’96) proposed timing attack, side channel analysis (SCA) has shown great potential to break cryptosystems via physical leakage. Recently, deep learning techniques are widely used in SCA and show equivalent and even better performance compared to traditional methods. However, it remains unknown why and when deep learning techniques are effective and efficient for SCA. Masure et al. (IACR TCHES 2020(1):348–375) illustrated that deep learning paradigm is suitable for evaluating implementations against SCA from a worst-case scenario point of view, yet their work is limited to balanced data and a specific loss function. Besides, deep learning metrics are not consistent with side channel metrics. In most cases, they are deceptive in foreseeing the feasibility and complexity of mounting a successful attack, especially for imbalanced data. To mitigate the gap between deep learning metrics and side channel metrics, we propose a novel Cross Entropy Ratio (CER) metric to evaluate the performance of deep learning models for SCA. CER is closely related to traditional side channel metrics Guessing Entropy (GE) and Success Rate (SR) and fits to deep learning scenario. Besides, we show that it works stably while deep learning metrics such as accuracy becomes rather unreliable when the training data tends to be imbalanced. However, estimating CER can be done as easy as natural metrics in deep learning algorithms with low computational complexity. Furthermore, we adapt CER metric to a new kind of loss function, namely CER loss function, designed specifically for deep learning in side channel scenario. In this way, we link directly the SCA objective to deep learning optimization. Our experiments on several datasets show that, for SCA with imbalanced data, CER loss function outperforms Cross Entropy loss function in various conditions.

Download Full-text

IMPROVING DEEP MATRIX FACTORIZATION WITH NORMALIZED CROSS ENTROPY LOSS FUNCTION FOR GRAPH-BASED MOOC RECOMMENDATION

14th International Conference on Computer Graphics, Visualization, Computer Vision and Image Processing ◽

10.33965/bigdaci2020_202011l017 ◽

2020 ◽

Keyword(s):

Loss Function ◽

Matrix Factorization ◽

Cross Entropy ◽

Entropy Loss

Download Full-text

Approximating the Gradient of Cross-Entropy Loss Function

IEEE Access ◽

10.1109/access.2020.3001531 ◽

2020 ◽

Vol 8 ◽

pp. 111626-111635

Author(s):

Li Li ◽

Milos Doroslovacki ◽

Murray H. Loew

Keyword(s):

Loss Function ◽

Cross Entropy ◽

Entropy Loss

Download Full-text

MR-NET: Exploiting Mutual Relation for Visual Relationship Detection

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33018110 ◽

2019 ◽

Vol 33 ◽

pp. 8110-8117 ◽

Cited By ~ 3

Author(s):

Yi Bin ◽

Yang Yang ◽

Chaofan Tao ◽

Zi Huang ◽

Jingjing Li ◽

...

Keyword(s):

Object Detection ◽

Loss Function ◽

Linear Constraint ◽

Cross Entropy ◽

Mutual Interaction ◽

Superior Performance ◽

Crucial Point ◽

Mutual Relation ◽

Relation Learning ◽

Ranking Loss

Inferring the interactions between objects, a.k.a visual relationship detection, is a crucial point for vision understanding, which captures more definite concepts than object detection. Most previous work that treats the interaction between a pair of objects as a one way fail to exploit the mutual relation between objects, which is essential to modern visual application. In this work, we propose a mutual relation net, dubbed MR-Net, to explore the mutual relation between paired objects for visual relationship detection. Specifically, we construct a mutual relation space to model the mutual interaction of paired objects, and employ linear constraint to optimize the mutual interaction, which is called mutual relation learning. Our mutual relation learning does not introduce any parameters, and can adapt to improve the performance of other methods. In addition, we devise a semantic ranking loss to discriminatively penalize predicates with semantic similarity, which is ignored by traditional loss function (e.g., cross entropy with softmax). Then, our MR-Net optimizes the mutual relation learning together with semantic ranking loss with a siamese network. The experimental results on two commonly used datasets (VG and VRD) demonstrate the superior performance of the proposed approach.

Download Full-text

MPCE: A Maximum Probability Based Cross Entropy Loss Function for Neural Network Classification

IEEE Access ◽

10.1109/access.2019.2946264 ◽

2019 ◽

Vol 7 ◽

pp. 146331-146341 ◽

Cited By ~ 4

Author(s):

Yangfan Zhou ◽

Xin Wang ◽

Mingchuan Zhang ◽

Junlong Zhu ◽

Ruijuan Zheng ◽

...

Keyword(s):

Neural Network ◽

Loss Function ◽

Cross Entropy ◽

Maximum Probability ◽

Entropy Loss ◽

Neural Network Classification

Download Full-text

Shaping the learning landscape in neural networks around wide flat minima

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1908636117 ◽

2019 ◽

Vol 117 (1) ◽

pp. 161-170 ◽

Cited By ~ 2

Author(s):

Carlo Baldassi ◽

Fabrizio Pittorino ◽

Riccardo Zecchina

Keyword(s):

Neural Networks ◽

Loss Function ◽

Critical Points ◽

Learning Process ◽

Numerical Study ◽

Cross Entropy ◽

Stochastic Gradient Descent ◽

Neural Network Models ◽

Entropy Loss ◽

Error Loss

Learning in deep neural networks takes place by minimizing a nonconvex high-dimensional loss function, typically by a stochastic gradient descent (SGD) strategy. The learning process is observed to be able to find good minimizers without getting stuck in local critical points and such minimizers are often satisfactory at avoiding overfitting. How these 2 features can be kept under control in nonlinear devices composed of millions of tunable connections is a profound and far-reaching open question. In this paper we study basic nonconvex 1- and 2-layer neural network models that learn random patterns and derive a number of basic geometrical and algorithmic features which suggest some answers. We first show that the error loss function presents few extremely wide flat minima (WFM) which coexist with narrower minima and critical points. We then show that the minimizers of the cross-entropy loss function overlap with the WFM of the error loss. We also show examples of learning devices for which WFM do not exist. From the algorithmic perspective we derive entropy-driven greedy and message-passing algorithms that focus their search on wide flat regions of minimizers. In the case of SGD and cross-entropy loss, we show that a slow reduction of the norm of the weights along the learning process also leads to WFM. We corroborate the results by a numerical study of the correlations between the volumes of the minimizers, their Hessian, and their generalization performance on real data.

Download Full-text

AttentionBased Deep Feature Fusion for the Scene Classification of HighResolution Remote Sensing Images

Remote Sensing ◽

10.3390/rs11171996 ◽

2019 ◽

Vol 11 (17) ◽

pp. 1996 ◽

Cited By ~ 7

Author(s):

Zhu ◽

Yan ◽

Mo ◽

Liu

Keyword(s):

Remote Sensing ◽

Loss Function ◽

Feature Fusion ◽

Cross Entropy ◽

Scene Classification ◽

Remote Sensing Images ◽

Graphic Processing Units ◽

Entropy Loss ◽

Deep Feature

Scene classification of highresolution remote sensing images (HRRSI) is one of the most important means of landcover classification. Deep learning techniques, especially the convolutional neural network (CNN) have been widely applied to the scene classification of HRRSI due to the advancement of graphic processing units (GPU). However, they tend to extract features from the whole images rather than discriminative regions. The visual attention mechanism can force the CNN to focus on discriminative regions, but it may suffer from the influence of intraclass diversity and repeated texture. Motivated by these problems, we propose an attention-based deep feature fusion (ADFF) framework that constitutes three parts, namely attention maps generated by Gradientweighted Class Activation Mapping (GradCAM), a multiplicative fusion of deep features and the centerbased cross-entropy loss function. First of all, we propose to make attention maps generated by GradCAM as an explicit input in order to force the network to concentrate on discriminative regions. Then, deep features derived from original images and attention maps are proposed to be fused by multiplicative fusion in order to consider both improved abilities to distinguish scenes of repeated texture and the salient regions. Finally, the centerbased cross-entropy loss function that utilizes both the cross-entropy loss and center loss function is proposed to backpropagate fused features so as to reduce the effect of intraclass diversity on feature representations. The proposed ADFF architecture is tested on three benchmark datasets to show its performance in scene classification. The experiments confirm that the proposed method outperforms most competitive scene classification methods with an average overall accuracy of 94% under different training ratios.

Download Full-text

Building Outline Extraction Directly Using the U2-Net Semantic Segmentation Model from High-Resolution Aerial Images and a Comparison Study

Remote Sensing ◽

10.3390/rs13163187 ◽

2021 ◽

Vol 13 (16) ◽

pp. 3187

Author(s):

Xinchun Wei ◽

Xing Li ◽

Wei Liu ◽

Lianpeng Zhang ◽

Dayu Cheng ◽

...

Keyword(s):

Edge Detection ◽

Loss Function ◽

Semantic Segmentation ◽

Cross Entropy ◽

Aerial Images ◽

Building Extraction ◽

Precise Position ◽

Entropy Loss ◽

Imbalance Problem ◽

Outline Extraction

Deep learning techniques have greatly improved the efficiency and accuracy of building extraction using remote sensing images. However, high-quality building outline extraction results that can be applied to the field of surveying and mapping remain a significant challenge. In practice, most building extraction tasks are manually executed. Therefore, an automated procedure of a building outline with a precise position is required. In this study, we directly used the U2-net semantic segmentation model to extract the building outline. The extraction results showed that the U2-net model can provide the building outline with better accuracy and a more precise position than other models based on comparisons with semantic segmentation models (Segnet, U-Net, and FCN) and edge detection models (RCF, HED, and DexiNed) applied for two datasets (Nanjing and Wuhan University (WHU)). We also modified the binary cross-entropy loss function in the U2-net model into a multiclass cross-entropy loss function to directly generate the binary map with the building outline and background. We achieved a further refined outline of the building, thus showing that with the modified U2-net model, it is not necessary to use non-maximum suppression as a post-processing step, as in the other edge detection models, to refine the edge map. Moreover, the modified model is less affected by the sample imbalance problem. Finally, we created an image-to-image program to further validate the modified U2-net semantic segmentation model for building outline extraction.

Download Full-text