MPCE: A Maximum Probability Based Cross Entropy Loss Function for Neural Network Classification

Within the sentiment classification field, the convolutional neural network (CNN) and long short-term memory (LSTM) are praised for their classification and prediction performance, but their accuracy, loss rate, and time are not ideal. To this purpose, a deep learning structure combining the improved cross entropy and weight for word is proposed for solving cross-domain sentiment classification, which focuses on achieving better text sentiment classification by optimizing and improving recurrent neural network (RNN) and CNN. Firstly, we use the idea of hinge loss function (hinge loss) and the triplet loss function (triplet loss) to improve the cross entropy loss. The improved cross entropy loss function is combined with the CNN model and LSTM network which are tested in the two classification problems. Then, the LSTM binary-optimize (LSTM-BO) model and CNN binary-optimize (CNN-BO) model are proposed, which are more effective in fitting the predicted errors and preventing overfitting. Finally, considering the characteristics of the processing text of the recurrent neural network, the influence of input words for the final classification is analysed, which can obtain the importance of each word to the classification results. The experiment results show that within the same time, the proposed weight-recurrent neural network (W-RNN) model gives higher weight to words with stronger emotional tendency to reduce the loss of emotional information, which improves the accuracy of classification.

Download Full-text

IMPROVING DEEP MATRIX FACTORIZATION WITH NORMALIZED CROSS ENTROPY LOSS FUNCTION FOR GRAPH-BASED MOOC RECOMMENDATION

14th International Conference on Computer Graphics, Visualization, Computer Vision and Image Processing ◽

10.33965/bigdaci2020_202011l017 ◽

2020 ◽

Keyword(s):

Loss Function ◽

Matrix Factorization ◽

Cross Entropy ◽

Entropy Loss

Download Full-text

Approximating the Gradient of Cross-Entropy Loss Function

IEEE Access ◽

10.1109/access.2020.3001531 ◽

2020 ◽

Vol 8 ◽

pp. 111626-111635

Author(s):

Li Li ◽

Milos Doroslovacki ◽

Murray H. Loew

Keyword(s):

Loss Function ◽

Cross Entropy ◽

Entropy Loss

Download Full-text

Shaping the learning landscape in neural networks around wide flat minima

Proceedings of the National Academy of Sciences ◽

10.1073/pnas.1908636117 ◽

2019 ◽

Vol 117 (1) ◽

pp. 161-170 ◽

Cited By ~ 2

Author(s):

Carlo Baldassi ◽

Fabrizio Pittorino ◽

Riccardo Zecchina

Keyword(s):

Neural Networks ◽

Loss Function ◽

Critical Points ◽

Learning Process ◽

Numerical Study ◽

Cross Entropy ◽

Stochastic Gradient Descent ◽

Neural Network Models ◽

Entropy Loss ◽

Error Loss

Learning in deep neural networks takes place by minimizing a nonconvex high-dimensional loss function, typically by a stochastic gradient descent (SGD) strategy. The learning process is observed to be able to find good minimizers without getting stuck in local critical points and such minimizers are often satisfactory at avoiding overfitting. How these 2 features can be kept under control in nonlinear devices composed of millions of tunable connections is a profound and far-reaching open question. In this paper we study basic nonconvex 1- and 2-layer neural network models that learn random patterns and derive a number of basic geometrical and algorithmic features which suggest some answers. We first show that the error loss function presents few extremely wide flat minima (WFM) which coexist with narrower minima and critical points. We then show that the minimizers of the cross-entropy loss function overlap with the WFM of the error loss. We also show examples of learning devices for which WFM do not exist. From the algorithmic perspective we derive entropy-driven greedy and message-passing algorithms that focus their search on wide flat regions of minimizers. In the case of SGD and cross-entropy loss, we show that a slow reduction of the norm of the weights along the learning process also leads to WFM. We corroborate the results by a numerical study of the correlations between the volumes of the minimizers, their Hessian, and their generalization performance on real data.

Download Full-text

AttentionBased Deep Feature Fusion for the Scene Classification of HighResolution Remote Sensing Images

Remote Sensing ◽

10.3390/rs11171996 ◽

2019 ◽

Vol 11 (17) ◽

pp. 1996 ◽

Cited By ~ 7

Author(s):

Zhu ◽

Yan ◽

Mo ◽

Liu

Keyword(s):

Remote Sensing ◽

Loss Function ◽

Feature Fusion ◽

Cross Entropy ◽

Scene Classification ◽

Remote Sensing Images ◽

Graphic Processing Units ◽

Entropy Loss ◽

Deep Feature

Scene classification of highresolution remote sensing images (HRRSI) is one of the most important means of landcover classification. Deep learning techniques, especially the convolutional neural network (CNN) have been widely applied to the scene classification of HRRSI due to the advancement of graphic processing units (GPU). However, they tend to extract features from the whole images rather than discriminative regions. The visual attention mechanism can force the CNN to focus on discriminative regions, but it may suffer from the influence of intraclass diversity and repeated texture. Motivated by these problems, we propose an attention-based deep feature fusion (ADFF) framework that constitutes three parts, namely attention maps generated by Gradientweighted Class Activation Mapping (GradCAM), a multiplicative fusion of deep features and the centerbased cross-entropy loss function. First of all, we propose to make attention maps generated by GradCAM as an explicit input in order to force the network to concentrate on discriminative regions. Then, deep features derived from original images and attention maps are proposed to be fused by multiplicative fusion in order to consider both improved abilities to distinguish scenes of repeated texture and the salient regions. Finally, the centerbased cross-entropy loss function that utilizes both the cross-entropy loss and center loss function is proposed to backpropagate fused features so as to reduce the effect of intraclass diversity on feature representations. The proposed ADFF architecture is tested on three benchmark datasets to show its performance in scene classification. The experiments confirm that the proposed method outperforms most competitive scene classification methods with an average overall accuracy of 94% under different training ratios.

Download Full-text

Building Outline Extraction Directly Using the U2-Net Semantic Segmentation Model from High-Resolution Aerial Images and a Comparison Study

Remote Sensing ◽

10.3390/rs13163187 ◽

2021 ◽

Vol 13 (16) ◽

pp. 3187

Author(s):

Xinchun Wei ◽

Xing Li ◽

Wei Liu ◽

Lianpeng Zhang ◽

Dayu Cheng ◽

...

Keyword(s):

Edge Detection ◽

Loss Function ◽

Semantic Segmentation ◽

Cross Entropy ◽

Aerial Images ◽

Building Extraction ◽

Precise Position ◽

Entropy Loss ◽

Imbalance Problem ◽

Outline Extraction

Deep learning techniques have greatly improved the efficiency and accuracy of building extraction using remote sensing images. However, high-quality building outline extraction results that can be applied to the field of surveying and mapping remain a significant challenge. In practice, most building extraction tasks are manually executed. Therefore, an automated procedure of a building outline with a precise position is required. In this study, we directly used the U2-net semantic segmentation model to extract the building outline. The extraction results showed that the U2-net model can provide the building outline with better accuracy and a more precise position than other models based on comparisons with semantic segmentation models (Segnet, U-Net, and FCN) and edge detection models (RCF, HED, and DexiNed) applied for two datasets (Nanjing and Wuhan University (WHU)). We also modified the binary cross-entropy loss function in the U2-net model into a multiclass cross-entropy loss function to directly generate the binary map with the building outline and background. We achieved a further refined outline of the building, thus showing that with the modified U2-net model, it is not necessary to use non-maximum suppression as a post-processing step, as in the other edge detection models, to refine the edge map. Moreover, the modified model is less affected by the sample imbalance problem. Finally, we created an image-to-image program to further validate the modified U2-net semantic segmentation model for building outline extraction.

Download Full-text

Personal Interest Attention Graph Neural Networks for Session-Based Recommendation

Entropy ◽

10.3390/e23111500 ◽

2021 ◽

Vol 23 (11) ◽

pp. 1500

Author(s):

Xiangde Zhang ◽

Yuan Zhou ◽

Jianping Wang ◽

Xiaojun Lu

Keyword(s):

Neural Network ◽

Neural Networks ◽

Objective Function ◽

Cross Entropy ◽

Personal Interest ◽

Entropy Loss ◽

Convolutional Networks ◽

The Cross ◽

Graph Neural Networks

Session-based recommendations aim to predict a user’s next click based on the user’s current and historical sessions, which can be applied to shopping websites and APPs. Existing session-based recommendation methods cannot accurately capture the complex transitions between items. In addition, some approaches compress sessions into a fixed representation vector without taking into account the user’s interest preferences at the current moment, thus limiting the accuracy of recommendations. Considering the diversity of items and users’ interests, a personalized interest attention graph neural network (PIA-GNN) is proposed for session-based recommendation. This approach utilizes personalized graph convolutional networks (PGNN) to capture complex transitions between items, invoking an interest-aware mechanism to activate users’ interest in different items adaptively. In addition, a self-attention layer is used to capture long-term dependencies between items when capturing users’ long-term preferences. In this paper, the cross-entropy loss is used as the objective function to train our model. We conduct rich experiments on two real datasets, and the results show that PIA-GNN outperforms existing personalized session-aware recommendation methods.

Download Full-text

Prediction of Ship Heave Motion Using Regularized BP Neural Network with Cross Entropy Error Function

International Journal of Computational Intelligence Systems ◽

10.1007/s44196-021-00043-8 ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Hailun Wang ◽

Fei Wu ◽

Dongge Lei

Keyword(s):

Neural Network ◽

Loss Function ◽

Error Function ◽

Back Propagation ◽

Back Propagation Neural Network ◽

Cross Entropy ◽

Saturation Phenomenon ◽

Entropy Error ◽

The Mean ◽

Heave Motion

AbstractAccurate prediction of ship’s heave motion can greatly enhance the safety of offshore operation. Due to its complexity and nonlinearity, however, ship’s heave motion prediction is a difficult task to be solved. In this paper, a new method for predicting ship’s heave motion is proposed based on an improved back propagation neural network (IBPNN). To overcome the gradient saturation phenomenon of traditional BPNN, the mean square error (MSE) loss function is replaced with a cross entropy (CE) loss function in IBPNN. Meanwhile, the weights of IBPNN is regularized by $$L_2$$ L 2 norm to enhance the generalization ability of traditional BPNN. Finally, conjugate gradient method is adopted to train IBPNN. The IBPNN is used to predict ship’s heave motion and the prediction results prove its effectiveness.

Download Full-text

Ranking Loss: Maximizing the Success Rate in Deep Learning Side-Channel Analysis

IACR Transactions on Cryptographic Hardware and Embedded Systems ◽

10.46586/tches.v2021.i1.25-55 ◽

2020 ◽

pp. 25-55

Author(s):

Gabriel Zaid ◽

Lilian Bossuet ◽

François Dassance ◽

Amaury Habrard ◽

Alexandre Venelli

Keyword(s):

Deep Learning ◽

Success Rate ◽

Loss Function ◽

Estimation Error ◽

Approximation Error ◽

Cross Entropy ◽

Side Channel ◽

Entropy Loss ◽

The Cross ◽

Ranking Loss

The side-channel community recently investigated a new approach, based on deep learning, to significantly improve profiled attacks against embedded systems. Compared to template attacks, deep learning techniques can deal with protected implementations, such as masking or desynchronization, without substantial preprocessing. However, important issues are still open. One challenging problem is to adapt the methods classically used in the machine learning field (e.g. loss function, performance metrics) to the specific side-channel context in order to obtain optimal results. We propose a new loss function derived from the learning to rank approach that helps preventing approximation and estimation errors, induced by the classical cross-entropy loss. We theoretically demonstrate that this new function, called Ranking Loss (RkL), maximizes the success rate by minimizing the ranking error of the secret key in comparison with all other hypotheses. The resulting model converges towards the optimal distinguisher when considering the mutual information between the secret and the leakage. Consequently, the approximation error is prevented. Furthermore, the estimation error, induced by the cross-entropy, is reduced by up to 23%. When the ranking loss is used, the convergence towards the best solution is up to 23% faster than a model using the cross-entropy loss function. We validate our theoretical propositions on public datasets.

Download Full-text