An algorithm to solve a quantile optimization problem with loss function having a separable structure, and its application to an aerospace problem

2019 ◽  
Vol 35 (5) ◽  
pp. 1269-1281
Author(s):  
S.V. Ivanov ◽  
A.I. Kibzun ◽  
A.S. Stepanova
2013 ◽  
Vol 427-429 ◽  
pp. 1121-1127 ◽  
Author(s):  
Man Fu Yan ◽  
Jiu Hai Wang

In this paper, it applies Gaussian loss function instead of ε-insensitive loss function in a standard SVRM to devise a new model and a new type of support vector classification machine whose optimization problem is easier to solve and has conducted effective test on open data set in order to apply the new algorithm to environment monitoring in water culture plants and the monitoring result is better than any other method available.


2020 ◽  
Vol 34 (04) ◽  
pp. 4884-4891
Author(s):  
Qingliang Liu ◽  
Jinmei Lai

Training deep neural networks is inherently subject to the predefined and fixed loss functions during optimizing. To improve learning efficiency, we develop Stochastic Loss Function (SLF) to dynamically and automatically generating appropriate gradients to train deep networks in the same round of back-propagation, while maintaining the completeness and differentiability of the training pipeline. In SLF, a generic loss function is formulated as a joint optimization problem of network weights and loss parameters. In order to guarantee the requisite efficiency, gradients with the respect to the generic differentiable loss are leveraged for selecting loss function and optimizing network weights. Extensive experiments on a variety of popular datasets strongly demonstrate that SLF is capable of obtaining appropriate gradients at different stages during training, and can significantly improve the performance of various deep models on real world tasks including classification, clustering, regression, neural machine translation, and objection detection.


Author(s):  
Pieter Robyns ◽  
Peter Quax ◽  
Wim Lamotte

Sensitive cryptographic information, e.g. AES secret keys, can be extracted from the electromagnetic (EM) leakages unintentionally emitted by a device using techniques such as Correlation Electromagnetic Analysis (CEMA). In this paper, we introduce Correlation Optimization (CO), a novel approach that improves CEMA attacks by formulating the selection of useful EM leakage samples in a trace as a machine learning optimization problem. To this end, we propose the correlation loss function, which aims to maximize the Pearson correlation between a set of EM traces and the true AES key during training. We show that CO works with high-dimensional and noisy traces, regardless of time-domain trace alignment and without requiring prior knowledge of the power consumption characteristics of the cryptographic hardware. We evaluate our approach using the ASCAD benchmark dataset and a custom dataset of EM leakages from an Arduino Duemilanove, captured with a USRP B200 SDR. Our results indicate that the masked AES implementation used in all three ASCAD datasets can be broken with a shallow Multilayer Perceptron model, whilst requiring only 1,000 test traces on average. A similar methodology was employed to break the unprotected AES implementation from our custom dataset, using 22,000 unaligned and unfiltered test traces.


2020 ◽  
Vol 2 (1) ◽  
pp. 37-55 ◽  
Author(s):  
Carl Leake ◽  
Daniele Mortari

This article presents a new methodology called Deep Theory of Functional Connections (TFC) that estimates the solutions of partial differential equations (PDEs) by combining neural networks with the TFC. The TFC is used to transform PDEs into unconstrained optimization problems by analytically embedding the PDE’s constraints into a “constrained expression” containing a free function. In this research, the free function is chosen to be a neural network, which is used to solve the now unconstrained optimization problem. This optimization problem consists of minimizing a loss function that is chosen to be the square of the residuals of the PDE. The neural network is trained in an unsupervised manner to minimize this loss function. This methodology has two major differences when compared with popular methods used to estimate the solutions of PDEs. First, this methodology does not need to discretize the domain into a grid, rather, this methodology can randomly sample points from the domain during the training phase. Second, after training, this methodology produces an accurate analytical approximation of the solution throughout the entire training domain. Because the methodology produces an analytical solution, it is straightforward to obtain the solution at any point within the domain and to perform further manipulation if needed, such as differentiation. In contrast, other popular methods require extra numerical techniques if the estimated solution is desired at points that do not lie on the discretized grid, or if further manipulation to the estimated solution must be performed.


2020 ◽  
Vol 10 (3) ◽  
pp. 69-84
Author(s):  
P.A. Ardabyevskiy ◽  
D.A. Gonchar ◽  
Yu.S. Kan

The article considers a plane quantile optimization problem with a bilinear loss function, which, using suffi cient optimality conditions, is reduced to a linear programming problem. The reduction is based on the use of a polyhedral model of the kernel of the probability distribution of the vector of random parameters. To build this model, an algorithm based on the method of statistical modeling is proposed. A description of the software package for constructing a kernel model for a number of probability distributions of random parameters is given.


Author(s):  
Balaji Ramakrishnan ◽  
S. S. Rao

Abstract The application of the concept of robust design, based on Taguchi’s loss function, in formulating and solving nonlinear optimization problems is investigated. The effectiveness of the approach is illustrated with two examples. The first example is a machining parameter optimization problem wherein the production cost, tool life and production rate are optimized with limitations on machining characteristics such as cutting power, cutting tool temperature and surface finish. The second example is a welded beam design problem where the dimensions of the weldment and the beam are found without exceeding the limitations stated on the shear stress in the weld, normal stress in the beam, buckling load on the beam and tip deflection of the beam. The results are highlighted by comparing the solutions of the robust formulation with those obtained from the conventional formulation. The methodology presented in this work is expected to be useful in the design of products and processes which are least sensitive to the noises and which reflect in higher quality.


Author(s):  
Adam N. Elmachtoub ◽  
Paul Grigas

Many real-world analytics problems involve two significant challenges: prediction and optimization. Because of the typically complex nature of each challenge, the standard paradigm is predict-then-optimize. By and large, machine learning tools are intended to minimize prediction error and do not account for how the predictions will be used in the downstream optimization problem. In contrast, we propose a new and very general framework, called Smart “Predict, then Optimize” (SPO), which directly leverages the optimization problem structure—that is, its objective and constraints—for designing better prediction models. A key component of our framework is the SPO loss function, which measures the decision error induced by a prediction. Training a prediction model with respect to the SPO loss is computationally challenging, and, thus, we derive, using duality theory, a convex surrogate loss function, which we call the SPO+ loss. Most importantly, we prove that the SPO+ loss is statistically consistent with respect to the SPO loss under mild conditions. Our SPO+ loss function can tractably handle any polyhedral, convex, or even mixed-integer optimization problem with a linear objective. Numerical experiments on shortest-path and portfolio-optimization problems show that the SPO framework can lead to significant improvement under the predict-then-optimize paradigm, in particular, when the prediction model being trained is misspecified. We find that linear models trained using SPO+ loss tend to dominate random-forest algorithms, even when the ground truth is highly nonlinear. This paper was accepted by Yinyu Ye, optimization.


2019 ◽  
Vol 6 ◽  
pp. 43-66
Author(s):  
Robin Kurtz ◽  
Marco Kuhlmann

Dependency parsing can be cast as a combinatorial optimization problem with the objective to find the highest-scoring graph, where edge scores are learnt from data. Several of the decoding algorithms that have been applied to this task employ structural restrictions on candidate solutions, such as the restriction to projective dependency trees in syntactic parsing, or the restriction to noncrossing graphs in semantic parsing. In this paper we study the interplay between structural restrictions and a common loss function in neural dependency parsing, the structural hingeloss. We show how structural constraints can make networks trained under this loss function diverge and propose a modified loss function that solves this problem. Our experimental evaluation shows that the modified loss function can yield improved parsing accuracy, compared to the unmodified baseline.


Author(s):  
A. Howie ◽  
D.W. McComb

The bulk loss function Im(-l/ε (ω)), a well established tool for the interpretation of valence loss spectra, is being progressively adapted to the wide variety of inhomogeneous samples of interest to the electron microscopist. Proportionality between n, the local valence electron density, and ε-1 (Sellmeyer's equation) has sometimes been assumed but may not be valid even in homogeneous samples. Figs. 1 and 2 show the experimentally measured bulk loss functions for three pure silicates of different specific gravity ρ - quartz (ρ = 2.66), coesite (ρ = 2.93) and a zeolite (ρ = 1.79). Clearly, despite the substantial differences in density, the shift of the prominent loss peak is very small and far less than that predicted by scaling e for quartz with Sellmeyer's equation or even the somewhat smaller shift given by the Clausius-Mossotti (CM) relation which assumes proportionality between n (or ρ in this case) and (ε - 1)/(ε + 2). Both theories overestimate the rise in the peak height for coesite and underestimate the increase at high energies.


Sign in / Sign up

Export Citation Format

Share Document