Sub-gradient based projection neural networks for non-differentiable optimization problems

In artificial neural networks, learning from data is a computationally demanding task in which a large number of connection weights are iteratively tuned through stochastic-gradient-based heuristic processes over a cost function. It is not well understood how learning occurs in these systems, in particular how they avoid getting trapped in configurations with poor computational performance. Here, we study the difficult case of networks with discrete weights, where the optimization landscape is very rough even for simple architectures, and provide theoretical and numerical evidence of the existence of rare—but extremely dense and accessible—regions of configurations in the network weight space. We define a measure, the robust ensemble (RE), which suppresses trapping by isolated configurations and amplifies the role of these dense regions. We analytically compute the RE in some exactly solvable models and also provide a general algorithmic scheme that is straightforward to implement: define a cost function given by a sum of a finite number of replicas of the original cost function, with a constraint centering the replicas around a driving assignment. To illustrate this, we derive several powerful algorithms, ranging from Markov Chains to message passing to gradient descent processes, where the algorithms target the robust dense states, resulting in substantial improvements in performance. The weak dependence on the number of precision bits of the weights leads us to conjecture that very similar reasoning applies to more conventional neural networks. Analogous algorithmic schemes can also be applied to other optimization problems.

Download Full-text

A Comprehensive Investigation of Ensembles of Gaussian-Based and Gradient-Based Transformed Reliability Analyses: When and How to Use Them

Volume 2A: 42nd Design Automation Conference ◽

10.1115/detc2016-59151 ◽

2016 ◽

Author(s):

Po Ting Lin ◽

Wei-Hao Lu ◽

Shu-Ping Lin

Keyword(s):

Design Optimization ◽

Design Space ◽

Optimization Problems ◽

Nonlinear Problems ◽

Kernel Functions ◽

Sensitivity Analyses ◽

Normal Space ◽

Highly Nonlinear ◽

Standard Normal ◽

Gradient Based

In the past few years, researchers have begun to investigate the existence of arbitrary uncertainties in the design optimization problems. Most traditional reliability-based design optimization (RBDO) methods transform the design space to the standard normal space for reliability analysis but may not work well when the random variables are arbitrarily distributed. It is because that the transformation to the standard normal space cannot be determined or the distribution type is unknown. The methods of Ensemble of Gaussian-based Reliability Analyses (EoGRA) and Ensemble of Gradient-based Transformed Reliability Analyses (EGTRA) have been developed to estimate the joint probability density function using the ensemble of kernel functions. EoGRA performs a series of Gaussian-based kernel reliability analyses and merged them together to compute the reliability of the design point. EGTRA transforms the design space to the single-variate design space toward the constraint gradient, where the kernel reliability analyses become much less costly. In this paper, a series of comprehensive investigations were performed to study the similarities and differences between EoGRA and EGTRA. The results showed that EGTRA performs accurate and effective reliability analyses for both linear and nonlinear problems. When the constraints are highly nonlinear, EGTRA may have little problem but still can be effective in terms of starting from deterministic optimal points. On the other hands, the sensitivity analyses of EoGRA may be ineffective when the random distribution is completely inside the feasible space or infeasible space. However, EoGRA can find acceptable design points when starting from deterministic optimal points. Moreover, EoGRA is capable of delivering estimated failure probability of each constraint during the optimization processes, which may be convenient for some applications.

Download Full-text

Improving Adversarial Attacks on Deep Neural Networks via Constricted Gradient-based Perturbations

Information Sciences ◽

10.1016/j.ins.2021.04.033 ◽

2021 ◽

Author(s):

Yatie Xiao ◽

Chi-Man Pun

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Gradient Based

Download Full-text

An improved dynamic structure-based neural networks determination approaches to simulation optimization problems

Neural Computing and Applications ◽

10.1007/s00521-010-0348-x ◽

2010 ◽

Vol 19 (6) ◽

pp. 883-901 ◽

Cited By ~ 2

Author(s):

Zheng Jun ◽

Tan Yu-An ◽

Zhang Xue-Lan ◽

Lu Jun

Keyword(s):

Neural Networks ◽

Optimization Problems ◽

Simulation Optimization ◽

Dynamic Structure

Download Full-text

Optimizing the Learning Process of Feedforward Neural Networks Using Lightning Search Algorithm

International Journal of Artificial Intelligence Tools ◽

10.1142/s0218213016500330 ◽

2016 ◽

Vol 25 (06) ◽

pp. 1650033 ◽

Cited By ~ 26

Author(s):

Hossam Faris ◽

Ibrahim Aljarah ◽

Nailah Al-Madi ◽

Seyedali Mirjalili

Keyword(s):

Neural Network ◽

Neural Networks ◽

Optimization Problems ◽

Search Algorithm ◽

Optimization Technique ◽

Back Propagation ◽

Feedforward Neural Networks ◽

Training Algorithms ◽

Local Optima ◽

Local Solutions

Evolutionary Neural Networks are proven to be beneficial in solving challenging datasets mainly due to the high local optima avoidance. Stochastic operators in such techniques reduce the probability of stagnation in local solutions and assist them to supersede conventional training algorithms such as Back Propagation (BP) and Levenberg-Marquardt (LM). According to the No-Free-Lunch (NFL), however, there is no optimization technique for solving all optimization problems. This means that a Neural Network trained by a new algorithm has the potential to solve a new set of problems or outperform the current techniques in solving existing problems. This motivates our attempts to investigate the efficiency of the recently proposed Evolutionary Algorithm called Lightning Search Algorithm (LSA) in training Neural Network for the first time in the literature. The LSA-based trainer is benchmarked on 16 popular medical diagnosis problems and compared to BP, LM, and 6 other evolutionary trainers. The quantitative and qualitative results show that the LSA algorithm is able to show not only better local solutions avoidance but also faster convergence speed compared to the other algorithms employed. In addition, the statistical test conducted proves that the LSA-based trainer is significantly superior in comparison with the current algorithms on the majority of datasets.

Download Full-text

Comparing gradient based learning methods for optimizing predictive neural networks

2014 Recent Advances in Engineering and Computational Sciences (RAECS) ◽

10.1109/raecs.2014.6799573 ◽

2014 ◽

Cited By ~ 1

Author(s):

Dharminder Kumar ◽

Sangeeta Gupta ◽

Parveen Sehgal

Keyword(s):

Neural Networks ◽

Learning Methods ◽

Gradient Based

Download Full-text

MOGBO: A new Multiobjective Gradient-Based Optimizer for real-world structural optimization problems

Knowledge-Based Systems ◽

10.1016/j.knosys.2021.106856 ◽

2021 ◽

Vol 218 ◽

pp. 106856

Author(s):

Manoharan Premkumar ◽

Pradeep Jangir ◽

Ravichandran Sowmya

Keyword(s):

Structural Optimization ◽

Real World ◽

Optimization Problems ◽

Gradient Based

Download Full-text

Building hydrological single-model ensembles using artificial neural networks and a combinatorial optimization approach

10.5194/egusphere-egu21-8256 ◽

2021 ◽

Author(s):

Juan F. Farfán-Durán ◽

Luis Cea

Keyword(s):

Neural Networks ◽

Artificial Neural Networks ◽

Goodness Of Fit ◽

Hydrological Model ◽

Hill Climbing ◽

Single Model ◽

Pearson Coefficient ◽

Gradient Based ◽

Artificial Neural ◽

Model Ensembles

In recent years, the application of model ensembles has received increasing attention in the hydrological modelling community due to the interesting results reported in several studies carried out in different parts of the world. The main idea of these approaches is to combine the results of the same hydrological model or a number of different hydrological models in order to obtain more robust, better-fitting models, reducing at the same time the uncertainty in the predictions. The techniques for combining models range from simple approaches such as averaging different simulations, to more complex techniques such as least squares, genetic algorithms and more recently artificial intelligence techniques such as Artificial Neural Networks (ANN).Despite the good results that model ensembles are able to provide, the models selected to build the ensemble have a direct influence on the results. Contrary to intuition, it has been reported that the best fitting single models do not necessarily produce the best ensemble. Instead, better results can be obtained with ensembles that incorporate models with moderate goodness of fit. This implies that the selection of the single models might have a random component in order to maximize the results that ensemble approaches can provide.The present study is carried out using hydrological data on an hourly scale between 2008 and 2015 corresponding to the Mandeo basin, located in the Northwest of Spain. In order to obtain 1000 single models, a hydrological model was run using 1000 sets of parameters sampled randomly in their feasible space. Then, we have classified the models in 3 groups with the following characteristics: 1) The 25 single models with highest Nash-Sutcliffe coefficient, 2) The 25 single models with the highest Pearson coefficient, and 3) The complete group of 1000 single models.The ensemble models are built with 5 models as the input of an ANN and the observed series as the output. Then, we applied the Random-Restart Hill-Climbing (RRHC) algorithm choosing 5 random models in each iteration to re-train the ANN in order to identify a better ensemble. The algorithm is applied to build 50 ensembles in each group of models. Finally, the results are compared to those obtained by optimizing the model using a gradient-based method by means of the following goodness-of-fit measures: Nash-Sutcliffe (NSE) coefficient, adapted for high flows Nash-Sutcliffe (HF&#8722;NSE), adapted for low flows Nash-Sutcliffe (LF&#8722;W NSE) and coefficient of determination (R2).The results show that the RRHC algorithm can identify adequate ensembles. The ensembles built using the group of models selected based on the NSE outperformed the model optimized by the gradient method in 64 % of the cases in at least 3 of 4 coefficients, both in the calibration and validation stages. Followed by the ensembles built with the group of models selected based on the Pearson coefficient with 56 %. In the case of the third group, no ensembles were identified that outperformed the gradient-based method. However, the most part of the ensembles outperformed the 1000 individual models.Keywords: Multi-model ensemble; Single-model ensemble; Artificial Neural Networks; Hydrological Model; Random-restart Hill-climbing&#160;

Download Full-text

Finding Better Local Optima in Topology Optimization via Tunneling

Volume 2B: 44th Design Automation Conference ◽

10.1115/detc2018-86116 ◽

2018 ◽

Cited By ~ 2

Author(s):

Shanglong Zhang ◽

Julián A. Norato

Keyword(s):

Topology Optimization ◽

Optimization Problems ◽

Structural Response ◽

Free Form ◽

Local Optima ◽

Design Representation ◽

Tunneling Method ◽

Optimization Parameters ◽

Gradient Based ◽

Geometric Primitives

Topology optimization problems are typically non-convex, and as such, multiple local minima exist. Depending on the initial design, the type of optimization algorithm and the optimization parameters, gradient-based optimizers converge to one of those minima. Unfortunately, these minima can be highly suboptimal, particularly when the structural response is very non-linear or when multiple constraints are present. This issue is more pronounced in the topology optimization of geometric primitives, because the design representation is more compact and restricted than in free-form topology optimization. In this paper, we investigate the use of tunneling in topology optimization to move from a poor local minimum to a better one. The tunneling method used in this work is a gradient-based deterministic method that finds a better minimum than the previous one in a sequential manner. We demonstrate this approach via numerical examples and show that the coupling of the tunneling method with topology optimization leads to better designs.

Download Full-text