First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method

1992 ◽  
Vol 4 (2) ◽  
pp. 141-166 ◽  
Author(s):  
Roberto Battiti

On-line first-order backpropagation is sufficiently fast and effective for many large-scale classification problems but for very high precision mappings, batch processing may be the method of choice. This paper reviews first- and second-order optimization methods for learning in feedforward neural networks. The viewpoint is that of optimization: many methods can be cast in the language of optimization techniques, allowing the transfer to neural nets of detailed results about computational complexity and safety procedures to ensure convergence and to avoid numerical problems. The review is not intended to deliver detailed prescriptions for the most appropriate methods in specific applications, but to illustrate the main characteristics of the different methods and their mutual relations.

2020 ◽  
Vol 21 (4) ◽  
pp. 1665-1690
Author(s):  
Maria Stefanova ◽  
Olga Minevich ◽  
Stanislav Baklanov ◽  
Margarita Petukhova ◽  
Sergey Lupuleac ◽  
...  

Abstract A special class of quadratic programming (QP) problems is considered in this paper. This class emerges in simulation of assembly of large-scale compliant parts, which involves the formulation and solution of contact problems. The considered QP problems can have up to 20,000 unknowns, the Hessian matrix is fully populated and ill-conditioned, while the matrix of constraints is sparse. Variation analysis and optimization of assembly process usually require massive computations of QP problems with slightly different input data. The following optimization methods are adapted to account for the particular features of the assembly problem: an interior point method, an active-set method, a Newton projection method, and a pivotal algorithm for the linear complementarity problems. Equivalent formulations of the QP problem are proposed with the intent of them being more amenable to the considered methods. The methods are tested and results are compared for a number of aircraft assembly simulation problems.


Author(s):  
Ali Adibi ◽  
Ehsan Salari

It has been recently shown that an additional therapeutic gain may be achieved if a radiotherapy plan is altered over the treatment course using a new treatment paradigm referred to in the literature as spatiotemporal fractionation. Because of the nonconvex and large-scale nature of the corresponding treatment plan optimization problem, the extent of the potential therapeutic gain that may be achieved from spatiotemporal fractionation has been investigated using stylized cancer cases to circumvent the arising computational challenges. This research aims at developing scalable optimization methods to obtain high-quality spatiotemporally fractionated plans with optimality bounds for clinical cancer cases. In particular, the treatment-planning problem is formulated as a quadratically constrained quadratic program and is solved to local optimality using a constraint-generation approach, in which each subproblem is solved using sequential linear/quadratic programming methods. To obtain optimality bounds, cutting-plane and column-generation methods are combined to solve the Lagrangian relaxation of the formulation. The performance of the developed methods are tested on deidentified clinical liver and prostate cancer cases. Results show that the proposed method is capable of achieving local-optimal spatiotemporally fractionated plans with an optimality gap of around 10%–12% for cancer cases tested in this study. Summary of Contribution: The design of spatiotemporally fractionated radiotherapy plans for clinical cancer cases gives rise to a class of nonconvex and large-scale quadratically constrained quadratic programming (QCQP) problems, the solution of which requires the development of efficient models and solution methods. To address the computational challenges posed by the large-scale and nonconvex nature of the problem, we employ large-scale optimization techniques to develop scalable solution methods that find local-optimal solutions along with optimality bounds. We test the performance of the proposed methods on deidentified clinical cancer cases. The proposed methods in this study can, in principle, be applied to solve other QCQP formulations, which commonly arise in several application domains, including graph theory, power systems, and signal processing.


Author(s):  
L Lamberti ◽  
C Pappalettere

Design optimization of complex structures entails tasks that oppose the usual constraints on time and computational resources. However, using optimization techniques is very useful because it allows engineers to obtain a large set of designs at low computational cost. Among the different optimization methods, sequential linear programming (SLP) is very popular because of its simplicity and because linear solvers (e.g. Simplex) are easily available. In spite of the inherent theoretical simplicity, well-coded SLP algorithms may outperform more sophisticated optimization methods. This paper describes the experience obtained in the design optimization of large-scale truss structures and beams with SLP-based algorithms. Sizing and configuration problems of structures under multiple loading conditions with up to 1000 design variables and 3500 constraints are considered. The relative performance and merits of some SLP-based algorithms are compared and the efficiency of an advanced SLP-based algorithm called ILEAML (improved linearization error amplitude move limits) is tested. ILEAML is also compared to the sequential quadratic programming (SQP) method, which is considered by theoreticians as probably the best theoretically founded optimization technique.


2020 ◽  
Vol 15 (7) ◽  
pp. 750-757
Author(s):  
Jihong Wang ◽  
Yue Shi ◽  
Xiaodan Wang ◽  
Huiyou Chang

Background: At present, using computer methods to predict drug-target interactions (DTIs) is a very important step in the discovery of new drugs and drug relocation processes. The potential DTIs identified by machine learning methods can provide guidance in biochemical or clinical experiments. Objective: The goal of this article is to combine the latest network representation learning methods for drug-target prediction research, improve model prediction capabilities, and promote new drug development. Methods: We use large-scale information network embedding (LINE) method to extract network topology features of drugs, targets, diseases, etc., integrate features obtained from heterogeneous networks, construct binary classification samples, and use random forest (RF) method to predict DTIs. Results: The experiments in this paper compare the common classifiers of RF, LR, and SVM, as well as the typical network representation learning methods of LINE, Node2Vec, and DeepWalk. It can be seen that the combined method LINE-RF achieves the best results, reaching an AUC of 0.9349 and an AUPR of 0.9016. Conclusion: The learning method based on LINE network can effectively learn drugs, targets, diseases and other hidden features from the network topology. The combination of features learned through multiple networks can enhance the expression ability. RF is an effective method of supervised learning. Therefore, the Line-RF combination method is a widely applicable method.


Author(s):  
YongAn LI

Background: The symbolic nodal analysis acts as a pivotal part of the very large scale integration (VLSI) design. Methods: In this work, based on the terminal relations for the pathological elements and the voltage differencing inverting buffered amplifier (VDIBA), twelve alternative pathological models for the VDIBA are presented. Moreover, the proposed models are applied to the VDIBA-based second-order filter and oscillator so as to simplify the circuit analysis. Results: The result shows that the behavioral models for the VDIBA are systematic, effective and powerful in the symbolic nodal circuit analysis.</P>


Sensors ◽  
2021 ◽  
Vol 21 (2) ◽  
pp. 345
Author(s):  
Pyung Kim ◽  
Younho Lee ◽  
Youn-Sik Hong ◽  
Taekyoung Kwon

To meet password selection criteria of a server, a user occasionally needs to provide multiple choices of password candidates to an on-line password meter, but such user-chosen candidates tend to be derived from the user’s previous passwords—the meter may have a high chance to acquire information about a user’s passwords employed for various purposes. A third party password metering service may worsen this threat. In this paper, we first explore a new on-line password meter concept that does not necessitate the exposure of user’s passwords for evaluating user-chosen password candidates in the server side. Our basic idea is straightforward; to adapt fully homomorphic encryption (FHE) schemes to build such a system but its performance achievement is greatly challenging. Optimization techniques are necessary for performance achievement in practice. We employ various performance enhancement techniques and implement the NIST (National Institute of Standards and Technology) metering method as seminal work in this field. Our experiment results demonstrate that the running time of the proposed meter is around 60 s in a conventional desktop server, expecting better performance in high-end hardware, with an FHE scheme in HElib library where parameters support at least 80-bit security. We believe the proposed method can be further explored and used for a password metering in case that password secrecy is very important—the user’s password candidates should not be exposed to the meter and also an internal mechanism of password metering should not be disclosed to users and any other third parties.


2021 ◽  
Vol 13 (3) ◽  
pp. 1274
Author(s):  
Loau Al-Bahrani ◽  
Mehdi Seyedmahmoudian ◽  
Ben Horan ◽  
Alex Stojcevski

Few non-traditional optimization techniques are applied to the dynamic economic dispatch (DED) of large-scale thermal power units (TPUs), e.g., 1000 TPUs, that consider the effects of valve-point loading with ramp-rate limitations. This is a complicated multiple mode problem. In this investigation, a novel optimization technique, namely, a multi-gradient particle swarm optimization (MG-PSO) algorithm with two stages for exploring and exploiting the search space area, is employed as an optimization tool. The M particles (explorers) in the first stage are used to explore new neighborhoods, whereas the M particles (exploiters) in the second stage are used to exploit the best neighborhood. The M particles’ negative gradient variation in both stages causes the equilibrium between the global and local search space capabilities. This algorithm’s authentication is demonstrated on five medium-scale to very large-scale power systems. The MG-PSO algorithm effectively reduces the difficulty of handling the large-scale DED problem, and simulation results confirm this algorithm’s suitability for such a complicated multi-objective problem at varying fitness performance measures and consistency. This algorithm is also applied to estimate the required generation in 24 h to meet load demand changes. This investigation provides useful technical references for economic dispatch operators to update their power system programs in order to achieve economic benefits.


2021 ◽  
Vol 11 (10) ◽  
pp. 4438
Author(s):  
Satyendra Singh ◽  
Manoj Fozdar ◽  
Hasmat Malik ◽  
Maria del Valle Fernández Moreno ◽  
Fausto Pedro García Márquez

It is expected that large-scale producers of wind energy will become dominant players in the future electricity market. However, wind power output is irregular in nature and it is subjected to numerous fluctuations. Due to the effect on the production of wind power, producing a detailed bidding strategy is becoming more complicated in the industry. Therefore, in view of these uncertainties, a competitive bidding approach in a pool-based day-ahead energy marketplace is formulated in this paper for traditional generation with wind power utilities. The profit of the generating utility is optimized by the modified gravitational search algorithm, and the Weibull distribution function is employed to represent the stochastic properties of wind speed profile. The method proposed is being investigated and simplified for the IEEE-30 and IEEE-57 frameworks. The results were compared with the results obtained with other optimization methods to validate the approach.


2021 ◽  
Vol 502 (3) ◽  
pp. 3976-3992
Author(s):  
Mónica Hernández-Sánchez ◽  
Francisco-Shu Kitaura ◽  
Metin Ata ◽  
Claudio Dalla Vecchia

ABSTRACT We investigate higher order symplectic integration strategies within Bayesian cosmic density field reconstruction methods. In particular, we study the fourth-order discretization of Hamiltonian equations of motion (EoM). This is achieved by recursively applying the basic second-order leap-frog scheme (considering the single evaluation of the EoM) in a combination of even numbers of forward time integration steps with a single intermediate backward step. This largely reduces the number of evaluations and random gradient computations, as required in the usual second-order case for high-dimensional cases. We restrict this study to the lognormal-Poisson model, applied to a full volume halo catalogue in real space on a cubical mesh of 1250 h−1 Mpc side and 2563 cells. Hence, we neglect selection effects, redshift space distortions, and displacements. We note that those observational and cosmic evolution effects can be accounted for in subsequent Gibbs-sampling steps within the COSMIC BIRTH algorithm. We find that going from the usual second to fourth order in the leap-frog scheme shortens the burn-in phase by a factor of at least ∼30. This implies that 75–90 independent samples are obtained while the fastest second-order method converges. After convergence, the correlation lengths indicate an improvement factor of about 3.0 fewer gradient computations for meshes of 2563 cells. In the considered cosmological scenario, the traditional leap-frog scheme turns out to outperform higher order integration schemes only when considering lower dimensional problems, e.g. meshes with 643 cells. This gain in computational efficiency can help to go towards a full Bayesian analysis of the cosmological large-scale structure for upcoming galaxy surveys.


Sign in / Sign up

Export Citation Format

Share Document