First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method

On-line first-order backpropagation is sufficiently fast and effective for many large-scale classification problems but for very high precision mappings, batch processing may be the method of choice. This paper reviews first- and second-order optimization methods for learning in feedforward neural networks. The viewpoint is that of optimization: many methods can be cast in the language of optimization techniques, allowing the transfer to neural nets of detailed results about computational complexity and safety procedures to ensure convergence and to avoid numerical problems. The review is not intended to deliver detailed prescriptions for the most appropriate methods in specific applications, but to illustrate the main characteristics of the different methods and their mutual relations.

Download Full-text

Convex optimization techniques in compliant assembly simulation

Optimization and Engineering ◽

10.1007/s11081-020-09493-z ◽

2020 ◽

Vol 21 (4) ◽

pp. 1665-1690

Author(s):

Maria Stefanova ◽

Olga Minevich ◽

Stanislav Baklanov ◽

Margarita Petukhova ◽

Sergey Lupuleac ◽

...

Keyword(s):

Large Scale ◽

Hessian Matrix ◽

Linear Complementarity Problems ◽

Contact Problems ◽

Optimization Methods ◽

Optimization Techniques ◽

Active Set ◽

Assembly Simulation ◽

The Matrix ◽

Compliant Parts

Abstract A special class of quadratic programming (QP) problems is considered in this paper. This class emerges in simulation of assembly of large-scale compliant parts, which involves the formulation and solution of contact problems. The considered QP problems can have up to 20,000 unknowns, the Hessian matrix is fully populated and ill-conditioned, while the matrix of constraints is sparse. Variation analysis and optimization of assembly process usually require massive computations of QP problems with slightly different input data. The following optimization methods are adapted to account for the particular features of the assembly problem: an interior point method, an active-set method, a Newton projection method, and a pivotal algorithm for the linear complementarity problems. Equivalent formulations of the QP problem are proposed with the intent of them being more amenable to the considered methods. The methods are tested and results are compared for a number of aircraft assembly simulation problems.

Download Full-text

Scalable Optimization Methods for Incorporating Spatiotemporal Fractionation into Intensity-Modulated Radiotherapy Planning

INFORMS Journal on Computing ◽

10.1287/ijoc.2021.1070 ◽

2021 ◽

Author(s):

Ali Adibi ◽

Ehsan Salari

Keyword(s):

Quadratic Programming ◽

Large Scale ◽

Optimization Methods ◽

Optimization Techniques ◽

Radiotherapy Planning ◽

Quadratic Program ◽

Clinical Cancer ◽

Therapeutic Gain ◽

Solution Methods ◽

Quadratically Constrained

It has been recently shown that an additional therapeutic gain may be achieved if a radiotherapy plan is altered over the treatment course using a new treatment paradigm referred to in the literature as spatiotemporal fractionation. Because of the nonconvex and large-scale nature of the corresponding treatment plan optimization problem, the extent of the potential therapeutic gain that may be achieved from spatiotemporal fractionation has been investigated using stylized cancer cases to circumvent the arising computational challenges. This research aims at developing scalable optimization methods to obtain high-quality spatiotemporally fractionated plans with optimality bounds for clinical cancer cases. In particular, the treatment-planning problem is formulated as a quadratically constrained quadratic program and is solved to local optimality using a constraint-generation approach, in which each subproblem is solved using sequential linear/quadratic programming methods. To obtain optimality bounds, cutting-plane and column-generation methods are combined to solve the Lagrangian relaxation of the formulation. The performance of the developed methods are tested on deidentified clinical liver and prostate cancer cases. Results show that the proposed method is capable of achieving local-optimal spatiotemporally fractionated plans with an optimality gap of around 10%–12% for cancer cases tested in this study. Summary of Contribution: The design of spatiotemporally fractionated radiotherapy plans for clinical cancer cases gives rise to a class of nonconvex and large-scale quadratically constrained quadratic programming (QCQP) problems, the solution of which requires the development of efficient models and solution methods. To address the computational challenges posed by the large-scale and nonconvex nature of the problem, we employ large-scale optimization techniques to develop scalable solution methods that find local-optimal solutions along with optimality bounds. We test the performance of the proposed methods on deidentified clinical cancer cases. The proposed methods in this study can, in principle, be applied to solve other QCQP formulations, which commonly arise in several application domains, including graph theory, power systems, and signal processing.

Download Full-text

Design optimization of large-scale structures with sequential linear programming

Proceedings of the Institution of Mechanical Engineers Part C Journal of Mechanical Engineering Science ◽

10.1243/09544060260171438 ◽

2002 ◽

Vol 216 (8) ◽

pp. 799-811 ◽

Cited By ~ 2

Author(s):

L Lamberti ◽

C Pappalettere

Keyword(s):

Linear Programming ◽

Design Optimization ◽

Large Scale ◽

Optimization Technique ◽

Computational Cost ◽

Optimization Methods ◽

Optimization Techniques ◽

Truss Structures ◽

Large Set ◽

Sequential Linear Programming

Design optimization of complex structures entails tasks that oppose the usual constraints on time and computational resources. However, using optimization techniques is very useful because it allows engineers to obtain a large set of designs at low computational cost. Among the different optimization methods, sequential linear programming (SLP) is very popular because of its simplicity and because linear solvers (e.g. Simplex) are easily available. In spite of the inherent theoretical simplicity, well-coded SLP algorithms may outperform more sophisticated optimization methods. This paper describes the experience obtained in the design optimization of large-scale truss structures and beams with SLP-based algorithms. Sizing and configuration problems of structures under multiple loading conditions with up to 1000 design variables and 3500 constraints are considered. The relative performance and merits of some SLP-based algorithms are compared and the efficiency of an advanced SLP-based algorithm called ILEAML (improved linearization error amplitude move limits) is tested. ILEAML is also compared to the sequential quadratic programming (SQP) method, which is considered by theoreticians as probably the best theoretically founded optimization technique.

Download Full-text

Advanced First- and Second-Order Optimization Methods

Machine Learning Refined ◽

10.1017/9781108690935.021 ◽

2020 ◽

pp. 473-510

Keyword(s):

Optimization Methods ◽

Second Order

Download Full-text

A Drug Target Interaction Prediction Based on LINE-RF Learning

Current Bioinformatics ◽

10.2174/1574893615666191227092453 ◽

2020 ◽

Vol 15 (7) ◽

pp. 750-757

Author(s):

Jihong Wang ◽

Yue Shi ◽

Xiaodan Wang ◽

Huiyou Chang

Keyword(s):

Network Topology ◽

Drug Target ◽

Large Scale ◽

Representation Learning ◽

New Drugs ◽

Combination Method ◽

Learning Methods ◽

Network Representation ◽

On Line ◽

Clinical Experiments

Background: At present, using computer methods to predict drug-target interactions (DTIs) is a very important step in the discovery of new drugs and drug relocation processes. The potential DTIs identified by machine learning methods can provide guidance in biochemical or clinical experiments. Objective: The goal of this article is to combine the latest network representation learning methods for drug-target prediction research, improve model prediction capabilities, and promote new drug development. Methods: We use large-scale information network embedding (LINE) method to extract network topology features of drugs, targets, diseases, etc., integrate features obtained from heterogeneous networks, construct binary classification samples, and use random forest (RF) method to predict DTIs. Results: The experiments in this paper compare the common classifiers of RF, LR, and SVM, as well as the typical network representation learning methods of LINE, Node2Vec, and DeepWalk. It can be seen that the combined method LINE-RF achieves the best results, reaching an AUC of 0.9349 and an AUPR of 0.9016. Conclusion: The learning method based on LINE network can effectively learn drugs, targets, diseases and other hidden features from the network topology. The combination of features learned through multiple networks can enhance the expression ability. RF is an effective method of supervised learning. Therefore, the Line-RF combination method is a widely applicable method.

Download Full-text

Behavior Models of Voltage Differencing Inverting Buffered Amplifier and Applications in Circuit Analysis

Recent Advances in Electrical & Electronic Engineering (Formerly Recent Patents on Electrical & Electronic Engineering) ◽

10.2174/2352096511666180629151111 ◽

2019 ◽

Vol 12 (4) ◽

pp. 298-303

Author(s):

YongAn LI

Keyword(s):

Large Scale ◽

Vlsi Design ◽

Circuit Analysis ◽

Second Order ◽

Very Large Scale Integration ◽

Behavioral Models ◽

Order Filter ◽

Large Scale Integration ◽

Scale Integration ◽

Behavior Models

Background: The symbolic nodal analysis acts as a pivotal part of the very large scale integration (VLSI) design. Methods: In this work, based on the terminal relations for the pathological elements and the voltage differencing inverting buffered amplifier (VDIBA), twelve alternative pathological models for the VDIBA are presented. Moreover, the proposed models are applied to the VDIBA-based second-order filter and oscillator so as to simplify the circuit analysis. Results: The result shows that the behavioral models for the VDIBA are systematic, effective and powerful in the symbolic nodal circuit analysis.</P>

Download Full-text

A Password Meter without Password Exposure

Sensors ◽

10.3390/s21020345 ◽

2021 ◽

Vol 21 (2) ◽

pp. 345

Author(s):

Pyung Kim ◽

Younho Lee ◽

Youn-Sik Hong ◽

Taekyoung Kwon

Keyword(s):

Performance Enhancement ◽

Homomorphic Encryption ◽

Optimization Techniques ◽

Third Party ◽

Fully Homomorphic Encryption ◽

Performance Achievement ◽

Server Side ◽

On Line ◽

Internal Mechanism ◽

High Chance

To meet password selection criteria of a server, a user occasionally needs to provide multiple choices of password candidates to an on-line password meter, but such user-chosen candidates tend to be derived from the user’s previous passwords—the meter may have a high chance to acquire information about a user’s passwords employed for various purposes. A third party password metering service may worsen this threat. In this paper, we first explore a new on-line password meter concept that does not necessitate the exposure of user’s passwords for evaluating user-chosen password candidates in the server side. Our basic idea is straightforward; to adapt fully homomorphic encryption (FHE) schemes to build such a system but its performance achievement is greatly challenging. Optimization techniques are necessary for performance achievement in practice. We employ various performance enhancement techniques and implement the NIST (National Institute of Standards and Technology) metering method as seminal work in this field. Our experiment results demonstrate that the running time of the proposed meter is around 60 s in a conventional desktop server, expecting better performance in high-end hardware, with an FHE scheme in HElib library where parameters support at least 80-bit security. We believe the proposed method can be further explored and used for a password metering in case that password secrecy is very important—the user’s password candidates should not be exposed to the meter and also an internal mechanism of password metering should not be disclosed to users and any other third parties.

Download Full-text

Solving the Real Power Limitations in the Dynamic Economic Dispatch of Large-Scale Thermal Power Units under the Effects of Valve-Point Loading and Ramp-Rate Limitations

Sustainability ◽

10.3390/su13031274 ◽

2021 ◽

Vol 13 (3) ◽

pp. 1274

Author(s):

Loau Al-Bahrani ◽

Mehdi Seyedmahmoudian ◽

Ben Horan ◽

Alex Stojcevski

Keyword(s):

Large Scale ◽

Thermal Power ◽

Optimization Technique ◽

Economic Dispatch ◽

Pso Algorithm ◽

Search Space ◽

Economic Benefits ◽

Optimization Techniques ◽

Ramp Rate ◽

Dynamic Economic Dispatch

Few non-traditional optimization techniques are applied to the dynamic economic dispatch (DED) of large-scale thermal power units (TPUs), e.g., 1000 TPUs, that consider the effects of valve-point loading with ramp-rate limitations. This is a complicated multiple mode problem. In this investigation, a novel optimization technique, namely, a multi-gradient particle swarm optimization (MG-PSO) algorithm with two stages for exploring and exploiting the search space area, is employed as an optimization tool. The M particles (explorers) in the first stage are used to explore new neighborhoods, whereas the M particles (exploiters) in the second stage are used to exploit the best neighborhood. The M particles’ negative gradient variation in both stages causes the equilibrium between the global and local search space capabilities. This algorithm’s authentication is demonstrated on five medium-scale to very large-scale power systems. The MG-PSO algorithm effectively reduces the difficulty of handling the large-scale DED problem, and simulation results confirm this algorithm’s suitability for such a complicated multi-objective problem at varying fitness performance measures and consistency. This algorithm is also applied to estimate the required generation in 24 h to meet load demand changes. This investigation provides useful technical references for economic dispatch operators to update their power system programs in order to achieve economic benefits.

Download Full-text

Influence of Wind Power on Modeling of Bidding Strategy in a Promising Power Market with a Modified Gravitational Search Algorithm

Applied Sciences ◽

10.3390/app11104438 ◽

2021 ◽

Vol 11 (10) ◽

pp. 4438

Author(s):

Satyendra Singh ◽

Manoj Fozdar ◽

Hasmat Malik ◽

Maria del Valle Fernández Moreno ◽

Fausto Pedro García Márquez

Keyword(s):

Wind Power ◽

Electricity Market ◽

Large Scale ◽

Search Algorithm ◽

Gravitational Search Algorithm ◽

Optimization Methods ◽

Bidding Strategy ◽

Wind Speed Profile ◽

Gravitational Search ◽

Modified Gravitational Search Algorithm

It is expected that large-scale producers of wind energy will become dominant players in the future electricity market. However, wind power output is irregular in nature and it is subjected to numerous fluctuations. Due to the effect on the production of wind power, producing a detailed bidding strategy is becoming more complicated in the industry. Therefore, in view of these uncertainties, a competitive bidding approach in a pool-based day-ahead energy marketplace is formulated in this paper for traditional generation with wind power utilities. The profit of the generating utility is optimized by the modified gravitational search algorithm, and the Weibull distribution function is employed to represent the stochastic properties of wind speed profile. The method proposed is being investigated and simplified for the IEEE-30 and IEEE-57 frameworks. The results were compared with the results obtained with other optimization methods to validate the approach.

Download Full-text

Higher order Hamiltonian Monte Carlo sampling for cosmological large-scale structure analysis

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/stab123 ◽

2021 ◽

Vol 502 (3) ◽

pp. 3976-3992

Author(s):

Mónica Hernández-Sánchez ◽

Francisco-Shu Kitaura ◽

Metin Ata ◽

Claudio Dalla Vecchia

Keyword(s):

Fourth Order ◽

Large Scale ◽

Equations Of Motion ◽

Large Scale Structure ◽

Real Space ◽

Higher Order ◽

Second Order ◽

Scale Structure ◽

Integration Schemes ◽

Reconstruction Methods

ABSTRACT We investigate higher order symplectic integration strategies within Bayesian cosmic density field reconstruction methods. In particular, we study the fourth-order discretization of Hamiltonian equations of motion (EoM). This is achieved by recursively applying the basic second-order leap-frog scheme (considering the single evaluation of the EoM) in a combination of even numbers of forward time integration steps with a single intermediate backward step. This largely reduces the number of evaluations and random gradient computations, as required in the usual second-order case for high-dimensional cases. We restrict this study to the lognormal-Poisson model, applied to a full volume halo catalogue in real space on a cubical mesh of 1250 h−1 Mpc side and 2563 cells. Hence, we neglect selection effects, redshift space distortions, and displacements. We note that those observational and cosmic evolution effects can be accounted for in subsequent Gibbs-sampling steps within the COSMIC BIRTH algorithm. We find that going from the usual second to fourth order in the leap-frog scheme shortens the burn-in phase by a factor of at least ∼30. This implies that 75–90 independent samples are obtained while the fastest second-order method converges. After convergence, the correlation lengths indicate an improvement factor of about 3.0 fewer gradient computations for meshes of 2563 cells. In the considered cosmological scenario, the traditional leap-frog scheme turns out to outperform higher order integration schemes only when considering lower dimensional problems, e.g. meshes with 643 cells. This gain in computational efficiency can help to go towards a full Bayesian analysis of the cosmological large-scale structure for upcoming galaxy surveys.

Download Full-text