Thallo – Scheduling for High-Performance Large-Scale Non-Linear Least-Squares Solvers

Large-scale optimization problems at the core of many graphics, vision, and imaging applications are often implemented by hand in tedious and error-prone processes in order to achieve high performance (in particular on GPUs), despite recent developments in libraries and DSLs. At the same time, these hand-crafted solver implementations reveal that the key for high performance is a problem-specific schedule that enables efficient usage of the underlying hardware. In this work, we incorporate this insight into Thallo, a domain-specific language for large-scale non-linear least squares optimization problems. We observe various code reorganizations performed by implementers of high-performance solvers in the literature, and then define a set of basic operations that span these scheduling choices, thereby defining a large scheduling space. Users can either specify code transformations in a scheduling language or use an autoscheduler. Thallo takes as input a compact, shader-like representation of an energy function and a (potentially auto-generated) schedule, translating the combination into high-performance GPU solvers. Since Thallo can generate solvers from a large scheduling space, it can handle a large set of large-scale non-linear and non-smooth problems with various degrees of non-locality and compute-to-memory ratios, including diverse applications such as bundle adjustment, face blendshape fitting, and spatially-varying Poisson deconvolution, as seen in Figure 1. Abstracting schedules from the optimization, we outperform state-of-the-art GPU-based optimization DSLs by an average of 16× across all applications introduced in this work, and even some published hand-written GPU solvers by 30%+.

Download Full-text

COMPUTATIONAL GRIDS TO SOLVE LARGE SCALE OPTIMIZATION PROBLEMS WITH UNCERTAIN DATA

International Journal of Computing ◽

10.47839/ijc.1.1.78 ◽

2014 ◽

pp. 77-81

Author(s):

Chefi Triki ◽

Lucio Grandinetti

Keyword(s):

High Performance ◽

Large Scale ◽

Optimization Problems ◽

Uncertain Data ◽

Path Following ◽

Computational Grids ◽

Large Scale Optimization ◽

Scale Optimization ◽

Performance Computing ◽

Path Following Algorithm

In this paper we discuss the use computational grids to solve stochastic optimization problems. These problems are generally difficult to solve and are often characterized by a high number of variables and constraints. Furthermore, for some applications it is required to achieve a real-time solution. Obtaining reasonable results is a difficult objective without the use of high performance computing. Here we present a grid-enabled path-following algorithm and we discuss some experimental results.

Download Full-text

New adaptive Barzilai--Borwein step size and its application in solving large-scale optimization problems

ANZIAM Journal ◽

10.21914/anziamj.v61i0.12874 ◽

2019 ◽

Vol 61 ◽

pp. 76

Author(s):

Ting Li ◽

Zhong Wan

Keyword(s):

Large Scale ◽

Optimization Problems ◽

Large Scale Optimization ◽

Step Size ◽

Scale Optimization

Download Full-text

Utilisation of pressure-volume techniques and non-linear least squares analysis to investigate site induced stresses in evergreen trees

Oecologia ◽

10.1007/bf00377184 ◽

1983 ◽

Vol 57 (3) ◽

pp. 380-390 ◽

Cited By ~ 29

Author(s):

G. T. Jane ◽

T. G. A. Green

Keyword(s):

Least Squares ◽

Linear Least Squares ◽

Non Linear ◽

Induced Stresses

Download Full-text

Adaptive Robust Kernels for Non-Linear Least Squares Problems

IEEE Robotics and Automation Letters ◽

10.1109/lra.2021.3061331 ◽

2021 ◽

pp. 1-1

Author(s):

Nived Chebrolu ◽

Thomas Labe ◽

Olga Vysotska ◽

Jens Behley ◽

Cyrill Stachniss

Keyword(s):

Least Squares ◽

Linear Least Squares ◽

Least Squares Problems ◽

Non Linear ◽

Linear Least Squares Problems

Download Full-text

Applying symbolic formula manipulation to the problem of non-linear least squares

Nuclear Instruments and Methods ◽

10.1016/0029-554x(73)90174-2 ◽

1973 ◽

Vol 112 (3) ◽

pp. 533-537 ◽

Cited By ~ 5

Author(s):

J.R. Wolberg ◽

J. Isenberg

Keyword(s):

Least Squares ◽

Linear Least Squares ◽

Non Linear ◽

Symbolic Formula

Download Full-text

Non-Linear Least-Squares Regression

Journal of Chemical Education ◽

10.1021/ed077p669.2 ◽

2000 ◽

Vol 77 (5) ◽

pp. 669 ◽

Cited By ~ 1

Author(s):

Sidney Young ◽

Andrzej Wierzbicki

Keyword(s):

Least Squares ◽

Least Squares Regression ◽

Linear Least Squares ◽

Non Linear

Download Full-text

Application of High-Performance Computing to Numerical Simulation of Human Movement

Journal of Biomechanical Engineering ◽

10.1115/1.2792264 ◽

1995 ◽

Vol 117 (1) ◽

pp. 155-157 ◽

Cited By ~ 29

Author(s):

F. C. Anderson ◽

J. M. Ziegler ◽

M. G. Pandy ◽

R. T. Whalen

Keyword(s):

Computer Architecture ◽

High Performance ◽

Large Scale ◽

Optimization Problems ◽

Optimal Solution ◽

Human Movement ◽

Parallel Machine ◽

Optimal Controls ◽

Processing Machine ◽

Vector Processing

We have examined the feasibility of using massively-parallel and vector-processing supercomputers to solve large-scale optimization problems for human movement. Specifically, we compared the computational expense of determining the optimal controls for the single support phase of gait using a conventional serial machine (SGI Iris 4D25), a MIMD parallel machine (Intel iPSC/860), and a parallel-vector-processing machine (Cray Y-MP 8/864). With the human body modeled as a 14 degree-of-freedom linkage actuated by 46 musculotendinous units, computation of the optimal controls for gait could take up to 3 months of CPU time on the Iris. Both the Cray and the Intel are able to reduce this time to practical levels. The optimal solution for gait can be found with about 77 hours of CPU on the Cray and with about 88 hours of CPU on the Intel. Although the overall speeds of the Cray and the Intel were found to be similar, the unique capabilities of each machine are better suited to different portions of the computational algorithm used. The Intel was best suited to computing the derivatives of the performance criterion and the constraints whereas the Cray was best suited to parameter optimization of the controls. These results suggest that the ideal computer architecture for solving very large-scale optimal control problems is a hybrid system in which a vector-processing machine is integrated into the communication network of a MIMD parallel machine.

Download Full-text

A new three-term spectral conjugate gradient algorithm with higher numerical performance for solving large scale optimization problems based on Quasi-Newton equation

International Journal of Modeling Simulation and Scientific Computing ◽

10.1142/s1793962321500537 ◽

2021 ◽

pp. 2150053

Author(s):

Jie Guo ◽

Zhong Wan

Keyword(s):

Conjugate Gradient ◽

Large Scale ◽

Line Search ◽

Optimization Problems ◽

Gradient Algorithm ◽

Large Scale Optimization ◽

Newton Equation ◽

Numerical Performance ◽

Scale Optimization ◽

Quasi Newton

A new spectral three-term conjugate gradient algorithm in virtue of the Quasi-Newton equation is developed for solving large-scale unconstrained optimization problems. It is proved that the search directions in this algorithm always satisfy a sufficiently descent condition independent of any line search. Global convergence is established for general objective functions if the strong Wolfe line search is used. Numerical experiments are employed to show its high numerical performance in solving large-scale optimization problems. Particularly, the developed algorithm is implemented to solve the 100 benchmark test problems from CUTE with different sizes from 1000 to 10,000, in comparison with some similar ones in the literature. The numerical results demonstrate that our algorithm outperforms the state-of-the-art ones in terms of less CPU time, less number of iteration or less number of function evaluation.

Download Full-text