Small-Data, Large-Scale Linear Optimization with Uncertain Objectives

Management Science ◽

10.1287/mnsc.2019.3554 ◽

2020 ◽

Cited By ~ 1

Author(s):

Vishal Gupta ◽

Paat Rusmevichientong

Keyword(s):

Empirical Bayes ◽

Large Scale ◽

Optimization Problems ◽

Linear Optimization ◽

Sample Average Approximation ◽

Data Driven ◽

Small Data ◽

Uncertain Parameters ◽

Fixed Amount ◽

Large Sample

Optimization applications often depend on a huge number of uncertain parameters. In many contexts, however, the amount of relevant data per parameter is small, and hence, we may only have imprecise estimates. We term this setting—in which the number of uncertainties is large but all estimates have low precision—the small-data, large-scale regime. We formalize a model for this new regime, focusing on optimization problems with uncertain linear objectives. We show that common data-driven methods, such as sample average approximation, data-driven robust optimization, and certain regularized policies, may perform poorly in this new setting. We then propose a novel framework for selecting a data-driven policy from a given policy class. As with the aforementioned data-driven methods, our new policy enjoys provably good performance in the large-sample regime. Unlike these methods, we show that in the small-data, large-scale regime, our data-driven policy performs comparably to an oracle best-in-class policy under some mild conditions. We strengthen this result for linear optimization problems and two natural policy classes, the first inspired by the empirical Bayes literature and the second by regularization techniques. For both classes, the suboptimality gap between our proposed policy and the oracle policy decays exponentially fast in the number of uncertain parameters even for a fixed amount of data. Thus, these policies retain the strong large-sample performance of traditional methods and additionally enjoy provably strong performance in the small-data, large-scale regime. Numerical experiments confirm the significant benefits of our methods. This paper was accepted by Yinyu Ye, optimization.

Download Full-text

Tractable Equilibria in Sponsored Search with Endogenous Budgets

Operations Research ◽

10.1287/opre.2020.2052 ◽

2020 ◽

Author(s):

Dragos Florin Ciocan ◽

Krishnamurthy Iyer

Keyword(s):

Large Scale ◽

Optimal Allocation ◽

Optimization Problems ◽

Linear Optimization ◽

Implementing and modifying Broyden class updates for large scale optimization

Computational Optimization and Applications ◽

10.1007/s10589-020-00239-2 ◽

2020 ◽

Author(s):

Martin Buhmann ◽

Dirk Siegel

Keyword(s):

Large Scale ◽

Optimization Problems ◽

Substantial Improvement ◽

Gradient Algorithm ◽

Large Scale Optimization ◽

Fixed Amount ◽

Limited Memory ◽

Broyden Class ◽

Scale Optimization ◽

Number Of Iterations

Abstract We consider Broyden class updates for large scale optimization problems in n dimensions, restricting attention to the case when the initial second derivative approximation is the identity matrix. Under this assumption we present an implementation of the Broyden class based on a coordinate transformation on each iteration. It requires only $$2nk + O(k^{2}) + O(n)$$ 2 n k + O ( k 2 ) + O ( n ) multiplications on the kth iteration and stores $$nK+ O(K^2) + O(n)$$ n K + O ( K 2 ) + O ( n ) numbers, where K is the total number of iterations. We investigate a modification of this algorithm by a scaling approach and show a substantial improvement in performance over the BFGS method. We also study several adaptations of the new implementation to the limited memory situation, presenting algorithms that work with a fixed amount of storage independent of the number of iterations. We show that one such algorithm retains the property of quadratic termination. The practical performance of the new methods is compared with the performance of Nocedal’s (Math Comput 35:773--782, 1980) method, which is considered the benchmark in limited memory algorithms. The tests show that the new algorithms can be significantly more efficient than Nocedal’s method. Finally, we show how a scaling technique can significantly improve both Nocedal’s method and the new generalized conjugate gradient algorithm.

Download Full-text

Hopfield neural networks in large-scale linear optimization problems

Applied Mathematics and Computation ◽

10.1016/j.amc.2011.12.059 ◽

2012 ◽

Vol 218 (12) ◽

pp. 6851-6859 ◽

Cited By ~ 2

Author(s):

Marta I. Velazco Fontova ◽

Aurelio R.L. Oliveira ◽

Christiano Lyra

Keyword(s):

Neural Networks ◽

Large Scale ◽

Optimization Problems ◽

Linear Optimization ◽

Hopfield Neural Networks ◽

Linear Optimization Problems

Download Full-text

Technical Note—Two-Stage Sample Robust Optimization

Operations Research ◽

10.1287/opre.2020.2096 ◽

2021 ◽

Author(s):

Dimitris Bertsimas ◽

Shimrit Shtern ◽

Bradley Sturt

Keyword(s):

Robust Optimization ◽

Approximation Scheme ◽

Optimization Problems ◽

Linear Optimization ◽

Decision Rules ◽

Technical Note ◽

Data Driven ◽

Two Stage ◽

Linear Optimization Problems ◽

Stochastic Linear Optimization

In “Two-Stage Sample Robust Optimization,” Bertsimas, Shtern, and Sturt investigate a simple approximation scheme, based on overlapping linear decision rules, for solving data-driven two-stage distributionally robust optimization problems with the type-infinity Wasserstein ambiguity set. Their main result establishes that this approximation scheme is asymptotically optimal for two-stage stochastic linear optimization problems; that is, under mild assumptions, the optimal cost and optimal first-stage decisions obtained by approximating the robust optimization problem converge to those of the underlying stochastic problem as the number of data points grows to infinity. These guarantees notably apply to two-stage stochastic problems that do not have relatively complete recourse, which arise frequently in applications. In this context, the authors show through numerical experiments that the approximation scheme is practically tractable and produces decisions that significantly outperform those obtained from state-of-the-art data-driven alternatives.

Download Full-text

Small-Data, Large-Scale Linear Optimization

SSRN Electronic Journal ◽

10.2139/ssrn.3065655 ◽

2017 ◽

Cited By ~ 2

Author(s):

Vishal Gupta ◽

Paat Rusmevichientong

Keyword(s):

Large Scale ◽

Linear Optimization ◽

Small Data

Download Full-text

Intraday Scheduling with Patient Re-entries and Variability in Behaviours

Manufacturing & Service Operations Management ◽

10.1287/msom.2020.0959 ◽

2021 ◽

Author(s):

Minglong Zhou ◽

Gar Goei Loke ◽

Chaithanya Bandi ◽

Zi Qiang Glen Liau ◽

Wilson Wang

Keyword(s):

Optimization Problems ◽

Waiting Times ◽

Risk Index ◽

Linear Optimization ◽

Sample Average Approximation ◽

Problem Definition ◽

Mixed Integer ◽

Data Set ◽

X Ray ◽

Patient Waiting

Problem definition: We consider the intraday scheduling problem in a group of orthopaedic clinics where the planner schedules appointment times, given a sequence of appointments. We consider patient re-entry—where patients may be required to go for an x-ray examination, returning to the same doctor they have seen—and variability in patient behaviours such as walk-ins, earliness, and no-shows, which leads to inefficiency such as long patient waiting time and physician overtime. Academic/practical relevance: In our data set, 25% of the patients are required to go for x-ray examination. We also found significant variability in patient behaviours. Hence, patient re-entry and variability in behaviours are common, but we found little in the literature that could handle them. Methodology: We formulate the problem as a two-stage optimization problem, where scheduling decisions are made in the first stage. Queue dynamics in the second stage are modeled under a P-Queue paradigm, which minimizes a risk index representing the chance of violating performance targets, such as patient waiting times. The model reduces to a sequence of mixed-integer linear-optimization problems. Results: Our model achieves significant reductions, in comparative studies against a sample average approximation (SAA) model, on patient waiting times, while keeping server overtime constant. Our simulations further characterize the types of uncertainties under which SAA performs poorly. Managerial insights: We present an optimization model that is easy to implement in practice and tractable to compute. Our simulations indicate that not accounting for patient re-entry or variability in patient behaviours will lead to suboptimal policies, especially when they have specific structure that should be considered.

Download Full-text

reComBat: Batch effect removal in large-scale, multi-source omics data integration

10.1101/2021.11.22.469488 ◽

2021 ◽

Author(s):

Michael F. Adamer ◽

Sarah C. Brueningk ◽

Alejandro Tejada-Arranz ◽

Fabienne Estermann ◽

Marek Basler ◽

...

Keyword(s):

Gene Expression ◽

Data Integration ◽

Empirical Bayes ◽

Large Scale ◽

Batch Effect ◽

Data Driven ◽

Design Matrix ◽

Omics Data ◽

Experimental Conditions ◽

Batch Correction

With the steadily increasing abundance of omics data produced all over the world, sometimes decades apart and under vastly different experimental conditions residing in public databases, a crucial step in many data-driven bioinformatics applications is that of data integration. The challenge of batch effect removal for entire databases lies in the large number and coincide of both batches and desired, biological variation resulting in design matrix singularity. This problem currently cannot be solved by any common batch correction algorithm. In this study, we present reComBat, a regularised version of the empirical Bayes method to overcome this limitation. We demonstrate our approach for the harmonisation of public gene expression data of the human opportunistic pathogen Pseudomonas aeruginosa and study a several metrics to empirically demonstrate that batch effects are successfully mitigated while biologically meaningful gene expression variation is retained. reComBat fills the gap in batch correction approaches applicable to large scale, public omics databases and opens up new avenues for data driven analysis of complex biological processes beyond the scope of a single study.

Download Full-text

Performance Analysis of Sparse Data Structure Implementations

MACRo 2015 ◽

10.1515/macro-2015-0028 ◽

2015 ◽

Vol 1 (1) ◽

pp. 283-292

Author(s):

Péter Böröcz ◽

Péter Tar ◽

István Maros

Keyword(s):

Data Structure ◽

Performance Analysis ◽

Data Structures ◽

Large Scale ◽

Optimization Problems ◽

Linear Optimization ◽

Sparse Data ◽

Vector Representation ◽

Linear Optimization Problems ◽

Algebraic Operations

AbstractSparse linear algebraic data structures are widely used during the solution of large scale linear optimization problems. The efficiency of the solver is significantly influenced by the used data structures. The implementations of such data structures are not trivial. A performance analysis of the available data structures can provide valuable information to improve efficiency. In the talk we present our software that supports this task as well as our new, special vector representation. We also report results covering the solution for numerical issues affecting the performance of sparse linear algebraic operations.

Download Full-text

Genetic algorithms: a powerful tool for large-scale non-linear optimization problems

International Journal of Rock Mechanics and Mining Sciences & Geomechanics Abstracts ◽

10.1016/0148-9062(95)94685-3 ◽

1995 ◽

Vol 32 (2) ◽

pp. A78

Keyword(s):

Genetic Algorithms ◽

Large Scale ◽

Optimization Problems ◽

Linear Optimization ◽

Non Linear ◽

Linear Optimization Problems

Download Full-text

Optimal Test Assembly in Practice

Zeitschrift für Psychologie ◽

10.1027/2151-2604/a000146 ◽

2013 ◽

Vol 221 (3) ◽

pp. 190-200 ◽

Cited By ~ 9

Author(s):

Jörg-Tobias Kuhn ◽

Thomas Kiefer

Keyword(s):

Particle Physics ◽

Large Scale ◽

Block Design ◽

Linear Optimization ◽

Mixed Integer ◽

Optimal Test ◽

Automated Test Assembly ◽

Test Assembly ◽

Block Level ◽

Mixed Integer Linear Optimization

Several techniques have been developed in recent years to generate optimal large-scale assessments (LSAs) of student achievement. These techniques often represent a blend of procedures from such diverse fields as experimental design, combinatorial optimization, particle physics, or neural networks. However, despite the theoretical advances in the field, there still exists a surprising scarcity of well-documented test designs in which all factors that have guided design decisions are explicitly and clearly communicated. This paper therefore has two goals. First, a brief summary of relevant key terms, as well as experimental designs and automated test assembly routines in LSA, is given. Second, conceptual and methodological steps in designing the assessment of the Austrian educational standards in mathematics are described in detail. The test design was generated using a two-step procedure, starting at the item block level and continuing at the item level. Initially, a partially balanced incomplete item block design was generated using simulated annealing, whereas in a second step, items were assigned to the item blocks using mixed-integer linear optimization in combination with a shadow-test approach.

Download Full-text