scholarly journals Improved Guarantees and a Multiple-descent Curve for Column Subset Selection and the Nystrom Method (Extended Abstract)

Author(s):  
Michał Dereziński ◽  
Rajiv Khanna ◽  
Michael W. Mahoney

The Column Subset Selection Problem (CSSP) and the Nystrom method are among the leading tools for constructing interpretable low-rank approximations of large datasets by selecting a small but representative set of features or instances. A fundamental question in this area is: what is the cost of this interpretability, i.e., how well can a data subset of size k compete with the best rank k approximation? We develop techniques which exploit spectral properties of the data matrix to obtain improved approximation guarantees which go beyond the standard worst-case analysis. Our approach leads to significantly better bounds for datasets with known rates of singular value decay, e.g., polynomial or exponential decay. Our analysis also reveals an intriguing phenomenon: the cost of interpretability as a function of k may exhibit multiple peaks and valleys, which we call a multiple-descent curve. A lower bound we establish shows that this behavior is not an artifact of our analysis, but rather it is an inherent property of the CSSP and Nystrom tasks. Finally, using the example of a radial basis function (RBF) kernel, we show that both our improved bounds and the multiple-descent curve can be observed on real datasets simply by varying the RBF parameter.

Author(s):  
P. M. Martino ◽  
G. A. Gabriele

Abstract The proper selection of tolerances is an important part of mechanical design that can have a significant impact on the cost and quality of the final product. Yet, despite their importance, current techniques for tolerance design are rather primitive and often based on experience and trial and error. Better tolerance design methods have been proposed but are seldom used because of the difficulty in formulating the necessary design equations for practical problems. In this paper we propose a technique for the automatic formulation of the design equations, or design functions, which is based on the use of solid models and variational geometry. A prototype system has been developed which can model conventional and statistical tolernaces, and a limited set of geometric tolerances. The prototype system is limited to the modeling of single parts, but can perform both a worst case analysis and a statistical analysis. Results on several simple parts with known characteristics are presented which demonstrate the accuracy of the system and the types of analysis it can perform. The paper concludes with a discussion of extensions to the prototype system to a broader range of geometry and the handling of assemblies.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Ling Wang ◽  
Hongqiao Wang ◽  
Guangyuan Fu

Extensions of kernel methods for the class imbalance problems have been extensively studied. Although they work well in coping with nonlinear problems, the high computation and memory costs severely limit their application to real-world imbalanced tasks. The Nyström method is an effective technique to scale kernel methods. However, the standard Nyström method needs to sample a sufficiently large number of landmark points to ensure an accurate approximation, which seriously affects its efficiency. In this study, we propose a multi-Nyström method based on mixtures of Nyström approximations to avoid the explosion of subkernel matrix, whereas the optimization to mixture weights is embedded into the model training process by multiple kernel learning (MKL) algorithms to yield more accurate low-rank approximation. Moreover, we select subsets of landmark points according to the imbalance distribution to reduce the model’s sensitivity to skewness. We also provide a kernel stability analysis of our method and show that the model solution error is bounded by weighted approximate errors, which can help us improve the learning process. Extensive experiments on several large scale datasets show that our method can achieve a higher classification accuracy and a dramatical speedup of MKL algorithms.


Quantum ◽  
2020 ◽  
Vol 4 ◽  
pp. 234 ◽  
Author(s):  
Alessandro Rudi ◽  
Leonard Wossnig ◽  
Carlo Ciliberto ◽  
Andrea Rocchetto ◽  
Massimiliano Pontil ◽  
...  

Simulating the time-evolution of quantum mechanical systems is BQP-hard and expected to be one of the foremost applications of quantum computers. We consider classical algorithms for the approximation of Hamiltonian dynamics using subsampling methods from randomized numerical linear algebra. We derive a simulation technique whose runtime scales polynomially in the number of qubits and the Frobenius norm of the Hamiltonian. As an immediate application, we show that sample based quantum simulation, a type of evolution where the Hamiltonian is a density matrix, can be efficiently classically simulated under specific structural conditions. Our main technical contribution is a randomized algorithm for approximating Hermitian matrix exponentials. The proof leverages a low-rank, symmetric approximation via the Nyström method. Our results suggest that under strong sampling assumptions there exist classical poly-logarithmic time simulations of quantum computations.


2017 ◽  
Vol 250 ◽  
pp. 1-15 ◽  
Author(s):  
Liang Lan ◽  
Kai Zhang ◽  
Hancheng Ge ◽  
Wei Cheng ◽  
Jun Liu ◽  
...  

2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Zhikuan Zhao ◽  
Jack K. Fitzsimons ◽  
Patrick Rebentrost ◽  
Vedran Dunjko ◽  
Joseph F. Fitzsimons

AbstractMachine learning has recently emerged as a fruitful area for finding potential quantum computational advantage. Many of the quantum-enhanced machine learning algorithms critically hinge upon the ability to efficiently produce states proportional to high-dimensional data points stored in a quantum accessible memory. Even given query access to exponentially many entries stored in a database, the construction of which is considered a one-off overhead, it has been argued that the cost of preparing such amplitude-encoded states may offset any exponential quantum advantage. Here we prove using smoothed analysis that if the data analysis algorithm is robust against small entry-wise input perturbation, state preparation can always be achieved with constant queries. This criterion is typically satisfied in realistic machine learning applications, where input data is subjective to moderate noise. Our results are equally applicable to the recent seminal progress in quantum-inspired algorithms, where specially constructed databases suffice for polylogarithmic classical algorithm in low-rank cases. The consequence of our finding is that for the purpose of practical machine learning, polylogarithmic processing time is possible under a general and flexible input model with quantum algorithms or quantum-inspired classical algorithms in the low-rank cases.


Sign in / Sign up

Export Citation Format

Share Document