scholarly journals Data-driven Random Fourier Features using Stein Effect

Author(s):  
Wei-Cheng Chang ◽  
Chun-Liang Li ◽  
Yiming Yang ◽  
Barnabás Póczos

Large-scale kernel approximation is an important problem in machine learning research. Approaches using random Fourier features have become increasingly popular \cite{Rahimi_NIPS_07}, where kernel approximation is treated as empirical mean estimation via Monte Carlo (MC) or Quasi-Monte Carlo (QMC) integration \cite{Yang_ICML_14}. A limitation of the current approaches is that all the features receive an equal weight summing to 1. In this paper, we propose a novel shrinkage estimator from "Stein effect", which provides a data-driven weighting strategy for random features and enjoys theoretical justifications in terms of lowering the empirical risk. We further present an efficient randomized algorithm for large-scale applications of the proposed method. Our empirical results on six benchmark data sets demonstrate the advantageous performance of this approach over representative baselines in both kernel approximation and supervised learning tasks.

Energies ◽  
2021 ◽  
Vol 14 (8) ◽  
pp. 2328
Author(s):  
Mohammed Alzubaidi ◽  
Kazi N. Hasan ◽  
Lasantha Meegahapola ◽  
Mir Toufikur Rahman

This paper presents a comparative analysis of six sampling techniques to identify an efficient and accurate sampling technique to be applied to probabilistic voltage stability assessment in large-scale power systems. In this study, six different sampling techniques are investigated and compared to each other in terms of their accuracy and efficiency, including Monte Carlo (MC), three versions of Quasi-Monte Carlo (QMC), i.e., Sobol, Halton, and Latin Hypercube, Markov Chain MC (MCMC), and importance sampling (IS) technique, to evaluate their suitability for application with probabilistic voltage stability analysis in large-scale uncertain power systems. The coefficient of determination (R2) and root mean square error (RMSE) are calculated to measure the accuracy and the efficiency of the sampling techniques compared to each other. All the six sampling techniques provide more than 99% accuracy by producing a large number of wind speed random samples (8760 samples). In terms of efficiency, on the other hand, the three versions of QMC are the most efficient sampling techniques, providing more than 96% accuracy with only a small number of generated samples (150 samples) compared to other techniques.


Mathematics ◽  
2020 ◽  
Vol 8 (5) ◽  
pp. 796
Author(s):  
Pavel Praks ◽  
Dejan Brkić

In this reply, we present updated approximations to the Colebrook equation for flow friction. The equations are equally computational simple, but with increased accuracy thanks to the optimization procedure, which was proposed by the discusser, Dr. Majid Niazkar. Our large-scale quasi-Monte Carlo verifications confirm that the here presented novel optimized numerical parameters further significantly increase accuracy of the estimated flow friction factor.


Acta Numerica ◽  
2013 ◽  
Vol 22 ◽  
pp. 133-288 ◽  
Author(s):  
Josef Dick ◽  
Frances Y. Kuo ◽  
Ian H. Sloan

This paper is a contemporary review of QMC (‘quasi-Monte Carlo’) methods, that is, equal-weight rules for the approximate evaluation of high-dimensional integrals over the unit cube [0,1]s, where s may be large, or even infinite. After a general introduction, the paper surveys recent developments in lattice methods, digital nets, and related themes. Among those recent developments are methods of construction of both lattices and digital nets, to yield QMC rules that have a prescribed rate of convergence for sufficiently smooth functions, and ideally also guaranteed slow growth (or no growth) of the worst-case error as s increases. A crucial role is played by parameters called ‘weights’, since a careful use of the weight parameters is needed to ensure that the worst-case errors in an appropriately weighted function space are bounded, or grow only slowly, as the dimension s increases. Important tools for the analysis are weighted function spaces, reproducing kernel Hilbert spaces, and discrepancy, all of which are discussed with an appropriate level of detail.


2011 ◽  
Vol 53 (1) ◽  
pp. 1-37 ◽  
Author(s):  
F. Y. KUO ◽  
CH. SCHWAB ◽  
I. H. SLOAN

AbstractThis paper is a contemporary review of quasi-Monte Carlo (QMC) methods, that is, equal-weight rules for the approximate evaluation of high-dimensional integrals over the unit cube [0,1]s. It first introduces the by-now standard setting of weighted Hilbert spaces of functions with square-integrable mixed first derivatives, and then indicates alternative settings, such as non-Hilbert spaces, that can sometimes be more suitable. Original contributions include the extension of the fast component-by-component (CBC) construction of lattice rules that achieve the optimal convergence order (a rate of almost 1/N, where N is the number of points, independently of dimension) to so-called “product and order dependent” (POD) weights, as seen in some recent applications. Although the paper has a strong focus on lattice rules, the function space settings are applicable to all QMC methods. Furthermore, the error analysis and construction of lattice rules can be adapted to polynomial lattice rules from the family of digital nets.


2021 ◽  
Vol 502 (3) ◽  
pp. 3942-3954
Author(s):  
D Hung ◽  
B C Lemaux ◽  
R R Gal ◽  
A R Tomczak ◽  
L M Lubin ◽  
...  

ABSTRACT We present a new mass function of galaxy clusters and groups using optical/near-infrared (NIR) wavelength spectroscopic and photometric data from the Observations of Redshift Evolution in Large-Scale Environments (ORELSE) survey. At z ∼ 1, cluster mass function studies are rare regardless of wavelength and have never been attempted from an optical/NIR perspective. This work serves as a proof of concept that z ∼ 1 cluster mass functions are achievable without supplemental X-ray or Sunyaev-Zel’dovich data. Measurements of the cluster mass function provide important contraints on cosmological parameters and are complementary to other probes. With ORELSE, a new cluster finding technique based on Voronoi tessellation Monte Carlo (VMC) mapping, and rigorous purity and completeness testing, we have obtained ∼240 galaxy overdensity candidates in the redshift range 0.55 < z < 1.37 at a mass range of 13.6 < log (M/M⊙) < 14.8. This mass range is comparable to existing optical cluster mass function studies for the local universe. Our candidate numbers vary based on the choice of multiple input parameters related to detection and characterization in our cluster finding algorithm, which we incorporated into the mass function analysis through a Monte Carlo scheme. We find cosmological constraints on the matter density, Ωm, and the amplitude of fluctuations, σ8, of $\Omega _{m} = 0.250^{+0.104}_{-0.099}$ and $\sigma _{8} = 1.150^{+0.260}_{-0.163}$. While our Ωm value is close to concordance, our σ8 value is ∼2σ higher because of the inflated observed number densities compared to theoretical mass function models owing to how our survey targeted overdense regions. With Euclid and several other large, unbiased optical surveys on the horizon, VMC mapping will enable optical/NIR cluster cosmology at redshifts much higher than what has been possible before.


2020 ◽  
Vol 26 (3) ◽  
pp. 171-176
Author(s):  
Ilya M. Sobol ◽  
Boris V. Shukhman

AbstractA crude Monte Carlo (MC) method allows to calculate integrals over a d-dimensional cube. As the number N of integration nodes becomes large, the rate of probable error of the MC method decreases as {O(1/\sqrt{N})}. The use of quasi-random points instead of random points in the MC algorithm converts it to the quasi-Monte Carlo (QMC) method. The asymptotic error estimate of QMC integration of d-dimensional functions contains a multiplier {1/N}. However, the multiplier {(\ln N)^{d}} is also a part of the error estimate, which makes it virtually useless. We have proved that, in the general case, the QMC error estimate is not limited to the factor {1/N}. However, our numerical experiments show that using quasi-random points of Sobol sequences with {N=2^{m}} with natural m makes the integration error approximately proportional to {1/N}. In our numerical experiments, {d\leq 15}, and we used {N\leq 2^{40}} points generated by the SOBOLSEQ16384 code published in 2011. In this code, {d\leq 2^{14}} and {N\leq 2^{63}}.


Sign in / Sign up

Export Citation Format

Share Document