A Low-Sample-Count, High-Precision Pareto Front Adaptive Sampling Algorithm Based on Multi-Criteria and Voronoi

Abstract In this paper, a Pareto front (PF)-based sampling algorithm, PF-Voronoi sampling method, is proposed to solve the problem of computationally intensive multi-objectives of medium size. The Voronoi diagram is introduced to classify the region containing PF prediction points into Pareto front cells (PFCs). Valid PFCs are screened according to the maximum crowding criterion (MCC), maximum LOO error criterion (MLEC), and maximum mean MSE criterion (MMMSEC). Sampling points are selected among the valid PFCs based on the Euclidean distance. The PF-Voronoi sampling method is applied to the coupled Kringing and NASG-II models and its validity is verified on the ZDT mathematical cases. The results show that the MCC criterion helps to improve the distribution diversity of PF. The MLEC criterion and the MMMSEC criterion reduce the number of training samples by 38.9% and 21.7%, respectively. The computational cost of the algorithm is reduced by more than 44.2%, compared to EHVIMOPSO and SAO-MOEA algorithms. The algorithm can be applied to multidisciplinary, multi-objective, and computationally intensive complex systems.

Download Full-text

Robust Well Placement Optimization Through Universal Trace Kriging with Adaptive Sampling

10.2118/207233-ms ◽

2021 ◽

Author(s):

Carlo Cristiano Stabile ◽

Marco Barbiero ◽

Giorgio Fighera ◽

Laura Dovera

Keyword(s):

Adaptive Sampling ◽

Pareto Front ◽

Computing Time ◽

Computational Cost ◽

Oil Field ◽

Computational Time ◽

Decision Space ◽

Adaptive Approach ◽

Reservoir Simulations ◽

Proxy Models

Abstract Optimizing well locations for a green field is critical to mitigate development risks. Performing such workflows with reservoir simulations is very challenging due to the huge computational cost. Proxy models can instead provide accurate estimates at a fraction of the computing time. This study presents an application of new generation functional proxies to optimize the well locations in a real oil field with respect to the actualized oil production on all the different geological realizations. Proxies are built with the Universal Trace Kriging and are functional in time allowing to actualize oil flows over the asset lifetime. Proxies are trained on the reservoir simulations using randomly sampled well locations. Two proxies are created for a pessimistic model (P10) and a mid-case model (P50) to capture the geological uncertainties. The optimization step uses the Non-dominated Sorting Genetic Algorithm, with discounted oil productions of the two proxies, as objective functions. An adaptive approach was employed: optimized points found from a first optimization were used to re-train the proxy models and a second run of optimization was performed. The methodology was applied on a real oil reservoir to optimize the location of four vertical production wells and compared against reference locations. 111 geological realizations were available, in which one relevant uncertainty is the presence of possible compartments. The decision space represented by the horizontal translation vectors for each well was sampled using Plackett-Burman and Latin-Hypercube designs. A first application produced a proxy with poor predictive quality. Redrawing the areas to avoid overlaps and to confine the decision space of each well in one compartment, improved the quality. This suggests that the proxy predictive ability deteriorates in presence of highly non-linear responses caused by sealing faults or by well interchanging positions. We then followed a 2-step adaptive approach: a first optimization was performed and the resulting Pareto front was validated with reservoir simulations; to further improve the proxy quality in this region of the decision space, the validated Pareto front points were added to the initial dataset to retrain the proxy and consequently rerun the optimization. The final well locations were validated on all 111 realizations with reservoir simulations and resulted in an overall increase of the discounted production of about 5% compared to the reference development strategy. The adaptive approach, combined with functional proxy, proved to be successful in improving the workflow by purposefully increasing the training set samples with data points able to enhance the optimization step effectiveness. Each optimization run performed relies on about 1 million proxy evaluations which required negligible computational time. The same workflow carried out with standard reservoir simulations would have been practically unfeasible.

Download Full-text

Improved Training of Deep Convolutional Networks via Minimum-Variance Regularized Adaptive Sampling

10.21203/rs.3.rs-983472/v1 ◽

2021 ◽

Author(s):

Alfonso Rojas-Domínguez ◽

Ivvan Valdez ◽

Manuel Ornelas-Rodríguez ◽

Martín Carpio

Keyword(s):

Neural Networks ◽

Adaptive Sampling ◽

Sampling Method ◽

Deep Neural Networks ◽

Computational Cost ◽

Stochastic Gradient Descent ◽

Experimental Comparison ◽

Great Success ◽

Convolutional Networks ◽

Training Examples

Abstract Fostered by technological and theoretical developments, deep neural networks have achieved great success in many applications, but their training by means of mini-batch stochastic gradient descent (SGD) can be very costly due to the possibly tens of millions of parameters to be optimized and the large amounts of training examples that must be processed. Said computational cost is exacerbated by the inefficiency of the uniform sampling method typically used by SGD to form the training mini-batches: since not all training examples are equally relevant for training, sampling these under a uniform distribution is far from optimal. A better strategy is to form the mini-batches by sampling the training examples under a distribution where the probability of being selected is proportional to the relevance of each individual example. This can be achieved through Importance Sampling (IS), which also achieves the minimization of the gradients’ variance w.r.t. the network parameters, further improving convergence. In this paper, an IS-based adaptive sampling method is studied that exploits side information to construct the required probability distribution. Said method is modified to enable its application to deep neural networks, and the improved method is dubbed Regularized Adaptive Sampling (RAS). Experimental comparison (using deep convolutional networks for classification of the MNIST and CIFAR-10 datasets) of RAS against SGD and against another sampling method in the state of the art, shows that RAS achieves relative improvements of the training process, without incurring significant overhead or affecting the accuracy of the networks.

Download Full-text

Physical Optics Radiation Integrals with Frequency-Independent Number of Division utilizing Fresnel Zone Number Localization and Adaptive Sampling Method

IEICE Transactions on Electronics ◽

10.1587/transele.e97.c.1134 ◽

2014 ◽

Vol E97.C (12) ◽

pp. 1134-1141

Author(s):

Takayuki KOHAMA ◽

Makoto ANDO

Keyword(s):

Adaptive Sampling ◽

Sampling Method ◽

Fresnel Zone ◽

Physical Optics ◽

Independent Number ◽

Frequency Independent

Download Full-text

Nanosecond Photodynamics Simulations of a Cis-Trans Isomerization Are Enabled by Machine Learning

10.26434/chemrxiv.13047863 ◽

2020 ◽

Author(s):

Jingbai Li ◽

Patrick Reiser ◽

André Eberhard ◽

Pascal Friederich ◽

Steven Lopez

Keyword(s):

Machine Learning ◽

Neural Networks ◽

Excited State ◽

Adaptive Sampling ◽

Computational Cost ◽

Ground Truth ◽

Absolute Error ◽

Photochemical Reactions ◽

Computational Techniques ◽

Full Potential

Photochemical reactions are being increasingly used to construct complex molecular architectures with mild and straightforward reaction conditions. Computational techniques are increasingly important to understand the reactivities and chemoselectivities of photochemical isomerization reactions because they offer molecular bonding information along the excited-state(s) of photodynamics. These photodynamics simulations are resource-intensive and are typically limited to 1–10 picoseconds and 1,000 trajectories due to high computational cost. Most organic photochemical reactions have excited-state lifetimes exceeding 1 picosecond, which places them outside possible computational studies. Westermeyr et al. demonstrated that a machine learning approach could significantly lengthen photodynamics simulation times for a model system, methylenimmonium cation (CH2NH2+).We have developed a Python-based code, Python Rapid Artificial Intelligence Ab Initio Molecular Dynamics (PyRAI2MD), to accomplish the unprecedented 10 ns cis-trans photodynamics of trans-hexafluoro-2-butene (CF3–CH=CH–CF3) in 3.5 days. The same simulation would take approximately 58 years with ground-truth multiconfigurational dynamics. We proposed an innovative scheme combining Wigner sampling, geometrical interpolations, and short-time quantum chemical trajectories to effectively sample the initial data, facilitating the adaptive sampling to generate an informative and data-efficient training set with 6,232 data points. Our neural networks achieved chemical accuracy (mean absolute error of 0.032 eV). Our 4,814 trajectories reproduced the S1 half-life (60.5 fs), the photochemical product ratio (trans: cis = 2.3: 1), and autonomously discovered a pathway towards a carbene. The neural networks have also shown the capability of generalizing the full potential energy surface with chemically incomplete data (trans → cis but not cis → trans pathways) that may offer future automated photochemical reaction discoveries.

Download Full-text

Improving the Computational Cost for Copied Region Detection in Forensic Images

Journal of Science and Technology Issue on Information and Communications Technology ◽

10.31130/jst.2016.28 ◽

2016 ◽

Vol 2 (1) ◽

pp. 55

Author(s):

Tu Huynh-Kha ◽

Thuong Le-Tien ◽

Synh Ha ◽

Khoa Huynh-Van

Keyword(s):

Wavelet Transform ◽

Euclidean Distance ◽

Research Work ◽

Computational Cost ◽

Correlation Coefficients ◽

Zernike Moments ◽

Computational Time ◽

Discrete Wavelet ◽

Feature Vectors ◽

Region Detection

This research work develops a new method to detect the forgery in image by combining the Wavelet transform and modified Zernike Moments (MZMs) in which the features are defined from more pixels than in traditional Zernike Moments. The tested image is firstly converted to grayscale and applied one level Discrete Wavelet Transform (DWT) to reduce the size of image by a half in both sides. The approximation sub-band (LL), which is used for processing, is then divided into overlapping blocks and modified Zernike moments are calculated in each block as feature vectors. More pixels are considered, more sufficient features are extracted. Lexicographical sorting and correlation coefficients computation on feature vectors are next steps to find the similar blocks. The purpose of applying DWT to reduce the dimension of the image before using Zernike moments with updated coefficients is to improve the computational time and increase exactness in detection. Copied or duplicated parts will be detected as traces of copy-move forgery manipulation based on a threshold of correlation coefficients and confirmed exactly from the constraint of Euclidean distance. Comparisons results between proposed method and related ones prove the feasibility and efficiency of the proposed algorithm.

Download Full-text

A Fast Estimator for Binary Choice Models with Spatial, Temporal, and Spatio-Temporal Interdependence

Political Analysis ◽

10.1017/pan.2020.54 ◽

2021 ◽

pp. 1-7

Author(s):

Julian Wucherpfennig ◽

Aya Kachi ◽

Nils-Christian Bormann ◽

Philipp Hunziker

Keyword(s):

Computational Cost ◽

Choice Models ◽

Reduced Form ◽

Binary Choice ◽

Monte Carlo Experiments ◽

Binary Choice Models ◽

Spatio Temporal ◽

Computationally Intensive ◽

Pseudo Maximum Likelihood ◽

Parameter Values

Abstract Binary outcome models are frequently used in the social sciences and economics. However, such models are difficult to estimate with interdependent data structures, including spatial, temporal, and spatio-temporal autocorrelation because jointly determined error terms in the reduced-form specification are generally analytically intractable. To deal with this problem, simulation-based approaches have been proposed. However, these approaches (i) are computationally intensive and impractical for sizable datasets commonly used in contemporary research, and (ii) rarely address temporal interdependence. As a way forward, we demonstrate how to reduce the computational burden significantly by (i) introducing analytically-tractable pseudo maximum likelihood estimators for latent binary choice models that exhibit interdependence across space and time and by (ii) proposing an implementation strategy that increases computational efficiency considerably. Monte Carlo experiments show that our estimators recover the parameter values as good as commonly used estimation alternatives and require only a fraction of the computational cost.

Download Full-text

Adaptive sampling of potential-field data: A direct approach to compressive inversion

Geophysics ◽

10.1190/geo2013-0087.1 ◽

2014 ◽

Vol 79 (1) ◽

pp. IM1-IM9 ◽

Cited By ~ 20

Author(s):

Nathan Leon Foks ◽

Richard Krahenbuhl ◽

Yaoguo Li

Keyword(s):

Potential Field ◽

Adaptive Sampling ◽

Field Data ◽

Computational Cost ◽

Large Field ◽

Magnetic Data ◽

Data Sampling ◽

Data Types ◽

Data Set ◽

Potential Field Data

Compressive inversion uses computational algorithms that decrease the time and storage needs of a traditional inverse problem. Most compression approaches focus on the model domain, and very few, other than traditional downsampling focus on the data domain for potential-field applications. To further the compression in the data domain, a direct and practical approach to the adaptive downsampling of potential-field data for large inversion problems has been developed. The approach is formulated to significantly reduce the quantity of data in relatively smooth or quiet regions of the data set, while preserving the signal anomalies that contain the relevant target information. Two major benefits arise from this form of compressive inversion. First, because the approach compresses the problem in the data domain, it can be applied immediately without the addition of, or modification to, existing inversion software. Second, as most industry software use some form of model or sensitivity compression, the addition of this adaptive data sampling creates a complete compressive inversion methodology whereby the reduction of computational cost is achieved simultaneously in the model and data domains. We applied the method to a synthetic magnetic data set and two large field magnetic data sets; however, the method is also applicable to other data types. Our results showed that the relevant model information is maintained after inversion despite using 1%–5% of the data.

Download Full-text

Solving the Pareto front for multiobjective Markov chains using the minimum Euclidean distance gradient-based optimization method

Mathematics and Computers in Simulation ◽

10.1016/j.matcom.2015.08.004 ◽

2016 ◽

Vol 119 ◽

pp. 142-160 ◽

Cited By ~ 12

Author(s):

Julio B. Clempner ◽

Alexander S. Poznyak

Keyword(s):

Markov Chains ◽

Euclidean Distance ◽

Pareto Front ◽

Optimization Method ◽

Minimum Euclidean Distance ◽

Gradient Based

Download Full-text

SAMS: Stochastic Analysis With Minimal Sampling—A Fast Algorithm for Analysis and Design Under Uncertainty

Journal of Mechanical Design ◽

10.1115/1.1866157 ◽

2004 ◽

Vol 127 (4) ◽

pp. 558-571 ◽

Cited By ~ 9

Author(s):

A. Mawardi ◽

R. Pitchumani

Keyword(s):

System Performance ◽

Stochastic Analysis ◽

Sampling Method ◽

Latin Hypercube Sampling ◽

Analysis And Design ◽

Process Outcomes ◽

Input Parameters ◽

Computationally Intensive ◽

Uncertain Input ◽

Reliability And Robustness

Design of processes and devices under uncertainty calls for stochastic analysis of the effects of uncertain input parameters on the system performance and process outcomes. The stochastic analysis is often carried out based on sampling from the uncertain input parameters space, and using a physical model of the system to generate distributions of the outcomes. In many engineering applications, a large number of samples—on the order of thousands or more—is needed for an accurate convergence of the output distributions, which renders a stochastic analysis computationally intensive. Toward addressing the computational challenge, this article presents a methodology of S̱tochastic A̱nalysis with M̱inimal S̱ampling (SAMS). The SAMS approach is based on approximating an output distribution by an analytical function, whose parameters are estimated using a few samples, constituting an orthogonal Taguchi array, from the input distributions. The analytical output distributions are, in turn, used to extract the reliability and robustness measures of the system. The methodology is applied to stochastic analysis of a composite materials manufacturing process under uncertainty, and the results are shown to compare closely to those from a Latin hypercube sampling method. The SAMS technique is also demonstrated to yield computational savings of up to 90% relative to the sampling-based method.

Download Full-text