scholarly journals Optimal 1-NN prototypes for pathological geometries

2021 ◽  
Vol 7 ◽  
pp. e464
Author(s):  
Ilia Sucholutsky ◽  
Matthias Schonlau

Using prototype methods to reduce the size of training datasets can drastically reduce the computational cost of classification with instance-based learning algorithms like the k-Nearest Neighbour classifier. The number and distribution of prototypes required for the classifier to match its original performance is intimately related to the geometry of the training data. As a result, it is often difficult to find the optimal prototypes for a given dataset, and heuristic algorithms are used instead. However, we consider a particularly challenging setting where commonly used heuristic algorithms fail to find suitable prototypes and show that the optimal number of prototypes can instead be found analytically. We also propose an algorithm for finding nearly-optimal prototypes in this setting, and use it to empirically validate the theoretical results. Finally, we show that a parametric prototype generation method that normally cannot solve this pathological setting can actually find optimal prototypes when combined with the results of our theoretical analysis.

2010 ◽  
Vol 20 (11) ◽  
pp. 2075-2107 ◽  
Author(s):  
F. AURICCHIO ◽  
L. BEIRÃO DA VEIGA ◽  
T. J. R. HUGHES ◽  
A. REALI ◽  
G. SANGALLI

We initiate the study of collocation methods for NURBS-based isogeometric analysis. The idea is to connect the superior accuracy and smoothness of NURBS basis functions with the low computational cost of collocation. We develop a one-dimensional theoretical analysis, and perform numerical tests in one, two and three dimensions. The numerical results obtained confirm theoretical results and illustrate the potential of the methodology.


Author(s):  
Sharifah Sakinah Syed Ahmad ◽  
Ezzatul Farhain Azmi ◽  
Fauziah Kasmin ◽  
Zuraini Othman

Even though there are numerous classifiers algorithms that are more complex, k-Nearest Neighbour (k-NN) is regarded as one amongst the most successful approaches to solve real-world issues. The classification process’s effectiveness relies on the training set’s data. However, when k-NN classifier is applied to a real world, various issues could arise; for instance, they are considered to be computationally expensive as the complete training set needs to be stored in the computer for classification of the unseen data. Also, intolerance of k-NN classifier towards irrelevant features can be seen. Conversely, imbalance in the training data could occur wherein considerably larger numbers of data could be seen with some classes versus other classes. Thus, selected training data are employed to improve the effectiveness of k-NN classifier when dealing with large datasets. In this research work, a substitute method is present to enhance data selection by simultaneously clubbing the feature selection as well as instances selection pertaining to k-NN classifier by employing Cooperative Binary Particle Swarm Optimisation (CBPSO). This method can also address the constraint of employing the k-nearest neighbour classifier, particularly when handling high dimensional and imbalance data. A comparison study was performed to demonstrate the performance of our approach by employing 20 real world datasets taken from the UCI Machine Learning Repository. The corresponding table of the classification rate demonstrates the algorithm’s performance. The experimental outcomes exhibit the efficacy of our proposed approach.


Author(s):  
SALVADOR GARCÍA ◽  
JOSÉ-RAMÓN CANO ◽  
ESTER BERNADÓ-MANSILLA ◽  
FRANCISCO HERRERA

Evolutionary prototype selection has shown its effectiveness in the past in the prototype selection domain. It improves in most of the cases the results offered by classical prototype selection algorithms but its computational cost is expensive. In this paper, we analyze the behavior of the evolutionary prototype selection strategy, considering a complexity measure for classification problems based on overlapping. In addition, we have analyzed different k values for the nearest neighbour classifier in this domain of study to see its influence on the results of PS methods. The objective consists of predicting when the evolutionary prototype selection is effective for a particular problem, based on this overlapping measure.


2018 ◽  
Author(s):  
Roman Zubatyuk ◽  
Justin S. Smith ◽  
Jerzy Leszczynski ◽  
Olexandr Isayev

<p>Atomic and molecular properties could be evaluated from the fundamental Schrodinger’s equation and therefore represent different modalities of the same quantum phenomena. Here we present AIMNet, a modular and chemically inspired deep neural network potential. We used AIMNet with multitarget training to learn multiple modalities of the state of the atom in a molecular system. The resulting model shows on several benchmark datasets the state-of-the-art accuracy, comparable to the results of orders of magnitude more expensive DFT methods. It can simultaneously predict several atomic and molecular properties without an increase in computational cost. With AIMNet we show a new dimension of transferability: the ability to learn new targets utilizing multimodal information from previous training. The model can learn implicit solvation energy (like SMD) utilizing only a fraction of original training data, and archive MAD error of 1.1 kcal/mol compared to experimental solvation free energies in MNSol database.</p>


Optics ◽  
2020 ◽  
Vol 2 (1) ◽  
pp. 25-42
Author(s):  
Ioseph Gurwich ◽  
Yakov Greenberg ◽  
Kobi Harush ◽  
Yarden Tzabari

The present study is aimed at designing anti-reflective (AR) engraving on the input–output surfaces of a rectangular light-guide. We estimate AR efficiency, by the transmittance level in the angular range, determined by the light-guide. Using nano-engraving, we achieve a uniform high transmission over a wide range of wavelengths. In the past, we used smoothed conical pins or indentations on the faces of light-guide crystal as the engraved structure. Here, we widen the class of pins under consideration, following the physical model developed in the previous paper. We analyze the smoothed pyramidal pins with different base shapes. The possible effect of randomization of the pins parameters is also examined. The results obtained demonstrate optimized engraved structure with parameters depending on the required spectral range and facet format. The predicted level of transmittance is close to 99%, and its flatness (estimated by the standard deviation) in the required wavelengths range is 0.2%. The theoretical analysis and numerical calculations indicate that the obtained results demonstrate the best transmission (reflection) we can expect for a facet with the given shape and size for the required spectral band. The approach is equally useful for any other form and of the facet. We also discuss a simple way of comparing experimental and theoretical results for a light-guide with the designed input and output features. In this study, as well as in our previous work, we restrict ourselves to rectangular facets. We also consider the limitations on maximal transmission produced by the size and shape of the light-guide facets. The theoretical analysis is performed for an infinite structure and serves as an upper bound on the transmittance for smaller-size apertures.


2013 ◽  
Vol 2013 ◽  
pp. 1-10
Author(s):  
Lei Luo ◽  
Chao Zhang ◽  
Yongrui Qin ◽  
Chunyuan Zhang

With the explosive growth of the data volume in modern applications such as web search and multimedia retrieval, hashing is becoming increasingly important for efficient nearest neighbor (similar item) search. Recently, a number of data-dependent methods have been developed, reflecting the great potential of learning for hashing. Inspired by the classic nonlinear dimensionality reduction algorithm—maximum variance unfolding, we propose a novel unsupervised hashing method, named maximum variance hashing, in this work. The idea is to maximize the total variance of the hash codes while preserving the local structure of the training data. To solve the derived optimization problem, we propose a column generation algorithm, which directly learns the binary-valued hash functions. We then extend it using anchor graphs to reduce the computational cost. Experiments on large-scale image datasets demonstrate that the proposed method outperforms state-of-the-art hashing methods in many cases.


2021 ◽  
Author(s):  
Mark Zhao ◽  
Ryosuke Okuno

Abstract Equation-of-state (EOS) compositional simulation is commonly used to model the interplay between phase behavior and fluid flow for various reservoir and surface processes. Because of its computational cost, however, there is a critical need for efficient phase-behavior calculations using an EOS. The objective of this research was to develop a proxy model for fugacity coefficient based on the Peng-Robinson EOS for rapid multiphase flash in compositional flow simulation. The proxy model as implemented in this research is to bypass the calculations of fugacity coefficients when the Peng-Robinson EOS has only one root, which is often the case at reservoir conditions. The proxy fugacity model was trained by artificial neural networks (ANN) with over 30 million fugacity coefficients based on the Peng-Robinson EOS. It accurately predicts the Peng- Robinson fugacity coefficient by using four parameters: Am, Bm, Bi, and ΣxiAij. Since these scalar parameters are general, not specific to particular compositions, pressures, and temperatures, the proxy model is applicable to petroleum engineering applications as equally as the original Peng-Robinson EOS. The proxy model is applied to multiphase flash calculations (phase-split and stability), where the cubic equation solutions and fugacity coefficient calculations are bypassed when the Peng-Robinson EOS has one root. The original fugacity coefficient is analytically calculated when the EOS has more than one root, but this occurs only occasionally at reservoir conditions. A case study shows the proxy fugacity model gave a speed-up factor of 3.4% in comparison to the conventional EOS calculation. Case studies also demonstrate accurate multiphase flash results (stability and phase split) and interchangeable proxy models for different fluid cases with different (numbers of) components. This is possible because it predicts the Peng-Robinson fugacity in the variable space that is not specific to composition, temperature, and pressure. For the same reason, non-zero binary iteration parameters do not impair the applicability, accuracy, robustness, and efficiency of the model. As the proxy models are specific to individual components, a combination of proxy models can be used to model for any mixture of components. Tuning of training hyperparameters and training data sampling method helped reduce the mean absolute percent error to less than 0.1% in the ANN modeling. To the best of our knowledge, this is the first generalized proxy model of the Peng-Robinson fugacity that is applicable to any mixture. The proposed model retains the conventional flash iteration, the convergence robustness, and the option of manual parameter tuning for fluid characterization.


SPE Journal ◽  
2014 ◽  
Vol 19 (05) ◽  
pp. 891-908 ◽  
Author(s):  
Obiajulu J. Isebor ◽  
David Echeverría Ciaurri ◽  
Louis J. Durlofsky

Summary The optimization of general oilfield development problems is considered. Techniques are presented to simultaneously determine the optimal number and type of new wells, the sequence in which they should be drilled, and their corresponding locations and (time-varying) controls. The optimization is posed as a mixed-integer nonlinear programming (MINLP) problem and involves categorical, integer-valued, and real-valued variables. The formulation handles bound, linear, and nonlinear constraints, with the latter treated with filter-based techniques. Noninvasive derivative-free approaches are applied for the optimizations. Methods considered include branch and bound (B&B), a rigorous global-search procedure that requires the relaxation of the categorical variables; mesh adaptive direct search (MADS), a local pattern-search method; particle swarm optimization (PSO), a heuristic global-search method; and a PSO-MADS hybrid. Four example cases involving channelized-reservoir models are presented. The recently developed PSO-MADS hybrid is shown to consistently outperform the standalone MADS and PSO procedures. In the two cases in which B&B is applied, the heuristic PSO-MADS approach is shown to give comparable solutions but at a much lower computational cost. This is significant because B&B provides a systematic search in the categorical variables. We conclude that, although it is demanding in terms of computation, the methodology presented here, with PSO-MADS as the core optimization method, appears to be applicable for realistic reservoir development and management.


Sign in / Sign up

Export Citation Format

Share Document