Flow-loss

Recently there has been significant interest in using machine learning to improve the accuracy of cardinality estimation. This work has focused on improving average estimation error, but not all estimates matter equally for downstream tasks like query optimization. Since learned models inevitably make mistakes, the goal should be to improve the estimates that make the biggest difference to an optimizer. We introduce a new loss function, Flow-Loss, for learning cardinality estimation models. Flow-Loss approximates the optimizer's cost model and search algorithm with analytical functions, which it uses to optimize explicitly for better query plans. At the heart of Flow-Loss is a reduction of query optimization to a flow routing problem on a certain "plan graph", in which different paths correspond to different query plans. To evaluate our approach, we introduce the Cardinality Estimation Benchmark (CEB) which contains the ground truth cardinalities for sub-plans of over 16 K queries from 21 templates with up to 15 joins. We show that across different architectures and databases, a model trained with Flow-Loss improves the plan costs and query runtimes despite having worse estimation accuracy than a model trained with Q-Error. When the test set queries closely match the training queries, models trained with both loss functions perform well. However, the Q-Error-trained model degrades significantly when evaluated on slightly different queries (e.g., similar but unseen query templates), while the Flow-Loss-trained model generalizes better to such situations, achieving 4 -- 8× better 99th percentile runtimes on unseen templates with the same model architecture and training data.

Download Full-text

Uncertainty analysis of numerical inversions of temperature logs from boreholes under injection conditions

Journal of Geophysics and Engineering ◽

10.1093/jge/gxab069 ◽

2021 ◽

Vol 18 (6) ◽

pp. 1022-1034

Author(s):

Jia Wang ◽

Fabian Nitschke ◽

Emmanuel Gaucher ◽

Thomas Kohl

Keyword(s):

Adaptive Sampling ◽

Estimation Error ◽

Rock Physics ◽

Temperature Data ◽

Thermal Recovery ◽

Machine Learning Techniques ◽

Estimation Accuracy ◽

Borehole Data ◽

Operational Conditions ◽

Flow Loss

Abstract Conventional methods to estimate the static formation temperature (SFT) require borehole temperature data measured during thermal recovery periods. This can be both economically and technically prohibitive under real operational conditions, especially for high-temperature boreholes. This study investigates the use of temperature logs obtained under injection conditions to determine SFT through inverse modelling. An adaptive sampling approach based on machine-learning techniques is applied to explore the model space efficiently by iteratively proposing samples based on the results of previous runs. Synthetic case studies are conducted with rigorous evaluation of factors affecting the quality of SFT estimates for deep hot wells. The results show that using temperature data measured at higher flow rates or after longer injection times could lead to less-reliable results. Furthermore, the estimation error exhibits an almost linear dependency on the standard error of the measured borehole temperatures. In addition, potential flow loss zones in the borehole would lead to increased uncertainties in the SFT estimates. Consequently, any prior knowledge about the amount of flow loss could improve the estimation accuracy considerably. For formations with thermal gradients varying with depth, prior information on the depth of the gradient change is necessary to avoid spurious results. The inversion scheme presented is demonstrated as an efficient tool for quantifying uncertainty in the interpretation of borehole data. Although only temperature data are considered in this work, other types of data such as flow and transport measurements can also be included in this method for geophysical and rock physics studies.

Download Full-text

Flower Pollination Algorithm for Software Effort Coefficients Optimization to Improve Effort Estimation Accuracy

JUITA Jurnal Informatika ◽

10.30595/juita.v9i2.10511 ◽

2021 ◽

Vol 9 (2) ◽

pp. 139

Author(s):

Alifia Puspaningrum ◽

Fachrul Pralienka Bani Muhammad ◽

Esti Mulyani

Keyword(s):

Search Algorithm ◽

Cost Model ◽

Cuckoo Search ◽

Estimation Accuracy ◽

Flower Pollination Algorithm ◽

Metaheuristic Algorithm ◽

Effort Estimation ◽

Software Effort Estimation ◽

Flower Pollination ◽

Cocomo Ii

Software effort estimation is one of important area in project management which used to predict effort for each person to develop an application. Besides, Constructive Cost Model (COCOMO) II is a common model used to estimate effort estimation. There are two coefficients in estimating effort of COCOMO II which highly affect the estimation accuracy. Several methods have been conducted to estimate those coefficients which can predict a closer value between actual effort and predicted value. In this paper, a new metaheuristic algorithm which is known as Flower Pollination Algorithm (FPA) is proposed in several scenario of iteration. Besides, FPA is also compared to several metaheuristic algorithm, namely Cuckoo Search Algorithm and Particle Swarm Optimization. After evaluated by using Mean Magnitude of Relative Error (MMRE), experimental results show that FPA obtains the best result in estimating effort compared to other algorithms by reached 52.48% of MMRE in 500 iterations.

Download Full-text

Multilayer Soil Moisture Mapping at a Regional Scale from Multisource Data via a Machine Learning Method

Remote Sensing ◽

10.3390/rs11030284 ◽

2019 ◽

Vol 11 (3) ◽

pp. 284 ◽

Cited By ~ 1

Author(s):

Linglin Zeng ◽

Shun Hu ◽

Daxiang Xiang ◽

Xiang Zhang ◽

Deren Li ◽

...

Keyword(s):

Machine Learning ◽

Soil Moisture ◽

Regional Scale ◽

Remotely Sensed ◽

Temporal Variations ◽

Training Data ◽

Estimation Accuracy ◽

Learning Approaches ◽

Remotely Sensed Data ◽

Deep Soil

Soil moisture mapping at a regional scale is commonplace since these data are required in many applications, such as hydrological and agricultural analyses. The use of remotely sensed data for the estimation of deep soil moisture at a regional scale has received far less emphasis. The objective of this study was to map the 500-m, 8-day average and daily soil moisture at different soil depths in Oklahoma from remotely sensed and ground-measured data using the random forest (RF) method, which is one of the machine-learning approaches. In order to investigate the estimation accuracy of the RF method at both a spatial and a temporal scale, two independent soil moisture estimation experiments were conducted using data from 2010 to 2014: a year-to-year experiment (with a root mean square error (RMSE) ranging from 0.038 to 0.050 m3/m3) and a station-to-station experiment (with an RMSE ranging from 0.044 to 0.057 m3/m3). Then, the data requirements, importance factors, and spatial and temporal variations in estimation accuracy were discussed based on the results using the training data selected by iterated random sampling. The highly accurate estimations of both the surface and the deep soil moisture for the study area reveal the potential of RF methods when mapping soil moisture at a regional scale, especially when considering the high heterogeneity of land-cover types and topography in the study area.

Download Full-text

Policy adaptation for vehicle routing

AI Communications ◽

10.3233/aic-201577 ◽

2020 ◽

pp. 1-15

Author(s):

Tristan Cazenave ◽

Jean-Yves Lucas ◽

Thomas Triboulet ◽

Hyoseok Kim

Keyword(s):

Vehicle Routing ◽

Search Algorithm ◽

Daily Basis ◽

Electric Utility ◽

Routing Problem ◽

Utility Company ◽

Standard Problem ◽

Policy Adaptation ◽

Monte Carlo Search ◽

Player Game

Nested Rollout Policy Adaptation (NRPA) is a Monte Carlo search algorithm that learns a playout policy in order to solve a single player game. In this paper we apply NRPA to the vehicle routing problem. This problem is important for large companies that have to manage a fleet of vehicles on a daily basis. Real problems are often too large to be solved exactly. The algorithm is applied to standard problem of the literature and to the specific problems of EDF (Electricité De France, the main French electric utility company). These specific problems have peculiar constraints. NRPA gives better result than the algorithm previously used by EDF.

Download Full-text

A cost model for NDP-aware query optimization for KV-stores

Proceedings of the 17th International Workshop on Data Management on New Hardware (DaMoN 2021) ◽

10.1145/3465998.3466013 ◽

2021 ◽

Author(s):

Christian Knödler ◽

Tobias Vinçon ◽

Arthur Bernhardt ◽

Ilia Petrov ◽

Leonardo Solis-Vasquez ◽

...

Keyword(s):

Query Optimization ◽

Cost Model

Download Full-text

Optimal Tuner Selection for Kalman Filter-Based Aircraft Engine Performance Estimation

Journal of Engineering for Gas Turbines and Power ◽

10.1115/1.3157096 ◽

2009 ◽

Vol 132 (3) ◽

Cited By ~ 26

Author(s):

Donald L. Simon ◽

Sanjay Garg

Keyword(s):

Kalman Filter ◽

Estimation Error ◽

Engine Performance ◽

Aircraft Engine ◽

Operating Conditions ◽

Performance Estimation ◽

Experimental Simulation ◽

Tuning Parameter ◽

Estimation Accuracy ◽

On Line

A linear point design methodology for minimizing the error in on-line Kalman filter-based aircraft engine performance estimation applications is presented. This technique specifically addresses the underdetermined estimation problem, where there are more unknown parameters than available sensor measurements. A systematic approach is applied to produce a model tuning parameter vector of appropriate dimension to enable estimation by a Kalman filter, while minimizing the estimation error in the parameters of interest. Tuning parameter selection is performed using a multivariable iterative search routine that seeks to minimize the theoretical mean-squared estimation error. This paper derives theoretical Kalman filter estimation error bias and variance values at steady-state operating conditions, and presents the tuner selection routine applied to minimize these values. Results from the application of the technique to an aircraft engine simulation are presented and compared with the conventional approach of tuner selection. Experimental simulation results are found to be in agreement with theoretical predictions. The new methodology is shown to yield a significant improvement in on-line engine performance estimation accuracy.

Download Full-text

Quantifying the structure of strong gravitational lens potentials with uncertainty-aware deep neural networks

Monthly Notices of the Royal Astronomical Society ◽

10.1093/mnras/staa3201 ◽

2020 ◽

Vol 499 (4) ◽

pp. 5641-5652

Author(s):

Georgios Vernardos ◽

Grigorios Tsagkatakis ◽

Yannis Pantazis

Keyword(s):

Confidence Intervals ◽

Galaxy Evolution ◽

Gravitational Lensing ◽

Probability Distributions ◽

Mass Density ◽

Ground Truth ◽

Gaussian Random Fields ◽

Training Data ◽

Gravitational Lens ◽

Data Set

ABSTRACT Gravitational lensing is a powerful tool for constraining substructure in the mass distribution of galaxies, be it from the presence of dark matter sub-haloes or due to physical mechanisms affecting the baryons throughout galaxy evolution. Such substructure is hard to model and is either ignored by traditional, smooth modelling, approaches, or treated as well-localized massive perturbers. In this work, we propose a deep learning approach to quantify the statistical properties of such perturbations directly from images, where only the extended lensed source features within a mask are considered, without the need of any lens modelling. Our training data consist of mock lensed images assuming perturbing Gaussian Random Fields permeating the smooth overall lens potential, and, for the first time, using images of real galaxies as the lensed source. We employ a novel deep neural network that can handle arbitrary uncertainty intervals associated with the training data set labels as input, provides probability distributions as output, and adopts a composite loss function. The method succeeds not only in accurately estimating the actual parameter values, but also reduces the predicted confidence intervals by 10 per cent in an unsupervised manner, i.e. without having access to the actual ground truth values. Our results are invariant to the inherent degeneracy between mass perturbations in the lens and complex brightness profiles for the source. Hence, we can quantitatively and robustly quantify the smoothness of the mass density of thousands of lenses, including confidence intervals, and provide a consistent ranking for follow-up science.

Download Full-text

Assessment of 2D ultrasound fluid volume estimation accuracy in different shaped objects: an in vitro study

Acta Radiologica ◽

10.1177/0284185119854198 ◽

2019 ◽

Vol 61 (2) ◽

pp. 253-259

Author(s):

Iroshani Kodikara ◽

Iroshini Abeysekara ◽

Dhanusha Gamage ◽

Isurani Ilayperuma

Keyword(s):

Estimation Error ◽

In Vitro Study ◽

High Accuracy ◽

Volume Estimation ◽

Estimation Accuracy ◽

Rater Agreement ◽

Actual Volume ◽

2D Ultrasound ◽

One Way Anova

Background Volume estimation of organs using two-dimensional (2D) ultrasonography is frequently warranted. Considering the influence of estimated volume on patient management, maintenance of its high accuracy is empirical. However, data are scarce regarding the accuracy of estimated volume of non-globular shaped objects of different volumes. Purpose To evaluate the volume estimation accuracy of different shaped and sized objects using high-end 2D ultrasound scanners. Material and Methods Globular (n=5); non-globular elongated (n=5), and non-globular near-spherical shaped (n=4) hollow plastic objects were scanned to estimate the volumes; actual volumes were compared with estimated volumes. T-test and one-way ANOVA were used to compare means; P<0.05 was considered significant. Results The actual volumes of the objects were in the range of 10–445 mL; estimated volumes ranged from 6.4–425 mL ( P=0.067). The estimated volume was lower than the actual volume; such volume underestimation was marked for non-globular elongated objects. Regardless of the scanner, the highest volume estimation error was for non-globular elongated objects (<40%) followed by non-globular near-spherical shaped objects (<23.88%); the lowest was for globular objects (<3.6%). Irrespective of the shape or the volume of the object, volume estimation difference among the scanners was not significant: globular (F=0.430, P=0.66); non-globular elongated (F=3.69, P=0.064); and non-globular near-spherical (F=4.00, P=0.06). A good inter-rater agreement (R=0.99, P<0.001) and a good correlation between actual versus estimated volumes (R=0.98, P<0.001) were noted. Conclusion The 2D ultrasonography can be recommended for volume estimation purposes of different shaped and different sized objects, regardless the type of the high-end scanner used.

Download Full-text

Automated analysis of 3D-echocardiography using spatially registered patient-specific CMR meshes

European Heart Journal - Cardiovascular Imaging ◽

10.1093/ehjci/jeaa356.425 ◽

2021 ◽

Vol 22 (Supplement_1) ◽

Author(s):

D Zhao ◽

E Ferdian ◽

GD Maso Talou ◽

GM Quill ◽

K Gilbert ◽

...

Keyword(s):

New Zealand ◽

Interobserver Variability ◽

Ground Truth ◽

Automated Analysis ◽

3D Echocardiography ◽

Training Data ◽

Patient Specific ◽

Manual Analysis ◽

Lv Mass ◽

3D Echo

Abstract Funding Acknowledgements Type of funding sources: Public grant(s) – National budget only. Main funding source(s): National Heart Foundation (NHF) of New Zealand Health Research Council (HRC) of New Zealand Artificial intelligence shows considerable promise for automated analysis and interpretation of medical images, particularly in the domain of cardiovascular imaging. While application to cardiac magnetic resonance (CMR) has demonstrated excellent results, automated analysis of 3D echocardiography (3D-echo) remains challenging, due to the lower signal-to-noise ratio (SNR), signal dropout, and greater interobserver variability in manual annotations. As 3D-echo is becoming increasingly widespread, robust analysis methods will substantially benefit patient evaluation. We sought to leverage the high SNR of CMR to provide training data for a convolutional neural network (CNN) capable of analysing 3D-echo. We imaged 73 participants (53 healthy volunteers, 20 patients with non-ischaemic cardiac disease) under both CMR and 3D-echo (<1 hour between scans). 3D models of the left ventricle (LV) were independently constructed from CMR and 3D-echo, and used to spatially align the image volumes using least squares fitting to a cardiac template. The resultant transformation was used to map the CMR mesh to the 3D-echo image. Alignment of mesh and image was verified through volume slicing and visual inspection (Fig. 1) for 120 paired datasets (including 47 rescans) each at end-diastole and end-systole. 100 datasets (80 for training, 20 for validation) were used to train a shallow CNN for mesh extraction from 3D-echo, optimised with a composite loss function consisting of normalised Euclidian distance (for 290 mesh points) and volume. Data augmentation was applied in the form of rotations and tilts (<15 degrees) about the long axis. The network was tested on the remaining 20 datasets (different participants) of varying image quality (Tab. I). For comparison, corresponding LV measurements from conventional manual analysis of 3D-echo and associated interobserver variability (for two observers) were also estimated. Initial results indicate that the use of embedded CMR meshes as training data for 3D-echo analysis is a promising alternative to manual analysis, with improved accuracy and precision compared with conventional methods. Further optimisations and a larger dataset are expected to improve network performance. (n = 20) LV EDV (ml) LV ESV (ml) LV EF (%) LV mass (g) Ground truth CMR 150.5 ± 29.5 57.9 ± 12.7 61.5 ± 3.4 128.1 ± 29.8 Algorithm error -13.3 ± 15.7 -1.4 ± 7.6 -2.8 ± 5.5 0.1 ± 20.9 Manual error -30.1 ± 21.0 -15.1 ± 12.4 3.0 ± 5.0 Not available Interobserver error 19.1 ± 14.3 14.4 ± 7.6 -6.4 ± 4.8 Not available Tab. 1. LV mass and volume differences (means ± standard deviations) for 20 test cases. Algorithm: CNN – CMR (as ground truth). Abstract Figure. Fig 1. CMR mesh registered to 3D-echo.

Download Full-text

A Modified Harmony Search Algorithm for Solving the Dynamic Vehicle Routing Problem with Time Windows

Scientific Programming ◽

10.1155/2017/1021432 ◽

2017 ◽

Vol 2017 ◽

pp. 1-13 ◽

Cited By ~ 2

Author(s):

Shifeng Chen ◽

Rong Chen ◽

Jian Gao

Keyword(s):

Vehicle Routing ◽

Vehicle Routing Problem ◽

Search Algorithm ◽

Time Windows ◽

Harmony Search ◽

Population Diversity ◽

Benchmark Problems ◽

Dynamic Vehicle Routing ◽

Dynamic Vehicle Routing Problem ◽

Routing Problem

The Vehicle Routing Problem (VRP) is a classical combinatorial optimization problem. It is usually modelled in a static fashion; however, in practice, new requests by customers arrive after the initial workday plan is in progress. In this case, routes must be replanned dynamically. This paper investigates the Dynamic Vehicle Routing Problem with Time Windows (DVRPTW) in which customers’ requests either can be known at the beginning of working day or occur dynamically over time. We propose a hybrid heuristic algorithm that combines the harmony search (HS) algorithm and the Variable Neighbourhood Descent (VND) algorithm. It uses the HS to provide global exploration capabilities and uses the VND for its local search capability. In order to prevent premature convergence of the solution, we evaluate the population diversity by using entropy. Computational results on the Lackner benchmark problems show that the proposed algorithm is competitive with the best existing algorithms from the literature.

Download Full-text