Reduced Degrees of Freedom Gaussian Mixture Model Fitting for Large Scale History Matching Problems

Summary It has been demonstrated that the Gaussian-mixture-model (GMM) fitting method can construct a GMM that more accurately approximates the posterior probability density function (PDF) by conditioning reservoir models to production data. However, the number of degrees of freedom (DOFs) for all unknown GMM parameters might become huge for large-scale history-matching problems. A new formulation of GMM fitting with a reduced number of DOFs is proposed in this paper to save memory use and reduce computational cost. The performance of the new method is benchmarked against other methods using test problems with different numbers of uncertain parameters. The new method performs more efficiently than the full-rank GMM fitting formulation, reducing the memory use and computational cost by a factor of 5 to 10. Although it is less efficient than the simple GMM approximation dependent on local linearization (L-GMM), it achieves much higher accuracy, reducing the error by a factor of 20 to 600. Finally, the new method together with the parallelized acceptance/rejection (A/R) algorithm is applied to a synthetic history-matching problem for demonstration.

Download Full-text

Strategies to Enhance the Performance of Gaussian Mixture Model Fitting for Uncertainty Quantification by Conditioning to Production Data

10.2118/204008-ms ◽

2021 ◽

Author(s):

Guohua Gao ◽

Jeroen Vink ◽

Fredrik Saaf ◽

Terence Wells

Keyword(s):

Gaussian Mixture Model ◽

Mixture Model ◽

History Matching ◽

Gaussian Mixture ◽

Model Parameters ◽

Gaussian Component ◽

Large Set ◽

Cpu Time ◽

Fitting Method ◽

New Strategies

Abstract When formulating history matching within the Bayesian framework, we may quantify the uncertainty of model parameters and production forecasts using conditional realizations sampled from the posterior probability density function (PDF). It is quite challenging to sample such a posterior PDF. Some methods e.g., Markov chain Monte Carlo (MCMC), are very expensive (e.g., MCMC) while others are cheaper but may generate biased samples. In this paper, we propose an unconstrained Gaussian Mixture Model (GMM) fitting method to approximate the posterior PDF and investigate new strategies to further enhance its performance. To reduce the CPU time of handling bound constraints, we reformulate the GMM fitting formulation such that an unconstrained optimization algorithm can be applied to find the optimal solution of unknown GMM parameters. To obtain a sufficiently accurate GMM approximation with the lowest number of Gaussian components, we generate random initial guesses, remove components with very small or very large mixture weights after each GMM fitting iteration and prevent their reappearance using a dedicated filter. To prevent overfitting, we only add a new Gaussian component if the quality of the GMM approximation on a (large) set of blind-test data sufficiently improves. The unconstrained GMM fitting method with the new strategies proposed in this paper is validated using nonlinear toy problems and then applied to a synthetic history matching example. It can construct a GMM approximation of the posterior PDF that is comparable to the MCMC method, and it is significantly more efficient than the constrained GMM fitting formulation, e.g., reducing the CPU time by a factor of 800 to 7300 for problems we tested, which makes it quite attractive for large scale history matching problems.

Download Full-text

Blind sparse source separation for unknown number of sources using Gaussian mixture model fitting with Dirichlet prior

2009 IEEE International Conference on Acoustics, Speech and Signal Processing ◽

10.1109/icassp.2009.4959513 ◽

2009 ◽

Cited By ~ 36

Author(s):

Shoko Araki ◽

Tomohiro Nakatani ◽

Hiroshi Sawada ◽

Shoji Makino

Keyword(s):

Gaussian Mixture Model ◽

Mixture Model ◽

Source Separation ◽

Model Fitting ◽

Gaussian Mixture ◽

Unknown Number ◽

Dirichlet Prior

Download Full-text

PAM-4 eye-opening monitoring techniques based on Gaussian mixture model fitting

IEICE Communications Express ◽

10.1587/comex.2020xbl0086 ◽

2020 ◽

Vol 9 (10) ◽

pp. 464-469

Author(s):

Yasushi Yuminaka ◽

Keigo Taya ◽

Yosuke Iijima

Keyword(s):

Gaussian Mixture Model ◽

Mixture Model ◽

Model Fitting ◽

Gaussian Mixture ◽

Monitoring Techniques ◽

Eye Opening

Download Full-text

Gaussian Mixture Model Fitting Method For Uncertainty Quantification By Conditioning To Production Data

ECMOR XVI - 16th European Conference on the Mathematics of Oil Recovery ◽

10.3997/2214-4609.201802279 ◽

2018 ◽

Cited By ~ 3

Author(s):

G. Gao ◽

H. Jiang Inc. ◽

J.C. Vink ◽

C. Chen ◽

Y. El Khamra Inc. ◽

...

Keyword(s):

Uncertainty Quantification ◽

Gaussian Mixture Model ◽

Mixture Model ◽

Model Fitting ◽

Gaussian Mixture ◽

Production Data ◽

Fitting Method

Download Full-text

Panoramic Gaussian Mixture Model and large-scale range background substraction method for PTZ camera-based surveillance systems

Machine Vision and Applications ◽

10.1007/s00138-012-0426-4 ◽

2012 ◽

Vol 24 (3) ◽

pp. 477-492 ◽

Cited By ~ 25

Author(s):

Kang Xue ◽

Yue Liu ◽

Gbolabo Ogunmakin ◽

Jing Chen ◽

Jiangen Zhang

Keyword(s):

Gaussian Mixture Model ◽

Mixture Model ◽

Large Scale ◽

Gaussian Mixture ◽

Surveillance Systems ◽

Ptz Camera ◽

Scale Range

Download Full-text

PoSTcode: Probabilistic image-based spatial transcriptomics decoder

10.1101/2021.10.12.464086 ◽

2021 ◽

Author(s):

Milana Gataric ◽

Jun Sung Park ◽

Tong Li ◽

Vasy Vaskivskyi ◽

Jessica Svedlund ◽

...

Keyword(s):

Gaussian Mixture Model ◽

Mixture Model ◽

Large Scale ◽

Probabilistic Method ◽

Gaussian Mixture ◽

Full Potential ◽

Correlated Noise ◽

New Approach ◽

Tuning Parameters

Realising the full potential of novel image-based spatial transcriptomic (IST) technologies requires robust and accurate algorithms for decoding the hundreds of thousand fluorescent signals each derived from single molecules of mRNA. In this paper, we introduce PoSTcode, a probabilistic method for transcript decoding from cyclic multi-channel images, whose effectiveness is demonstrated on multiple large-scale datasets generated using different versions of the in situ sequencing protocols. PoSTcode is based on a re-parametrised matrix-variate Gaussian mixture model designed to account for correlated noise across fluorescence channels and imaging cycles. PoSTcode is shown to recover up to 50% more confidently decoded molecules while simultaneously decreasing transcript mislabeling when compared to existing decoding techniques. In addition, we demonstrate its increased stability to various types of noise and tuning parameters, which makes this new approach reliable and easy to use in practice. Lastly, we show that PoSTcode produces fewer doublet signals compared to a pixel-based decoding algorithm.

Download Full-text

Strategies to Enhance the Performance of Gaussian Mixture Model Fitting for Uncertainty Quantification

SPE Journal ◽

10.2118/204008-pa ◽

2021 ◽

pp. 1-20

Author(s):

Guohua Gao ◽

Jeroen Vink ◽

Fredrik Saaf ◽

Terence Wells

Keyword(s):

Gaussian Mixture Model ◽

Mixture Model ◽

History Matching ◽

Gaussian Mixture ◽

Model Parameters ◽

Processing Unit ◽

Large Set ◽

Cpu Time ◽

Fitting Method ◽

New Strategies

Summary When formulating history matching within the Bayesian framework, we may quantify the uncertainty of model parameters and production forecasts using conditional realizations sampled from the posterior probability density function (PDF). It is quite challenging to sample such a posterior PDF. Some methods [e.g., Markov chain Monte Carlo (MCMC)] are very expensive, whereas other methods are cheaper but may generate biased samples. In this paper, we propose an unconstrained Gaussian mixture model (GMM) fitting method to approximate the posterior PDF and investigate new strategies to further enhance its performance. To reduce the central processing unit (CPU) time of handling bound constraints, we reformulate the GMM fitting formulation such that an unconstrained optimization algorithm can be applied to find the optimal solution of unknown GMM parameters. To obtain a sufficiently accurate GMM approximation with the lowest number of Gaussian components, we generate random initial guesses, remove components with very small or very large mixture weights after each GMM fitting iteration, and prevent their reappearance using a dedicated filter. To prevent overfitting, we add a new Gaussian component only if the quality of the GMM approximation on a (large) set of blind-test data sufficiently improves. The unconstrained GMM fitting method with the new strategies proposed in this paper is validated using nonlinear toy problems and then applied to a synthetic history-matching example. It can construct a GMM approximation of the posterior PDF that is comparable to the MCMC method, and it is significantly more efficient than the constrained GMM fitting formulation (e.g., reducing the CPU time by a factor of 800 to 7,300 for problems we tested), which makes it quite attractive for large-scalehistory-matchingproblems. NOTE: This paper is published as part of the 2021 SPE Reservoir Simulation Special Issue.

Download Full-text

Estimating hotspots using a Gaussian mixture model from large-scale taxi GPS trace data

Transportation Safety and Environment ◽

10.1093/tse/tdz006 ◽

2019 ◽

Vol 1 (2) ◽

pp. 145-153

Author(s):

Jin-jun Tang ◽

Jin Hu ◽

Yi-wei Wang ◽

He-lai Huang ◽

Yin-hai Wang

Keyword(s):

Gaussian Mixture Model ◽

Mixture Model ◽

Large Scale ◽

Spatial Information ◽

Spatial Clustering ◽

Gaussian Mixture ◽

Traffic Information ◽

Trace Data ◽

Real Trace ◽

Two Parameters

Abstract The data collected from taxi vehicles using the global positioning system (GPS) traces provides abundant temporal-spatial information, as well as information on the activity of drivers. Using taxi vehicles as mobile sensors in road networks to collect traffic information is an important emerging approach in efforts to relieve congestion. In this paper, we present a hybrid model for estimating driving paths using a density-based spatial clustering of applications with noise (DBSCAN) algorithm and a Gaussian mixture model (GMM). The first step in our approach is to extract the locations from pick-up and drop-off records (PDR) in taxi GPS equipment. Second, the locations are classified into different clusters using DBSCAN. Two parameters (density threshold and radius) are optimized using real trace data recorded from 1100 drivers. A GMM is also utilized to estimate a significant number of locations; the parameters of the GMM are optimized using an expectation-maximum (EM) likelihood algorithm. Finally, applications are used to test the effectiveness of the proposed model. In these applications, locations distributed in two regions (a residential district and a railway station) are clustered and estimated automatically.

Download Full-text

Fast decentralized learning of a Gaussian mixture model for large-scale multimedia retrieval

14th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP'06) ◽

10.1109/pdp.2006.37 ◽

2006 ◽

Author(s):

A. Nikseresht ◽

M. Gelgon

Keyword(s):

Gaussian Mixture Model ◽

Mixture Model ◽

Large Scale ◽

Gaussian Mixture ◽

Multimedia Retrieval ◽

Decentralized Learning

Download Full-text