Application of discovery process models in estimating petroleum resources at the play level in China

1997 ◽  
Vol 6 (4) ◽  
pp. 317-328 ◽  
Author(s):  
Zhuoheng Chen ◽  
Richard Sinding-Larsen ◽  
Xinhua Ma
Author(s):  
P.J. Lee

The procedure and steps of petroleum resource assessment involve a learning process that is characterized by an interactive loop between geological and statistical models and their feedback mechanisms. Geological models represent natural populations and are the basic units for petroleum resource evaluation. Statistical models include the superpopulation, finite population, and discovery process models that may be used for estimating the distributions for pool size and number of pools, and can be estimated from somewhat biased exploration data. Methods for assessing petroleum resources have been developed using different geological perspectives. Each of them can be applied to a specific case. When we consider using a particular method, the following aspects should be examined: • Types of data required—Some methods can only incorporate certain types of data; others can incorporate all data that are available. • Assumptions required—We must study what specific assumptions should be made and what role they play in the process of estimation. • Types of estimates—What types of estimates does the method provide (aggregate estimates vs. pool-size estimates)? Do the types of estimates fulfill our needs for economic analysis? • Feedback mechanisms—What types of feedback mechanism does the method offer? PETRIMES is based on a probabilistic framework that uses superpopulation and finite population concepts, discovery process models, and the optional use of lognormal distributions. The reasoning behind the application of discovery process models is that they offer the only known way to incorporate petroleum assessment fundamentals (i.e., realism) into the estimates. PETRIMES requires an exploration time series as basic input and can be applied to both mature and frontier petroleum resource evaluations.


Author(s):  
Mouhib Alnoukari ◽  
Asim El Sheikh

Knowledge Discovery (KD) process model was first discussed in 1989. Different models were suggested starting with Fayyad’s et al (1996) process model. The common factor of all data-driven discovery process is that knowledge is the final outcome of this process. In this chapter, the authors will analyze most of the KD process models suggested in the literature. The chapter will have a detailed discussion on the KD process models that have innovative life cycle steps. It will propose a categorization of the existing KD models. The chapter deeply analyzes the strengths and weaknesses of the leading KD process models, with the supported commercial systems and reported applications, and their matrix characteristics.


2010 ◽  
Vol 25 (2) ◽  
pp. 137-166 ◽  
Author(s):  
Gonzalo Mariscal ◽  
Óscar Marbán ◽  
Covadonga Fernández

AbstractUp to now, many data mining and knowledge discovery methodologies and process models have been developed, with varying degrees of success. In this paper, we describe the most used (in industrial and academic projects) and cited (in scientific literature) data mining and knowledge discovery methodologies and process models, providing an overview of its evolution along data mining and knowledge discovery history and setting down the state of the art in this topic. For every approach, we have provided a brief description of the proposed knowledge discovery in databases (KDD) process, discussing about special features, outstanding advantages and disadvantages of every approach. Apart from that, a global comparative of all presented data mining approaches is provided, focusing on the different steps and tasks in which every approach interprets the whole KDD process. As a result of the comparison, we propose a new data mining and knowledge discovery process namedrefined data mining processfor developing any kind of data mining and knowledge discovery project. The refined data mining process is built on specific steps taken from analyzed approaches.


Author(s):  
P.J. Lee

In Chapter 3 we discussed the concepts, functions, and applications of the two discovery process models LDSCV and NDSCV. In this chapter we will use various simulated populations to validate these two models to examine whether their performance meets our expectations. In addition, lognormal assumptions are applied to Weibull and Pareto populations to assess the impact on petroleum evaluation as a result of incorrect specification of probability distributions. A mixed population of two lognormal populations and a mixed population of lognormal, Weibull, and Pareto populations were generated to test the impact of mixed populations on assessment quality. NDSCV was then applied to all these data sets to validate the performance of the models. Finally, justifications for choosing a lognormal distribution in petroleum assessments are discussed in detail. Known populations were created as follows: A finite population was generated from a random sample of size 300 (N = 300) drawn from the lognormal, Pareto, and Weibull superpopulations. For the lognormal case, a population with μ = 0 and σ2 = 5 was assumed. The truncated and shifted Pareto population with shape factor θ = 0.4, maximum pool size = 4000, and minimum pool size = 1 was created. The Weibull population with λ = 20, θ = 1.0 was generated for the current study. The first mixed population was created by mixing two lognormal populations. Parameters for population I are μ = 0, σ2 = 3, and N1 = 150. For population II, μ = 3.0, σ2 = 3.2, and N2 = 150. The second mixed population was generated by mixing lognormal (N1 = 100), Pareto (N2 = 100), and Weibull (N3 = 100) populations with a total of 300 pools. In addition, a gamma distribution was also used for reference. The lognormal distribution is J-shaped if an arithmetic scale is used for the horizontal axis, but it shows an almost symmetrical pattern when a logarithmic scale is applied.


2014 ◽  
Vol 23 (01) ◽  
pp. 1440001 ◽  
Author(s):  
J. C. A. M. Buijs ◽  
B. F. van Dongen ◽  
W. M. P. van der Aalst

Process discovery algorithms typically aim at discovering process models from event logs that best describe the recorded behavior. Often, the quality of a process discovery algorithm is measured by quantifying to what extent the resulting model can reproduce the behavior in the log, i.e. replay fitness. At the same time, there are other measures that compare a model with recorded behavior in terms of the precision of the model and the extent to which the model generalizes the behavior in the log. Furthermore, many measures exist to express the complexity of a model irrespective of the log.In this paper, we first discuss several quality dimensions related to process discovery. We further show that existing process discovery algorithms typically consider at most two out of the four main quality dimensions: replay fitness, precision, generalization and simplicity. Moreover, existing approaches cannot steer the discovery process based on user-defined weights for the four quality dimensions.This paper presents the ETM algorithm which allows the user to seamlessly steer the discovery process based on preferences with respect to the four quality dimensions. We show that all dimensions are important for process discovery. However, it only makes sense to consider precision, generalization and simplicity if the replay fitness is acceptable.


Sign in / Sign up

Export Citation Format

Share Document