scholarly journals Data-based support for petroleum prospect evaluation

2020 ◽  
Vol 13 (4) ◽  
pp. 1305-1324
Author(s):  
Summaya Mumtaz ◽  
Irina Pene ◽  
Adnan Latif ◽  
Martin Giese

AbstractWe consider the challenging task of evaluating the commercial viability of hydrocarbon prospects based on limited information, and in limited time. We investigate purely data-driven approaches to predicting key reservoir parameters and obtain a negative result: the information that is typically available for prospect evaluation and is suitable for data-based methods, cannot be used for the required predictions. We can show however that the same information is sufficient to produce a limited list of potentially similar well-explored reservoirs (known as analogues) that can support the prospect evaluation work of human geoscientists. We base the proposal of analogues on similarity measures on the data available about prospects. Technically, the challenge is to define suitable similarity measures on categorical data like depositional environment or rock types. Existing data-based similarity measures for categorical data do not perform well, since they do not take geological domain knowledge into account. We propose two novel similarity measures that use domain knowledge in the form of hierarchies on categorical values. Comparative evaluation shows that the semantic-based similarity measures outperform the existing data-driven approaches and are effective in comparison to the human analogue selection.

2021 ◽  
Author(s):  
MUTHU RAM ELENCHEZHIAN ◽  
VAMSEE VADLAMUDI ◽  
RASSEL RAIHAN ◽  
KENNETH REIFSNIDER

Our community has a widespread knowledge on the damage tolerance and durability of the composites, developed over the past few decades by various experimental and computational efforts. Several methods have been used to understand the damage behavior and henceforth predict the material states such as residual strength (damage tolerance) and life (durability) of these material systems. Electrochemical Impedance Spectroscopy (EIS) and Broadband Dielectric Spectroscopy (BbDS) are such methods, which have been proven to identify the damage states in composites. Our previous work using BbDS method has proven to serve as precursor to identify the damage levels, indicating the beginning of end of life of the material. As a change in the material state variable is triggered by damage development, the rate of change of these states indicates the rate of damage interaction and can effectively predict impending failure. The Data-Driven Discovery of Models (D3M) [1] aims to develop model discovery systems, enabling users with domain knowledge but no data science background to create empirical models of real, complex processes. These D3M methods have been developed severely over the years in various applications and their implementation on real-time prediction for complex parameters such as material states in composites need to be trusted based on physics and domain knowledge. In this research work, we propose the use of data-driven methods combined with BbDS and progressive damage analysis to identify and hence predict material states in composites, subjected to fatigue loads.


In data mining ample techniques use distance based measures for data clustering. Improving clustering performance is the fundamental goal in cluster domain related tasks. Many techniques are available for clustering numerical data as well as categorical data. Clustering is an unsupervised learning technique and objects are grouped or clustered based on similarity among the objects. A new cluster similarity finding measure, which is cosine like cluster similarity measure (CLCSM), is proposed in this paper. The proposed cluster similarity measure is used for data classification. Extensive experiments are conducted by taking UCI machine learning datasets. The experimental results have shown that the proposed cosinelike cluster similarity measure is superior to many of the existing cluster similarity measures for data classification.


Author(s):  
Yunpeng Li ◽  
Utpal Roy ◽  
Y. Tina Lee ◽  
Sudarsan Rachuri

Rule-based expert systems such as CLIPS (C Language Integrated Production System) are 1) based on inductive (if-then) rules to elicit domain knowledge and 2) designed to reason new knowledge based on existing knowledge and given inputs. Recently, data mining techniques have been advocated for discovering knowledge from massive historical or real-time sensor data. Combining top-down expert-driven rule models with bottom-up data-driven prediction models facilitates enrichment and improvement of the predefined knowledge in an expert system with data-driven insights. However, combining is possible only if there is a common and formal representation of these models so that they are capable of being exchanged, reused, and orchestrated among different authoring tools. This paper investigates the open standard PMML (Predictive Model Mockup Language) in integrating rule-based expert systems with data analytics tools, so that a decision maker would have access to powerful tools in dealing with both reasoning-intensive tasks and data-intensive tasks. We present a process planning use case in the manufacturing domain, which is originally implemented as a CLIPS-based expert system. Different paradigms in interpreting expert system facts and rules as PMML models (and vice versa), as well as challenges in representing and composing these models, have been explored. They will be discussed in detail.


Author(s):  
Longbing Cao ◽  
Chengqi Zhang

Quantitative intelligence based traditional data mining is facing grand challenges from real-world enterprise and cross-organization applications. For instance, the usual demonstration of specific algorithms cannot support business users to take actions to their advantage and needs. We think this is due to Quantitative Intelligence focused data-driven philosophy. It either views data mining as an autonomous data-driven, trial-and-error process, or only analyzes business issues in an isolated, case-by-case manner. Based on experience and lessons learnt from real-world data mining and complex systems, this article proposes a practical data mining methodology referred to as Domain-Driven Data Mining. On top of quantitative intelligence and hidden knowledge in data, domain-driven data mining aims to meta-synthesize quantitative intelligence and qualitative intelligence in mining complex applications in which human is in the loop. It targets actionable knowledge discovery in constrained environment for satisfying user preference. Domain-driven methodology consists of key components including understanding constrained environment, business-technical questionnaire, representing and involving domain knowledge, human-mining cooperation and interaction, constructing next-generation mining infrastructure, in-depth pattern mining and postprocessing, business interestingness and actionability enhancement, and loop-closed human-cooperated iterative refinement. Domain-driven data mining complements the data-driven methodology, the metasynthesis of qualitative intelligence and quantitative intelligence has potential to discover knowledge from complex systems, and enhance knowledge actionability for practical use by industry and business.


2018 ◽  
Vol 22 (11) ◽  
pp. 3603-3619 ◽  
Author(s):  
Wentao Zhao ◽  
Qian Li ◽  
Chengzhang Zhu ◽  
Jianglong Song ◽  
Xinwang Liu ◽  
...  

Perception ◽  
1997 ◽  
Vol 26 (1_suppl) ◽  
pp. 296-296
Author(s):  
L A Olzak ◽  
T D Wickens

Many important psychophysical questions concern the interaction or combination of different components of a stimulus. Classical psychophysical methods for assessing whether two stimulus aspects are coded independently (eg, masking and summation) provide limited information about the nature of whatever interactions are discovered. In both older work in detection and recent work in complex pattern discrimination, we have used a double-judgment paradigm in which the observer rates two aspects of a stimulus simultaneously. The paradigm provides a rich source of information about the codes underlying each psychophysical decision. It is unique in permitting us to investigate effects resulting from correlations in noise. We review the theoretical, technological, and methodological results that led us to develop this approach. Procedural antecedents lie in theories of dimensional interaction, in signal detection theory, and in information theory. Analytically, we draw on methods from several branches of statistics, including categorical data analysis and structural equation modeling. Also key to our work are advances in computational power: both our experimental procedures and our data analysis would have been difficult or impossible two decades ago.


2020 ◽  
Vol 10 (2) ◽  
pp. 255-259 ◽  
Author(s):  
Philip James ◽  
Ronnie Das ◽  
Agata Jalosinska ◽  
Luke Smith

This commentary describes the rapid development of a COVID-19 data dashboard utilising existing Urban Observatory Internet of Things (IoT) data and analytics infrastructure. Existing data capture systems were rapidly repurposed to provide real-time insights into the impacts of lockdown policy on urban governance.


2014 ◽  
Vol 2014 ◽  
pp. 1-16 ◽  
Author(s):  
Rubing Huang ◽  
Jinfu Chen ◽  
Yansheng Lu

Random testing (RT) is a fundamental testing technique to assess software reliability, by simply selecting test cases in a random manner from the whole input domain. As an enhancement of RT, adaptive random testing (ART) has better failure‐detection capability and has been widely applied in different scenarios, such as numerical programs, some object‐oriented programs, and mobile applications. However, not much work has been done on the effectiveness of ART for the programs with combinatorial input domain (i.e., the set of categorical data). To extend the ideas to the testing for combinatorial input domain, we have adopted different similarity measures that are widely used for categorical data in data mining and have proposed two similarity measures based on interaction coverage. Then, we propose a new version named ART‐CID as an extension of ART in combinatorial input domain, which selects an element from categorical data as the next test case such that it has the lowest similarity against already generated test cases. Experimental results show that ART‐CID generally performs better than RT, with respect to different evaluation metrics.


Dela ◽  
2021 ◽  
pp. 149-167
Author(s):  
Špela Vintar ◽  
Uroš Stepišnik

We describe a systematic and data-driven approach to karst terminology where knowledge from different textual sources is structured into a comprehensive multilingual knowledge representation. The approach is based on a domain model which is constructed in line with the frame-based approach to terminology and the analytical geomorphological method of describing karst phenomena. The domain model serves as a basis for annotating definitions and aggregating the information obtained from different definitions into a knowledge network. We provide examples of visual knowledge representations and demonstrate the advantages of a systematic and interdisciplinary approach to domain knowledge.


2021 ◽  
Author(s):  
Bryan Fuentes ◽  
Minerva Dorantes ◽  
John Tipton

Spatial stratification of landscapes allows for the development of efficient sampling surveys,the inclusion of domain knowledge in data-driven modeling frameworks, and the production of information relating the spatial variability of response phenomena to that of landscape processes. This work presents the rassta package as a collection of algorithms dedicated to the spatial stratification of landscapes, the calculation of landscape correspondence metrics across geographic space, and the application of these metrics for spatial sampling and modeling of environmental phenomena. The theoretical background of rassta is presented through references to several studies which have benefited from landscape stratification routines. The functionality of rassta is presented through code examples which are complemented with the geographic visualization of their outputs.


Sign in / Sign up

Export Citation Format

Share Document