Application of Inductive Modeling Principles to Solve the Double Clustering Problems

Author(s):  
Volodymyr Osypenko ◽  
Valentyna Osypenko
2018 ◽  
Author(s):  
Jordan Stevens ◽  
Douglas Steinley ◽  
Cassandra L. Boness ◽  
Timothy J Trull ◽  
...  

Using complete enumeration (e.g., generating all possible subsets of item combinations) to evaluate clustering problems has the benefit of locating globally optimal solutions automatically without the concern of sampling variability. The proposed method is meant to combine clustering variables in such a way as to create groups that are maximally different on a theoretically sound derivation variable(s). After the population of all unique sets is permuted, optimization on some predefined, user-specific function can occur. We apply this technique to optimizing the diagnosis of Alcohol Use Disorder. This is a unique application, from a clustering point of view, in that the decision rule for clustering observations into the diagnosis group relies on both the set of items being considered and a predefined threshold on the number of items required to be endorsed for the diagnosis to occur. In optimizing diagnostic rules, criteria set sizes can be reduced without a loss of significant information when compared to current and proposed, alternative, diagnostic schemes.


2006 ◽  
Vol 31 (5) ◽  
pp. 5 ◽  
Author(s):  
Marco Sinnema ◽  
Jan Salvador van der Ven ◽  
Sybren Deelstra

2010 ◽  
Vol 57 (2) ◽  
pp. 1-32 ◽  
Author(s):  
Amit Kumar ◽  
Yogish Sabharwal ◽  
Sandeep Sen

1996 ◽  
Vol 9 (3) ◽  
pp. 229-239 ◽  
Author(s):  
Santosh Kabadi ◽  
Katta G. Murty ◽  
Cosimo Spera

2009 ◽  
Vol 20 (02) ◽  
pp. 361-377
Author(s):  
DANNY Z. CHEN ◽  
MARK A. HEALY ◽  
CHAO WANG ◽  
BIN XU

In this paper, we present efficient geometric algorithms for the discrete constrained 1-D K-means clustering problem and extend our solutions to the continuous version of the problem. One key clustering constraint we consider is that the maximum difference in each cluster cannot be larger than a given threshold. These constrained 1-D K-means clustering problems appear in various applications, especially in intensity-modulated radiation therapy (IMRT). Our algorithms improve the efficiency and accuracy of the heuristic approaches used in clinical IMRT treatment planning.


2021 ◽  
Author(s):  
Xian Wu ◽  
Tianfang Zhou ◽  
Kaixiang Yi ◽  
Minrui Fei ◽  
Yayu Chen ◽  
...  

2021 ◽  
Author(s):  
Boxiao Li ◽  
Hemant Phale ◽  
Yanfen Zhang ◽  
Timothy Tokar ◽  
Xian-Huan Wen

Abstract Design of Experiments (DoE) is one of the most commonly employed techniques in the petroleum industry for Assisted History Matching (AHM) and uncertainty analysis of reservoir production forecasts. Although conceptually straightforward, DoE is often misused by practitioners because many of its statistical and modeling principles are not carefully followed. Our earlier paper (Li et al. 2019) detailed the best practices in DoE-based AHM for brownfields. However, to our best knowledge, there is a lack of studies that summarize the common caveats and pitfalls in DoE-based production forecast uncertainty analysis for greenfields and history-matched brownfields. Our objective here is to summarize these caveats and pitfalls to help practitioners apply the correct principles for DoE-based production forecast uncertainty analysis. Over 60 common pitfalls in all stages of a DoE workflow are summarized. Special attention is paid to the following critical project transitions: (1) the transition from static earth modeling to dynamic reservoir simulation; (2) from AHM to production forecast; and (3) from analyzing subsurface uncertainties to analyzing field-development alternatives. Most pitfalls can be avoided by consistently following the statistical and modeling principles. Some pitfalls, however, can trap experienced engineers. For example, mistakes made in handling the three abovementioned transitions can yield strongly unreliable proxy and sensitivity analysis. For the representative examples we study, they can lead to having a proxy R2 of less than 0.2 versus larger than 0.9 if done correctly. Two improved experimental designs are created to resolve this challenge. Besides the technical pitfalls that are avoidable via robust statistical workflows, we also highlight the often more severe non-technical pitfalls that cannot be evaluated by measures like R2. Thoughts are shared on how they can be avoided, especially during project framing and the three critical transition scenarios.


Sign in / Sign up

Export Citation Format

Share Document