A method for model selection using reinforcement learning when viewing design as a sequential decision process

2018 ◽  
Vol 59 (5) ◽  
pp. 1521-1542 ◽  
Author(s):  
Jaskanwal P. S. Chhabra ◽  
Gordon P. Warn
2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Sung Ho Kang ◽  
Kiwan Jeon ◽  
Sang-Hoon Kang ◽  
Sang-Hwy Lee

AbstractThe lengthy time needed for manual landmarking has delayed the widespread adoption of three-dimensional (3D) cephalometry. We here propose an automatic 3D cephalometric annotation system based on multi-stage deep reinforcement learning (DRL) and volume-rendered imaging. This system considers geometrical characteristics of landmarks and simulates the sequential decision process underlying human professional landmarking patterns. It consists mainly of constructing an appropriate two-dimensional cutaway or 3D model view, then implementing single-stage DRL with gradient-based boundary estimation or multi-stage DRL to dictate the 3D coordinates of target landmarks. This system clearly shows sufficient detection accuracy and stability for direct clinical applications, with a low level of detection error and low inter-individual variation (1.96 ± 0.78 mm). Our system, moreover, requires no additional steps of segmentation and 3D mesh-object construction for landmark detection. We believe these system features will enable fast-track cephalometric analysis and planning and expect it to achieve greater accuracy as larger CT datasets become available for training and testing.


2007 ◽  
Vol 24 (02) ◽  
pp. 181-202
Author(s):  
YUKIHIRO MARUYAMA

In this paper, we will introduce a new subclass of bitone sequential decision process (bsdp) and give a representation theorem for the subclass called positively/negatively bsdp, shortly, p/n bsdp, that is, necessary and sufficient condition for p/n bsdp to strongly represent a given discrete decision process (ddp).


2004 ◽  
Vol 98 (3) ◽  
pp. 495-513 ◽  
Author(s):  
CHRISTIAN LIST

I model sequential decisions over multiple interconnected propositions and investigate path-dependence in such decisions. The propositions and their interconnections are represented in propositional logic. A sequential decision process is path-dependent if its outcome depends on the order in which the propositions are considered. Assuming that earlier decisions constrain later ones, I prove three main results: First, certain rationality violations by the decision-making agent—individual or group—are necessary and sufficient for path-dependence. Second, under some conditions, path-dependence is unavoidable in decisions made by groups. Third, path-dependence makes decisions vulnerable to strategic agenda setting and strategic voting. I also discuss escape routes from path-dependence. My results are relevant to discussions on collective consistency and reason-based decision-making, focusing not only on outcomes, but also on underlying reasons, beliefs, and constraints.


Author(s):  
Simon W. Miller ◽  
Timothy W. Simpson ◽  
Michael A. Yukish

Design is a sequential decision process that increases the detail of modeling and analysis while simultaneously decreasing the space of alternatives considered. In a decision theoretic framework, low-fidelity models help decision-makers identify regions of interest in the tradespace and cull others prior to constructing more computationally expensive models of higher fidelity. The method presented herein demonstrates design as a sequence of finite decision epochs through a search space defined by the extent of the set of designs under consideration, and the level of analytic fidelity subjected to each design. Previous work has shown that multi-fidelity modeling can aid in rapid optimization of the design space when high-fidelity models are coupled with low-fidelity models. This paper offers two contributions to the design community: (1) a model of design as a sequential decision process of refinement using progressively more accurate and expensive models, and (2) a connected approach for how conceptual models couple with detailed models. Formal definitions of the process are provided, and a simple one-dimensional example is presented to demonstrate the use of sequential multi-fidelity modeling in determining an optimal modeling selection policy.


Author(s):  
Jaskanwal P. S. Chhabra ◽  
Gordon P. Warn

Engineers often employ, formally or informally, multi-fidelity computational models to aid design decision making. For example, recently the idea of viewing design as a Sequential Decision Process (SDP) provides a formal framework of sequencing multi-fidelity models to realize computational gains in the design process. Efficiency is achieved in the SDP because dominated designs are removed using less expensive (low-fidelity) models before using higher-fidelity models with the guarantee the antecedent model only removes design solutions that are dominated when analyzed using more detailed, higher-fidelity models. The set of multi-fidelity models and discrete decision states result in a combinatorial combination of modeling sequences, some of which require significantly fewer model evaluations than others. It is desirable to optimally sequence models; however, the optimal modeling policy can not be determined at the onset of SDP because the computational cost and discriminatory power of executing all models on all designs is unknown. In this study, the model selection problem is formulated as a Markov Decision Process and a classical reinforcement learning, namely Qlearning, is investigated to obtain and follow an approximately optimal modeling policy. The outcome is a methodology able to learn efficient sequencing of models by estimating their computational cost and discriminatory power while analyzing designs in the tradespace throughout the design process. Through application to a design example, the methodology is shown to: 1) effectively identify the approximate optimal modeling policy, and 2) efficiently converge upon a choice set.


Author(s):  
Maximilian E. Ororbia ◽  
Gordon P. Warn

Abstract This article illustrates that structural design synthesis can be achieved through a sequential decision process, whereby a sparsely connected seed configuration is sequentially altered through discrete actions to generate the best design solution, with respect to a specified objective and constraints. Specifically, the generative design synthesis is mathematically formulated as a finite Markov Decision Process. In this context, the states correspond to a specific structural configuration, the actions correspond to the available alterations that can be made to a given configuration, and the immediate rewards are constructed to be proportional to the improvement in the altered configuration’s performance. In the context of generative structural design synthesis, since the immediate rewards are not known at the onset of the process, reinforcement learning is employed to obtain an approximately optimal policy by which to alter the seed configuration to synthesize the best design solution. The approach is applied for the optimization of planar truss structures and its utility is investigated with three numerical examples, each with unique domains and constraints.


2020 ◽  
Vol 62 (2) ◽  
pp. 709-728
Author(s):  
Maximilian E. Ororbia ◽  
Jaskanwal P. S. Chhabra ◽  
Gordon P. Warn ◽  
Simon W. Miller ◽  
Michael A. Yukish ◽  
...  

Sign in / Sign up

Export Citation Format

Share Document