scholarly journals A hybrid landmark Aalen-Johansen estimator for transition probabilities in partially non-Markov multi-state models

Author(s):  
Niklas Maltzahn ◽  
Rune Hoff ◽  
Odd O. Aalen ◽  
Ingrid S. Mehlum ◽  
Hein Putter ◽  
...  

AbstractMulti-state models are increasingly being used to model complex epidemiological and clinical outcomes over time. It is common to assume that the models are Markov, but the assumption can often be unrealistic. The Markov assumption is seldomly checked and violations can lead to biased estimation of many parameters of interest. This is a well known problem for the standard Aalen-Johansen estimator of transition probabilities and several alternative estimators, not relying on the Markov assumption, have been suggested. A particularly simple approach known as landmarking have resulted in the Landmark-Aalen-Johansen estimator. Since landmarking is a stratification method a disadvantage of landmarking is data reduction, leading to a loss of power. This is problematic for “less traveled” transitions, and undesirable when such transitions indeed exhibit Markov behaviour. Introducing the concept of partially non-Markov multi-state models, we suggest a hybrid landmark Aalen-Johansen estimator for transition probabilities. We also show how non-Markov transitions can be identified using a testing procedure. The proposed estimator is a compromise between regular Aalen-Johansen and landmark estimation, using transition specific landmarking, and can drastically improve statistical power. We show that the proposed estimator is consistent, but that the traditional variance estimator can underestimate the variance of both the hybrid and landmark estimator. Bootstrapping is therefore recommended. The methods are compared in a simulation study and in a real data application using registry data to model individual transitions for a birth cohort of 184 951 Norwegian men between states of sick leave, disability, education, work and unemployment.

2020 ◽  
Vol 19 ◽  

Multi-state models can be successfully used for describing complicated event history data, for example, describing stages in the disease progression of a patient. In these models one important goal is the estimation of the transition probabilities since they allow for long term prediction of the process. Traditionally these quantities have been estimated by the Aalen-Johansen estimator which is consistent if the process is Markovian. Recently, estimators have been proposed that outperform the Aalen-Johansen estimators in non-Markov situations. This paper considers a new proposal for the estimation of the transition probabilities in a multi-state system that is not necessarily Markovian. The proposed product-limit nonparametric estimator is defined in the form of a counting process, counting the number of transitions between states and the risk sets for leaving each state with an inverse probability of censoring weighted form. Advantages and limitations of the different methods and some practical recommendations are presented. We also introduce a graphical local test for the Markov assumption. Several simulation studies were conducted under different data scenarios. The proposed methods are illustrated with a real data set on colon cancer.


Symmetry ◽  
2021 ◽  
Vol 13 (11) ◽  
pp. 2164
Author(s):  
Héctor J. Gómez ◽  
Diego I. Gallardo ◽  
Karol I. Santoro

In this paper, we present an extension of the truncated positive normal (TPN) distribution to model positive data with a high kurtosis. The new model is defined as the quotient between two random variables: the TPN distribution (numerator) and the power of a standard uniform distribution (denominator). The resulting model has greater kurtosis than the TPN distribution. We studied some properties of the distribution, such as moments, asymmetry, and kurtosis. Parameter estimation is based on the moments method, and maximum likelihood estimation uses the expectation-maximization algorithm. We performed some simulation studies to assess the recovery parameters and illustrate the model with a real data application related to body weight. The computational implementation of this work was included in the tpn package of the R software.


2019 ◽  
Vol 29 (7) ◽  
pp. 1972-1986
Author(s):  
Bo Chen ◽  
Keith A Lawson ◽  
Antonio Finelli ◽  
Olli Saarela

There is increasing interest in comparing institutions delivering healthcare in terms of disease-specific quality indicators (QIs) that capture processes or outcomes showing variations in the care provided. Such comparisons can be framed in terms of causal models, where adjusting for patient case-mix is analogous to controlling for confounding, and exposure is being treated in a given hospital, for instance. Our goal here is to help identify good QIs rather than comparing hospitals in terms of an already chosen QI, and so we focus on the presence and magnitude of overall variation in care between the hospitals rather than the pairwise differences between any two hospitals. We consider how the observed variation in care received at patient level can be decomposed into that causally explained by the hospital performance adjusting for the case-mix, the case-mix itself, and residual variation. For this purpose, we derive a three-way variance decomposition, with particular attention to its causal interpretation in terms of potential outcome variables. We propose model-based estimators for the decomposition, accommodating different link functions and either fixed or random effect models. We evaluate their performance in a simulation study and demonstrate their use in a real data application.


Agronomy ◽  
2020 ◽  
Vol 10 (11) ◽  
pp. 1702
Author(s):  
Sojung Kim ◽  
Evi Ofekeze ◽  
James R. Kiniry ◽  
Sumin Kim

The reduction in the operational cost of a biofuel refinery is vitally important to make biofuel competitive with fossil fuels. The aim of this paper is to find a cost-efficient and sustainable refinery capacity for grain-based ethanol (i.e., corn-based ethanol) production, which will play an important role in promoting the widespread adoption and sustainable use of ethanol, by improving the productivity of the overall refining process. Continuous-event simulation was utilized in this study to model complex operations of a refinery such as the loading, unloading and treatment of feedstock over nine major phases (e.g., feedstock storage and handling, pretreatment and conditioning, fermentation and hydrolysis, and enzyme production) to produce ethanol. To improve the model prediction, the real data of corn yield produced in Tazewell County, Illinois, U.S. were used. The proposed simulation model is implemented in AnyLogic® University 8.6.0 simulation software, Chicago, IL, USA, and the (near) optimal number of reactors for the hydrolysis and fermentation is found via optimization software known as OptQuest®, Boulder, CO, USA, As a result, the proposed approach found that six reactors showed the optimal daily profit from USD 67,500 to 82,217. This information will help engineers and policy makers to modify the capacity of a biofuel refinery for enhancement of the system efficiency and ethanol production.


2006 ◽  
Vol 31 (1) ◽  
pp. 1-33 ◽  
Author(s):  
Sandip Sinharay

Bayesian networks are frequently used in educational assessments primarily for learning about students’ knowledge and skills. There is a lack of works on assessing fit of Bayesian networks. This article employs the posterior predictive model checking method, a popular Bayesian model checking tool, to assess fit of simple Bayesian networks. A number of aspects of model fit, those of usual interest to practitioners, are assessed using various diagnostic tools. This article suggests a direct data display for assessing overall fit, suggests several diagnostics for assessing item fit, suggests a graphical approach to examine if the model can explain the association among the items, and suggests a version of the Mantel–Haenszel statistic for assessing differential item functioning. Limited simulation studies and a real data application demonstrate the effectiveness of the suggested model diagnostics.


2019 ◽  
Vol 3 (Supplement_1) ◽  
pp. S911-S911
Author(s):  
Tomiko Yoneda ◽  
Jonathan Rush ◽  
Nathan A Lewis ◽  
Jamie E Knight ◽  
Jinshil Hyun ◽  
...  

Abstract Although existing research shows that physical activity (PA) protects against cognitive decline, it is unclear if maintenance of PA throughout older adulthood influences the timing of onset or transitions through cognitive states. Further understanding of modifiable lifestyle factors that protect against cognitive changes characteristic of both normal aging and pathological aging, such as Alzheimer’s disease and other dementias, is imperative. Data were drawn from fourteen longitudinal studies of aging from Europe and America (total N=53,069). Controlling for demographics and chronic conditions, multi-state models were independently fit between datasets to investigate the impact of PA (computed based on Metabolic Equivalent of Task Method) on the likelihood of transitioning through three cognitive states, while also accounting for death as a competing risk factor. Random effects meta-analysis of transition probabilities indicated that more PA was associated with a reduced risk of transitioning from normal cognition to mildly impaired cognition (HR=0.90, CI’s=0.84, 0.97, p=0.007) and death (HR=0.24, CI’s=0.06, 0.92, p=0.04), as well as an increased likelihood of transitioning from severe impairment back to mild impairment (HR=1.09, CI’s=1.01, 1.17, p=0.03). Engagement in national minimum recommendations for PA (~150 minutes/week) increased total life expectancy for 70 year old males and females by 4.08 and 5.47 years, respectively. These results suggest that engaging in at least 150 minutes of physical activity per week in older adulthood contributes to delays in onset of mild cognitive impairment, substantially increases life expectancy, and may also diminish the symptoms that contribute to poor cognitive performance at the severely impaired stage.


Biometrika ◽  
2020 ◽  
Author(s):  
Rong Ma ◽  
Ian Barnett

Summary Modularity is a popular metric for quantifying the degree of community structure within a network. The distribution of the largest eigenvalue of a network’s edge weight or adjacency matrix is well studied and is frequently used as a substitute for modularity when performing statistical inference. However, we show that the largest eigenvalue and modularity are asymptotically uncorrelated, which suggests the need for inference directly on modularity itself when the network is large. To this end, we derive the asymptotic distribution of modularity in the case where the network’s edge weight matrix belongs to the Gaussian orthogonal ensemble, and study the statistical power of the corresponding test for community structure under some alternative models. We empirically explore universality extensions of the limiting distribution and demonstrate the accuracy of these asymptotic distributions through Type I error simulations. We also compare the empirical powers of the modularity-based tests and some existing methods. Our method is then used to test for the presence of community structure in two real data applications.


2017 ◽  
Vol 28 (1) ◽  
pp. 151-169
Author(s):  
Abderrahim Oulhaj ◽  
Anouar El Ghouch ◽  
Rury R Holman

Composite endpoints are frequently used in clinical outcome trials to provide more endpoints, thereby increasing statistical power. A key requirement for a composite endpoint to be meaningful is the absence of the so-called qualitative heterogeneity to ensure a valid overall interpretation of any treatment effect identified. Qualitative heterogeneity occurs when individual components of a composite endpoint exhibit differences in the direction of a treatment effect. In this paper, we develop a general statistical method to test for qualitative heterogeneity, that is to test whether a given set of parameters share the same sign. This method is based on the intersection–union principle and, provided that the sample size is large, is valid whatever the model used for parameters estimation. We propose two versions of our testing procedure, one based on a random sampling from a Gaussian distribution and another version based on bootstrapping. Our work covers both the case of completely observed data and the case where some observations are censored which is an important issue in many clinical trials. We evaluated the size and power of our proposed tests by carrying out some extensive Monte Carlo simulations in the case of multivariate time to event data. The simulations were designed under a variety of conditions on dimensionality, censoring rate, sample size and correlation structure. Our testing procedure showed very good performances in terms of statistical power and type I error. The proposed test was applied to a data set from a single-center, randomized, double-blind controlled trial in the area of Alzheimer’s disease.


2018 ◽  
Vol 20 (6) ◽  
pp. 2055-2065 ◽  
Author(s):  
Johannes Brägelmann ◽  
Justo Lorenzo Bermejo

Abstract Technological advances and reduced costs of high-density methylation arrays have led to an increasing number of association studies on the possible relationship between human disease and epigenetic variability. DNA samples from peripheral blood or other tissue types are analyzed in epigenome-wide association studies (EWAS) to detect methylation differences related to a particular phenotype. Since information on the cell-type composition of the sample is generally not available and methylation profiles are cell-type specific, statistical methods have been developed for adjustment of cell-type heterogeneity in EWAS. In this study we systematically compared five popular adjustment methods: the factored spectrally transformed linear mixed model (FaST-LMM-EWASher), the sparse principal component analysis algorithm ReFACTor, surrogate variable analysis (SVA), independent SVA (ISVA) and an optimized version of SVA (SmartSVA). We used real data and applied a multilayered simulation framework to assess the type I error rate, the statistical power and the quality of estimated methylation differences according to major study characteristics. While all five adjustment methods improved false-positive rates compared with unadjusted analyses, FaST-LMM-EWASher resulted in the lowest type I error rate at the expense of low statistical power. SVA efficiently corrected for cell-type heterogeneity in EWAS up to 200 cases and 200 controls, but did not control type I error rates in larger studies. Results based on real data sets confirmed simulation findings with the strongest control of type I error rates by FaST-LMM-EWASher and SmartSVA. Overall, ReFACTor, ISVA and SmartSVA showed the best comparable statistical power, quality of estimated methylation differences and runtime.


Sign in / Sign up

Export Citation Format

Share Document