Bi-level variable selection in semiparametric transformation mixture cure models for right-censored data

Author(s):  
Jingjing Wu ◽  
Xuewen Lu ◽  
Wenyan Zhong
Biometrika ◽  
2020 ◽  
Author(s):  
Sunyoung Shin ◽  
Yufeng Liu ◽  
Stephen R Cole ◽  
Jason P Fine

Summary We consider scenarios in which the likelihood function for a semiparametric regression model factors into separate components, with an efficient estimator of the regression parameter available for each component. An optimal weighted combination of the component estimators, named an ensemble estimator, may be employed as an overall estimate of the regression parameter, and may be fully efficient under uncorrelatedness conditions. This approach is useful when the full likelihood function may be difficult to maximize, but the components are easy to maximize. It covers settings where the nuisance parameter may be estimated at different rates in the component likelihoods. As a motivating example we consider proportional hazards regression with prospective doubly censored data, in which the likelihood factors into a current status data likelihood and a left-truncated right-censored data likelihood. Variable selection is important in such regression modelling, but the applicability of existing techniques is unclear in the ensemble approach. We propose ensemble variable selection using the least squares approximation technique on the unpenalized ensemble estimator, followed by ensemble re-estimation under the selected model. The resulting estimator has the oracle property such that the set of nonzero parameters is successfully recovered and the semiparametric efficiency bound is achieved for this parameter set. Simulations show that the proposed method performs well relative to alternative approaches. Analysis of an AIDS cohort study illustrates the practical utility of the method.


2019 ◽  
Vol 29 (8) ◽  
pp. 2151-2166 ◽  
Author(s):  
Shuwei Li ◽  
Qiwei Wu ◽  
Jianguo Sun

Variable selection or feature extraction is fundamental to identify important risk factors from a large number of covariates and has applications in many fields. In particular, its applications in failure time data analysis have been recognized and many methods have been proposed for right-censored data. However, developing relevant methods for variable selection becomes more challenging when one confronts interval censoring that often occurs in practice. In this article, motivated by an Alzheimer’s disease study, we develop a variable selection method for interval-censored data with a general class of semiparametric transformation models. Specifically, a novel penalized expectation–maximization algorithm is developed to maximize the complex penalized likelihood function, which is shown to perform well in the finite-sample situation through a simulation study. The proposed methodology is then applied to the interval-censored data arising from the Alzheimer’s disease study mentioned above.


2021 ◽  
Author(s):  
Alexander Seipp ◽  
Verena Uslar ◽  
Dirk Weyhe ◽  
Antje Timmer ◽  
Fabian Otto‐Sobotka

Sign in / Sign up

Export Citation Format

Share Document