Variable selection for random effects two-part models

Random effects two-part models have been applied to longitudinal studies for zero-inflated (or semi-continuous) data, characterized by a large portion of zero values and continuous non-zero (positive) values. Examples include monthly medical costs, daily alcohol drinks, relative abundance of microbiome, etc. With the advance of information technology for data collection and storage, the number of variables available to researchers can be rather large in such studies. To avoid curse of dimensionality and facilitate decision making, it is critically important to select covariates that are truly related to the outcome. However, owing to its intricate nature, there is not yet a satisfactory variable selection method available for such sophisticated models. In this paper, we seek a feasible way of conducting variable selection for random effects two-part models on the basis of the recently proposed “minimum information criterion” (MIC) method. We demonstrate that the MIC formulation leads to a reasonable formulation of sparse estimation, which can be conveniently solved with SAS Proc NLMIXED. The performance of our approach is evaluated through simulation, and an application to a longitudinal alcohol dependence study is provided.

Download Full-text

Genetic Biomarker Selection for Obsessive-Compulsive Disorder by Sparse Representation Based Variable Selection Method

Journal of Psychiatry and Brain Science ◽

10.20900/jpbs.20160015 ◽

2016 ◽

Keyword(s):

Variable Selection ◽

Sparse Representation ◽

Obsessive Compulsive Disorder ◽

Selection Method ◽

Obsessive Compulsive ◽

Compulsive Disorder ◽

Variable Selection Method ◽

Selection For ◽

Genetic Biomarker ◽

Biomarker Selection

Download Full-text

A Variable Selection Method for Fault Isolation through Bayesian Information Criterion

2019 IEEE 8th Data Driven Control and Learning Systems Conference (DDCLS) ◽

10.1109/ddcls.2019.8909044 ◽

2019 ◽

Author(s):

Lang Liu ◽

Weidong Yang ◽

Hong Zhang ◽

Huijin Fan ◽

Bo Tao

Keyword(s):

Variable Selection ◽

Bayesian Information Criterion ◽

Information Criterion ◽

Selection Method ◽

Fault Isolation ◽

Variable Selection Method

Download Full-text

A Novel Variable Selection Method Based on a Partial KL Information Measure and Its Application to Channel Selection for Bioelectric Signal Classification

Transactions of the Society of Instrument and Control Engineers ◽

10.9746/sicetr.45.724 ◽

2009 ◽

Vol 45 (12) ◽

pp. 724-730 ◽

Cited By ~ 4

Author(s):

Taro SHIBANOKI ◽

Keisuke SHIMA ◽

Toshio TSUJI ◽

Takeshi TAKAKI ◽

Akira OTSUKA ◽

...

Keyword(s):

Variable Selection ◽

Channel Selection ◽

Selection Method ◽

Information Measure ◽

Signal Classification ◽

Variable Selection Method ◽

Bioelectric Signal ◽

Selection For

Download Full-text

Variable Selection Method Based on Partial Mutual Information and Its Application to NOx Emission Prediction

2020 39th Chinese Control Conference (CCC) ◽

10.23919/ccc50068.2020.9189070 ◽

2020 ◽

Author(s):

QIN Tianmu ◽

ZHANG Jinzhe ◽

YOU Mo ◽

YANG Tingting

Keyword(s):

Mutual Information ◽

Variable Selection ◽

Selection Method ◽

Nox Emission ◽

Variable Selection Method

Download Full-text

Predictive and Descriptive CoMFA Models: The Effect of Variable Selection

Combinatorial Chemistry & High Throughput Screening ◽

10.2174/1386207321666180212162028 ◽

2018 ◽

Vol 21 (2) ◽

pp. 117-124 ◽

Cited By ~ 4

Author(s):

Bakhtyar Sepehri ◽

Nematollah Omidikia ◽

Mohsen Kompany-Zareh ◽

Raouf Ghavami

Keyword(s):

Variable Selection ◽

Predictive Power ◽

Selection Method ◽

Data Sets ◽

Data Set ◽

Comfa Model ◽

Variable Selection Method

Aims & Scope: In this research, 8 variable selection approaches were used to investigate the effect of variable selection on the predictive power and stability of CoMFA models. Materials & Methods: Three data sets including 36 EPAC antagonists, 79 CD38 inhibitors and 57 ATAD2 bromodomain inhibitors were modelled by CoMFA. First of all, for all three data sets, CoMFA models with all CoMFA descriptors were created then by applying each variable selection method a new CoMFA model was developed so for each data set, 9 CoMFA models were built. Obtained results show noisy and uninformative variables affect CoMFA results. Based on created models, applying 5 variable selection approaches including FFD, SRD-FFD, IVE-PLS, SRD-UVEPLS and SPA-jackknife increases the predictive power and stability of CoMFA models significantly. Result & Conclusion: Among them, SPA-jackknife removes most of the variables while FFD retains most of them. FFD and IVE-PLS are time consuming process while SRD-FFD and SRD-UVE-PLS run need to few seconds. Also applying FFD, SRD-FFD, IVE-PLS, SRD-UVE-PLS protect CoMFA countor maps information for both fields.

Download Full-text

Variable selection for semiparametric random-effects conditional density models with longitudinal data

Communication in Statistics- Theory and Methods ◽

10.1080/03610926.2018.1554130 ◽

2018 ◽

Vol 49 (4) ◽

pp. 977-996

Author(s):

Xiaohui Yuan ◽

Yue Wang ◽

Tianqing Liu

Keyword(s):

Variable Selection ◽

Longitudinal Data ◽

Random Effects ◽

Conditional Density ◽

Selection For

Download Full-text

Performance of smoothly clipped absolute deviation as a variable selection method in the artificial neural network‐based QSAR studies

Journal of Chemometrics ◽

10.1002/cem.3338 ◽

2021 ◽

Author(s):

Zeinab Mozafari ◽

Mansour Arab Chamjangali ◽

Mohammad Arashi ◽

Nasser Goudarzi

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Variable Selection ◽

Selection Method ◽

Absolute Deviation ◽

Qsar Studies ◽

Variable Selection Method ◽

Smoothly Clipped Absolute Deviation ◽

Artificial Neural

Download Full-text

Variable Selection in the Regularized Simultaneous Component Analysis Method for Multi-Source Data Integration

Scientific Reports ◽

10.1038/s41598-019-54673-2 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 1

Author(s):

Zhengguo Gu ◽

Niek C. de Schipper ◽

Katrijn Van Deun

Keyword(s):

Variable Selection ◽

Data Integration ◽

Component Analysis ◽

Selection Method ◽

Gps Data ◽

Positioning Systems ◽

Variable Selection Method ◽

Diary Data ◽

Simultaneous Component Analysis ◽

Travel Diary

AbstractInterdisciplinary research often involves analyzing data obtained from different data sources with respect to the same subjects, objects, or experimental units. For example, global positioning systems (GPS) data have been coupled with travel diary data, resulting in a better understanding of traveling behavior. The GPS data and the travel diary data are very different in nature, and, to analyze the two types of data jointly, one often uses data integration techniques, such as the regularized simultaneous component analysis (regularized SCA) method. Regularized SCA is an extension of the (sparse) principle component analysis model to the cases where at least two data blocks are jointly analyzed, which - in order to reveal the joint and unique sources of variation - heavily relies on proper selection of the set of variables (i.e., component loadings) in the components. Regularized SCA requires a proper variable selection method to either identify the optimal values for tuning parameters or stably select variables. By means of two simulation studies with various noise and sparseness levels in simulated data, we compare six variable selection methods, which are cross-validation (CV) with the “one-standard-error” rule, repeated double CV (rdCV), BIC, Bolasso with CV, stability selection, and index of sparseness (IS) - a lesser known (compared to the first five methods) but computationally efficient method. Results show that IS is the best-performing variable selection method.

Download Full-text