Score-Guided Structural Equation Model Trees
Structural equation model (SEM) trees are data-driven tools for finding covariates that predict group differences in the parameters of an SEM. SEM trees build upon the decision tree paradigm by growing tree structures that divide a data set recursively into homogeneous subsets. Currently, the selection of split variables among covariates involves the calculation of a likelihood ratio for each possible split of each covariate. Obtaining these likelihood ratios is computationally intensive. Moreover, comparing maximum likelihood ratios biases the selection process by favoring covariates with many different values. Several correction procedures for this selection bias have been proposed. Unfortunately, these procedures either reduce statistical power to detect group differences or impose an additional computational burden. As a remedy, we propose to guide the construction of SEM trees by a family of score-based tests instead of using likelihood ratios. These score-based tests monitor fluctuations in the case-wise derivatives of the likelihood function, also called scores, to detect parameter differences between groups. In contrast to the likelihood-ratio approach, score-based tests are computationally efficient because they do not require refitting the model for every possible split, they offer an unbiased selection of covariates, and have high statistical power. In this paper, we introduce score-guided SEM trees and its implementation in the R package semtree and evaluate their performance by means of a Monte Carlo simulation.