Unique variable analysis: A novel approach for detecting redundant variables in multivariate data
One common approach for constructing tests that measure a single attribute is the semantic similarity approach where items vary slightly in their wording and content. Despite being an effective strategy for ensuring high internal consistency, the information in tests may become redundant or worse confound the interpretation of the test scores. With the advent of network models, where tests represent a complex system and components (usually items) represent causally autonomous features, redundant variables may have inadvertent effects on the interpretation of their metrics. These issues motivated the development of a novel approach called Unique Variable Analysis (UVA), which detects redundant variables in multivariate data. The goal of UVA is to statistically identify potential redundancies in multivariate data so that researchers can make decisions about how best to handle them. Using a Monte Carlo simulation approach, we generated multivariate data with redundancies that were based on examples of known real-world redundancies. We then demonstrate the effects that redundancy can have on the accurate estimation of dimensions. Next, we evaluated UVA’s ability to detect redundant variables in the simulated data. Based on these results, we provide a tutorial for how to apply UVA to real-world data. Our example data demonstrate that redundant variables create inaccurate estimates of dimensional structure but after applying UVA, the expected structure can be recovered. In sum, our study suggests that redundancy can have substantial effects on validity if left unchecked and that redundancy assessment should be integrated into standard validation practices.