Validation studies of standard scales in the particular sample that one is studying are essential for accurate conclusions. We investigated the differences in answering patterns of the Brief-Symptom-Inventory (BSI), Transpersonal Trust Scale (TPV), Sense of Coherence Questionnaire (SOC), and a Social Support Scale (F-SoZu) for a matched sample of spiritually practicing (SP) and nonpracticing (NSP) individuals at two measurement points (t1, t2). Applying a sample matching procedure based on propensity scores, we selected two sociodemographically balanced subsamples of N = 120 out of a total sample of N = 431. Employing repeated measures ANOVAs, we found an intersample difference in means only for TPV and an intrasample difference for F-SoZu. Additionally, a group × time interaction effect was found for TPV. While Cronbach’s α was acceptable and comparable for both samples, a significantly lower test-rest-reliability for the BSI was found in the SP sample (rSP = .62; rNSP = .78). Thus, when researching the effects of spiritual practice, one should not only look at differences in means but also consider time stability. We recommend propensity score matching as an alternative for randomization in variables that defy experimental manipulation such as spirituality.