Abstract
Background:We propose and evaluate the approximation formulae for the 95% confidence intervals (CIs) of the sensitivity and specificity and a formula to estimate sample size in a validation study with stratified sampling where positive samples satisfying the outcome definition and negative samples that do not are selected with different extraction fractions. Methods:We used the delta method to derive the approximation formulae for estimating the sensitivity and specificity and their CIs. From those formulae, we derived the formula to estimate the size of negative samples required to achieve the intended precision and the formula to estimate the precision for a negative sample size arbitrarily selected by the investigator. We conducted simulation studies in a population where 4% were outcome definition positive, the positive predictive value (PPV)=0.8, and the negative predictive value (NPV)=0.96, 0.98 and 0.99. The size of negative samples, n0, was either selected to make the 95% CI fall within ± 0.1, 0.15 and 0.2 or set arbitrarily as 150, 300 and 600. We assumed a binomial distribution for the positive and negative samples. The coverage of the 95% CIs of the sensitivity and specificity was calculated as the proportion of CIs including the sensitivity and specificity in the population, respectively. For selected studies, the coverage was also estimated by the bootstrap method. The sample size was evaluated by examining whether the observed precision was within the pre-specified value.Results:For the sensitivity, the coverage of the approximated 95% CIs was larger than 0.95 in most studies but in 9 of 18 selected studies derived by the bootstrap method. For the specificity, the coverage of the approximated 95% CIs was approximately 0.93 in most studies, but the coverage was more than 0.95 in all 18 studies derived by the bootstrap method. The calculated size of negative samples yielded precisions within the pre-specified values in most of the studies.Conclusion:The approximation formulae for the 95% CIs of the sensitivity and specificity for stratified validation studies are presented. These formulae will help in conducting and analysing validation studies with stratified sampling.