Power and Sample Size Calculation of Comparative Diagnostic Accuracy Studies with Multiple Correlated Test Results

2005 ◽  
Vol 47 (2) ◽  
pp. 140-150 ◽  
Author(s):  
Aiyi Liu ◽  
Enrique F. Schisterman ◽  
Madhu Mazumdar ◽  
Jiang Hu
2020 ◽  
Vol 29 (10) ◽  
pp. 2958-2971 ◽  
Author(s):  
Maria Stark ◽  
Antonia Zapf

Introduction In a confirmatory diagnostic accuracy study, sensitivity and specificity are considered as co-primary endpoints. For the sample size calculation, the prevalence of the target population must be taken into account to obtain a representative sample. In this context, a general problem arises. With a low or high prevalence, the study may be overpowered in one subpopulation. One further issue is the correct pre-specification of the true prevalence. With an incorrect assumption about the prevalence, an over- or underestimated sample size will result. Methods To obtain the desired power independent of the prevalence, a method for an optimal sample size calculation for the comparison of a diagnostic experimental test with a prespecified minimum sensitivity and specificity is proposed. To face the problem of an incorrectly pre-specified prevalence, a blinded one-time re-estimation design of the sample size based on the prevalence and a blinded repeated re-estimation design of the sample size based on the prevalence are evaluated by a simulation study. Both designs are compared to a fixed design and additionally among each other. Results The type I error rates of both blinded re-estimation designs are not inflated. Their empirical overall power equals the desired theoretical power and both designs offer unbiased estimates of the prevalence. The repeated re-estimation design reveals no advantages concerning the mean squared error of the re-estimated prevalence or sample size compared to the one-time re-estimation design. The appropriate size of the internal pilot study in the one-time re-estimation design is 50% of the initially calculated sample size. Conclusions A one-time re-estimation design of the prevalence based on the optimal sample size calculation is recommended in single-arm diagnostic accuracy studies.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Amra Hot ◽  
Patrick M. Bossuyt ◽  
Oke Gerke ◽  
Simone Wahl ◽  
Werner Vach ◽  
...  

Abstract Background Diagnostic accuracy studies aim to examine the diagnostic accuracy of a new experimental test, but do not address the actual merit of the resulting diagnostic information to a patient in clinical practice. In order to assess the impact of diagnostic information on subsequent treatment strategies regarding patient-relevant outcomes, randomized test-treatment studies were introduced. Various designs for randomized test-treatment studies, including an evaluation of biomarkers as part of randomized biomarker-guided treatment studies, are suggested in the literature, but the nomenclature is not consistent. Methods The aim was to provide a clear description of the different study designs within a pre-specified framework, considering their underlying assumptions, advantages as well as limitations and derivation of effect sizes required for sample size calculations. Furthermore, an outlook on adaptive designs within randomized test-treatment studies is given. Results The need to integrate adaptive design procedures in randomized test-treatment studies is apparent. The derivation of effect sizes induces that sample size calculation will always be based on rather vague assumptions resulting in over- or underpowered study results. Therefore, it might be advantageous to conduct a sample size re-estimation based on a nuisance parameter during the ongoing trial. Conclusions Due to their increased complexity, compared to common treatment trials, the implementation of randomized test-treatment studies poses practical challenges including a huge uncertainty regarding study parameters like the expected outcome in specific subgroups or disease prevalence which might affect the sample size calculation. Since research on adaptive designs within randomized test-treatment studies is limited so far, further research is recommended.


Author(s):  
Daniël A. Korevaar ◽  
Gowri Gopalakrishna ◽  
Jérémie F. Cohen ◽  
Patrick M. Bossuyt

AbstractMost randomized controlled trials evaluating medical interventions have a pre-specified hypothesis, which is statistically tested against the null hypothesis of no effect. In diagnostic accuracy studies, study hypotheses are rarely pre-defined and sample size calculations are usually not performed, which may jeopardize scientific rigor and can lead to over-interpretation or “spin” of study findings. In this paper, we propose a strategy for defining meaningful hypotheses in diagnostic accuracy studies. Based on the role of the index test in the clinical pathway and the downstream consequences of test results, the consequences of test misclassifications can be weighed, to arrive at minimally acceptable criteria for pre-defined test performance: levels of sensitivity and specificity that would justify the test’s intended use. Minimally acceptable criteria for test performance should form the basis for hypothesis formulation and sample size calculations in diagnostic accuracy studies.


2021 ◽  
Vol 31 (2) ◽  
pp. 44
Author(s):  
Naura Ega Kahayani ◽  
Hermanto Tri Joewono ◽  
Rio Wironegoro

Background: Obesity in pregnant women has been linked to negative outcomes for both the mother and the fetus. Gestational diabetes, preeclampsia, and postpartum haemorrhage are all threats for mothers during pregnancy. These may also affect labor outcomes because fat accumulation in the pelvis makes obese pregnant women more likely to have a caesarean section. Objective: To analyze the outcome of maternal obesity in pregnancy in 2017. Materials and Methods: This was an analytic cross-sectional, observational study in pregnant women with obesity in Dr. Soetomo General Academic Hospital, Surabaya, Indonesia, within the period of January 1st – December 31st 2017 using medical records. Sample size was determined using cross-sectional sample size calculation with the result of n=162. After receiving approval of ethical clearance, medical records were collected and analyzed. The statistical analysis was done using chi-square test. Results: From a total of 162 subjects, most of the patients were 21-34 y/o (56.8%), under graduate (90.7%) and 58% of the subjects worked. Proportion of grade I, II and III were 60.5%, 22.2%, and 17.3%, and 67% suffered from complications such as PE, DM, PE with DM with percentages as follows: 56.2%, 4.9%, and 6.2%. Neonatal complications were preterm 42.6%, macrosomia 4.3%, stillbirth 4.9%, and low Apgar score 48.1%. No association was identified between maternal age and complications in grade I (p=0.764) and grade III (p=0.716). Obesity grades I and III had correlation with complications (p=0.035). Conclusion: No significant association was found between age and complications in grade III obesity, while there was significant correlation between obesity levels (grade I and grade III) and complications.


2020 ◽  
Vol 6 (1) ◽  
pp. e000772 ◽  
Author(s):  
Rahel Caliesch ◽  
Martin Sattelmayer ◽  
Stephan Reichenbach ◽  
Marcel Zwahlen ◽  
Roger Hilfiker

ObjectivesTo determine the diagnostic accuracy of clinical tests for cam or pincer morphology in individuals with suspected femoroacetabular impingement (FAI) syndrome and to evaluate their clinical utility.DesignA systematic review of studies investigating the diagnostic accuracy of clinical tests for cam and pincer morphology.Data sourcesPubMed, Embase, CINAHL and SPORTDiscus.Eligibility criteria for selecting studiesStudies investigating the diagnostic accuracy of clinical tests for cam, pincer or mixed morphology in symptomatic patients. Patients had to undergo an index test and a reference test able to identify cam or pincer morphology. Study results have to allow the calculation of true or false positives and/or negatives to calculate sensitivity, specificity, likelihood ratios (LR) and post-test probabilities.ResultsEight studies were included, investigating 17 tests and two test combinations. The studies reported a low specificity for all tests, ranging from 0.11 to 0.56. Sensitivity ranged from 0.11 to 1.00, with high sensitivities for the flexion-adduction-internal rotation (FADIR), foot progression angle walking (FPAW) and maximal squat tests. We estimated that negative test results on all of these three tests would result in a negative LR of 0.15. However, we judged the studies to provide low-quality evidence.ConclusionThere is low-quality evidence that negative test results reduce the post-test probability of cam or mixed morphologies and that consecutive testing with the FADIR, FPAW and maximal squat tests might be used as a clinical test combination. We would not recommend their use to confirm the diagnosis of FAI syndrome.PROSPERO registration numberCRD42018079116.


2017 ◽  
Vol 23 (5) ◽  
pp. 644-646 ◽  
Author(s):  
Maria Pia Sormani

The calculation of the sample size needed for a clinical study is the challenge most frequently put to statisticians, and it is one of the most relevant issues in the study design. The correct size of the study sample optimizes the number of patients needed to get the result, that is, to detect the minimum treatment effect that is clinically relevant. Minimizing the sample size of a study has the advantage of reducing costs, enhancing feasibility, and also has ethical implications. In this brief report, I will explore the main concepts on which the sample size calculation is based.


1994 ◽  
Vol 13 (8) ◽  
pp. 859-870 ◽  
Author(s):  
Robert P. McMahon ◽  
Michael Proschan ◽  
Nancy L. Geller ◽  
Peter H. Stone ◽  
George Sopko

2007 ◽  
Vol 77 (5) ◽  
pp. 773-778 ◽  
Author(s):  
Lars Bondemark ◽  
Jola Tsiopa

Abstract Objective: To elucidate the prevalence of ectopic eruption, impaction, and primary and secondary retention as well as agenesis of the permanent second molar (M2) among adolescents. Materials and Methods: After a sample size calculation, dental records, including radiographs, of 1543 patients (722 girls and 821 boys), from three clinics in the city of Malmoe, Sweden, were retrospectively analyzed. Series of annual records and radiographs were examined for all patients from 10 to 16 years of age and were carried out during 2004–2006. The prevalence of ectopic eruption, impaction, and primary and secondary retention as well as agenesis of M2s was registered in a standardized manner and according to preset definitions. In addition, the times of emergence of the M2s were recorded. Results: The prevalence of ectopic eruption of M2 was 1.5%, the prevalence of primary retention was 0.6%, and the prevalence of impaction was 0.2%. This means that the overall prevalence of eruption disturbances was 2.3%. In addition, the prevalence of agenesis was 0.8%. The prevalence of ectopic eruption was significantly higher in the mandible. Those patients with eruption disturbances and agenesis of M2 showed significantly delayed eruption of their other M2s compared to the individuals without any eruption disturbances. Conclusions: The prevalence of eruption disturbances was higher than reported earlier, and, even if the disturbances do not occur frequently, it is important to develop an early diagnosis in order to start the treatment at the optimal time.


Sign in / Sign up

Export Citation Format

Share Document