Systematic reviews do not (yet) represent the ‘gold standard’ of evidence: A position paper

2022 ◽  
Author(s):  
Robert Andrew Moore ◽  
Emma Fisher ◽  
Christopher Eccleston
2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Katie O’Hearn ◽  
Cameron MacDonald ◽  
Anne Tsampalieros ◽  
Leo Kadota ◽  
Ryan Sandarage ◽  
...  

Abstract Background Standard practice for conducting systematic reviews (SRs) is time consuming and involves the study team screening hundreds or thousands of citations. As the volume of medical literature grows, the citation set sizes and corresponding screening efforts increase. While larger team size and alternate screening methods have the potential to reduce workload and decrease SR completion times, it is unknown whether investigators adapt team size or methods in response to citation set sizes. Using a cross-sectional design, we sought to understand how citation set size impacts (1) the total number of authors or individuals contributing to screening and (2) screening methods. Methods MEDLINE was searched in April 2019 for SRs on any health topic. A total of 1880 unique publications were identified and sorted into five citation set size categories (after deduplication): < 1,000, 1,001–2,500, 2,501–5,000, 5,001–10,000, and > 10,000. A random sample of 259 SRs were selected (~ 50 per category) for data extraction and analysis. Results With the exception of the pairwise t test comparing the under 1000 and over 10,000 categories (median 5 vs. 6, p = 0.049) no statistically significant relationship was evident between author number and citation set size. While visual inspection was suggestive, statistical testing did not consistently identify a relationship between citation set size and number of screeners (title-abstract, full text) or data extractors. However, logistic regression identified investigators were significantly more likely to deviate from gold-standard screening methods (i.e. independent duplicate screening) with larger citation sets. For every doubling of citation size, the odds of using gold-standard screening decreased by 15 and 20% at title-abstract and full text review, respectively. Finally, few SRs reported using crowdsourcing (n = 2) or computer-assisted screening (n = 1). Conclusions Large citation set sizes present a challenge to SR teams, especially when faced with time-sensitive health policy questions. Our study suggests that with increasing citation set size, authors are less likely to adhere to gold-standard screening methods. It is possible that adjunct screening methods, such as crowdsourcing (large team) and computer-assisted technologies, may provide a viable solution for authors to complete their SRs in a timely manner.


2019 ◽  
Vol 36 ◽  
Author(s):  
Lucas de Francisco CARVALHO ◽  
Giselle PIANOWSKI ◽  
Manoel Antônio dos SANTOS

Abstract The systematic review refers to the literature review guided by scientific methods explicitly intended to reduce bias, resulting in a synthesis of all relevant evidence for a given issue. In Brazil, specifically in Psychology, systematic review is found in the literature; however, the available studies do not always reflect the gold standard or what is expected in terms of typical systematic review procedures. The present study is structured in the form of a didactic guide, organized in topics, which should be typically contemplated in an systematic review in Psychology. The information that must be contained in each of these topics is indicated, including which procedures should be performed in the typical steps of the development of an systematic review. The present publication intends to increase the interest and investment of researchers in systematic review, providing them with information to improve the quality of systematic review in the area of Psychology in Brazil.


Author(s):  
Jacob Stegenga

An astonishing volume and diversity of evidence is available for many hypotheses in medicine. Some of this evidence—usually from randomized trials—is amalgamated by meta-analysis. Despite the ongoing debate regarding whether or not randomized trials are the gold standard of evidence, the most reliable source of evidence in medical science is usually thought to come from systematic reviews and meta-analyses. This chapter argues that meta-analyses are malleable. Different meta-analyses of the same evidence can reach contradictory conclusions. Meta-analysis fails to provide objective grounds for assessing the effectiveness and harms of medical interventions because numerous decisions must be made when performing a meta-analysis, which allow wide latitude for subjective idiosyncrasies to influence its outcome.


2018 ◽  
Vol 34 (6) ◽  
pp. 547-554 ◽  
Author(s):  
Mick Arber ◽  
Julie Glanville ◽  
Jaana Isojarvi ◽  
Erin Baragula ◽  
Mary Edwards ◽  
...  

Objectives:This study investigated which databases and which combinations of databases should be used to identify economic evaluations (EEs) to inform systematic reviews. It also investigated the characteristics of studies not identified in database searches and evaluated the success of MEDLINE search strategies used within typical reviews in retrieving EEs in MEDLINE.Methods:A quasi-gold standard (QGS) set of EEs was collected from reviews of EEs. The number of QGS records found in nine databases was calculated and the most efficient combination of databases was determined. The number and characteristics of QGS records not retrieved from the databases were collected. Reproducible MEDLINE strategies from the reviews were rerun to calculate the sensitivity and precision for each strategy in finding QGS records.Results:The QGS comprised 351 records. Across all databases, 337/351 (96 percent) QGS records were identified. Embase yielded the most records (314; 89 percent). Four databases were needed to retrieve all 337 references: Embase + Health Technology Assessment database + (MEDLINE or PubMed) + Scopus. Four percent (14/351) of records could not be found in any database. Twenty-nine of forty-one (71 percent) reviews reported a reproducible MEDLINE strategy. Ten of twenty-nine (34.5 percent) of the strategies missed at least one QGS record in MEDLINE. Across all twenty-nine MEDLINE searches, 25/143 records were missed (17.5 percent). Mean sensitivity was 89 percent and mean precision was 1.6 percent.Conclusions:Searching beyond key databases for published EEs may be inefficient, providing the search strategies in those key databases are adequately sensitive. Additional search approaches should be used to identify unpublished evidence (grey literature).


2021 ◽  
Vol 23 (2) ◽  
pp. 248-250

The editorial article presents information about the problem of the use of erythropoietins in anemia in cancer patients in conjunction with the results of Cochrane systematic reviews the gold standard of the quality of evidence-based medical information.


10.2196/26167 ◽  
2021 ◽  
Vol 9 (4) ◽  
pp. e26167
Author(s):  
Tien Yun Yang ◽  
Li Huang ◽  
Shwetambara Malwade ◽  
Chien-Yi Hsu ◽  
Yang Ching Chen

Background Atrial fibrillation (AF) is the most common cardiac arrhythmia worldwide. Early diagnosis of AF is crucial for preventing AF-related morbidity, mortality, and economic burden, yet the detection of the disease remains challenging. The 12-lead electrocardiogram (ECG) is the gold standard for the diagnosis of AF. Because of technological advances, ambulatory devices may serve as convenient screening tools for AF. Objective The objective of this review was to investigate the diagnostic accuracy of 2 relatively new technologies used in ambulatory devices, non-12-lead ECG and photoplethysmography (PPG), in detecting AF. We performed a meta-analysis to evaluate the diagnostic accuracy of non-12-lead ECG and PPG compared to the reference standard, 12-lead ECG. We also conducted a subgroup analysis to assess the impact of study design and participant recruitment on diagnostic accuracy. Methods This systematic review and meta-analysis was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines. MEDLINE and EMBASE were systematically searched for articles published from January 1, 2015 to January 23, 2021. A bivariate model was used to pool estimates of sensitivity, specificity, positive likelihood ratio (PLR), negative likelihood ratio (NLR), and area under the summary receiver operating curve (SROC) as the main diagnostic measures. Study quality was evaluated using the quality assessment of diagnostic accuracy studies (QUADAS-2) tool. Results Our search resulted in 16 studies using either non-12-lead ECG or PPG for detecting AF, comprising 3217 participants and 7623 assessments. The pooled estimates of sensitivity, specificity, PLR, NLR, and diagnostic odds ratio for the detection of AF were 89.7% (95% CI 83.2%-93.9%), 95.7% (95% CI 92.0%-97.7%), 20.64 (95% CI 10.10-42.15), 0.11 (95% CI 0.06-0.19), and 224.75 (95% CI 70.10-720.56), respectively, for the automatic interpretation of non-12-lead ECG measurements and 94.7% (95% CI 93.3%-95.8%), 97.6% (95% CI 94.5%-99.0%), 35.51 (95% CI 18.19-69.31), 0.05 (95% CI 0.04-0.07), and 730.79 (95% CI 309.33-1726.49), respectively, for the automatic interpretation of PPG measurements. Conclusions Both non-12-lead ECG and PPG offered high diagnostic accuracies for AF. Detection employing automatic analysis techniques may serve as a useful preliminary screening tool before administering a gold standard test, which generally requires competent physician analyses. Subgroup analysis indicated variations of sensitivity and specificity between studies that recruited low-risk and high-risk populations, warranting future validity tests in the general population. Trial Registration PROSPERO International Prospective Register of Systematic Reviews CRD42020179937; https://www.crd.york.ac.uk/prospero/display_record.php?RecordID=179937


2021 ◽  
Vol 3 ◽  
Author(s):  
Lubna Daraz ◽  
Sheila Bouseh

Background: The current pandemic of COVID-19 has changed the way health information is distributed through online platforms. These platforms have played a significant role in informing patients and the public with knowledge that has changed the virtual world forever. Simultaneously, there are growing concerns that much of the information is not credible, impacting patient health outcomes, causing human lives, and tremendous resource waste. With the increasing use of online platforms, patients/the public require new learning models and sharing medical knowledge. They need to be empowered with strategies to navigate disinformation on online platforms.Methods and Design: To meet the urgent need to combat health “misinformation,” the research team proposes a structured approach to develop a quality benchmark, an evidence-based tool that identifies and addresses the determinants of online health information reliability. The specific methods to develop the intervention are the following: (1) systematic reviews: two comprehensive systematic reviews to understand the current state of the quality of online health information and to identify research gaps, (2) content analysis: develop a conceptual framework based on established and complementary knowledge translation approaches for analyzing the existing quality assessment tools and draft a unique set of quality of domains, (3) focus groups: multiple focus groups with diverse patients/the public and health information providers to test the acceptability and usability of the quality domains, (4) development and evaluation: a unique set of determinants of reliability will be finalized along with a preferred scoring classification. These items will be used to develop and validate a quality benchmark to assess the quality of online health information.Expected Outcomes: This multi-phase project informed by theory will lead to new knowledge that is intended to inform the development of a patient-friendly quality benchmark. This benchmark will inform best practices and policies in disseminating reliable web health information, thus reducing disparities in access to health knowledge and combat misinformation online. In addition, we envision the final product can be used as a gold standard for developing similar interventions for specific groups of patients or populations.


Sign in / Sign up

Export Citation Format

Share Document