scholarly journals Novel tools and methods for designing and wrangling multifunctional, machine-readable evidence synthesis databases

2021 ◽  
Author(s):  
Neal Robert Haddaway ◽  
Charles T. Gray ◽  
Matthew Grainger

One of the most important steps in the process of conducting a systematic review or map is data extraction and the production of a database of coding, metadata and study data. There are many ways to structure these data, but to date, no guidelines or standards have been produced for the evidence synthesis community to support their production. Furthermore, there is little adoption of easily machine-readable, readily reusable and adaptable databases: these databases would be easier to translate into different formats by review authors, for example for tabulation, visualisation and analysis, and also by readers of the review/map. As a result, it is common for systematic review and map authors to produce bespoke, complex data structures that, although typically provided digitally, require considerable efforts to understand, verify and reuse. Here, we report on an analysis of systematic reviews and maps published by the Collaboration for Environmental Evidence, and discuss major issues that hamper machine readability and data reuse or verification. We highlight different justifications for the alternative data formats found: condensed databases; long databases; and wide databases. We describe these challenges in the context of data science principles that can support curation and publication of machine-readable, Open Data. We then go on to make recommendations to review and map authors on how to plan and structure their data, and we provide a suite of novel R-based functions to support efficient and reliable translation of databases between formats that are useful for presentation (condensed, human readable tables), filtering and visualisation (wide databases), and analysis (long databases). We hope that our recommendations for adoption of standard practices in database formatting, and the tools necessary to rapidly move between formats will provide a step-change in transparency and replicability of Open Data in evidence synthesis.

2021 ◽  
Vol 10 (1) ◽  
Author(s):  
Neal R. Haddaway ◽  
Charles T. Gray ◽  
Matthew Grainger

AbstractOne of the most important steps in the process of conducting a systematic review or map is data extraction and the production of a database of coding, metadata and study data. There are many ways to structure these data, but to date, no guidelines or standards have been produced for the evidence synthesis community to support their production. Furthermore, there is little adoption of easily machine-readable, readily reusable and adaptable databases: these databases would be easier to translate into different formats by review authors, for example for tabulation, visualisation and analysis, and also by readers of the review/map. As a result, it is common for systematic review and map authors to produce bespoke, complex data structures that, although typically provided digitally, require considerable efforts to understand, verify and reuse. Here, we report on an analysis of systematic reviews and maps published by the Collaboration for Environmental Evidence, and discuss major issues that hamper machine readability and data reuse or verification. We highlight different justifications for the alternative data formats found: condensed databases; long databases; and wide databases. We describe these challenges in the context of data science principles that can support curation and publication of machine-readable, Open Data. We then go on to make recommendations to review and map authors on how to plan and structure their data, and we provide a suite of novel R-based functions to support efficient and reliable translation of databases between formats that are useful for presentation (condensed, human readable tables), filtering and visualisation (wide databases), and analysis (long databases). We hope that our recommendations for adoption of standard practices in database formatting, and the tools necessary to rapidly move between formats will provide a step-change in transparency and replicability of Open Data in evidence synthesis.


BMJ Open ◽  
2021 ◽  
Vol 11 (11) ◽  
pp. e053084
Author(s):  
Travis Haber ◽  
Rana S Hinman ◽  
Fiona Dobson ◽  
Samantha Bunzli ◽  
Michelle Hall

IntroductionChronic hip pain in middle-aged and older adults is common and disabling. Patient-centred care of chronic hip pain requires a comprehensive understanding of how people with chronic hip pain view their health problem and its care. This paper outlines a protocol to synthesise qualitative evidence of middle-aged and older adults' views, beliefs, expectations and preferences about their chronic hip pain and its care.Methods and analysisWe will perform a qualitative evidence synthesis using a framework approach. We will conduct this study in accord with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) Statement and the Enhancing Transparency in Reporting the synthesis of Qualitative research checklist. We will search MEDLINE, CINAHL, The Cochrane Central Register of Controlled Trials, EMBASE and PsycINFO using a comprehensive search strategy. A priori selection criteria include qualitative studies involving samples with a mean age over 45 and where 80% or more have chronic hip pain. Two or more reviewers will independently screen studies for eligibility, assess methodological strengths and limitations using the Critical Appraisal Skills Programme qualitative studies checklist, perform data extraction and synthesis and determine ratings of confidence in each review finding using the Grading of Recommendations Assessment, Development and Evaluation—Confidence in the Evidence from Reviews of Qualitative research approach. Data extraction and synthesis will be guided by the Common-Sense Model of Self-Regulation. All authors will contribute to interpreting, refining and finalising review findings. This protocol is registered on PROSPERO and reported according to the PRISMA Statement for Protocols (PRISMA-P) checklist.Ethics and disseminationEthics approval is not required for this systematic review as primary data will not be collected. The findings of the review will be disseminated through publication in an academic journal and scientific conferences.PROSPERO registration numberPROSPERO registration number: CRD42021246305.


The 2017 SIS Conference aims to highlight the crucial role of the Statistics in Data Science. In this new domain of ‘meaning’ extracted from the data, the increasing amount of produced and available data in databases, nowadays, has brought new challenges. That involves different fields of statistics, machine learning, information and computer science, optimization, pattern recognition. These afford together a considerable contribute in the analysis of ‘Big data’, open data, relational and complex data, structured and no-structured. The interest is to collect the contributes which provide from the different domains of Statistics, in the high dimensional data quality validation, sampling extraction, dimensional reduction, pattern selection, data modelling, testing hypotheses and confirming conclusions drawn from the data.


2019 ◽  
Vol 3 ◽  
pp. 157
Author(s):  
Fala Cramond ◽  
Alison O'Mara-Eves ◽  
Lee Doran-Constant ◽  
Andrew SC Rice ◽  
Malcolm Macleod ◽  
...  

Background: The extraction of data from the reports of primary studies, on which the results of systematic reviews depend, needs to be carried out accurately. To aid reliability, it is recommended that two researchers carry out data extraction independently. The extraction of statistical data from graphs in PDF files is particularly challenging, as the process is usually completely manual, and reviewers need sometimes to revert to holding a ruler against the page to read off values: an inherently time-consuming and error-prone process. Methods: To mitigate some of the above problems we integrated and customised two existing JavaScript libraries to create a new web-based graphical data extraction tool to assist reviewers in extracting data from graphs. This tool aims to facilitate more accurate and timely data extraction through a user interface which can be used to extract data through mouse clicks. We carried out a non-inferiority evaluation to examine its performance in comparison to standard practice. Results: We found that the customised graphical data extraction tool is not inferior to users’ prior preferred current approaches. Our study was not designed to show superiority, but suggests that there may be a saving in time of around 6 minutes per graph, accompanied by a substantial increase in accuracy. Conclusions: Our study suggests that the incorporation of this type of tool in online systematic review software would be beneficial in facilitating the production of accurate and timely evidence synthesis to improve decision-making.


2022 ◽  
Vol 11 (1) ◽  
Author(s):  
Yuelun Zhang ◽  
Siyu Liang ◽  
Yunying Feng ◽  
Qing Wang ◽  
Feng Sun ◽  
...  

Abstract Background Systematic review is an indispensable tool for optimal evidence collection and evaluation in evidence-based medicine. However, the explosive increase of the original literatures makes it difficult to accomplish critical appraisal and regular update. Artificial intelligence (AI) algorithms have been applied to automate the literature screening procedure in medical systematic reviews. In these studies, different algorithms were used and results with great variance were reported. It is therefore imperative to systematically review and analyse the developed automatic methods for literature screening and their effectiveness reported in current studies. Methods An electronic search will be conducted using PubMed, Embase, ACM Digital Library, and IEEE Xplore Digital Library databases, as well as literatures found through supplementary search in Google scholar, on automatic methods for literature screening in systematic reviews. Two reviewers will independently conduct the primary screening of the articles and data extraction, in which nonconformities will be solved by discussion with a methodologist. Data will be extracted from eligible studies, including the basic characteristics of study, the information of training set and validation set, and the function and performance of AI algorithms, and summarised in a table. The risk of bias and applicability of the eligible studies will be assessed by the two reviewers independently based on Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2). Quantitative analyses, if appropriate, will also be performed. Discussion Automating systematic review process is of great help in reducing workload in evidence-based practice. Results from this systematic review will provide essential summary of the current development of AI algorithms for automatic literature screening in medical evidence synthesis and help to inspire further studies in this field. Systematic review registration PROSPERO CRD42020170815 (28 April 2020).


2021 ◽  
Author(s):  
Anna Mae Scott ◽  
Connor Forbes ◽  
Justin Clark ◽  
Matt Carter ◽  
Paul Glasziou ◽  
...  

Objective We investigated the use of systematic review automation tools by systematic reviewers, health technology assessors and clinical guideline developers. Study design and settings An online, 16-question survey was distributed across several evidence synthesis, health technology assessment and guideline development organisations internationally. We asked the respondents what tools they use and abandon, how often and when they use the tools, their perceived time savings and accuracy, and desired new tools. Descriptive statistics were used to report the results. Results 253 respondents completed the survey; 89% have used systematic review automation tools - most frequently whilst screening (79%). Respondents' Top 3 tools include: Covidence (45%), RevMan (35%), Rayyan and GRADEPro (both 22%); most commonly abandoned were Rayyan (19%), Covidence (15%), DistillerSR (14%) and RevMan (13%). Majority thought tools saved time (80%) and increased accuracy (54%). Respondents taught themselves to how to use the tools (72%), and were most often prevented by lack of knowledge from their adoption (51%). Most new tool development was suggested for the searching and data extraction stages. Conclusion Automation tools are likely to take on an increasingly important role in high quality and timely reviews. Further work is required in training and dissemination of automation tools and ensuring they meet the desirable features of those conducting systematic reviews.


Ultrasound ◽  
2019 ◽  
Vol 28 (2) ◽  
pp. 70-81 ◽  
Author(s):  
Jonathan Ince ◽  
Meshal Alharbi ◽  
Jatinder S Minhas ◽  
Emma ML Chung

Introduction It has long been suggested that ultrasound could be used to measure brain tissue pulsations in humans, but potential clinical applications are relatively unexplored. The aim of this systematic review was to explore and synthesise available literature on ultrasound measurement of brain tissue motion in humans. Methods Our systematic review was designed to include predefined study selection criteria, quality evaluation, and a data extraction pro-forma, registered prospectively on PROSPERO (CRD42018114117). The systematic review was conducted by two independent reviewers. Results Ten studies were eligible for the evidence synthesis and qualitative evaluation. All eligible studies confirmed that brain tissue motion over the cardiac cycle could be measured using ultrasound; however, data acquisition, analysis, and outcomes varied. The majority of studies used tissue pulsatility imaging, with the right temporal window as the acquisition point. Currently available literature is largely exploratory, with measurements of brain tissue displacement over a narrow range of health conditions and ages. Explored health conditions include orthostatic hypotension and depression. Conclusion Further studies are needed to assess variability in brain tissue motion estimates across larger cohorts of healthy subjects and in patients with various medical conditions. This would be important for informing sample size estimates to ensure future studies are appropriately powered. Future research would also benefit from a consistent framework for data analysis and reporting, to facilitate comparative research and meta-analysis. Following standardisation and further healthy participant studies, future work should focus on assessing the clinical utility of brain tissue pulsation measurements in cerebrovascular disease states.


2021 ◽  
Author(s):  
Alice Fremand

<p>Open data is not a new concept. Over sixty years ago in 1959, knowledge sharing was at the heart of the Antarctic Treaty which included in article III 1c the statement: “scientific observations and results from Antarctica shall be exchanged and made freely available”. ​At a similar time, the World Data Centre (WDC) system was created to manage and distribute the data collected from the International Geophysical Year (1957-1958) led by the International Council of Science (ICSU) building the foundations of today’s research data management practices.</p><p>What about now? The WDC system still exists through the World Data System (WDS). Open data has been endorsed by a majority of funders and stakeholders. Technology has dramatically evolved. And the profession of data manager/curator has emerged. Utilising their professional expertise means that their role is far wider than the long-term curation and publication of data sets.</p><p>Data managers are involved in all stages of the data life cycle: from data management planning, data accessioning to data publication and re-use. They implement open data policies; help write data management plans and provide advice on how to manage data during, and beyond the life of, a science project. In liaison with software developers as well as scientists, they are developing new strategies to publish data either via data catalogues, via more sophisticated map-based viewer services or in machine-readable form via APIs. Often, they bring the expertise of the field they are working in to better assist scientists satisfy Findable, Accessible, Interoperable and Re-usable (FAIR) principles. Recent years have seen the development of a large community of experts that are essential to share, discuss and set new standards and procedures. The data are published to be re-used, and data managers are key to promoting high-quality datasets and participation in large data compilations.</p><p>To date, there is no magical formula for FAIR data. The Research Data Alliance is a great platform allowing data managers and researchers to work together, develop and adopt infrastructure that promotes data-sharing and data-driven research. However, the challenge to properly describe each data set remains. Today, scientists are expecting more and more from their data publication or data requests: they want interactive maps, they want more complex data systems, they want to query data, combine data from different sources and publish them rapidly.  By developing new procedures and standards, and looking at new technologies, data managers help set the foundations to data science.</p>


2021 ◽  
Vol 12 ◽  
Author(s):  
Alexander Aguirre Montero ◽  
José Antonio López-Sánchez

This systematic review adopts a formal and structured approach to review the intersection of data science and smart tourism destinations in terms of components found in previous research. The study period corresponds to 1995–2021 focusing the analysis mainly on the last years (2015–2021), identifying and characterizing the current trends on this research topic. The review comprises documentary research based on bibliometric and conceptual analysis, using the VOSviewer and SciMAT software to analyze articles from the Web of Science database. There is growing interest in this research topic, with more than 300 articles published annually. Data science technologies on which current smart destinations research is based include big data, smart data, data analytics, social media, cloud computing, the internet of things (IoT), smart card data, geographic information system (GIS) technologies, open data, artificial intelligence, and machine learning. Critical research areas for data science techniques and technologies in smart destinations are public tourism marketing, mobility-accessibility, and sustainability. Data analysis techniques and technologies face unprecedented challenges and opportunities post-coronavirus disease-2019 (COVID-19) to build on the huge amount of data and a new tourism model that is more sustainable, smarter, and safer than those previously implemented.


2021 ◽  
Author(s):  
Katherine E. O. Todd-Brown ◽  
Rose Z. Abramoff ◽  
Jeffrey Beem-Miller ◽  
Hava K. Blair ◽  
Stevan Earl ◽  
...  

Abstract. In the age of big data, soil data are more available than ever, but -outside of a few large soil survey resources- remain largely unusable for informing soil management and understanding Earth system processes outside of the original study. Data science has promised a fully reusable research pipeline where data from past studies are used to contextualize new findings and reanalyzed for global relevance. Yet synthesis projects encounter challenges at all steps of the data reuse pipeline, including unavailable data, labor-intensive transcription of datasets, incomplete metadata, and a lack of communication between collaborators. Here, using insights from a diversity of soil, data and climate scientists, we summarize current practices in soil data synthesis across all stages of database creation: data discovery, input, harmonization, curation, and publication. We then suggest new soil-focused semantic tools to improve existing data pipelines, such as ontologies, vocabulary lists, and community practices. Our goal is to provide the soil data community with an overview of current practices in soil data and where we need to go to fully leverage big data to solve soil problems in the next century.


Sign in / Sign up

Export Citation Format

Share Document