scholarly journals The ground truth of the Data-Iceberg: Correct Meta-data

2021 ◽  
Author(s):  
Aylin Caliskan ◽  
Seema Dangwal ◽  
Thomas Dandekar

Biological molecular data such as sequence information increase so rapidly that detailed metadata, describing the process and conditions of data collection as well as proper labelling and typing of the data become ever more important to avoid mistakes and erroneous labeling. Starting from a striking example of wrong labelling of patient data recently published in Nature, we advocate measures to improve software metadata and controls in a timely manner to not rapidly loose quality in the ever-growing data flood.

Author(s):  
Dimitra Flouri ◽  
Daniel Lesnic ◽  
Constantina Chrysochou ◽  
Jehill Parikh ◽  
Peter Thelwall ◽  
...  

Abstract Introduction Model-driven registration (MDR) is a general approach to remove patient motion in quantitative imaging. In this study, we investigate whether MDR can effectively correct the motion in free-breathing MR renography (MRR). Materials and methods MDR was generalised to linear tracer-kinetic models and implemented using 2D or 3D free-form deformations (FFD) with multi-resolution and gradient descent optimization. MDR was evaluated using a kidney-mimicking digital reference object (DRO) and free-breathing patient data acquired at high temporal resolution in multi-slice 2D (5 patients) and 3D acquisitions (8 patients). Registration accuracy was assessed using comparison to ground truth DRO, calculating the Hausdorff distance (HD) between ground truth masks with segmentations and visual evaluation of dynamic images, signal-time courses and parametric maps (all data). Results DRO data showed that the bias and precision of parameter maps after MDR are indistinguishable from motion-free data. MDR led to reduction in HD (HDunregistered = 9.98 ± 9.76, HDregistered = 1.63 ± 0.49). Visual inspection showed that MDR effectively removed motion effects in the dynamic data, leading to a clear improvement in anatomical delineation on parametric maps and a reduction in motion-induced oscillations on signal-time courses. Discussion MDR provides effective motion correction of MRR in synthetic and patient data. Future work is needed to compare the performance against other more established methods.


PhytoKeys ◽  
2020 ◽  
Vol 140 ◽  
pp. 33-45
Author(s):  
Chien-Ti Chao ◽  
Bing-Hong Huang ◽  
Jui-Tse Chang ◽  
Pei-Chun Liao

The genus Scutellaria comprises eight species distributed from 50 to 2000 m in Taiwan. Amongst them, S. barbata and S. taipeiensis are very similar on the basis of morphological and plastid DNA sequence information. Therefore, a comprehensive study of the taxonomic status of S. taipeiensis is necessary. We reviewed the herbarium sheets, related literature and protologues and compared morphologies of these two species, as well as their phylogenetic relationships. All evidence, including the diagnostic characters between S. taipeiensis and S. barbata, suggest that they belonged to a single species rather than two. As a result, S. taipeiensis is treated as a synonym of S. barbata.


2019 ◽  
pp. 1098-1128
Author(s):  
Gennady Gienko ◽  
Michael Govorov

Researchers worldwide use remotely sensed imagery in their projects, in both the social and natural sciences. However, users often encounter difficulties working with satellite images and aerial photographs, as image interpretation requires specific experience and skills. The best way to acquire these skills is to go into the field, identify your location in an overhead image, observe the landscape, and find corresponding features in the overhead image. In many cases, personal observations could be substituted by using terrestrial photographs taken from the ground with conventional cameras. This chapter discusses the value of terrestrial photographs as a substitute for field observations, elaborates on issues of data collection, and presents results of experimental estimation of the effectiveness of the use of terrestrial ground truth photographs for interpretation of remotely sensed imagery. The chapter introduces the concept of GeoTruth – a web-based collaborative framework for collection, storing and distribution of ground truth terrestrial photographs and corresponding metadata.


1995 ◽  
Vol 10 (3) ◽  
pp. 178-183 ◽  
Author(s):  
Ralph B. (Monty) Leonard ◽  
Lew W. Stringer ◽  
Roy Alson

AbstractIntroduction:In large disasters, such as earthquakes and hurricanes, rapid, adequate and documented medical care and distribution of patients are essential.Methods:After a major (magnitude 6.7 Richter scale) earthquake occurred in Southern California, nine disaster medical assistance teams and two Veterans Administration (VA) buses with VA personnel responded to staff four medical stations, 19 disaster-assistance centers, and two mobile vans. All were under the supervision of the medical support unit (MSU) and its supervising officer. This article describes the patient-data collection system used. All facilities used the same patient encounter forms, log sheets, and medical treatment forms. Copies of these records accompanied the patients during every transfer. Centers for Disease Control and Prevention data classifications were used routinely. The MSU collected these forms twice each day so that all facilities had access to updated patient flow information.Results:Through the use of these methods, more than 11,000 victims were treated, transferred, and their cases tracked during a 12-day period.Conclusions:Use of this system by all federal responders to a major disaster area led to organized care for a large number of victims. Factors enhancing this care were the simplicity of the forms, the use of the forms by all federal responders, a central data collection point, and accessibility of the data at a known site available to all agencies every 12 hours.


Sensors ◽  
2020 ◽  
Vol 20 (3) ◽  
pp. 879 ◽  
Author(s):  
Uwe Köckemann ◽  
Marjan Alirezaie ◽  
Jennifer Renoux ◽  
Nicolas Tsiftes ◽  
Mobyen Uddin Ahmed ◽  
...  

As research in smart homes and activity recognition is increasing, it is of ever increasing importance to have benchmarks systems and data upon which researchers can compare methods. While synthetic data can be useful for certain method developments, real data sets that are open and shared are equally as important. This paper presents the E-care@home system, its installation in a real home setting, and a series of data sets that were collected using the E-care@home system. Our first contribution, the E-care@home system, is a collection of software modules for data collection, labeling, and various reasoning tasks such as activity recognition, person counting, and configuration planning. It supports a heterogeneous set of sensors that can be extended easily and connects collected sensor data to higher-level Artificial Intelligence (AI) reasoning modules. Our second contribution is a series of open data sets which can be used to recognize activities of daily living. In addition to these data sets, we describe the technical infrastructure that we have developed to collect the data and the physical environment. Each data set is annotated with ground-truth information, making it relevant for researchers interested in benchmarking different algorithms for activity recognition.


2000 ◽  
Vol 21 ◽  
pp. S247
Author(s):  
S. Lal ◽  
D. Chinkes ◽  
F. Smith ◽  
R. E. Barrow ◽  
D. N. Herndon

2017 ◽  
Vol 24 (5) ◽  
pp. 579-586 ◽  
Author(s):  
Bruce F Bebo ◽  
Robert J Fox ◽  
Karen Lee ◽  
Ursula Utz ◽  
Alan J Thompson

Background: There is a growing number of cohorts and registries collecting phenotypic and genotypic data from groups of multiple sclerosis patients. Improved awareness and better coordination of these efforts is needed. Objective: The purpose of this report is to provide a global landscape of the major longitudinal MS patient data collection efforts and share recommendations for increasing their impact. Methods: A workshop that included over 50 MS research and clinical experts from both academia and industry was convened to evaluate how current and future MS cohorts could be better used to provide answers to urgent questions about progressive MS. Results: The landscape analysis revealed a significant number of largely uncoordinated parallel studies. Strategic oversight and direction is needed to streamline and leverage existing and future efforts. A number of recommendations for enhancing these efforts were developed. Conclusions: Better coordination, increased leverage of evolving technology, cohort designs that focus on the most important unanswered questions, improved access, and more sustained funding will be needed to close the gaps in our understanding of progressive MS and accelerate the development of effective therapies.


2019 ◽  
Vol 37 (15_suppl) ◽  
pp. e18063-e18063
Author(s):  
Donna M. Graham ◽  
Joanna Clarke ◽  
Gemma Wickert ◽  
Leanna Goodwin ◽  
Carla Timmins ◽  
...  

e18063 Background: Data capture in early phase cancer clinical trials (EPCCT) is usually via paper records with manual transcription to the sponsor’s case report form. Capturing real time trial data directly to computer (eSource) may reduce errors and increase completeness and timeliness of data entry. A simulated system pilot took place between Oct 2018 and Jan 2019 at an EPCCT facility to appraise Foundry Health’s eSource system “ClinSpark”. Aims were to assess consistency and effectiveness of creating electronic templates for source data capture and live data collection compliance. Methods: A multidisciplinary focus group (2 research nurses, 1 doctor, 3 data managers) was created to collaborate with Foundry Health staff. The focus group agreed on a 52 item user acceptance test listing ideal features for a data collection tool, classifying items as high, medium or low priority. Specialised features of the eSource system were adapted to handle the complex needs of EPCCT. The pilot incorporated a 5 day boot camp for familiarisation to the digital platform; a conference room test using simulated patient data; construction of a trial template including contingency planning; and a clinic floor test with live simulated patient data collection using digital tablets. Results: During the 3 month pilot, templates for 2 EPCCT were planned and created. Using eSource, 43 items (83%) of the acceptance test were passed compared with 27 items (52%) for the current (paper-based) system. The paper system did not pass any of the 9 items for which eSource failed. For the 30 high priority items, eSource passed 30 (100%) compared with 22 for the paper system (73%). Time saving and potential error reduction were noted as additional benefits. Conclusions: This process demonstrates that a multidisciplinary approach can be used to successfully integrate a customised eSource system working with previously untrained staff. Improved performance across pre-specified domains and potential additional benefits were noted. As FDA encourages the use of digital solutions in clinical trials, using eSource provides a potential solution for compliant and efficient capture of data from protocol assessments at investigator sites and rapid data transfer to sponsors.


Sign in / Sign up

Export Citation Format

Share Document