ELaPro, a LOINC-Mapped Core Dataset for Top Laboratory Procedures of Eligibility Screening for Clinical Trials
Abstract Background Screening for eligible patients continues to pose a great challenge for many clinical trials. This has led to a rapidly growing interest in standardizing computable representations of eligibility criteria (EC) in order to develop tools that leverage data from electronic health record (EHR) systems. Although laboratory procedures (LP) represent a common entity of EC that is readily available and retrievable from EHR systems, there is a lack of interoperable data models for this entity of EC. A public, specialized data model that utilizes international, widely-adopted terminology for LP, e.g. LOINC, is much needed to support automated screening tools. Objective The aim of this study is to establish a core dataset for LP most frequently requested to recruit patients for clinical trials using LOINC terminology. Employing such a core dataset could enhance the interface between study feasibility platforms and EHR systems and significantly improve automatic patient recruitment. Methods We used a semi-automated approach to analyze 10516 UMLS-annotated screening forms from the Medical Data Models (MDM) portal’s data repository. An automated semantic analysis based on concept frequency is followed by a manual expert review performed by physicians to analyze complex recruitment-relevant concepts not amenable to automatic approach. Results Based on analysis of 138225 EC from 10516 screening forms, 55 laboratory procedures represented 77.87% of all UMLS laboratory concept occurrences identified in the selected EC forms. We identified 26413 unique UMLS concepts from 118 UMLS semantic types and covered the vast majority of MeSH disease domains. Conclusions Only a small set of LP cover the majority of laboratory concepts in screening EC. The results prove the feasibility of establishing a core dataset for a group of LP common to most EC forms. We present ELaPro (Eligibility Laboratory Procedures), a novel, LOINC-mapped, core dataset for the most frequent 55 LP requested in screening for clinical trials in multiple machine-readable data formats.