Common Data Elements for Meaningful Stroke Documentation in Routine Care and Clinical Research (Preprint)
BACKGROUND The medical information management regarding stroke patients is currently a very time-consuming endeavour. There are clear guidelines and procedures to treat patients suffering from an acute stroke - but how well are these established practices reflected in patient documentation? This paper compares a variety of documentation processes regarding stroke. The main objective of this work is to provide an overview regarding the most commonly occurring medical concepts in stroke documentation and identify overlaps between different documentation contexts to allow for the definition of a core dataset that could be utilized in potential data interfaces. OBJECTIVE A list of most common data elements could be identified to pave the way for a core dataset in stroke care and research. METHODS Medical source documentation forms from different documentation contexts including hospitals, clinical trials, registries and international standards regarding stroke treatment with following rehabilitation were digitized in the Operational Data Model (ODM). Each source data element was semantically annotated using the Unified Medical Language System (UMLS). Concept codes were analysed for semantic overlaps. A concept was considered to be common if it appeared at least in two documentation contexts. The resulting common concepts were extended with implementation details including data types and permissible values based on frequent patterns of source data elements using an established expert-based and semi-automatic approach. RESULTS In total, 3287 data elements were identified and 1051 of these emerged as unique medical concepts. The 100 most frequent medical concepts cover 50% of all concept occurrences in the stroke documentations and the 50 most frequent concepts cover 34%. A list of common data elements was implemented in different standardized machine-readable formats on a public metadata repository for interoperable re-use. CONCLUSIONS Standardization of medical documentation is a prerequisite for data exchange as well as the transferability and reutilization of data. In the long run standardization would lead to saving time and money and extend the capabilities such data could be used for. In the context of this work a lack of standardization was observed regarding the current information management. Free form text fields and intricate questions complicate the automated data access and transfer between institutions. This work also revealed the potential of a unified documentation process as a core dataset of the 50 most frequent CDEs already accounts for 34% of the documentations in medical information management. Such a dataset offers a starting point for a standardized and interoperable data collection in routine care, quality management and clinical research.