Improving Prehospital Stroke Diagnosis Using Natural Language Processing of Paramedic Reports
Background and Purpose: Accurate prehospital diagnosis of stroke by emergency medical services (EMS) can increase treatments rates, mitigate disability, and reduce stroke deaths. We aimed to develop a model that utilizes natural language processing of EMS reports and machine learning to improve prehospital stroke identification. Methods: We conducted a retrospective study of patients transported by the Chicago EMS to 17 regional primary and comprehensive stroke centers. Patients who were suspected of stroke by the EMS or had hospital-diagnosed stroke were included in our cohort. Text within EMS reports were converted to unigram features, which were given as input to a support-vector machine classifier that was trained on 70% of the cohort and tested on the remaining 30%. Outcomes included final diagnosis of stroke versus nonstroke, large vessel occlusion, severe stroke (National Institutes of Health Stroke Scale score >5), and comprehensive stroke center-eligible stroke (large vessel occlusion or hemorrhagic stroke). Results: Of 965 patients, 580 (60%) had confirmed acute stroke. In a test set of 289 patients, the text-based model predicted stroke nominally better than models based on the Cincinnati Prehospital Stroke Scale ( c -statistic: 0.73 versus 0.67, P =0.165) and was superior to the 3-Item Stroke Scale ( c -statistic: 0.73 versus 0.53, P <0.001) scores. Improvements in discrimination were also observed for the other outcomes. Conclusions: We derived a model that utilizes clinical text from paramedic reports to identify stroke. Our results require validation but have the potential of improving prehospital routing protocols.