scholarly journals Optimising the use of electronic health records to estimate the incidence of rheumatoid arthritis in primary care: what information is hidden in free text?

2013 ◽  
Vol 13 (1) ◽  
Author(s):  
Elizabeth Ford ◽  
Amanda Nicholson ◽  
Rob Koeling ◽  
A Rosemary Tate ◽  
John Carroll ◽  
...  
2019 ◽  
Vol 10 (S1) ◽  
Author(s):  
Anoop D. Shah ◽  
Emily Bailey ◽  
Tim Williams ◽  
Spiros Denaxas ◽  
Richard Dobson ◽  
...  

Abstract Background Free text in electronic health records (EHR) may contain additional phenotypic information beyond structured (coded) information. For major health events – heart attack and death – there is a lack of studies evaluating the extent to which free text in the primary care record might add information. Our objectives were to describe the contribution of free text in primary care to the recording of information about myocardial infarction (MI), including subtype, left ventricular function, laboratory results and symptoms; and recording of cause of death. We used the CALIBER EHR research platform which contains primary care data from the Clinical Practice Research Datalink (CPRD) linked to hospital admission data, the MINAP registry of acute coronary syndromes and the death registry. In CALIBER we randomly selected 2000 patients with MI and 1800 deaths. We implemented a rule-based natural language engine, the Freetext Matching Algorithm, on site at CPRD to analyse free text in the primary care record without raw data being released to researchers. We analysed text recorded within 90 days before or 90 days after the MI, and on or after the date of death. Results We extracted 10,927 diagnoses, 3658 test results, 3313 statements of negation, and 850 suspected diagnoses from the myocardial infarction patients. Inclusion of free text increased the recorded proportion of patients with chest pain in the week prior to MI from 19 to 27%, and differentiated between MI subtypes in a quarter more patients than structured data alone. Cause of death was incompletely recorded in primary care; in 36% the cause was in coded data and in 21% it was in free text. Only 47% of patients had exactly the same cause of death in primary care and the death registry, but this did not differ between coded and free text causes of death. Conclusions Among patients who suffer MI or die, unstructured free text in primary care records contains much information that is potentially useful for research such as symptoms, investigation results and specific diagnoses. Access to large scale unstructured data in electronic health records (millions of patients) might yield important insights.


Diagnostics ◽  
2021 ◽  
Vol 11 (10) ◽  
pp. 1908
Author(s):  
Fabiola Fernández-Gutiérrez ◽  
Jonathan I. Kennedy ◽  
Roxanne Cooksey ◽  
Mark Atkinson ◽  
Ernest Choy ◽  
...  

(1) Background: We aimed to develop a transparent machine-learning (ML) framework to automatically identify patients with a condition from electronic health records (EHRs) via a parsimonious set of features. (2) Methods: We linked multiple sources of EHRs, including 917,496,869 primary care records and 40,656,805 secondary care records and 694,954 records from specialist surgeries between 2002 and 2012, to generate a unique dataset. Then, we treated patient identification as a problem of text classification and proposed a transparent disease-phenotyping framework. This framework comprises a generation of patient representation, feature selection, and optimal phenotyping algorithm development to tackle the imbalanced nature of the data. This framework was extensively evaluated by identifying rheumatoid arthritis (RA) and ankylosing spondylitis (AS). (3) Results: Being applied to the linked dataset of 9657 patients with 1484 cases of rheumatoid arthritis (RA) and 204 cases of ankylosing spondylitis (AS), this framework achieved accuracy and positive predictive values of 86.19% and 88.46%, respectively, for RA and 99.23% and 97.75% for AS, comparable with expert knowledge-driven methods. (4) Conclusions: This framework could potentially be used as an efficient tool for identifying patients with a condition of interest from EHRs, helping clinicians in clinical decision-support process.


PLoS ONE ◽  
2013 ◽  
Vol 8 (2) ◽  
pp. e54878 ◽  
Author(s):  
Amanda Nicholson ◽  
Elizabeth Ford ◽  
Kevin A. Davies ◽  
Helen E. Smith ◽  
Greta Rait ◽  
...  

PLoS ONE ◽  
2016 ◽  
Vol 11 (5) ◽  
pp. e0154515 ◽  
Author(s):  
Shang-Ming Zhou ◽  
Fabiola Fernandez-Gutierrez ◽  
Jonathan Kennedy ◽  
Roxanne Cooksey ◽  
Mark Atkinson ◽  
...  

2018 ◽  
Vol 68 (suppl 1) ◽  
pp. bjgp18X696749 ◽  
Author(s):  
Maimoona Hashmi ◽  
Mark Wright ◽  
Kirin Sultana ◽  
Benjamin Barratt ◽  
Lia Chatzidiakou ◽  
...  

BackgroundChronic Obstructive Airway Disease (COPD) is marked by often severely debilitating exacerbations. Efficient patient-centric research approaches are needed to better inform health management primary-care.AimThe ‘COPE study’ aims to develop a method of predicting COPD exacerbations utilising personal air quality sensors, environmental exposure modelling and electronic health records through the recruitment of patients from consenting GPs contributing to the Clinical Practice Research Datalink (CPRD).MethodThe study made use of Electronic Healthcare Records (EHR) from CPRD, an anonymised GP records database to screen and locate patients within GP practices in Central London. Personal air monitors were used to capture data on individual activities and environmental exposures. Output from the monitors were then linked with the EHR data to obtain information on COPD management, severity, comorbidities and exacerbations. Symptom changes not equating to full exacerbations were captured on diary cards. Linear regression was used to investigate the relationship between subject peak flow, symptoms, exacerbation events and exposure data.ResultsPreliminary results on the first 80 patients who have completed the study indicate variable susceptibility to environmental stressors in COPD patients. Some individuals appear highly susceptible to environmental stress and others appear to have unrelated triggers.ConclusionRecruiting patients through EHR for a study is feasible and allows easy collection of data for long term follow up. Portable environmental sensors could now be used to develop personalised models to predict risk of COPD exacerbations in susceptible individuals. Identification of direct links between participant health and activities would allow improved health management thus cost savings.


Rheumatology ◽  
2021 ◽  
Author(s):  
Dahai Yu ◽  
George Peat ◽  
Kelvin P Jordan ◽  
James Bailey ◽  
Daniel Prieto-Alhambra ◽  
...  

Abstract Objectives Better indicators from affordable, sustainable data sources are needed to monitor population burden of musculoskeletal conditions. We propose five indicators of musculoskeletal health and assessed if routinely available primary care electronic health records (EHR) can estimate population levels in musculoskeletal consulters. Methods We collected validated patient-reported measures of pain experience, function and health status through a local survey of adults (≥35 years) presenting to English general practices over 12 months for low back pain, shoulder pain, osteoarthritis and other regional musculoskeletal disorders. Using EHR data we derived and validated models for estimating population levels of five self-reported indicators: prevalence of high impact chronic pain, overall musculoskeletal health (based on Musculoskeletal Health Questionnaire), quality of life (based on EuroQoL health utility measure), and prevalence of moderate-to-severe low back pain and moderate-to-severe shoulder pain. We applied models to a national EHR database (Clinical Practice Research Datalink) to obtain national estimates of each indicator for three successive years. Results The optimal models included recorded demographics, deprivation, consultation frequency, analgesic and antidepressant prescriptions, and multimorbidity. Applying models to national EHR, we estimated that 31.9% of adults (≥35 years) presenting with non-inflammatory musculoskeletal disorders in England in 2016/17 experienced high impact chronic pain. Estimated population health levels were worse in women, older aged and those in the most deprived neighbourhoods, and changed little over 3 years. Conclusion National and subnational estimates for a range of subjective indicators of non-inflammatory musculoskeletal health conditions can be obtained using information from routine electronic health records.


2013 ◽  
Vol 112 (3) ◽  
pp. 731-737 ◽  
Author(s):  
Usman Iqbal ◽  
Cheng-Hsun Ho ◽  
Yu-Chuan(Jack) Li ◽  
Phung-Anh Nguyen ◽  
Wen-Shan Jian ◽  
...  

2021 ◽  
Vol 12 (04) ◽  
pp. 816-825
Author(s):  
Yingcheng Sun ◽  
Alex Butler ◽  
Ibrahim Diallo ◽  
Jae Hyun Kim ◽  
Casey Ta ◽  
...  

Abstract Background Clinical trials are the gold standard for generating robust medical evidence, but clinical trial results often raise generalizability concerns, which can be attributed to the lack of population representativeness. The electronic health records (EHRs) data are useful for estimating the population representativeness of clinical trial study population. Objectives This research aims to estimate the population representativeness of clinical trials systematically using EHR data during the early design stage. Methods We present an end-to-end analytical framework for transforming free-text clinical trial eligibility criteria into executable database queries conformant with the Observational Medical Outcomes Partnership Common Data Model and for systematically quantifying the population representativeness for each clinical trial. Results We calculated the population representativeness of 782 novel coronavirus disease 2019 (COVID-19) trials and 3,827 type 2 diabetes mellitus (T2DM) trials in the United States respectively using this framework. With the use of overly restrictive eligibility criteria, 85.7% of the COVID-19 trials and 30.1% of T2DM trials had poor population representativeness. Conclusion This research demonstrates the potential of using the EHR data to assess the clinical trials population representativeness, providing data-driven metrics to inform the selection and optimization of eligibility criteria.


Sign in / Sign up

Export Citation Format

Share Document