scholarly journals Defining Disease Phenotypes in Primary Care Electronic Health Records by a Machine Learning Approach: A Case Study in Identifying Rheumatoid Arthritis

PLoS ONE ◽  
2016 ◽  
Vol 11 (5) ◽  
pp. e0154515 ◽  
Author(s):  
Shang-Ming Zhou ◽  
Fabiola Fernandez-Gutierrez ◽  
Jonathan Kennedy ◽  
Roxanne Cooksey ◽  
Mark Atkinson ◽  
...  
Diagnostics ◽  
2021 ◽  
Vol 11 (10) ◽  
pp. 1908
Author(s):  
Fabiola Fernández-Gutiérrez ◽  
Jonathan I. Kennedy ◽  
Roxanne Cooksey ◽  
Mark Atkinson ◽  
Ernest Choy ◽  
...  

(1) Background: We aimed to develop a transparent machine-learning (ML) framework to automatically identify patients with a condition from electronic health records (EHRs) via a parsimonious set of features. (2) Methods: We linked multiple sources of EHRs, including 917,496,869 primary care records and 40,656,805 secondary care records and 694,954 records from specialist surgeries between 2002 and 2012, to generate a unique dataset. Then, we treated patient identification as a problem of text classification and proposed a transparent disease-phenotyping framework. This framework comprises a generation of patient representation, feature selection, and optimal phenotyping algorithm development to tackle the imbalanced nature of the data. This framework was extensively evaluated by identifying rheumatoid arthritis (RA) and ankylosing spondylitis (AS). (3) Results: Being applied to the linked dataset of 9657 patients with 1484 cases of rheumatoid arthritis (RA) and 204 cases of ankylosing spondylitis (AS), this framework achieved accuracy and positive predictive values of 86.19% and 88.46%, respectively, for RA and 99.23% and 97.75% for AS, comparable with expert knowledge-driven methods. (4) Conclusions: This framework could potentially be used as an efficient tool for identifying patients with a condition of interest from EHRs, helping clinicians in clinical decision-support process.


PLoS ONE ◽  
2014 ◽  
Vol 9 (11) ◽  
pp. e110900 ◽  
Author(s):  
Katherine I. Morley ◽  
Joshua Wallace ◽  
Spiros C. Denaxas ◽  
Ross J. Hunter ◽  
Riyaz S. Patel ◽  
...  

2018 ◽  
Vol 25 (4) ◽  
pp. 1538-1548 ◽  
Author(s):  
Sofie Wass ◽  
Vivian Vimarlund

In this study, we explore how healthcare professionals in primary care and outpatient clinics perceive the outcomes of giving patients online access to their electronic health records. The study was carried out as a case study and included a workshop, six interviews and a survey that was answered by 146 healthcare professionals. The results indicate that professionals working in primary care perceive that an increase in information-sharing with patients can increase adherence, clarify important information to the patient and allow the patient to quality-control documented information. Professionals at outpatient clinics seem less convinced about the benefits of patient accessible electronic health records and have concerns about how patients manage the information that they are given access to. However, the patient accessible electronic health record has not led to a change in documentation procedures among the majority of the professionals. While the findings can be connected to the context of outpatient clinics and primary care units, other contextual factors might influence the results and more in-depth studies are therefore needed to clarify the concerns.


2020 ◽  
Author(s):  
Nicholas B. Link ◽  
Selena Huang ◽  
Tianrun Cai ◽  
Zeling He ◽  
Jiehuan Sun ◽  
...  

ABSTRACTObjectiveThe use of electronic health records (EHR) systems has grown over the past decade, and with it, the need to extract information from unstructured clinical narratives. Clinical notes, however, frequently contain acronyms with several potential senses (meanings) and traditional natural language processing (NLP) techniques cannot differentiate between these senses. In this study we introduce an unsupervised method for acronym disambiguation, the task of classifying the correct sense of acronyms in the clinical EHR notes.MethodsWe developed an unsupervised ensemble machine learning (CASEml) algorithm to automatically classify acronyms by leveraging semantic embeddings, visit-level text and billing information. The algorithm was validated using note data from the Veterans Affairs hospital system to classify the meaning of three acronyms: RA, MS, and MI. We compared the performance of CASEml against another standard unsupervised method and a baseline metric selecting the most frequent acronym sense. We additionally evaluated the effects of RA disambiguation on NLP-driven phenotyping of rheumatoid arthritis.ResultsCASEml achieved accuracies of 0.947, 0.911, and 0.706 for RA, MS, and MI, respectively, higher than a standard baseline metric and (on average) higher than a state-of-the-art unsupervised method. As well, we demonstrated that applying CASEml to medical notes improves the AUC of a phenotype algorithm for rheumatoid arthritis.ConclusionCASEml is a novel method that accurately disambiguates acronyms in clinical notes and has advantages over commonly used supervised and unsupervised machine learning approaches. In addition, CASEml improves the performance of NLP tasks that rely on ambiguous acronyms, such as phenotyping.


2017 ◽  
Vol 31 (19-21) ◽  
pp. 1740055 ◽  
Author(s):  
Jiang Xie ◽  
Yan Liu ◽  
Xu Zeng ◽  
Wu Zhang ◽  
Zhen Mei

An extensive, in-depth study of diabetes risk factors (DBRF) is of crucial importance to prevent (or reduce) the chance of suffering from type 2 diabetes (T2D). Accumulation of electronic health records (EHRs) makes it possible to build nonlinear relationships between risk factors and diabetes. However, the current DBRF researches mainly focus on qualitative analyses, and the inconformity of physical examination items makes the risk factors likely to be lost, which drives us to study the novel machine learning approach for risk model development. In this paper, we use Bayesian networks (BNs) to analyze the relationship between physical examination information and T2D, and to quantify the link between risk factors and T2D. Furthermore, with the quantitative analyses of DBRF, we adopt EHR and propose a machine learning approach based on BNs to predict the risk of T2D. The experiments demonstrate that our approach can lead to better predictive performance than the classical risk model.


Sign in / Sign up

Export Citation Format

Share Document