scholarly journals Data Mining Approach to Identify Disease Cohorts from Primary Care Electronic Medical Records: A Case of Diabetes Mellitus

2017 ◽  
Vol 10 (1) ◽  
pp. 16-27 ◽  
Author(s):  
Ebenezer S. Owusu Adjah ◽  
Olga Montvida ◽  
Julius Agbeve ◽  
Sanjoy K. Paul

Background:Identification of diseased patients from primary care based electronic medical records (EMRs) has methodological challenges that may impact epidemiologic inferences.Objective:To compare deterministic clinically guided selection algorithms with probabilistic machine learning (ML) methodologies for their ability to identify patients with type 2 diabetes mellitus (T2DM) from large population based EMRs from nationally representative primary care database.Methods:Four cohorts of patients with T2DM were defined by deterministic approach based on disease codes. The database was mined for a set of best predictors of T2DM and the performance of six ML algorithms were compared based on cross-validated true positive rate, true negative rate, and area under receiver operating characteristic curve.Results:In the database of 11,018,025 research suitable individuals, 379 657 (3.4%) were coded to have T2DM. Logistic Regression classifier was selected as best ML algorithm and resulted in a cohort of 383,330 patients with potential T2DM. Eighty-three percent (83%) of this cohort had a T2DM code, and 16% of the patients with T2DM code were not included in this ML cohort. Of those in the ML cohort without disease code, 52% had at least one measure of elevated glucose level and 22% had received at least one prescription for antidiabetic medication.Conclusion:Deterministic cohort selection based on disease coding potentially introduces significant mis-classification problem. ML techniques allow testing for potential disease predictors, and under meaningful data input, are able to identify diseased cohorts in a holistic way.

Diabetes ◽  
2020 ◽  
Vol 69 (Supplement 1) ◽  
pp. 908-P
Author(s):  
SOSTENES MISTRO ◽  
THALITA V.O. AGUIAR ◽  
VANESSA V. CERQUEIRA ◽  
KELLE O. SILVA ◽  
JOSÉ A. LOUZADO ◽  
...  

2019 ◽  
Vol 15 (5) ◽  
pp. e1-e4 ◽  
Author(s):  
Daniel Martinez-Laguna ◽  
Alberto Soria-Castro ◽  
Cristina Carbonell-Abella ◽  
Pilar Orozco-López ◽  
Pilar Estrada-Laza ◽  
...  

2019 ◽  
Vol 15 (5) ◽  
pp. e1-e4
Author(s):  
Daniel Martinez-Laguna ◽  
Alberto Soria-Castro ◽  
Cristina Carbonell-Abella ◽  
Pilar Orozco-López ◽  
Pilar Estrada-Laza ◽  
...  

2021 ◽  
Vol 9 ◽  
Author(s):  
Marti Catala ◽  
Ermengol Coma ◽  
Sergio Alonso ◽  
Enrique Álvarez-Lacalle ◽  
Silvia Cordomi ◽  
...  

Monitoring transmission is a prerequisite for containing COVID-19. We report on effective potential growth (EPG) as a novel measure for the early identification of local outbreaks based on primary care electronic medical records (EMR) and PCR-confirmed cases. Secondly, we studied whether increasing EPG precedes local hospital and intensive care (ICU) admissions and mortality. Population-based cohort including all Catalan citizens' PCR tests, hospitalization, intensive care (ICU) and mortality between 1/07/2020 and 13/09/2020; linked EMR covering 88.6% of the Catalan population was obtained. Nursing home residents were excluded. COVID-19 counts were ascertained based on EMR and PCRs separately. Weekly empirical propagation (ρ7) and 14-day cumulative incidence (A14) and 95% confidence intervals were estimated at care management area (CMA) level, and combined as EPG = ρ7 × A14. Overall, 7,607,201 and 6,798,994 people in 43 CMAs were included for PCR and EMR measures, respectively. A14, ρ7, and EPG increased in numerous CMAs during summer 2020. EMR identified 2.70-fold more cases than PCRs, with similar trends, a median (interquartile range) 2 (1) days earlier, and better precision. Upticks in EPG preceded increases in local hospital admissions, ICU occupancy, and mortality. Increasing EPG identified localized outbreaks in Catalonia, and preceded local hospital and ICU admissions and subsequent mortality. EMRs provided similar estimates to PCR, but some days earlier and with better precision. EPG is a useful tool for the monitoring of community transmission and for the early identification of COVID-19 local outbreaks.


2021 ◽  
Vol 30 (5) ◽  
pp. 1124-1138
Author(s):  
Elisabet Rodriguez Llorian ◽  
Gregory Mason

Sign in / Sign up

Export Citation Format

Share Document