908-P: Predictive Model for Glycemic Level of People with Diabetes Mellitus Using Data Extracted from Electronic Medical Records in Primary Care

Diabetes ◽  
2020 ◽  
Vol 69 (Supplement 1) ◽  
pp. 908-P
Author(s):  
SOSTENES MISTRO ◽  
THALITA V.O. AGUIAR ◽  
VANESSA V. CERQUEIRA ◽  
KELLE O. SILVA ◽  
JOSÉ A. LOUZADO ◽  
...  
2017 ◽  
Vol 10 (1) ◽  
pp. 16-27 ◽  
Author(s):  
Ebenezer S. Owusu Adjah ◽  
Olga Montvida ◽  
Julius Agbeve ◽  
Sanjoy K. Paul

Background:Identification of diseased patients from primary care based electronic medical records (EMRs) has methodological challenges that may impact epidemiologic inferences.Objective:To compare deterministic clinically guided selection algorithms with probabilistic machine learning (ML) methodologies for their ability to identify patients with type 2 diabetes mellitus (T2DM) from large population based EMRs from nationally representative primary care database.Methods:Four cohorts of patients with T2DM were defined by deterministic approach based on disease codes. The database was mined for a set of best predictors of T2DM and the performance of six ML algorithms were compared based on cross-validated true positive rate, true negative rate, and area under receiver operating characteristic curve.Results:In the database of 11,018,025 research suitable individuals, 379 657 (3.4%) were coded to have T2DM. Logistic Regression classifier was selected as best ML algorithm and resulted in a cohort of 383,330 patients with potential T2DM. Eighty-three percent (83%) of this cohort had a T2DM code, and 16% of the patients with T2DM code were not included in this ML cohort. Of those in the ML cohort without disease code, 52% had at least one measure of elevated glucose level and 22% had received at least one prescription for antidiabetic medication.Conclusion:Deterministic cohort selection based on disease coding potentially introduces significant mis-classification problem. ML techniques allow testing for potential disease predictors, and under meaningful data input, are able to identify diseased cohorts in a holistic way.


2021 ◽  
Vol 30 (5) ◽  
pp. 1124-1138
Author(s):  
Elisabet Rodriguez Llorian ◽  
Gregory Mason

2021 ◽  
Author(s):  
Jiaming Zeng ◽  
Michael F. Gensheimer ◽  
Daniel L. Rubin ◽  
Susan Athey ◽  
Ross D. Shachter

AbstractIn medicine, randomized clinical trials (RCT) are the gold standard for informing treatment decisions. Observational comparative effectiveness research (CER) is often plagued by selection bias, and expert-selected covariates may not be sufficient to adjust for confounding. We explore how the unstructured clinical text in electronic medical records (EMR) can be used to reduce selection bias and improve medical practice. We develop a method based on natural language processing to uncover interpretable potential confounders from the clinical text. We validate our method by comparing the hazard ratio (HR) from survival analysis with and without the confounders against the results from established RCTs. We apply our method to four study cohorts built from localized prostate and lung cancer datasets from the Stanford Cancer Institute Research Database and show that our method adjusts the HR estimate towards the RCT results. We further confirm that the uncovered terms can be interpreted by an oncologist as potential confounders. This research helps enable more credible causal inference using data from EMRs, offers a transparent way to improve the design of observational CER, and could inform high-stake medical decisions. Our method can also be applied to studies within and beyond medicine to extract important information from observational data to support decisions.


Author(s):  
Francesc X. Marin-Gomez ◽  
Jacobo Mendioroz-Peña ◽  
Miguel-Angel Mayer ◽  
Leonardo Méndez-Boo ◽  
Núria Mora ◽  
...  

Nursing homes have accounted for a significant part of SARS-CoV-2 mortality, causing great social alarm. Using data collected from electronic medical records of 1,319,839 institutionalised and non-institutionalised persons ≥ 65 years, the present study investigated the epidemiology and differential characteristics between these two population groups. Our results showed that the form of presentation of the epidemic outbreak, as well as some risk factors, are different among the elderly institutionalised population with respect to those who are not. In addition to a twenty-fold increase in the rate of adjusted mortality among institutionalised individuals, the peak incidence was delayed by approximately three weeks. Having dementia was shown to be a risk factor for death, and, unlike the non-institutionalised group, neither obesity nor age were shown to be significantly associated with the risk of death among the institutionalised. These differential characteristics should be able to guide the actions to be taken by the health administration in the event of a similar infectious situation among institutionalised elderly people.


2010 ◽  
Vol 47 (8) ◽  
pp. 895-912 ◽  
Author(s):  
Janice P. Minard ◽  
Scott E. Turcotte ◽  
M. Diane Lougheed

Sign in / Sign up

Export Citation Format

Share Document