scholarly journals Automated mapping of laboratory tests to LOINC codes using noisy labels in a national electronic health record system database

2018 ◽  
Vol 25 (10) ◽  
pp. 1292-1300 ◽  
Author(s):  
Sharidan K Parr ◽  
Matthew S Shotwell ◽  
Alvin D Jeffery ◽  
Thomas A Lasko ◽  
Michael E Matheny

Abstract Objective Standards such as the Logical Observation Identifiers Names and Codes (LOINC®) are critical for interoperability and integrating data into common data models, but are inconsistently used. Without consistent mapping to standards, clinical data cannot be harmonized, shared, or interpreted in a meaningful context. We sought to develop an automated machine learning pipeline that leverages noisy labels to map laboratory data to LOINC codes. Materials and Methods Across 130 sites in the Department of Veterans Affairs Corporate Data Warehouse, we selected the 150 most commonly used laboratory tests with numeric results per site from 2000 through 2016. Using source data text and numeric fields, we developed a machine learning model and manually validated random samples from both labeled and unlabeled datasets. Results The raw laboratory data consisted of >6.5 billion test results, with 2215 distinct LOINC codes. The model predicted the correct LOINC code in 85% of the unlabeled data and 96% of the labeled data by test frequency. In the subset of labeled data where the original and model-predicted LOINC codes disagreed, the model-predicted LOINC code was correct in 83% of the data by test frequency. Conclusion Using a completely automated process, we are able to assign LOINC codes to unlabeled data with high accuracy. When the model-predicted LOINC code differed from the original LOINC code, the model prediction was correct in the vast majority of cases. This scalable, automated algorithm may improve data quality and interoperability, while substantially reducing the manual effort currently needed to accurately map laboratory data.

2019 ◽  
Vol 28 (01) ◽  
pp. 101-101

Eichstaedt JC, Smith RJ, Merchant RM, Ungar LH, Crutchley P, Preotiuc-Pietro D Asch DA, Schwartz HA. Facebook language predicts depression in medical records. Proc Natl Acad Sci U S A 2018;115(44):11203-8 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6217418/ Parr SK, Shotwell MS, Jeffery AD, Lasko TA, Matheny ME. Automated mapping of laboratory tests to LOINC codes using noisy labels in a national electronic health record system database. J Am Med Inform Assoc 2018;25(10):1292-300 https://academic.oup.com/jamia/article-abstract/25/10/1292/5075874?redirectedFrom=fulltext Xiao C, Ma T, Dieng AB, Blei DM, Wang F. Readmission prediction via deep contextual embedding of clinical concepts. PLoS One 2018;13(4):e0195024 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5890980/


2022 ◽  
Vol 12 ◽  
Author(s):  
Wei Chen ◽  
Xiangkui Li ◽  
Lu Ma ◽  
Dong Li

Objective: The accurate evaluation of outcomes at a personalized level in patients with intracerebral hemorrhage (ICH) is critical clinical implications. This study aims to evaluate how machine learning integrates with routine laboratory tests and electronic health records (EHRs) data to predict inpatient mortality after ICH.Methods: In this machine learning-based prognostic study, we included 1,835 consecutive patients with acute ICH between October 2010 and December 2018. The model building process incorporated five pre-implant ICH score variables (clinical features) and 13 out of 59 available routine laboratory parameters. We assessed model performance according to a range of learning metrics, such as the mean area under the receiver operating characteristic curve [AUROC]. We also used the Shapley additive explanation algorithm to explain the prediction model.Results: Machine learning models using laboratory data achieved AUROCs of 0.71–0.82 in a split-by-year development/testing scheme. The non-linear eXtreme Gradient Boosting model yielded the highest prediction accuracy. In the held-out validation set of development cohort, the predictive model using comprehensive clinical and laboratory parameters outperformed those using clinical alone in predicting in-hospital mortality (AUROC [95% bootstrap confidence interval], 0.899 [0.897–0.901] vs. 0.875 [0.872–0.877]; P <0.001), with over 81% accuracy, sensitivity, and specificity. We observed similar performance in the testing set.Conclusions: Machine learning integrated with routine laboratory tests and EHRs could significantly promote the accuracy of inpatient ICH mortality prediction. This multidimensional composite prediction strategy might become an intelligent assistive prediction for ICH risk reclassification and offer an example for precision medicine.


2021 ◽  
Author(s):  
Camilo E. Valderrama ◽  
Daniel J. Niven ◽  
Henry T. Stelfox ◽  
Joon Lee

BACKGROUND Redundancy in laboratory blood tests is common in intensive care units (ICU), affecting patients' health and increasing healthcare expenses. Medical communities have made recommendations to order laboratory tests more judiciously. Wise selection can rely on modern data-driven approaches that have been shown to help identify redundant laboratory blood tests in ICUs. However, most of these works have been developed for highly selected clinical conditions such as gastrointestinal bleeding. Moreover, features based on conditional entropy and conditional probability distribution have not been used to inform the need for performing a new test. OBJECTIVE We aimed to address the limitations of previous works by adapting conditional entropy and conditional probability to extract features to predict abnormal laboratory blood test results. METHODS We used an ICU dataset collected across Alberta, Canada which included 55,689 ICU admissions from 48,672 patients with different diagnoses. We investigated conditional entropy and conditional probability-based features by comparing the performances of two machine learning approaches to predict normal and abnormal results for 18 blood laboratory tests. Approach 1 used patients' vitals, age, sex, admission diagnosis, and other laboratory blood test results as features. Approach 2 used the same features plus the new conditional entropy and conditional probability-based features. RESULTS Across the 18 blood laboratory tests, both Approach 1 and Approach 2 achieved a median F1-score, AUC, precision-recall AUC, and Gmean above 80%. We found that the inclusion of the new features statistically significantly improved the capacity to predict abnormal laboratory blood test results in between ten and fifteen laboratory blood tests depending on the machine learning model. CONCLUSIONS Our novel approach with promising prediction results can help reduce over-testing in ICUs, as well as risks for patients and healthcare systems. CLINICALTRIAL N/A


2006 ◽  
Vol 52 (2) ◽  
pp. 325-328 ◽  
Author(s):  
Paul Froom ◽  
Zvi Shimoni

Abstract Background: The aim of this study was to explore whether electronically retrieved laboratory data can predict mortality in internal medicine departments in a regional hospital. Methods: All 10 308 patients hospitalized in internal medicine departments over a 1-year period were included in the cohort. Nearly all patients had a complete blood count and basic clinical chemistries on admission. We used logistic regression analysis to predict the 573 deaths (5.6%), including all variables that added significantly to the model. Results: Eight laboratory variables and age significantly and independently contributed to a logistic regression model (area under the ROC curve, 88.7%). The odds ratio for the final model per quartile of risk was 6.44 (95% confidence interval, 5.42–7.64), whereas for age alone, the odds ratio per quartile was 2.01 (95% confidence interval, 1.84–2.19). Conclusions: A logistic regression model including only age and electronically retrieved laboratory data highly predicted mortality in internal medicine departments in a regional hospital, suggesting that age and routine admission laboratory tests might be used to ensure a fair comparison when using mortality monitoring for hospital quality control.


2021 ◽  
Vol 15 (5) ◽  
pp. 76-79
Author(s):  
E. S. Aronova ◽  
B. S. Belov

The article describes the clinical observation of the onset of polyarthritis after COVID-19. Clinical data, laboratory tests' and instrumental methods results in dynamics, as well as approaches to therapy are presented. The discussion reflects modern views on the causes of the development of articular syndrome after SARS-CoV-2, with special attention to the need for a careful study of the history, clinical and laboratory data of patients with COVID-19.


2022 ◽  
Vol 12 (1) ◽  
pp. 112
Author(s):  
Rui Guo ◽  
Renjie Zhang ◽  
Ran Liu ◽  
Yi Liu ◽  
Hao Li ◽  
...  

Spontaneous intracerebral hemorrhage (SICH) has been common in China with high morbidity and mortality rates. This study aims to develop a machine learning (ML)-based predictive model for the 90-day evaluation after SICH. We retrospectively reviewed 751 patients with SICH diagnosis and analyzed clinical, radiographic, and laboratory data. A modified Rankin scale (mRS) of 0–2 was defined as a favorable functional outcome, while an mRS of 3–6 was defined as an unfavorable functional outcome. We evaluated 90-day functional outcome and mortality to develop six ML-based predictive models and compared their efficacy with a traditional risk stratification scale, the intracerebral hemorrhage (ICH) score. The predictive performance was evaluated by the areas under the receiver operating characteristic curves (AUC). A total of 553 patients (73.6%) reached the functional outcome at the 3rd month, with the 90-day mortality rate of 10.2%. Logistic regression (LR) and logistic regression CV (LRCV) showed the best predictive performance for functional outcome (AUC = 0.890 and 0.887, respectively), and category boosting presented the best predictive performance for the mortality (AUC = 0.841). Therefore, ML might be of potential assistance in the prediction of the prognosis of SICH.


2019 ◽  
Author(s):  
Clara Fannjiang ◽  
T. Aran Mooney ◽  
Seth Cones ◽  
David Mann ◽  
K. Alex Shorter ◽  
...  

AbstractZooplankton occupy critical roles in marine ecosystems, yet their fine-scale behavior remains poorly understood due to the difficulty of studying individualsin situ. Here we combine biologging with supervised machine learning (ML) to demonstrate a pipeline for studyingin situbehavior of larger zooplankton such as jellyfish. We deployed the ITAG, a biologging package with high-resolution motion sensors designed for soft-bodied invertebrates, on 8Chrysaora fuscescensin Monterey Bay, using the tether method for retrieval. Using simultaneous video footage of the tagged jellyfish, we develop ML methods to 1) identify periods of tag data corrupted by the tether method, which may have compromised prior research findings, and 2) classify jellyfish behaviors. Our tools yield characterizations of fine-scale jellyfish activity and orientation over long durations, and provide evidence that developing behavioral classifiers onin siturather than laboratory data is essential.Summary StatementHigh-resolution motion sensors paired with supervised machine learning can be used to infer fine-scalein situbehavior of zooplankton for long durations.


2021 ◽  
Author(s):  
Zhenhao Li

UNSTRUCTURED Tuberculosis (TB) is a precipitating cause of lung cancer. Lung cancer patients coexisting with TB is difficult to differentiate from isolated TB patients. The aim of this study is to develop a prediction model in identifying those two diseases between the comorbidities and TB. In this work, based on the laboratory data from 389 patients, 81 features, including main laboratory examination of blood test, biochemical test, coagulation assay, tumor markers and baseline information, were initially used as integrated markers and then reduced to form a discrimination system consisting of 31 top-ranked indices. Patients diagnosed with TB PCR >1mtb/ml as negative samples, lung cancer patients with TB were confirmed by pathological examination and TB PCR >1mtb/ml as positive samples. We used Spatially Uniform ReliefF (SURF) algorithm to determine feature importance, and the predictive model was built using machine learning algorithm Random Forest. For cross-validation, the samples were randomly split into four training set and one test set. The selected features are composed of four tumor markers (Scc, Cyfra21-1, CEA, ProGRP and NSE), fifteen blood biochemical indices (GLU, IBIL, K, CL, Ur, NA, TBA, CHOL, SA, TG, A/G, AST, CA, CREA and CRP), six routine blood indices (EO#, EO%, MCV, RDW-S, LY# and MPV) and four coagulation indices (APTT ratio, APTT, PTA, TT ratio). This model presented a robust and stable classification performance, which can easily differentiate the comorbidity group from the isolated TB group with AUC, ACC, sensitivity and specificity of 0.8817, 0.8654, 0.8594 and 0.8656 for the training set, respectively. Overall, this work may provide a novel strategy for identifying the TB patients with lung cancer from routine admission lab examination with advantages of being timely and economical. It also indicated that our model with enough indices may further increase the effectiveness and efficiency of diagnosis.


2020 ◽  
Vol 21 (4) ◽  
pp. 147032032098132
Author(s):  
Yang Xue ◽  
Shaoqing Sun ◽  
Jianing Cai ◽  
Linwen Zeng ◽  
Shihui Wang ◽  
...  

Background: The clinical use of angiotensin-converting enzyme inhibitors (ACEI) and angiotensin-receptor blockers (ARB) in patients with COVID-19 infection remains controversial. Therefore, we performed a meta-analysis on the effects of ACEI/ARB on disease symptoms and laboratory tests in hypertensive patients infected with COVID-19 virus and those who did not use ACEI/ARB. Methods: We systematically searched the relevant literatures from Pubmed, Embase, EuropePMC, CNKI, and other databases during the study period of 31 December 2019 (solstice, 15 March 2020), and analyzed the differences in symptoms and laboratory tests between patients with COVID-19 and hypertension who used ACEI/ARB drugs and those who did not. All statistical analyses were performed with REVMAN5.3. Results: We included a total of 1808 patients with hypertension diagnosed with COVID-19 in six studies. Analysis results show that ACEI/ARB drugs group D-dimer is lower (SMD = −0.22, 95%CI: −0.36 to −0.06), and the chances of getting fever is lower (OR = 0.74, 95%CI: 0.55 to 0.98). Meanwhile, laboratory data and symptoms were not statistical difference, but creatinine tends to rise (SMD = 0.22, 95% CI: 0.04 to 0.41). Conclusion: We found that the administration of ACEI/ARB drugs had positive effect on reducing D-dimer and the number of people with fever. Meanwhile it had no significant effect on other laboratory tests (creatinine excepted) or symptoms in patients with COVID-19, while special attention was still needed in patients with renal insufficiency.


Sign in / Sign up

Export Citation Format

Share Document