scholarly journals Improving the Coding Completeness of Hypertension in Inpatient Administrative Health Data Using Machine Learning Methods

Author(s):  
Adam D'Souza ◽  
Zhiyang Liang ◽  
Tyler Williamson ◽  
Tony Smith ◽  
Hude Quan ◽  
...  

IntroductionThe Discharge Abstract Database (DAD) associates ICD-10-CA diagnosis codes with inpatient care episodes at acute-care facilities. Codes are assigned by human coders, based on chart review. Coding guidelines stipulate mandatory coding of major and fatal conditions but only optional coding of secondary conditions, which results in undercoding for many conditions. Objectives and ApproachThis research evaluates machine learning approaches for identifying and completing records with missing codes, to improve data quality. The Alberta Hospital DAD for 2013-14 was used in this study. We assumed that the existing ICD-10-CA codes in the DAD are correct, and used them as training examples. Several ML classifiers, including logistic regression and random forest, were used to develop models to assess the coding probability, using existing codes and demographic information. 3300 chart-review records were used as the reference standard. We focused on hypertension-related codes. Validity of raw diagnosis codes in the DAD was used as the baseline. ResultsA record is deemed to have a missing hypertension diagnosis code if the predicted probability is high, but without the diagnosis codes having been assigned by the coders. In the baseline, the original hypertension codes have high PPV (ranging from 0.902 for the age group 35-54 to 1.000 for the age group 18-34) but low sensitivity (ranging from 0.200 for the age group 18-34 to 0.565 for the age group 75+). The most successful models that we have tested so far have provided improvements of 2-6% in the sensitivity, while maintaining the PPV. More improvement is generally seen for the younger age groups. Initial experiments indicate greater improvements in sensitivity may be possible for other conditions, such as peptic ulcer disease and cerebrovascular disease. Conclusion/ImplicationsMachine learning approaches can be useful and cost-effective for improving data quality in DAD. While the improvements in sensitivity relative to the baseline are modest at present, further experiments with different models and feature sets are warranted. Experiments with other conditions may also be fruitful.

Author(s):  
Jeffrey G Klann ◽  
Griffin M Weber ◽  
Hossein Estiri ◽  
Bertrand Moal ◽  
Paul Avillach ◽  
...  

Abstract Introduction The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) is an international collaboration addressing COVID-19 with federated analyses of electronic health record (EHR) data. Objective We sought to develop and validate a computable phenotype for COVID-19 severity. Methods Twelve 4CE sites participated. First we developed an EHR-based severity phenotype consisting of six code classes, and we validated it on patient hospitalization data from the 12 4CE clinical sites against the outcomes of ICU admission and/or death. We also piloted an alternative machine-learning approach and compared selected predictors of severity to the 4CE phenotype at one site. Results The full 4CE severity phenotype had pooled sensitivity of 0.73 and specificity 0.83 for the combined outcome of ICU admission and/or death. The sensitivity of individual code categories for acuity had high variability - up to 0.65 across sites. At one pilot site, the expert-derived phenotype had mean AUC 0.903 (95% CI: 0.886, 0.921), compared to AUC 0.956 (95% CI: 0.952, 0.959) for the machine-learning approach. Billing codes were poor proxies of ICU admission, with as low as 49% precision and recall compared to chart review. Discussion We developed a severity phenotype using 6 code classes that proved resilient to coding variability across international institutions. In contrast, machine-learning approaches may overfit hospital-specific orders. Manual chart review revealed discrepancies even in the gold-standard outcomes, possibly due to heterogeneous pandemic conditions. Conclusion We developed an EHR-based severity phenotype for COVID-19 in hospitalized patients and validated it at 12 international sites.


2018 ◽  
Vol 23 (6) ◽  
pp. 460-465
Author(s):  
Jordan Anderson ◽  
Sevilay Dalabih ◽  
Esma Birisi ◽  
Abdallah Dalabih

OBJECTIVES Chloral hydrate had been extensively used for children undergoing sedation for imaging studies, but after the manufacturer discontinued production, pediatric sedation providers explored alternative sedation medications. Those medications needed to be at least as safe and as effective as chloral hydrate. In this study, we examined if pentobarbital is a suitable replacement for chloral hydrate. METHODS Subjects who received pentobarbital were recruited from a prospectively collected database, whereas we used a retrospective chart review to study subjects who received chloral hydrate. Sedation success was defined as the ability to provide adequate sedation using a single medication. We included electively performed sedations for subjects aged 2 months to 3 years who received either pentobarbital or chloral hydrate orally. We excluded subjects stratified as American Academy of Anesthesiologists category III or higher and those who received sedation for electroencephalogram. The data collected captured subject demographics and complications. RESULTS Five hundred thirty-four subjects were included in the final analysis, 368 in the chloral hydrate group and 166 in the pentobarbital group. Subjects who received pentobarbital had a statistically significant higher success rate [136 (82%) vs 238 (65%), p < 0.001], but longer sleeping time (18.1% vs 0%, p < 0.001) in all age groups. Subjects who received chloral hydrate had a higher risk of airway complications in the <1 year of age group (6.5% vs 1.8%, p = 0.03). CONCLUSIONS For pediatric patients younger than 3 years of age undergoing sedation for imaging studies, oral pentobarbital may be at least as effective and as safe as chloral hydrate, making it an acceptable and practical alternative.


2020 ◽  
Vol 54 (2) ◽  
pp. 215-234
Author(s):  
M.N. Doja ◽  
Ishleen Kaur ◽  
Tanvir Ahmad

PurposeThe incidence of prostate cancer is increasing from the past few decades. Various studies have tried to determine the survival of patients, but metastatic prostate cancer is still not extensively explored. The survival rate of metastatic prostate cancer is very less compared to the earlier stages. The study aims to investigate the survivability of metastatic prostate cancer based on the age group to which a patient belongs, and the difference between the significance of the attributes for different age groups.Design/methodology/approachData of metastatic prostate cancer patients was collected from a cancer hospital in India. Two predictive models were built for the analysis-one for the complete dataset, and the other for separate age groups. Machine learning was applied to both the models and their accuracies were compared for the analysis. Also, information gain for each model has been evaluated to determine the significant predictors for each age group.FindingsThe ensemble approach gave the best results of 81.4% for the complete dataset, and thus was used for the age-specific models. The results concluded that the age-specific model had the direct average accuracy of 83.74% and weighted average accuracy of 79.9%, with the highest accuracy levels for age less than 60.Originality/valueThe study developed a model that predicts the survival of metastatic prostate cancer based on age. The study will be able to assist the clinicians in determining the best course of treatment for each patient based on ECOG, age and comorbidities.


Antibiotics ◽  
2020 ◽  
Vol 9 (9) ◽  
pp. 536
Author(s):  
George Germanos ◽  
Patrick Light ◽  
Roger Zoorob ◽  
Jason Salemi ◽  
Fareed Khan ◽  
...  

Objective: To validate the use of electronic algorithms based on International Classification of Diseases (ICD)-10 codes to identify outpatient visits for urinary tract infections (UTI), one of the most common reasons for antibiotic prescriptions. Methods: ICD-10 symptom codes (e.g., dysuria) alone or in addition to UTI diagnosis codes plus prescription of a UTI-relevant antibiotic were used to identify outpatient UTI visits. Chart review (gold standard) was performed by two reviewers to confirm diagnosis of UTI. The positive predictive value (PPV) that the visit was for UTI (based on chart review) was calculated for three different ICD-10 code algorithms using (1) symptoms only, (2) diagnosis only, or (3) both. Results: Of the 1087 visits analyzed, symptom codes only had the lowest PPV for UTI (PPV = 55.4%; 95%CI: 49.3–61.5%). Diagnosis codes alone resulted in a PPV of 85% (PPV = 84.9%; 95%CI: 81.1–88.2%). The highest PPV was obtained by using both symptom and diagnosis codes together to identify visits with UTI (PPV = 96.3%; 95%CI: 94.5–97.9%). Conclusions: ICD-10 diagnosis codes with or without symptom codes reliably identify UTI visits; symptom codes alone are not reliable. ICD-10 based algorithms are a valid method to study UTIs in primary care settings.


2020 ◽  
Vol 8 ◽  
Author(s):  
Qun Miao ◽  
Aideen M. Moore ◽  
Shelley D. Dougan

Background: Congenital anomalies (CAs) are a major cause of infant morbidity and mortality in Canada. Reliably identifying CAs is essential for CA surveillance and research. The main objective of this study was to assess the agreement of eight sentinel anomalies including: neural tube defects (NTD), orofacial clefts, limb deficiency defects (LDD), Down syndrome (DS), tetralogy of Fallot (TOF), gastroschisis (GS), hypoplastic left heart syndrome (HLHS) and transposition of great vessels (TGA) captured in the BORN Information System (BIS) database and the Canadian Institute for Health Information (CIHI) Discharge Abstract Database (DAD).Methods: Live birth and stillbirth records between the BIS and CIHI-DAD in the fiscal years of 2012–2013 to 2015–2016 were linked using 10 digit infant Ontario Health Insurance Plan (OHIP) numbers. Percent agreement and Kappa statistics were performed to assess the reliability (agreement) of CAs identified in the linked BIS and CIHI-DAD birth records. Then, further investigations were conducted on those CA cases identified in the CIHI-DAD only.Results: Kappa coefficients of the eight selected CAs between BIS (“Confirmed” or “Suspected” cases) and CIHI-DAD were 0.96 (95% CI: 0.93–0.98) for GS; 0.81 (95% CI: 0.78–0.83) for Orofacial clefts; 0.75 (95% CI: 0.72–0.77) for DS; 0.71 (95% CI: 0.65–0.77) for TOF; 0.62 (95% CI: 0.55–0.68) for TGA; 0.59 (95% CI: 0.49–0.68) for HLHS, 0.53 (95% CI: 0.46–0.60) for NTD-all; and 0.30 (95% CI: 0.23–0.37) for LDD.Conclusions: The degree of agreement varied among sentinel CAs identified between the BIS and CIHI. The potential reasons for discrepancies include incompleteness of capturing CAs using existing picklist values, especially for certain sub-types, incomplete neonatal special care data in the BIS, and differences between clinical diagnosis in the BIS and ICD-10-CA classification in the DAD. A future data abstraction study will be conducted to investigate the potential reasons for discrepancies of CA capture between two databases. This project helps quantify the quality of CA data collection in the BIS, enhances understanding of CA prevalence in Ontario and provides direction for future data quality improvement activities.


BMC Nutrition ◽  
2020 ◽  
Vol 6 (1) ◽  
Author(s):  
Oleg Bilukha ◽  
Alexia Couture ◽  
Kelly McCain ◽  
Eva Leidman

Abstract Background Ensuring the quality of anthropometry data is paramount for getting accurate estimates of malnutrition prevalence among children aged 6–59 months in humanitarian and refugee settings. Previous reports based on data from Demographic and Health Surveys suggested systematic differences in anthropometric data quality between the younger and older groups of preschool children. Methods We analyzed 712 anthropometric population-representative field surveys from humanitarian and refugee settings conducted during 2011–2018. We examined and compared the quality of five anthropometric indicators in children aged 6–23 months and children aged 24–59 months: weight for height, weight for age, height for age, body mass index for age and mid-upper arm circumference (MUAC) for age. Using the z-score distribution of each indicator, we calculated the following parameters: standard deviation (SD), percentage of outliers, and measures of distribution normality. We also examined and compared the quality of height, weight, MUAC and age measurements using missing data and rounding criteria. Results Both SD and percentage of flags were significantly smaller on average in older than in younger age group for all five anthropometric indicators. Differences in SD between age groups did not change meaningfully depending on overall survey quality or on the quality of age ascertainment. Over 50% of surveys overall did not deviate significantly from normality. The percentage of non-normal surveys was higher in older than in the younger age groups. Digit preference score for weight, height and MUAC was slightly higher in younger age group, and for age slightly higher in the older age group. Children with reported exact date of birth (DOB) had much lower digit preference for age than those without exact DOB. SD, percentage flags and digit preference scores were positively correlated between the two age groups at the survey level, such as those surveys showing higher anthropometry data quality in younger age group also tended to show higher quality in older age group. Conclusions There should be an emphasis on increased rigor of training survey measurers in taking anthropometric measurements in the youngest children. Standardization test, a mandatory component of the pre-survey measurer training and evaluation, of 10 children should include at least 4–5 children below 2 years of age.


2020 ◽  
Author(s):  
Robert Chew ◽  
Caroline Kery ◽  
Laura Baum ◽  
Thomas Bukowski ◽  
Annice Kim ◽  
...  

BACKGROUND Social media are important for monitoring perceptions of public health issues and for educating target audiences about health; however, limited information about the demographics of social media users makes it challenging to identify conversations among target audiences and limits how well social media can be used for public health surveillance and education outreach efforts. Certain social media platforms provide demographic information on followers of a user account, if given, but they are not always disclosed, and researchers have developed machine learning algorithms to predict social media users’ demographic characteristics, mainly for Twitter. To date, there has been limited research on predicting the demographic characteristics of Reddit users. OBJECTIVE We aimed to develop a machine learning algorithm that predicts the age segment of Reddit users, as either adolescents or adults, based on publicly available data. METHODS This study was conducted between January and September 2020 using publicly available Reddit posts as input data. We manually labeled Reddit users’ age by identifying and reviewing public posts in which Reddit users self-reported their age. We then collected sample posts, comments, and metadata for the labeled user accounts and created variables to capture linguistic patterns, posting behavior, and account details that would distinguish the adolescent age group (aged 13 to 20 years) from the adult age group (aged 21 to 54 years). We split the data into training (n=1660) and test sets (n=415) and performed 5-fold cross validation on the training set to select hyperparameters and perform feature selection. We ran multiple classification algorithms and tested the performance of the models (precision, recall, F1 score) in predicting the age segments of the users in the labeled data. To evaluate associations between each feature and the outcome, we calculated means and confidence intervals and compared the two age groups, with 2-sample t tests, for each transformed model feature. RESULTS The gradient boosted trees classifier performed the best, with an F1 score of 0.78. The test set precision and recall scores were 0.79 and 0.89, respectively, for the adolescent group (n=254) and 0.78 and 0.63, respectively, for the adult group (n=161). The most important feature in the model was the number of sentences per comment (permutation score: mean 0.100, SD 0.004). Members of the adolescent age group tended to have created accounts more recently, have higher proportions of submissions and comments in the r/teenagers subreddit, and post more in subreddits with higher subscriber counts than those in the adult group. CONCLUSIONS We created a Reddit age prediction algorithm with competitive accuracy using publicly available data, suggesting machine learning methods can help public health agencies identify age-related target audiences on Reddit. Our results also suggest that there are characteristics of Reddit users’ posting behavior, linguistic patterns, and account features that distinguish adolescents from adults.


2021 ◽  
Vol 27 ◽  
Author(s):  
Lilla Tamási ◽  
Krisztián Horváth ◽  
Zoltán Kiss ◽  
Krisztina Bogos ◽  
Gyula Ostoros ◽  
...  

Objective: No assessment was conducted describing the age and gender specific epidemiology of lung cancer (LC) prior to 2018 in Hungary, thus the objective of this study was to appraise the detailed epidemiology of lung cancer (ICD-10 C34) in Hungary based on a retrospective analysis of the National Health Insurance Fund database.Methods: This longitudinal study included patients aged ≥20 years with LC diagnosis (ICD-10 C34) between January 1, 2011 and December 31, 2016. Patients with different cancer-related codes 6 months before or 12 months after LC diagnosis or having any cancer treatment other than lung cancer protocols were excluded.Results: Lung cancer incidence and mortality increased with age, peaking in the 70–79 age group (375.0/100,000 person-years) among males, while at 60–69 age group for females (148.1/100,000 person-years). The male-to-female incidence rate ratio reached 2.46–3.01 (p < 0.0001) among the 70–79 age group. We found 2–11% decrease in male incidence rate at most age groups, while a significant 1–3% increase was observed in older females (>60) annually during the study period.Conclusion: This nationwide epidemiology study demonstrated that LC incidence and mortality in Hungary decreased in younger male and female population, however we found significant increase of incidence in older female population, similar to international trends. Incidence rates peaked in younger age-groups compared to Western countries, most likely due to higher smoking prevalence in these cohorts, while lower age LC incidence could be attributed to higher competing cardiovascular risk resulting in earlier mortality in smoking population.


Author(s):  
Jeffrey G Klann ◽  
Griffin M Weber ◽  
Hossein Estiri ◽  
Bertrand Moal ◽  
Paul Avillach ◽  
...  

AbstractIntroductionThe Consortium for Clinical Characterization of COVID-19 by EHR (4CE) includes hundreds of hospitals internationally using a federated computational approach to COVID-19 research using the EHR.ObjectiveWe sought to develop and validate a standard definition of COVID-19 severity from readily accessible EHR data across the Consortium.MethodsWe developed an EHR-based severity algorithm and validated it on patient hospitalization data from 12 4CE clinical sites against the outcomes of ICU admission and/or death. We also used a machine learning approach to compare selected predictors of severity to the 4CE algorithm at one site.ResultsThe 4CE severity algorithm performed with pooled sensitivity of 0.73 and specificity 0.83 for the combined outcome of ICU admission and/or death. The sensitivity of single code categories for acuity were unacceptably inaccurate - varying by up to 0.65 across sites. A multivariate machine learning approach identified codes resulting in mean AUC 0.956 (95% CI: 0.952, 0.959) compared to 0.903 (95% CI: 0.886, 0.921) using expert-derived codes. Billing codes were poor proxies of ICU admission, with 49% precision and recall compared against chart review at one partner institution.DiscussionWe developed a proxy measure of severity that proved resilient to coding variability internationally by using a set of 6 code classes. In contrast, machine-learning approaches may tend to overfit hospital-specific orders. Manual chart review revealed discrepancies even in the gold standard outcomes, possibly due to pandemic conditions.ConclusionWe developed an EHR-based algorithm for COVID-19 severity and validated it at 12 international sites.


2017 ◽  
Author(s):  
Chin Lin ◽  
Chia-Jung Hsu ◽  
Yu-Sheng Lou ◽  
Shih-Jen Yeh ◽  
Chia-Cheng Lee ◽  
...  

BACKGROUND Automated disease code classification using free-text medical information is important for public health surveillance. However, traditional natural language processing (NLP) pipelines are limited, so we propose a method combining word embedding with a convolutional neural network (CNN). OBJECTIVE Our objective was to compare the performance of traditional pipelines (NLP plus supervised machine learning models) with that of word embedding combined with a CNN in conducting a classification task identifying International Classification of Diseases, Tenth Revision, Clinical Modification (ICD-10-CM) diagnosis codes in discharge notes. METHODS We used 2 classification methods: (1) extracting from discharge notes some features (terms, n-gram phrases, and SNOMED CT categories) that we used to train a set of supervised machine learning models (support vector machine, random forests, and gradient boosting machine), and (2) building a feature matrix, by a pretrained word embedding model, that we used to train a CNN. We used these methods to identify the chapter-level ICD-10-CM diagnosis codes in a set of discharge notes. We conducted the evaluation using 103,390 discharge notes covering patients hospitalized from June 1, 2015 to January 31, 2017 in the Tri-Service General Hospital in Taipei, Taiwan. We used the receiver operating characteristic curve as an evaluation measure, and calculated the area under the curve (AUC) and F-measure as the global measure of effectiveness. RESULTS In 5-fold cross-validation tests, our method had a higher testing accuracy (mean AUC 0.9696; mean F-measure 0.9086) than traditional NLP-based approaches (mean AUC range 0.8183-0.9571; mean F-measure range 0.5050-0.8739). A real-world simulation that split the training sample and the testing sample by date verified this result (mean AUC 0.9645; mean F-measure 0.9003 using the proposed method). Further analysis showed that the convolutional layers of the CNN effectively identified a large number of keywords and automatically extracted enough concepts to predict the diagnosis codes. CONCLUSIONS Word embedding combined with a CNN showed outstanding performance compared with traditional methods, needing very little data preprocessing. This shows that future studies will not be limited by incomplete dictionaries. A large amount of unstructured information from free-text medical writing will be extracted by automated approaches in the future, and we believe that the health care field is about to enter the age of big data.


Sign in / Sign up

Export Citation Format

Share Document