scholarly journals Underserved populations with missing race ethnicity data differ significantly from those with structured race/ethnicity documentation

2019 ◽  
Vol 26 (8-9) ◽  
pp. 722-729 ◽  
Author(s):  
Evan T Sholle ◽  
Laura C Pinheiro ◽  
Prakash Adekkanattu ◽  
Marcos A Davila ◽  
Stephen B Johnson ◽  
...  

Abstract Objective We aimed to address deficiencies in structured electronic health record (EHR) data for race and ethnicity by identifying black and Hispanic patients from unstructured clinical notes and assessing differences between patients with or without structured race/ethnicity data. Materials and Methods Using EHR notes for 16 665 patients with encounters at a primary care practice, we developed rule-based natural language processing (NLP) algorithms to classify patients as black/Hispanic. We evaluated performance of the method against an annotated gold standard, compared race and ethnicity between NLP-derived and structured EHR data, and compared characteristics of patients identified as black or Hispanic using only NLP vs patients identified as such only in structured EHR data. Results For the sample of 16 665 patients, NLP identified 948 additional patients as black, a 26%increase, and 665 additional patients as Hispanic, a 20% increase. Compared with the patients identified as black or Hispanic in structured EHR data, patients identified as black or Hispanic via NLP only were older, more likely to be male, less likely to have commercial insurance, and more likely to have higher comorbidity. Discussion Structured EHR data for race and ethnicity are subject to data quality issues. Supplementing structured EHR race data with NLP-derived race and ethnicity may allow researchers to better assess the demographic makeup of populations and draw more accurate conclusions about intergroup differences in health outcomes. Conclusions Black or Hispanic patients who are not documented as such in structured EHR race/ethnicity fields differ significantly from those who are. Relatively simple NLP can help address this limitation.

Author(s):  
Andrew Hantel ◽  
Marlise R. Luskin ◽  
Jacqueline S Garcia ◽  
Wendy Stock ◽  
Daniel J DeAngelo ◽  
...  

Data regarding racial and ethnic enrollment diversity for acute myeloid (AML) and lymphoid leukemia (ALL) clinical trials in the United States (US) are limited, and little is known about the effect of federal reporting requirements instituted in the late 2000s. We examined demographic data reporting and enrollment diversity for US ALL and AML trials from 2002-2017 as well as changes in reporting and diversity after reporting requirements were instituted. Of 223 AML and 97 ALL trials with results, 68 (30.5%) and 51 (52.6%) reported enrollment by both race and ethnicity. Among trials that reported race and ethnicity (AML N=6,554; ALL N=4,149), non-Hispanic (NH)-Black, NH-Native American, NH-Asian, and Hispanic patients had significantly lower enrollment compared to NH-white patients after adjusting for race-ethnic disease incidence (AML odds: 0.68, 0.31, 0.75, and 0.83; ALL: 0.74, 0.27, 0.67, and 0.64; all p≤0.01). The proportion of trials reporting race increased significantly after the reporting requirements (44.2 to 60.2%; p=0.02), but race-ethnicity reporting did not (34.8 to 38.6%; p=0.57). Reporting proportions by number of patients enrolled increased significantly after the reporting requirements (race: 51.7 to 72.7%, race-ethnicity: 39.5 to 45.4%; both p<0.001), and relative enrollment of NH-Black and Hispanic patients decreased (AML odds: 0.79 and 0.77; ALL: 0.35 and 0.25; both p≤0.01). These data suggest that demographic enrollment reporting for acute leukemia trials is suboptimal, changes in diversity after the reporting requirements may be due to additional enrollment disparities that were previously unreported, and enrollment diversification strategies specific to acute leukemia care delivery are needed.


Blood ◽  
2019 ◽  
Vol 134 (Supplement_1) ◽  
pp. 387-387
Author(s):  
Lauryn S. Walker ◽  
Taylor Olmsted Kim ◽  
Amanda Bell Grimes ◽  
Susan Kirk ◽  
Audrey S. Cohen ◽  
...  

Background: Immune thrombocytopenia (ITP) is the most common cause of acquired immune platelet destruction in children. Clinical symptoms range from asymptomatic to significant and even life-threatening bleeding, fatigue, and reduced health-related quality of life. About 75% of affected children experience spontaneous remission, with the remainder developing chronic ITP. Our clinical observations suggest a decreased prevalence of ITP among Black children, although no available studies have evaluated racial or ethnic predisposition to ITP or to chronic disease. We hypothesized that biological differences in Black children alter the prevalence of ITP, relative to the general population, and may affect disease course. Methods: A retrospective analysis evaluating race and ethnicity of all children with ITP treated at Texas Children's Hospital (TCH, Houston, TX) from January 2015-July 2019 was performed, and compared to both the Houston metropolitan area and the TCH Cancer Center 2018 race and ethnicity data. Of the 699 unique patients, race and ethnicity data were unavailable for 24, and 2 patients were excluded (1 with leukemia, 1 with bone marrow failure). The remaining patients were categorized as (1) White, non-Hispanic; (2) Black, non-Hispanic; (3) Hispanic; and (4) Other. Hispanic patients included those who self-identified as (1) White, Hispanic; (2) Black, Hispanic; (3) Asian, Hispanic; (4) Multi-race, Hispanic; and (5) Unknown race, Hispanic. Demographic data was then collected in a second ITP population derived from the Children's Hospital of Philadelphia (CHOP, Philadelphia, PA) and the surrounding metropolitan area. To match the distribution reported in the Houston metro area data, a comparison was conducted focusing on the proportion of Black non-Hispanic patients in each cohort. A chi-squared test with Yates correction was utilized to compare nonparametric categorical data using GraphPad Prism version 8.0.1 for Windows, GraphPad Software, San Diego, California, USA. A p-value of <0.05 was statistically significant. Results: At TCH, there were 673 evaluable ITP patients. Classifying by race only, 42 were Black (6.2%), 564 White (83.8%), and 67 Other (Asian, Mixed Race, Native American, 9.9%). There was a significantly smaller percentage of ITP patients identified as Black, non-Hispanic (n= 40, 5.9%) relative to both the Texas Children's Cancer Center (16% Black, non-Hispanic) and the Houston metropolitan area populations (16% Black, non-Hispanic, p<0.0001). These data are significant given that Texas Children's Hospital sees the majority of children in the Houston metropolitan area, and the demographic data for the Cancer Center accurately reflect the Houston population (see Table). Black, non-Hispanic patients with ITP were more likely to have chronic disease (40% chronic, n=16) than the expected 20-25% in the overall pediatric ITP population. Many of the patients described as Black, non-Hispanic had secondary ITP (20.0%, n=8), most commonly due to systemic lupus erythematosus (SLE), a condition more common amongst Black patients. These findings were consistent at CHOP, with 7% of the 311 included patients classified as Black (n=22), 76% White (n=236) and 17% Other (n=54 including Asian, Middle Eastern, Mixed Race and Native American/Pacific Islander). Again, these proportions are in contrast to the general CHOP patient population (22% Black, 55.5% White and 22.5% other) and the Philadelphia metro population (42.5% Black, 41.5% White, 16% Other). The frequency of chronic ITP was also increased in Black patients in the CHOP cohort (39%, n=7). Conclusions: This analysis suggests a significant difference in the prevalence of ITP by race in children at two large tertiary care centers in the US. We found that Black children were less likely to develop ITP but those who did were more likely to develop chronic disease relative to other races. Race and ethnicity data at both institutions more closely matched their surrounding metropolitan areas, suggesting the observed differences are not due to care or access barriers. These findings support the possibility that damaging genetic variants, altered gene expression and/or other biological differences in immune response may lead to increased risk of ITP in White children, or protective factors result in reduced prevalence of ITP in Black children. Further research to explore these differences is ongoing. Figure Disclosures Lambert: CSL Behring: Consultancy; Amgen: Consultancy, Other; Bayer: Other: Ad boards; Novartis: Other: Ad boards, Research Funding; Shionogi: Consultancy; Kedrion: Consultancy; Sysmex: Consultancy; AstraZeneca: Research Funding; PDSA: Research Funding. Despotovic:Dova: Honoraria; Novartis: Research Funding; Amgen: Research Funding.


2019 ◽  
Author(s):  
Olga F. Jarrín ◽  
Abner N. Nyandege ◽  
Irina B. Grafova ◽  
XinQi Dong

Background: Errors in racial and ethnic classification of Medicare beneficiaries limit health services research on minority health and health disparities among priority populations, including American Indians and Alaskan Natives.Objective: To compare the agreement and accuracy of three sources of race and ethnicity information contained in the Medicare data warehouse: 1) the Enrollment Database (EDB) which originate from Social Security data; 2) the Research Triangle Institute (RTI) imputed data based on name and geography; and 3) self-reported race and ethnicity data collected during routine home health care assessments as part of the Outcome and Assessment Information Set (OASIS).Subjects: Medicare beneficiaries over the age of 18 who received home health care in 2015 (N = 4,243,090). Measures: Percent agreement, sensitivity, specificity, positive predictive value, and Cohen’s kappa coefficient. Results: Compared to self-reported race/ethnicity data from OASIS, the RTI race code is more accurate than the EDB race code. Non-Hispanic whites and blacks were correctly classified by the RTI race code with 97% accuracy. However, more than half of American Indians/Alaskan Natives, one-fourth of Asian American/Pacific Islanders, and nearly one-tenth of Hispanics were misclassified by the RTI race code. Misclassification of race/ethnicity occurred less often for men, compared to women. Discussion: These findings highlight the strengths and limitations of using race/ethnicity classifications contained in Medicare administrative data. Health services and policy researchers should consider using self-identified race/ethnicity information to augment administrative datasources. This is especially important for research that aims to include Asian Americans/Pacific Islanders and American Indians/Alaskan Natives.


Circulation ◽  
2020 ◽  
Vol 142 (Suppl_3) ◽  
Author(s):  
Michael S Miller ◽  
Gennaro Giustino ◽  
Annapoorna Kini ◽  
Giulio Stefanini ◽  
Renato Bragato ◽  
...  

Introduction: Myocardial injury is common amongst patients hospitalized with Covid-19 and is associated with a poor prognosis. It is unknown whether its incidence and its mechanisms differ by race and ethnicity. Methods: We conducted a multicenter, international cohort study at 7 hospitals in New York (United States) and Milan (Italy) between March and May 2020. All patients were hospitalized, had laboratory-confirmed Covid-19, and received a transthoracic echocardiogram (TTE) during their hospitalization. We evaluated the association between race/ethnicity and myocardial injury in multivariable logistic regression models. Myocardial injury was defined as any cardiac troponin elevation above the upper limit of normal at each enrolling site. Results: A total of 305 consecutive patients were included, of whom 280 had self-reported race/ethnicity. Key demographic, laboratory and echocardiographic characteristics are presented in the Table. All minority groups had higher incidence of a composite of major echocardiographic abnormalities compared to whites, and Asian and Hispanic patients had increased incidence of RV dysfunction. In multivariable models, compared with Whites, Black (adjOR 2.7 [1.1-6.4]), Asian (adjOR 3.3 [1.1-10.2]), and Hispanic (adjOR 2.8 [1.4-5.8]) patients had increased odds of myocardial injury. After adjusting for baseline demographic and clinical variables, both Asian (adjOR 9.9 [2.6-38.6]) and Hispanic (adjOR 5.7 [2.1-15.6]) patients had increased odds of in-hospital mortality compared with White, but not Black (adjOR 2.0 [0.6-7.0]) patients. Conclusions: Among hospitalized patients with Covid-19 who received a TTE, minority groups had higher incidence of echocardiographic abnormalities and increased risk of myocardial injury. After adjustment for baseline confounders, only Asian and Hispanic patients remained at increased risk for in-hospital mortality.


2019 ◽  
Vol 34 (14) ◽  
pp. 928-936 ◽  
Author(s):  
Celestine H. Yeung Gregerson ◽  
Amanda V. Bakian ◽  
Jacob Wilkes ◽  
Andrew J. Knighton ◽  
Flory Nkoy ◽  
...  

Objective: The purpose of our study was to assess whether race/ethnicity was associated with seizure remission in pediatric epilepsy. Methods: This was a retrospective population-based cohort study of children who were evaluated for new-onset epilepsy in the clinic, emergency department, and/or hospital by a pediatric neurologist in an integrated health care delivery system. Children were between ages 6 months and 15 years at their initial presentation of epilepsy. The cohort, identified through an electronic database, was assembled over 6 years, with no less than 5 years of follow-up. All children were evaluated for race, ethnicity, insurance type, and socioeconomic background. Patient outcome was determined at the conclusion of the study period and categorized according to their epilepsy control as either drug resistant (pharmacoresistant and intractable) or drug responsive (controlled, probable remission, and terminal remission). Results: In the final cohort of 776 patients, 63% were drug responsive (control or seizure remission). After controlling for confounding socioeconomic and demographic factors, children of Hispanic ethnicity experienced reduced likelihood (hazard) of drug-responsive epilepsy (hazard ratio 0.6, P < .001), and had longer median time to remission (8 years; 95% CI 5.9-9.6 years) compared to white non-Hispanic patients (5.6 years; 95% CI 4.9-6.1 years). Among Hispanic patients, higher health care costs were associated with reduced likelihood of drug responsiveness. Significance: We found that Hispanic ethnicity is associated with a reduced likelihood of achieving seizure control and remission. This study suggests that factors associated with the race/ethnicity of patients contributes to their likelihood of achieving seizure freedom.


2016 ◽  
Vol 3 (3) ◽  
pp. 226 ◽  
Author(s):  
Jane R Grafton ◽  
Onchee Yu ◽  
David Carrell ◽  
Susan Reed ◽  
Renate Shulze-Rath ◽  
...  

PLoS ONE ◽  
2020 ◽  
Vol 15 (12) ◽  
pp. e0244270
Author(s):  
Ingrid V. Bassett ◽  
Virginia A. Triant ◽  
Bridget A. Bunda ◽  
Caitlin A. Selvaggi ◽  
Daniel J. Shinnick ◽  
...  

Objective To evaluate differences by race/ethnicity in clinical characteristics and outcomes among hospitalized patients with Covid-19 at Massachusetts General Hospital (MGH). Methods The MGH Covid-19 Registry includes confirmed SARS-CoV-2-infected patients hospitalized at MGH and is based on manual chart reviews and data extraction from electronic health records (EHRs). We evaluated differences between White/Non-Hispanic and Hispanic patients in demographics, complications and 14-day outcomes among the N = 866 patients hospitalized with Covid-19 from March 11, 2020—May 4, 2020. Results Overall, 43% of patients hospitalized with Covid-19 were women, median age was 60.4 [IQR = (48.2, 75)], 11.3% were Black/non-Hispanic and 35.2% were Hispanic. Hispanic patients, representing 35.2% of patients, were younger than White/non-Hispanic patients [median age 51y; IQR = (40.6, 61.6) versus 72y; (58.0, 81.7) (p<0.001)]. Hispanic patients were symptomatic longer before presenting to care (median 5 vs 3d, p = 0.039) but were more likely to be sent home with self-quarantine than be admitted to hospital (29% vs 16%, p<0.001). Hispanic patients had fewer comorbidities yet comparable rates of ICU or death (34% vs 36%). Nonetheless, a greater proportion of Hispanic patients recovered by 14 days after presentation (62% vs 45%, p<0.001; OR = 1.99, p = 0.011 in multivariable adjusted model) and fewer died (2% versus 18%, p<0.001). Conclusions Hospitalized Hispanic patients were younger and had fewer comorbidities compared to White/non-Hispanic patients; despite comparable rates of ICU care or death, a greater proportion recovered. These results have implications for public health policy and the design and conduct of clinical trials.


2020 ◽  
Author(s):  
Ingrid V Bassett ◽  
Virgina A Triant ◽  
Bridget A Bunda ◽  
Caitlin A Selvaggi ◽  
Daniel J Shinnick ◽  
...  

Objective: To evaluate differences by race/ethnicity in clinical characteristics and outcomes among hospitalized patients with Covid-19 at Massachusetts General Hospital (MGH). Methods: The MGH Covid-19 Registry includes confirmed SARS-CoV-2-infected patients hospitalized at MGH and is based on manual chart reviews and data extraction from electronic health records (EHRs). We evaluated differences between White/Non-Hispanic and Hispanic patients in demographics, complications and 14-day outcomes among the N=866 patients hospitalized with Covid-19 from March 11, 2020 - May 4, 2020. Results: Overall, 43% of patients hospitalized with Covid-19 were women, median age was 60.4 [IQR = (48.2, 75)], 11.3% were Black/non-Hispanic and 35.2% were Hispanic. Hispanic patients, representing 35.2% of patients, were younger than White/non-Hispanic patients [median age 51y; IQR = (40.6, 61.6) versus 72y; (58.0, 81.7) (p<0.001)]. Hispanic patients were symptomatic longer before presenting to care (median 5 vs 3d, p=0.039) but were more likely to be sent home with self-quarantine than be admitted to hospital (29% vs 16%, p<0.001). Hispanic patients had fewer comorbidities yet comparable rates of ICU or death (34% vs 36%). Nonetheless, a greater proportion of Hispanic patients recovered by 14 days after presentation (62% vs 45%, p<0.001; OR = 1.99, p = 0.011 in multivariable adjusted model) and fewer died (2% versus 18%, p<0.001). Conclusions: Hospitalized Hispanic patients were younger and had fewer comorbidities compared to White/non-Hispanic patients; despite comparable rates of ICU care or death, a greater proportion recovered. These results have implications for public health policy and the design and conduct of clinical trials.


Author(s):  
Jordan Jouffroy ◽  
Sarah F Feldman ◽  
Ivan Lerner ◽  
Bastien Rance ◽  
Anita Burgun ◽  
...  

BACKGROUND Information related to patient medication is crucial for health care. However, up to 80% of the information resides solely in unstructured text. Manual extraction may be difficult and time-consuming. Many studies have shown the interest of natural language processing for this task but only a few on French corpus. OBJECTIVE We aim at developing a system to extract medication-related information from French clinical text. METHODS We developed a hybrid system combining an expert rule-based system (RBS), contextual word embedding (ELMo) trained on clinical notes and a deep recurrent neural network (BiLSTM-CRF). The task consists in extracting drug mentions and their related information (e.g. dosage, frequency, duration, route, condition). We manually annotated 320 clinical notes extracted from a French clinical data warehouse, to train and evaluate the model. We compared the performances of our approach to standard approaches: rule-based or machine learning only, and classic word embeddings. We evaluated the models using token level recall, precision and F-measure. RESULTS Models including RBS, ELMo and BiLSTM reached the best results: overall F-measure of 89.9%. F-measures per category were 95.3% for the medication name, 64.4% for the drug class mentions, 95.3% for the dosage, 92.2% for the frequency, 78.8% for the duration, and 62.2% for the condition of the intake. CONCLUSIONS Associating expert rules, deep contextualized embedding (ELMo) and deep neural networks improves medication information extraction. Our results reveal a synergy when associating expert knowledge and latent knowledge.


JAMIA Open ◽  
2019 ◽  
Vol 2 (1) ◽  
pp. 150-159 ◽  
Author(s):  
Imon Banerjee ◽  
Kevin Li ◽  
Martin Seneviratne ◽  
Michelle Ferrari ◽  
Tina Seto ◽  
...  

Abstract Background The population-based assessment of patient-centered outcomes (PCOs) has been limited by the efficient and accurate collection of these data. Natural language processing (NLP) pipelines can determine whether a clinical note within an electronic medical record contains evidence on these data. We present and demonstrate the accuracy of an NLP pipeline that targets to assess the presence, absence, or risk discussion of two important PCOs following prostate cancer treatment: urinary incontinence (UI) and bowel dysfunction (BD). Methods We propose a weakly supervised NLP approach which annotates electronic medical record clinical notes without requiring manual chart review. A weighted function of neural word embedding was used to create a sentence-level vector representation of relevant expressions extracted from the clinical notes. Sentence vectors were used as input for a multinomial logistic model, with output being either presence, absence or risk discussion of UI/BD. The classifier was trained based on automated sentence annotation depending only on domain-specific dictionaries (weak supervision). Results The model achieved an average F1 score of 0.86 for the sentence-level, three-tier classification task (presence/absence/risk) in both UI and BD. The model also outperformed a pre-existing rule-based model for note-level annotation of UI with significant margin. Conclusions We demonstrate a machine learning method to categorize clinical notes based on important PCOs that trains a classifier on sentence vector representations labeled with a domain-specific dictionary, which eliminates the need for manual engineering of linguistic rules or manual chart review for extracting the PCOs. The weakly supervised NLP pipeline showed promising sensitivity and specificity for identifying important PCOs in unstructured clinical text notes compared to rule-based algorithms. Trial registration This is a chart review study and approved by Institutional Review Board (IRB).


Sign in / Sign up

Export Citation Format

Share Document