scholarly journals Enrichment sampling for a multi-site patient survey using electronic health records and census data

2018 ◽  
Vol 26 (3) ◽  
pp. 219-227 ◽  
Author(s):  
Nathaniel D Mercaldo ◽  
Kyle B Brothers ◽  
David S Carrell ◽  
Ellen W Clayton ◽  
John J Connolly ◽  
...  

Abstract Objective We describe a stratified sampling design that combines electronic health records (EHRs) and United States Census (USC) data to construct the sampling frame and an algorithm to enrich the sample with individuals belonging to rarer strata. Materials and Methods This design was developed for a multi-site survey that sought to examine patient concerns about and barriers to participating in research studies, especially among under-studied populations (eg, minorities, low educational attainment). We defined sampling strata by cross-tabulating several socio-demographic variables obtained from EHR and augmented with census-block-level USC data. We oversampled rarer and historically underrepresented subpopulations. Results The sampling strategy, which included USC-supplemented EHR data, led to a far more diverse sample than would have been expected under random sampling (eg, 3-, 8-, 7-, and 12-fold increase in African Americans, Asians, Hispanics and those with less than a high school degree, respectively). We observed that our EHR data tended to misclassify minority races more often than majority races, and that non-majority races, Latino ethnicity, younger adult age, lower education, and urban/suburban living were each associated with lower response rates to the mailed surveys. Discussion We observed substantial enrichment from rarer subpopulations. The magnitude of the enrichment depends on the accuracy of the variables that define the sampling strata and the overall response rate. Conclusion EHR and USC data may be used to define sampling strata that in turn may be used to enrich the final study sample. This design may be of particular interest for studies of rarer and understudied populations.

2018 ◽  
Vol 20 (11) ◽  
pp. e278 ◽  
Author(s):  
Jonas Moll ◽  
Hanife Rexhepi ◽  
Åsa Cajander ◽  
Christiane Grünloh ◽  
Isto Huvila ◽  
...  

2017 ◽  
Vol 132 (4) ◽  
pp. 463-470 ◽  
Author(s):  
Maxwell J. Richardson ◽  
Stephen K. Van Den Eeden ◽  
Eric Roberts ◽  
Assiamira Ferrara ◽  
Susan Paulukonis ◽  
...  

Objectives: Electronic health records (EHRs) and electronic laboratory records (ELRs) are increasingly seen as a rich source of data for performing public health surveillance activities and monitoring community health status. Their potential for surveillance of chronic illness, however, may be underused. Our objectives were to (1) evaluate the use of EHRs and ELRs for diabetes surveillance in 2 California counties and (2) examine disparities in diabetes prevalence by geography, income, and race/ethnicity. Methods: We obtained data on a clinical diagnosis of diabetes and hemoglobin A1c (HbA1c) test results for adult members of Kaiser Permanente Northern California living in Contra Costa County or Solano County at any time during 2010-2014. We evaluated the validity of using HbA1c test results to determine diabetes prevalence, using clinical diagnoses as a gold standard. We estimated disparities in diabetes prevalence by combining HbA1c test results with US Census data on income, race, and ethnicity. Results: When compared with a clinical diagnosis of diabetes, data on a patient’s 5-year maximum HbA1c value ≥6.5% yielded the best combination of sensitivity (87.4%) and specificity (99.2%). The prevalence of 5-year maximum HbA1c ≥6.5% decreased with increasing median family income and increased with greater proportions of residents who were either non-Hispanic black or Hispanic. Conclusions: Timely diabetes surveillance data from ELRs can be used to document disparities, target interventions, and evaluate changes in population health. ELR data may be easier to access than a patient’s entire EHR, but outcome metric validation with diabetes diagnoses would need to be ongoing. Future research should validate ELR and EHR data across multiple providers.


Author(s):  
Lauren Beesley ◽  
Maxwell Salvatore ◽  
Lars Fritsche ◽  
Anita Pandit ◽  
Arvind Rao ◽  
...  

Biobanks linked to electronic health records provide a rich data resource for health-related research. With the establishment of large-scale infrastructure, the availability and utility of data from biobanks has dramatically increased over time. As more researchers become interested in using biobank data to explore a diverse spectrum of scientific questions, resources guiding the data access, design, and analysis of biobank-based studies will be crucial.  The first aim of this review is to characterize the types of biobanks that are discussed in the recent literature and provide detailed descriptions of specific biobanks including their location, size, data access, data linkages and more. The development and accessibility of large-scale biorepositories provide the opportunity to accelerate agnostic searches, new discoveries, and hypothesis-generating studies of disease-treatment, disease-exposure and disease-gene associations. Rather than spending time and money designing and implementing a single study with pre-defined objectives, researchers can use biobanks’ existing data-rich resources to answer scientific questions as quickly as they can analyze them. While the data are becoming increasingly available, additional thought is needed to address issues related to the design of such studies and analysis of these data. In the second aim of this review, we discuss statistical issues related to biobank research in general including study design, sampling strategy, phenotype identification, and missing data. These issues are illustrated using data from the Michigan Genomics Initiative, UK Biobank, and Genes for Good. We summarize the current body of statistical literature aimed at addressing some of these challenges and discuss some of the standing open problems in this area. This work serves to complement and extend recent reviews about biobank-based research and aims to provide a resource catalog with statistical and practical guidance to researchers pursuing biobank-based research.


2020 ◽  
Vol 26 (4) ◽  
pp. 2915-2929
Author(s):  
Hanife Rexhepi ◽  
Jonas Moll ◽  
Isto Huvila

This study investigates differences in attitudes towards, and experiences with, online electronic health records between cancer patients and patients with other conditions, highlighting what is characteristic to cancer patients. A national patient survey on online access to electronic health records was conducted, where cancer patients were compared with all other respondents. Overall, 2587 patients completed the survey (response rate 0.61%). A total of 347 respondents (13.4%) indicated that they suffered from cancer. Results showed that cancer patients are less likely than other patients to use online electronic health records due to general interest (p < 0.001), but more likely for getting an overview of their health history (p = 0.001) and to prepare for visits (p < 0.001). Moreover, cancer patients rate benefits of accessing their electronic health records online higher than other patients and see larger positive effects regarding improved communication with and involvement in healthcare.


2019 ◽  
Vol 26 (14) ◽  
pp. 1948-1952 ◽  
Author(s):  
Farren BS Briggs ◽  
Eddie Hill

Background/objective: In 2019, the 2010 U.S. multiple sclerosis (MS) prevalence was robustly estimated (265.1–309.2/100,000) based on large administrative health-claims datasets. Using 56.6 million electronic health records (EHRs), we sought to generate complementary age, sex, and race standardized estimates. Methods/results: Using de-identified EHRs and 2018 U.S. Census data, we estimated an age- and sex-standardized MS prevalence of 219.5/100,000 which increased to 274.5/100,000 when accounting for White and Black race alone. Women aged 50 to 69 years had the highest prevalence (>600/100,000). Among White and Black Americans, the age- and sex-standardized prevalence was 283.7 and 226.1 per 100,000, respectively. Conclusion: Using 56.6 million EHRs and standardizing for age, sex, and race (White and Black Americans only), we estimated at least 810,504 Americans were living with MS in 2018.


Author(s):  
Jonathan M Tan ◽  
Vicky Tam ◽  
Jorge A Galvez ◽  
Grace Hsu ◽  
William Quarshie ◽  
...  

IntroductionSocial determinants of health (SDOH) has a significant impact on access to health. Risk stratification of patients who have difficulty accessing care could allow for triaging of peri-operative resources. Unfortunately, there is a limited amount of SES factors available to study in electronic health records (EHR). Objectives and ApproachOur objective was to understand the SES and location risk factors that are associated with paediatric patients arriving late to the hospital for elective surgery. We conducted a retrospective study of paediatric patients requiring elective surgery from 2015-2019. Spatial linkage of EHR data with US Census–ACS 2017 data was conducted. Analysis was at patient and neighbourhood block group levels. Statistical analysis was conducted utilizing SAS, Python and ArcGIS. ResultsOur study had 40,943 patients with 7,453 patients (18.2%) who arrived ≥15 minutes late from their scheduled arrival. Patient level risk factors for arriving late included younger age, Black and Indian patients, non-English speaking, government insurance, increased co-morbidities and earlier appointments. The median time of arrival for patients arriving late was 23.0 minutes (18.0-33.0 minutes IQR), versus the on-time group of 7 min (4-22 minutes IQR) early. Median drive time and distance, using network analysis was not a significant factor. Statistically significant neighbourhood risk factors for arriving late included block groups with high unemployment, households receiving public assistance, low income households, higher number of high-school drop outs, female-headed households, more renter-occupied houses, and areas with high turnover. Logistic regression demonstrated neighbourhoods with the lowest quintile SES were 30% more likely to be late than the areas with the highest SES (p<0.001). Conclusion / ImplicationsWe successfully identified patient and neighbourhood socioeconomic risk factors for arriving late to the hospital leveraging geospatial methods and EHR data. Leveraging EHR data with geospatial analytics can augment our understanding of the SDOH that may impact the delivery of care.


2017 ◽  
Author(s):  
Jonas Moll ◽  
Hanife Rexhepi ◽  
Åsa Cajander ◽  
Christiane Grünloh ◽  
Isto Huvila ◽  
...  

BACKGROUND Internationally, there is a movement toward providing patients a Web-based access to their electronic health records (EHRs). In Sweden, Region Uppsala was the first to introduce patient-accessible EHRs (PAEHRs) in 2012. By the summer of 2016, 17 of 21 county councils had given citizens Web-based access to their medical information. Studies on the effect of PAEHRs on the work environment of health care professionals have been conducted, but up until now, few extensive studies have been conducted regarding patients’ experiences of using PAEHRs in Sweden or Europe, more generally. OBJECTIVE The objective of our study was to investigate patients’ experiences of accessing their EHRs through the Swedish national patient portal. In this study, we have focused on describing user characteristics, usage, and attitudes toward the system. METHODS A national patient survey was designed, based on previous interview and survey studies with patients and health care professionals. Data were collected during a 5-month period in 2016. The survey was made available through the PAEHR system, called Journalen, in Sweden. The total number of patients that logged in and could access the survey during the study period was 423,141. In addition to descriptive statistics reporting response frequencies on Likert scale questions, Mann-Whitney tests, Kruskal-Wallis tests, and chi-square tests were used to compare answers between different county councils as well as between respondents working in health care and all other respondents. RESULTS Overall, 2587 users completed the survey with a response rate of 0.61% (2587/423,141). Two participants were excluded from the analysis because they had only received care in a county council that did not yet show any information in Journalen. The results showed that 62.97% (1629/2587) of respondents were women and 39.81% (1030/2587) were working or had been working in health care. In addition, 72.08% (1794/2489) of respondents used Journalen about once a month, and the main reason for use was to gain an overview of one’s health status. Furthermore, respondents reported that lab results were the most important information for them to access; 68.41% (1737/2539) of respondents wanted access to new information within a day, and 96.58% (2454/2541) of users reported that they are positive toward Journalen. CONCLUSIONS In this study, respondents provided several important reasons for why they use Journalen and why it is important for them to be able to access information in this way—several related to patient empowerment, involvement, and security. Considering the overall positive attitude, PAEHRs seem to fill important needs for patients.


Sign in / Sign up

Export Citation Format

Share Document