scholarly journals The impact of low input DNA on the reliability of DNA methylation as measured by the Illumina Infinium MethylationEPIC BeadChip

2021 ◽  
Author(s):  
Sarah Holmes Watkins ◽  
Karen Ho ◽  
Christian Testa ◽  
Louise Falk ◽  
Patrice Soule ◽  
...  

Background: DNA methylation (DNAm) is commonly assayed using the Illumina Infinium MethylationEPIC BeadChip, but there is currently little published evidence to define the lower limits of the amount of DNA that can be used whilst preserving data quality. Such evidence is valuable for analyses utilising precious or limited DNA sources. Materials and methods: We use a single pooled sample of DNA in quadruplicate at three dilutions to define replicability and noise, and an independent population dataset of 328 individuals (from a community-based study including US-born non-Hispanic Black and white persons) to assess the impact of total DNA input on the quality of data generated using the Illumina Infinium MethylationEPIC BeadChip. Results: Data are less reliable and more noisy as DNA input decreases to 40ng, with clear reductions in data quality; however samples with a total input as low as 40ng pass standard quality control tests, and we observe little evidence that low input DNA obscures the associations between DNAm and two phenotypes, age and smoking status. Conclusions: DNA input as low as 40ng can be used with the Illumina Infinium MethylationEPIC BeadChip, provided quality checks and sensitivity analyses are undertaken.

2019 ◽  
Author(s):  
Pavankumar Mulgund ◽  
Raj Sharman ◽  
Priya Anand ◽  
Shashank Shekhar ◽  
Priya Karadi

BACKGROUND In recent years, online physician-rating websites have become prominent and exert considerable influence on patients’ decisions. However, the quality of these decisions depends on the quality of data that these systems collect. Thus, there is a need to examine the various data quality issues with physician-rating websites. OBJECTIVE This study’s objective was to identify and categorize the data quality issues afflicting physician-rating websites by reviewing the literature on online patient-reported physician ratings and reviews. METHODS We performed a systematic literature search in ACM Digital Library, EBSCO, Springer, PubMed, and Google Scholar. The search was limited to quantitative, qualitative, and mixed-method papers published in the English language from 2001 to 2020. RESULTS A total of 423 articles were screened. From these, 49 papers describing 18 unique data quality issues afflicting physician-rating websites were included. Using a data quality framework, we classified these issues into the following four categories: intrinsic, contextual, representational, and accessible. Among the papers, 53% (26/49) reported intrinsic data quality errors, 61% (30/49) highlighted contextual data quality issues, 8% (4/49) discussed representational data quality issues, and 27% (13/49) emphasized accessibility data quality. More than half the papers discussed multiple categories of data quality issues. CONCLUSIONS The results from this review demonstrate the presence of a range of data quality issues. While intrinsic and contextual factors have been well-researched, accessibility and representational issues warrant more attention from researchers, as well as practitioners. In particular, representational factors, such as the impact of inline advertisements and the positioning of positive reviews on the first few pages, are usually deliberate and result from the business model of physician-rating websites. The impact of these factors on data quality has not been addressed adequately and requires further investigation.


BMJ Open ◽  
2018 ◽  
Vol 8 (2) ◽  
pp. e016589 ◽  
Author(s):  
Annette Veile ◽  
Heiko Zimmermann ◽  
Eva Lorenz ◽  
Heiko Becher

ObjectiveTo assess the epidemiological association of smoking status and tinnitus with a systematic review and meta-analysis and to estimate the population attributable risk in Germany.Data sourcesA systematic literature search in PubMed and ISI-Web of Science Core Collection resulted in 1026 articles that were indexed until 15 September 2015. Additionally, proceedings of the international tinnitus seminars and reference lists of relevant articles were screened.Study selectionTwo reviewers searched independently for epidemiological studies. Tinnitus as a manifestation of tumours, vascular malformations, specific syndromes or as a consequence of surgical and medical treatment was not considered. Moreover, studies conducted among patients of ear, nose and throat clinics were excluded.Data extractionIf only raw data were provided, effect sizes were calculated. Further unpublished data were received by corresponding authors.Data synthesisData of 20 studies were pooled. Current smoking (OR 1.21, 95% CI 1.09 to 1.35), former smoking (OR 1.13, 95% CI 1.01 to 1.26) and ever smoking (OR 1.20, 95% CI 1.11 to 1.30) were significantly associated with tinnitus. Moreover, sensitivity analyses for severe tinnitus (OR 1.32, 95% CI 1.10 to 1.58) and for studies of superior quality (OR 1.15, 95% CI 1.03 to 1.29) showed increased risks. According to this, the population attributable risk estimate in Germany is 3.5%.ConclusionThere is sufficient evidence that smoking is associated with tinnitus. As the review mainly consists of cross-sectional studies, the observed correlation does not give evidence of a causal relationship. Due to the impact of various confounders, further research is needed to provide more evidence on the strength of association and causal relationships.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Stephanie Garies ◽  
Kerry McBrien ◽  
Hude Quan ◽  
Donna Manca ◽  
Neil Drummond ◽  
...  

Abstract Background Hypertension is a common chronic condition affecting nearly a quarter of Canadians. Hypertension surveillance in Canada typically relies on administrative data and/or national surveys. Routinely-captured data from primary care electronic medical records (EMRs) are a complementary source for chronic disease surveillance, with longitudinal patient-level details such as sociodemographics, blood pressure, weight, prescribed medications, and behavioural risk factors. As EMR data are generated from patient care and administrative tasks, assessing data quality is essential before using for secondary purposes. This study evaluated the quality of primary care EMR data from one province in Canada within the context of hypertension surveillance. Methods We conducted a cross-sectional, descriptive study using primary care EMR data collected by two practice-based research networks in Alberta, Canada. There were 48,377 adults identified with hypertension from 53 clinics as of June 2018. Summary statistics were used to examine the quality of data elements considered relevant for hypertension surveillance. Results Patient year of birth and sex were complete, but other sociodemographic information (ethnicity, occupation, education) was largely incomplete and highly variable. Height, weight, body mass index and blood pressure were complete for most patients (over 90%), but a small proportion of outlying values indicate data inaccuracies were present. Most patients had a relevant laboratory test present (e.g. blood glucose/glycated hemoglobin, lipid profile), though a very small proportion of values were outside a biologically plausible range. Details of prescribed antihypertensive medication, such as start date, strength, dose, frequency, were mostly complete. Nearly 80% of patients had a smoking status recorded, though only 66% had useful information (i.e. categorized as current, past, or never), and less than half had their alcohol use described; information related to amount, frequency or duration was not available. Conclusions Blood pressure and prescribed medications in primary care EMR data demonstrated good completeness and plausibility, and contribute valuable information for hypertension epidemiology and surveillance. The use of other clinical, laboratory, and sociodemographic variables should be used carefully due to variable completeness and suspected data errors. Additional strategies to improve these data at the point of entry and after data extraction (e.g. statistical methods) are required.


2021 ◽  
Vol 2 (2) ◽  
pp. 68-74
Author(s):  
Shahidul Islam

Incentives of different forms and at different stages are used for motivating people to participate in human subject research. Although it is widely accepted that incentives, in general, play a positive role in increasing participation rate and are widely used, there are exceptions that they may not increase response rate and may even contaminate the quality of data resulting in poor research findings. This study examines the impact of pre- and post-disclosed committed lottery incentives on response rate and data quality in a face-to-face survey of conventional consumers for organic food consumption. A survey was conducted at the premises of four conventional grocery stores in Edmonton, Alberta, Canada. Half of the randomly approached and agreed upon respondents were disclosed the lottery incentives at the beginning, and the rest half were told at the end. Data quality was measured using three indicators – edit occurrences, imputation occurrences, and proportion of incomplete answers. Our study finds little difference in response rate between pre- and post-disclosed committed lottery payments. However, the useability of incomplete questionnaires among post-disclosed lottery was significantly higher than those of pre-disclosed. Our study also shows that people with likings of organic food and buying organic food more frequently are likely to offer a better quality of information.


10.2196/15916 ◽  
2020 ◽  
Vol 22 (9) ◽  
pp. e15916
Author(s):  
Pavankumar Mulgund ◽  
Raj Sharman ◽  
Priya Anand ◽  
Shashank Shekhar ◽  
Priya Karadi

Background In recent years, online physician-rating websites have become prominent and exert considerable influence on patients’ decisions. However, the quality of these decisions depends on the quality of data that these systems collect. Thus, there is a need to examine the various data quality issues with physician-rating websites. Objective This study’s objective was to identify and categorize the data quality issues afflicting physician-rating websites by reviewing the literature on online patient-reported physician ratings and reviews. Methods We performed a systematic literature search in ACM Digital Library, EBSCO, Springer, PubMed, and Google Scholar. The search was limited to quantitative, qualitative, and mixed-method papers published in the English language from 2001 to 2020. Results A total of 423 articles were screened. From these, 49 papers describing 18 unique data quality issues afflicting physician-rating websites were included. Using a data quality framework, we classified these issues into the following four categories: intrinsic, contextual, representational, and accessible. Among the papers, 53% (26/49) reported intrinsic data quality errors, 61% (30/49) highlighted contextual data quality issues, 8% (4/49) discussed representational data quality issues, and 27% (13/49) emphasized accessibility data quality. More than half the papers discussed multiple categories of data quality issues. Conclusions The results from this review demonstrate the presence of a range of data quality issues. While intrinsic and contextual factors have been well-researched, accessibility and representational issues warrant more attention from researchers, as well as practitioners. In particular, representational factors, such as the impact of inline advertisements and the positioning of positive reviews on the first few pages, are usually deliberate and result from the business model of physician-rating websites. The impact of these factors on data quality has not been addressed adequately and requires further investigation.


Circulation ◽  
2015 ◽  
Vol 132 (suppl_3) ◽  
Author(s):  
Jaime E Hart ◽  
Jarvis T Chen ◽  
Robin C Puett ◽  
Jeff D Yanosky ◽  
Eric B Rimm ◽  
...  

Introduction: Chronic exposures to particulate matter (PM) have been associated with cardiovascular disease (CVD) morbidity and mortality. We examined the impact of long-term exposures to PM on the risk of incident coronary heart disease (CHD) and stroke among members of the nationwide all-male Health Professionals Follow-Up Study (HPFS) prospective cohort. Methods: HPFS members were followed biennially between 1986-2006 to obtain information on incident disease and to update information on CVD risk factors. Time-varying ambient PM 10 , PM 2.5-10 , and PM 2.5 for the previous 12 months were calculated from monthly predictions at the address level. Multivariate adjusted Cox proportional hazards models were used to estimate [HR (95%CI)] for the association between each fraction of PM and each outcome among 43,371 CVD-free members of the HPFS, adjusting for risk factors and other potential confounders. We also assessed effect modification by region of the country, BMI, smoking status, and comorbidities (hypercholesterolemia, high blood pressure, and diabetes). Sensitivity analyses were conducted restricting the population to men who provided residential (N=15,395), as opposed to work, addresses. Results: The mean (SD) levels of 12-month average PM 10 , PM 2.5-10 , and PM 2.5 were 20.7 (6.2), 8.4 (4.7) and 12.3 (3.4) μg/m 3 . In the full population, there was only modest evidence of increased risks of incident CHD or stroke with increasing PM exposures. Associations with stroke were modified by region, hypercholesterolemia, high blood pressure, and diabetes, with larger effects among those with comorbid conditions and in the Northeast and South. CHD, but not stroke, dose-responses were stronger among those who provided residential as opposed to work addresses; each 10 μg/m 3 increase, was associated with increases in overall CHD [1.10 (95%CI: 1.01-1.20), 1.09 (0.97-1.23), and 1.14 (0.98-1.32) for PM 10 , PM 2.5-10 , and PM 2.5 , respectively]. Conclusions: In this cohort of US men, PM exposures were only modestly associated with elevated risks of CHD and stroke. Comorbidities and region modified the associations with stroke, and residential ambient exposures were more associated with CHD than work ambient exposure.


Genes ◽  
2020 ◽  
Vol 11 (3) ◽  
pp. 311 ◽  
Author(s):  
Man-Kit Lei ◽  
Frederick X. Gibbons ◽  
Ronald L. Simons ◽  
Robert A. Philibert ◽  
Steven R. H. Beach

Smoking is one of the leading preventable causes of morbidity and mortality worldwide, prompting interest in its association with DNA methylation-based measures of biological aging. Considerable progress has been made in developing DNA methylation-based measures that correspond to self-reported smoking status. In addition, assessment of DNA methylation-based aging has been expanded to better capture individual differences in risk for morbidity and mortality. Untested to date, however, is whether smoking is similarly related to older and newer indices of DNA methylation-based aging, and whether DNA methylation-based indices of smoking can be used in lieu of self-reported smoking to examine effects on DNA methylation-based aging measures. In the current investigation we examine mediation of the impact of self-reported cigarette consumption on accelerated, intrinsic DNA methylation-based aging using indices designed to predict chronological aging, phenotypic aging, and mortality risk, as well as a newly developed DNA methylation-based measure of telomere length. Using a sample of 500 African American middle aged smokers and non-smokers, we found that a) self-reported cigarette consumption was associated with accelerated intrinsic DNA methylation-based aging on some but not all DNA methylation-based aging indices, b) for those aging outcomes associated with self-reported cigarette consumption, DNA methylation-based indicators of smoking typically accounted for greater variance than did self-reported cigarette consumption, and c) self-reported cigarette consumption effects on DNA methylation-based aging indices typically were fully mediated by DNA methylation-based indicators of smoking (e.g., PACKYRS from GrimAge; or cg05575921 CpG site). Results suggest that when DNA methylation-based indices of smoking are substituted for self-reported assessments of smoking, they will typically fully reflect the varied impact of cigarette smoking on intrinsic, accelerated DNA methylation-based aging.


Author(s):  
Linda Cook ◽  
Laurie Benton ◽  
Melanie Edwards

ABSTRACT Field sampling investigations in response to oil spill incidents are growing increasingly more complex with analytical data collected by a variety of interested parties over many years and with different investigative purposes. For the Deepwater Horizon (DWH) Oil Spill, the analytical chemistry data and toxicity study data were required to be validated in accordance with U.S. Environmental Protection Agency's (EPA's) data validation for Superfund program methods. The process of validating data according to EPA guidelines is a manual and time-consuming process focused on chemistry results for individual samples within a single data package to assess if data meet quality control criteria. In hindsight, the burden of validating all of the chemistry data appears to be excessive, and for some parameters unnecessary, which was costly and slowed the process of disseminating data. Depending on the data use (e.g., assessing human and ecological risk, qualitative oil tracking, or forensic fingerprinting), data validation may not be needed in every circumstance or for every data type. Publicly available water column, sediment, and oil chemistry analytical data associated with the DWH Oil Spill, obtained from the Gulf of Mexico Research Initiative Information and Data Cooperative data portal were evaluated to understand the impact, effort, accuracy, and benefit of the data validation process. Questions explored include: What data changed based on data validation reviews?How would these changes affect the associated data evaluation findings?Did data validation introduce additional errors?What data quality issues did the data validation process miss?What statistical and data analytical approaches would more efficiently identify potential data quality issues? Based on our evaluation of the chemical data associated with the DWH Oil Spill, new strategies to assess the quality of data associated with oil spill investigations will be presented.


2020 ◽  
Author(s):  
Dana M. Lapato ◽  
Roxann Roberson-Nay ◽  
Patricia A. Kinser ◽  
Timothy P. York

AbstractPrenatal maternal depression increases the risk of negative maternal-infant health outcomes but often goes unrecognized. As a result, biomarker screening tests capable of identifying women at risk for depression are highly desirable. This study tested how demographic and clinical factors affect the predictive validity of a DNA methylation-based screening test for postpartum major depression (MD) using data from a longitudinal study of birth outcomes. Lifetime history of MD and current levels of postpartum depressive symptoms were assessed using an extended self-report version of the Composite International Diagnostic Interview Short Form and the Edinburgh Postnatal Depression Scale (EPDS), respectively. Predictive validity of the test was estimated in the PREG cohort using the area under the receiver operator characteristic curve (AUC), and sensitivity analyses were performed to assess the impact of self-reported race, age, and pre-pregnancy history of MD. Data for N=103 pregnant participants (African-American=49; European-American=54) were available. The prediction model identified women who would develop high levels of postpartum depressive symptoms better within the subset of women with previous histories of MD (AUC = 0.94, 95% CI 0.79-1.00) compared to the full pregnant cohort (AUC = 0.62, 95% CI 0.46-0.79). This observation prompted secondary analyses to test the model specificity for postpartum depression. The model predicted lifetime history of MD moderately well in never-pregnant, mixed-sex cohort of adolescents (N=150; ages 15-20; AUC = 0.75, 95% CI 0.57-0.92) and performed slightly better in males versus females. Additional sensitivity analyses are needed to determine the extent of the model’s specificity for MD subtypes and if demographic or clinical factors influence the predictive validity of this model.


2017 ◽  
Vol 4 (1) ◽  
pp. 25-31 ◽  
Author(s):  
Diana Effendi

Information Product Approach (IP Approach) is an information management approach. It can be used to manage product information and data quality analysis. IP-Map can be used by organizations to facilitate the management of knowledge in collecting, storing, maintaining, and using the data in an organized. The  process of data management of academic activities in X University has not yet used the IP approach. X University has not given attention to the management of information quality of its. During this time X University just concern to system applications used to support the automation of data management in the process of academic activities. IP-Map that made in this paper can be used as a basis for analyzing the quality of data and information. By the IP-MAP, X University is expected to know which parts of the process that need improvement in the quality of data and information management.   Index term: IP Approach, IP-Map, information quality, data quality. REFERENCES[1] H. Zhu, S. Madnick, Y. Lee, and R. Wang, “Data and Information Quality Research: Its Evolution and Future,” Working Paper, MIT, USA, 2012.[2] Lee, Yang W; at al, Journey To Data Quality, MIT Press: Cambridge, 2006.[3] L. Al-Hakim, Information Quality Management: Theory and Applications. Idea Group Inc (IGI), 2007.[4] “Access : A semiotic information quality framework: development and comparative analysis : Journal ofInformation Technology.” [Online]. Available: http://www.palgravejournals.com/jit/journal/v20/n2/full/2000038a.html. [Accessed: 18-Sep-2015].[5] Effendi, Diana, Pengukuran Dan Perbaikan Kualitas Data Dan Informasi Di Perguruan Tinggi MenggunakanCALDEA Dan EVAMECAL (Studi Kasus X University), Proceeding Seminar Nasional RESASTEK, 2012, pp.TIG.1-TI-G.6.


Sign in / Sign up

Export Citation Format

Share Document