Statistical Methods for the Validation of Questionnaires

Summary Objectives: Questionnaires used in epidemiological studies should be validated. However, unclarity exists about the appropriate statistical methods and interpretation of validation studies. Thus, we investigated the theory and practice of statistical evaluation approaches. Methods: Using three platforms, a literature review, own simulations, and a validation study performed by ourselves, we worked out relevant limitations, advantages, and new important aspects of evaluation methods. Results: Our systematic literature review, based on physical activity questionnaires, revealed that correlation coefficients are still the common approach in validation studies, found in 41 of 46 reviewed publications (89.1%). This practice has been criticized in the theoretically oriented literature for more than 20 years. Appropriate evaluation methods as recommended by Bland and Altman were found in only ten publications (21.7 %).We showed that serious bias in questionnaires can be revealed by Bland-Altman plots but may remain undetected by correlation coefficients. With our simulations we refuted the argument that correlation coefficients properly investigate whether a questionnaire ranks the subjects sufficiently well. Further, with Bland-Altman analyses we could evaluate differential errors with respect to case-control status in our validation study. Yet, this was not possible with correlation coefficients, because they generally do not identify systematic bias. In addition, we show a potential pitfall in the interpretation of Bland-Altman plots that might occur in specific rare instances. Conclusions: The commonly used correlation approach can yield misleading conclusions in validation studies. A more frequent and proper use of the Bland-Altman methods would be desirable to improve epidemiological data quality.

Download Full-text

Relative validity of a food frequency questionnaire among tin miners in China: 1992/93 and 1995/96 diet validation studies

Public Health Nutrition ◽

10.1017/s1368980099000403 ◽

1999 ◽

Vol 2 (3) ◽

pp. 301-315 ◽

Cited By ~ 10

Author(s):

MR Forman ◽

J Zhang ◽

L Nebeling ◽

S-X Yao ◽

MJ Slesinski ◽

...

Keyword(s):

Lung Cancer ◽

Food Intake ◽

Validation Study ◽

Food Frequency Questionnaire ◽

Validation Studies ◽

Pearson Correlation ◽

Correlation Coefficients ◽

Average Frequency ◽

Food Frequency ◽

Food Recalls

AbstractObjectiveDiet validation research was conducted to compare the respondents' reporting of dietary intake in a food frequency questionnaire (FFQ) with intake reported in food recalls. Because the population received annual salary increments that could modify food intake, diet validation studies (DVSs) were conducted during two time intervals.DesignA 99-item FFQ was administered by an interviewer twice in a 1-year interval, and responses to each FFQ item were compared with 28 days of interviewer-administered food recalls that were collected in four 1-week intervals during each season of 1992/93. The second validation study in 1995/96 had a similar design to the earlier one.SettingA prospective cohort study of lung cancer among tin miners in China was initiated in 1992, with dietary and other risk factors updated annually.SubjectsAmong a cohort of high risk tin miners for lung cancer, two different samples (n = 141 in 1992/93, and n = 113 in 1995/96) for each diet validation study were randomly selected from four mine units, that were representative of all worker units.ResultsMiners reported a significantly higher average frequency of intake of foods in the food recalls than the FFQ, with few exceptions. Deattenuated Pearson correlation coefficients of the frequency of food intake between the FFQ and food recalls were in the range of –0.40 to 0.72 in both studies, with higher positive correlations for beverages and cereal staples than for animal protein sources, vegetables, fruits and legumes. The percentage of individuals with exact agreement in the extreme quartiles of intake in the food recalls and FFQ ranged from 0 to 100% in both studies.ConclusionsAmong Chinese miners, the range in correlations between the food recalls and the FFQ were due to: (i) market availability of foods during the food recall weeks compared to their annual reported intake in the FFQ; (ii) cultural perception of time; and (iii) differences in how the intake of mixed dishes and their multi-ingredient foods were reported in the recalls vs. the FFQ. The range in the percentage of agreement in the same quartiles and the changes in food intake over time may have implications for the analysis of the diet-disease relationship in this cohort.

Download Full-text

Review of nursing diagnosis validation studies: caregiver role strain

Revista Gaúcha de Enfermagem ◽

10.1590/1983-1447.2020.20190370 ◽

2020 ◽

Vol 41 ◽

Author(s):

Tânia Marlene Gonçalves Lourenço ◽

Rita Maria Sousa Abreu-Figueiredo ◽

Luís Octávio de Sá

Keyword(s):

Chronic Illness ◽

Literature Review ◽

Validation Study ◽

Validation Studies ◽

Web Of Science ◽

Role Strain ◽

The Elderly ◽

Nursing Diagnosis ◽

Different Populations ◽

Caregiver Role

ABSTRACT Objective: To analyze the nursing diagnosis NANDA-I - Caregiver Role Strain validation studies. Methods: Integrative literature review. Research of studies carried out between 2000 and 2018 with the descriptors: caregivers, nursing diagnosis and validation study in the following databases: Web of Science, EBESCOhost, Scielo Brasil and Portugal, LILACS, RCAAP, CAPES, NANDA-I website, and in the bibliographic references of the articles. Articles in Portuguese, English or Spanish were included. Results: The sample consisted of seven validation studies, with heterogeneity in the methodologies used. The populations where the diagnosis was clinically validated focused on caregivers for the elderly and people with chronic illness. The most prevalent defining characteristics were Stress and Apprehension related to the future. Conclusions: This diagnosis requires further validation studies among different populations in search of greater accuracy and a reduction in the number of defining characteristics, facilitating the use of taxonomy.

Download Full-text

Identification and Reproducibility of Plasma Metabolomic Biomarkers of Habitual Food Intake in a US Diet Validation Study

Metabolites ◽

10.3390/metabo10100382 ◽

2020 ◽

Vol 10 (10) ◽

pp. 382

Author(s):

Ying Wang ◽

Rebecca A. Hodge ◽

Victoria L. Stevens ◽

Terryl J. Hartman ◽

Marjorie L. McCullough

Keyword(s):

Validation Study ◽

Intraclass Correlation ◽

Correlation Coefficients ◽

Ethnically Diverse ◽

Epidemiological Studies ◽

Food Groups ◽

Intraclass Correlation Coefficients ◽

Bonferroni Adjustment ◽

Absolute Correlation ◽

Potential Use

Previous metabolomic studies have identified putative blood biomarkers of dietary intake. These biomarkers need to be replicated in other populations and tested for reproducibility over time for the potential use in future epidemiological studies. We conducted a metabolomics analysis among 671 racially/ethnically diverse men and women included in a diet validation study to examine the correlation between >100 food groups/items (101 by a food frequency questionnaire (FFQ), 105 by 24-h diet recalls (24HRs)) with 1141 metabolites measured in fasting plasma sample replicates, six months apart. Diet–metabolite associations were examined by Pearson’s partial correlation analysis. Biomarker reproducibility was assessed using intraclass correlation coefficients (ICCs). A total of 677 diet–metabolite associations were identified after Bonferroni adjustment for multiple comparisons and restricting absolute correlation coefficients to greater than 0.2 (601 associations using the FFQ and 395 using 24HRs). The median ICCs of the 238 putative biomarkers was 0.56 (interquartile range 0.46–0.68). In this study, with repeated FFQs, 24HRs and plasma metabolic profiles, we identified several potentially novel food biomarkers and replicated others found in our previous study. Our findings contribute to the growing literature on food-based biomarkers and provide important information on biomarker reproducibility which could facilitate their utilization in future nutritional epidemiological studies.

Download Full-text

Injuries Resulted From Breastfeeding: A New Approach To A Known Problem

Revista da Escola de Enfermagem da USP ◽

10.1590/s0080-6234201400002000021 ◽

2014 ◽

Vol 48 (2) ◽

pp. 346-356 ◽

Cited By ~ 8

Author(s):

Marina Possato Cervellini ◽

Mônica Antar Gamba ◽

Kelly Pereira Coca ◽

Ana Cristina Freitas de Vilhena Abrão

Keyword(s):

Validation Study ◽

Validation Studies ◽

Evaluation Methods ◽

Assessment Methods ◽

Integrative Review ◽

New Approach

This study aimed at analyzing nipple trauma resulted from breastfeeding based on dermatological approach. Two integrative reviews of literature were conducted, the first related to definitions, classification and evaluation methods of nipple trauma and another about validation studies related to this theme. In the first part were included 20 studies and only one third defined nipple trauma, more than half did not defined the nipple’s injuries reported, and each author showed a particular way to assess the injuries, without consensus. In the second integrative review, no validation study or algorithm related to nipple trauma resulted from breastfeeding was found. This fact demonstrated that the nipple’s injuries mentioned in the first review did not go through validation studies, justifying the lack of consensus identified as far as definition, classification and assessment methods of nipple trauma. 

Download Full-text

Correction of Risk Estimates for Measurement Error in Epidemiology

Methods of Information in Medicine ◽

10.1055/s-0038-1634621 ◽

1995 ◽

Vol 34 (05) ◽

pp. 503-510 ◽

Cited By ~ 12

Author(s):

S. A. Bashir ◽

S. W. Duffy

Keyword(s):

Risk Factors ◽

Measurement Error ◽

Validation Study ◽

Validation Studies ◽

Epidemiological Studies ◽

Study Data ◽

Risk Estimates ◽

User Friendly ◽

Considerable Pressure

Abstract:Epidemiologists are under considerable pressure to acknowledge the presence of measurement error in the determination of risk factors. Repeatability and validation studies are often prescribed in conjunction with epidemiological studies. We describe some practical uses for repeatability and validation study data, in terms of correcting risk estimates for measurement error. Commonly available methods are described, with their advantages and shortcomings. A user-friendly computer program to carry out the analyses described accompanies the paper.

Download Full-text

Methods for epidemiological studies in competitive cycling: an extension of the IOC consensus statement on methods for recording and reporting of epidemiological data on injury and illness in sport 2020

British Journal of Sports Medicine ◽

10.1136/bjsports-2020-103906 ◽

2021 ◽

pp. bjsports-2020-103906

Author(s):

Benjamin Clarsen ◽

Babette M Pluim ◽

Víctor Moreno-Pérez ◽

Xavier Bigard ◽

Cheri Blauwet ◽

...

Keyword(s):

Consensus Statement ◽

Expert Panel ◽

Epidemiological Data ◽

Epidemiological Studies ◽

Population Characteristics ◽

Exposure Study ◽

Injury Mechanisms ◽

Study Population ◽

Final Manuscript ◽

Collection Methods

In 2020, the IOC released a consensus statement that provides overall guidelines for the recording and reporting of epidemiological data on injury and illness in sport. Some aspects of this statement need to be further specified on a sport-by-sport basis. To extend the IOC consensus statement on methods for recording and reporting of epidemiological data on injury and illness in sports and to meet the sport-specific requirements of all cycling disciplines regulated by the Union Cycliste Internationale (UCI). A panel of 20 experts, all with experience in cycling or cycling medicine, participated in the drafting of this cycling-specific extension of the IOC consensus statement. In preparation, panel members were sent the IOC consensus statement, the first draft of this manuscript and a list of topics to be discussed. The expert panel met in July 2020 for a 1-day video conference to discuss the manuscript and specific topics. The final manuscript was developed in an iterative process involving all panel members. This paper extends the IOC consensus statement to provide cycling-specific recommendations on health problem definitions, mode of onset, injury mechanisms and circumstances, diagnosis classifications, exposure, study population characteristics and data collection methods. Recommendations apply to all UCI cycling disciplines, for both able-bodied cyclists and para-cyclists. The recommendations presented in this consensus statement will improve the consistency and accuracy of future epidemiological studies of injury and illness in cycling.

Download Full-text

Genetic Contribution of Endometriosis to the Risk of Developing Hormone-Related Cancers

International Journal of Molecular Sciences ◽

10.3390/ijms22116083 ◽

2021 ◽

Vol 22 (11) ◽

pp. 6083

Author(s):

Aintzane Rueda-Martínez ◽

Aiara Garitazelaia ◽

Ariadna Cilleros-Portet ◽

Sergi Marí ◽

Rebeca Arauzo ◽

...

Keyword(s):

Mendelian Randomization ◽

Clear Cell ◽

Association Studies ◽

Epidemiological Data ◽

Epidemiological Studies ◽

Genome Wide Association Studies ◽

Genome Wide ◽

Genomic Analyses ◽

Gynecological Disorder ◽

Robust Evidence

Endometriosis is a common gynecological disorder that has been associated with endometrial, breast and epithelial ovarian cancers in epidemiological studies. Since complex diseases are a result of multiple environmental and genetic factors, we hypothesized that the biological mechanism underlying their comorbidity might be explained, at least in part, by shared genetics. To assess their potential genetic relationship, we performed a two-sample mendelian randomization (2SMR) analysis on results from public genome-wide association studies (GWAS). This analysis confirmed previously reported genetic pleiotropy between endometriosis and endometrial cancer. We present robust evidence supporting a causal genetic association between endometriosis and ovarian cancer, particularly with the clear cell and endometrioid subtypes. Our study also identified genetic variants that could explain those associations, opening the door to further functional experiments. Overall, this work demonstrates the value of genomic analyses to support epidemiological data, and to identify targets of relevance in multiple disorders.

Download Full-text

Influential Factors and Evaluation Methods of the Performance of Grouted Semi-Flexible Pavement (GSP)—A Review

Applied Sciences ◽

10.3390/app11156700 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6700

Author(s):

Xiaogang Guo ◽

Peiwen Hao

Keyword(s):

Literature Review ◽

Systematic Literature Review ◽

Asphalt Concrete ◽

Cement Mortar ◽

Road Construction ◽

Flexible Pavement ◽

Evaluation Methods ◽

Influential Factors ◽

Future Research ◽

Relative Evaluation

Grouted Semi-flexible Pavement (GSP) is a novel pavement composed of open-graded asphalt concrete grouted with high-fluidity cement mortar. Due to its excellent load-bearing and anti-rutting performance, it has great potential as anti-rutting overlay and surface in road construction. However, the understanding of GSP performance remains limited and pertinent findings are inconsistent. This article aims to provide a systematic literature review for the articles which were published between 2000 and 2020 on GSP, explore the problems in the recent research, identify knowledge gaps, and deliver recommendations for future research. The influential factors and the relative evaluation methods of GSP performance are summarized and discussed in this article.

Download Full-text

A deep cascaded segmentation of obstructive sleep apnea-relevant organs from sagittal spine MRI

International Journal of Computer Assisted Radiology and Surgery ◽

10.1007/s11548-021-02333-0 ◽

2021 ◽

Vol 16 (4) ◽

pp. 579-588

Author(s):

Tatyana Ivanovska ◽

Amro Daboul ◽

Oleksandr Kalentev ◽

Norbert Hosten ◽

Reiner Biffar ◽

...

Keyword(s):

Obstructive Sleep Apnea ◽

Sleep Apnea ◽

Soft Palate ◽

High Speed ◽

Sleep Apnea Syndrome ◽

Epidemiological Data ◽

Epidemiological Studies ◽

Obstructive Sleep ◽

Validation Set ◽

Spine Mri

Abstract Purpose The main purpose of this work was to develop an efficient approach for segmentation of structures that are relevant for diagnosis and treatment of obstructive sleep apnea syndrome (OSAS), namely pharynx, tongue, and soft palate, from mid-sagittal magnetic resonance imaging (MR) data. This framework will be applied to big data acquired within an on-going epidemiological study from a general population. Methods A deep cascaded framework for subsequent segmentation of pharynx, tongue, and soft palate is presented. The pharyngeal structure was segmented first, since the airway was clearly visible in the T1-weighted sequence. Thereafter, it was used as an anatomical landmark for tongue location. Finally, the soft palate region was extracted using segmented tongue and pharynx structures and used as input for a deep network. In each segmentation step, a UNet-like architecture was applied. Results The result assessment was performed qualitatively by comparing the region boundaries obtained from the expert to the framework results and quantitatively using the standard Dice coefficient metric. Additionally, cross-validation was applied to ensure that the framework performance did not depend on the specific selection of the validation set. The average Dice coefficients on the test set were $$0.89\pm 0.03$$ 0.89 ± 0.03 , $$0.87\pm 0.02$$ 0.87 ± 0.02 , and $$0.79\pm 0.08$$ 0.79 ± 0.08 for tongue, pharynx, and soft palate tissues, respectively. The results were similar to other approaches and consistent with expert readings. Conclusion Due to high speed and efficiency, the framework will be applied for big epidemiological data with thousands of participants acquired within the Study of Health in Pomerania as well as other epidemiological studies to provide information on the anatomical structures and aspects that constitute important risk factors to the OSAS development.

Download Full-text

Evaluation of automated microvascular flow analysis software AVA 4: a validation study

Intensive Care Medicine Experimental ◽

10.1186/s40635-021-00380-0 ◽

2021 ◽

Vol 9 (1) ◽

Author(s):

Christian S. Guay ◽

Mariam Khebir ◽

T. Shiva Shahiri ◽

Ariana Szilagyi ◽

Erin Elizabeth Cole ◽

...

Keyword(s):

General Anesthesia ◽

Validation Study ◽

Validation Studies ◽

Gold Standard ◽

Flow Analysis ◽

Automated Analysis ◽

Altman Analysis ◽

Analysis Software ◽

Bland Altman Analysis ◽

Microvascular Flow

Abstract Background Real-time automated analysis of videos of the microvasculature is an essential step in the development of research protocols and clinical algorithms that incorporate point-of-care microvascular analysis. In response to the call for validation studies of available automated analysis software by the European Society of Intensive Care Medicine, and building on a previous validation study in sheep, we report the first human validation study of AVA 4. Methods Two retrospective perioperative datasets of human microcirculation videos (P1 and P2) and one prospective healthy volunteer dataset (V1) were used in this validation study. Video quality was assessed using the modified Microcirculation Image Quality Selection (MIQS) score. Videos were initially analyzed with (1) AVA software 3.2 by two experienced investigators using the gold standard semi-automated method, followed by an analysis with (2) AVA automated software 4.1. Microvascular variables measured were perfused vessel density (PVD), total vessel density (TVD), and proportion of perfused vessels (PPV). Bland–Altman analysis and intraclass correlation coefficients (ICC) were used to measure agreement between the two methods. Each method’s ability to discriminate between microcirculatory states before and after induction of general anesthesia was assessed using paired t-tests. Results Fifty-two videos from P1, 128 videos from P2 and 26 videos from V1 met inclusion criteria for analysis. Correlational analysis and Bland–Altman analysis revealed poor agreement and no correlation between AVA 4.1 and AVA 3.2. Following the induction of general anesthesia, TVD and PVD measured using AVA 3.2 increased significantly for P1 (p < 0.05) and P2 (p < 0.05). However, these changes could not be replicated with the data generated by AVA 4.1. Conclusions AVA 4.1 is not a suitable tool for research or clinical purposes at this time. Future validation studies of automated microvascular flow analysis software should aim to measure the new software’s agreement with the gold standard, its ability to discriminate between clinical states and the quality thresholds at which its performance becomes unacceptable.

Download Full-text