Simulation of Random Differential Periodontitis Outcome Misclassification with Perfect Specificity

2021 ◽  
pp. 238008442110071
Author(s):  
T.S. Alshihayb ◽  
B. Heaton

Introduction: Misclassification of clinical periodontitis can occur by partial-mouth protocols, particularly when tooth-based case definitions are applied. In these cases, the true prevalence of periodontal disease is underestimated, but specificity is perfect. In association studies of periodontal disease etiology, misclassification by this mechanism is independent of exposure status (i.e., nondifferential). Despite nondifferential mechanisms, differential misclassification may be realized by virtue of random errors. Objectives: To gauge the amount of uncertainty around the expectation of differential periodontitis outcome misclassification due to random error only, we estimated the probability of differential outcome misclassification, its magnitude, and expected impacts via simulation methods using values from the periodontitis literature. Methods: We simulated data sets with a binary exposure and outcome that varied according to sample size (200, 1,000, 5,000, 10,000), exposure effect (risk ratio; 1.5, 2), exposure prevalence (0.1, 0.3), outcome incidence (0.1, 0.4), and outcome sensitivity (0.6, 0.8). Using a Bernoulli trial, we introduced misclassification by randomly sampling individuals with the outcome in each exposure group and repeated each scenario 10,000 times. Results: The probability of differential misclassification decreased as the simulation parameter values increased and occurred at least 37% of the time across the 10,000 repetitions. Across all scenarios, the risk ratio was biased, on average, toward the null when the sensitivity was higher among the unexposed and away from the null when it was higher among the exposed. The extent of bias for absolute sensitivity differences ≥0.04 ranged from 0.05 to 0.19 regardless of simulation parameters. However, similar trends were not observed for the odds ratio where the extent and direction of bias were dependent on the outcome incidence, sensitivity of classification, and effect size. Conclusions: The results of this simulation provide helpful quantitative information to guide interpretation of findings in which nondifferential outcome misclassification mechanisms are known to be operational with perfect specificity. Knowledge Transfer Statement: Measurement of periodontitis can suffer from classification errors, such as when partial-mouth protocols are applied. In this case, specificity is perfect and sensitivity is expected to be nondifferential, leading to an expectation for no bias when studying periodontitis etiologies. Despite expectation, differential misclassification could occur from sources of random error, the effects of which are unknown. Proper scrutiny of research findings can occur when the probability and impact of random classification errors are known.

2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Danqing Xu ◽  
Chen Wang ◽  
Atlas Khan ◽  
Ning Shang ◽  
Zihuai He ◽  
...  

AbstractLabeling clinical data from electronic health records (EHR) in health systems requires extensive knowledge of human expert, and painstaking review by clinicians. Furthermore, existing phenotyping algorithms are not uniformly applied across large datasets and can suffer from inconsistencies in case definitions across different algorithms. We describe here quantitative disease risk scores based on almost unsupervised methods that require minimal input from clinicians, can be applied to large datasets, and alleviate some of the main weaknesses of existing phenotyping algorithms. We show applications to phenotypic data on approximately 100,000 individuals in eMERGE, and focus on several complex diseases, including Chronic Kidney Disease, Coronary Artery Disease, Type 2 Diabetes, Heart Failure, and a few others. We demonstrate that relative to existing approaches, the proposed methods have higher prediction accuracy, can better identify phenotypic features relevant to the disease under consideration, can perform better at clinical risk stratification, and can identify undiagnosed cases based on phenotypic features available in the EHR. Using genetic data from the eMERGE-seq panel that includes sequencing data for 109 genes on 21,363 individuals from multiple ethnicities, we also show how the new quantitative disease risk scores help improve the power of genetic association studies relative to the standard use of disease phenotypes. The results demonstrate the effectiveness of quantitative disease risk scores derived from rich phenotypic EHR databases to provide a more meaningful characterization of clinical risk for diseases of interest beyond the prevalent binary (case-control) classification.


1937 ◽  
Vol 33 (4) ◽  
pp. 444-450 ◽  
Author(s):  
Harold Jeffreys

1. It often happens that we have a series of observed data for different values of the argument and with known standard errors, and wish to remove the random errors as far as possible before interpolation. In many cases previous considerations suggest a form for the true value of the function; then the best method is to determine the adjustable parameters in this function by least squares. If the number required is not initially known, as for a polynomial where we do not know how many terms to retain, the number can be determined by finding out at what stage the introduction of a new parameter is not supported by the observations*. In many other cases, again, existing theory does not suggest a form for the solution, but the observations themselves suggest one when the departures from some simple function are found to be much less than the whole range of variation and to be consistent with the standard errors. The same method can then be used. There are, however, further cases where no simple function is suggested either by previous theory or by the data themselves. Even in these the presence of errors in the data is expected. If ε is the actual error of any observed value and σ the standard error, the expectation of Σε2/σ2 is equal to the number of observed values. Part, at least, of any irregularity in the data, such as is revealed by the divided differences, can therefore be attributed to random error, and we are entitled to try to reduce it.


2021 ◽  
Author(s):  
Milton Pividori ◽  
Sumei Lu ◽  
Binglan Li ◽  
Chun Su ◽  
Matthew E. Johnson ◽  
...  

Understanding how dysregulated transcriptional processes result in tissue-specific pathology requires a mechanistic interpretation of expression regulation across different cell types. It has been shown that this insight is key for the development of new therapies. These mechanisms can be identified with transcriptome-wide association studies (TWAS), which have represented an important step forward to test the mediating role of gene expression in GWAS associations. However, due to pervasive eQTL sharing across tissues, TWAS has not been successful in identifying causal tissues, and other methods generally do not take advantage of the large amounts of RNA-seq data publicly available. Here we introduce a polygenic approach that leverages gene modules (genes with similar co-expression patterns) to project both gene-trait associations and pharmacological perturbation data into a common latent representation for a joint analysis. We observed that diseases were significantly associated with gene modules expressed in relevant cell types, such as hypothyroidism with T cells and thyroid, hypertension and lipids with adipose tissue, and coronary artery disease with cardiomyocytes. Our approach was more accurate in predicting known drug-disease pairs and revealed stable trait clusters, including a complex branch involving lipids with cardiovascular, autoimmune, and neuropsychiatric disorders. Furthermore, using a CRISPR-screen, we show that genes involved in lipid regulation exhibit more consistent trait associations through gene modules than individual genes. Our results suggest that a gene module perspective can contextualize genetic associations and prioritize alternative treatment targets when GWAS hits are not druggable.


2002 ◽  
Vol 5 (6a) ◽  
pp. 969-976 ◽  
Author(s):  
Rudolf Kaaks ◽  
Pietro Ferrari ◽  
Antonio Ciampi ◽  
Martyn Plummer ◽  
Elio Riboli

AbstractObjective:To examine statistical models that account for correlation between random errors of different dietary assessment methods, in dietary validation studies.Setting:In nutritional epidemiology, sub-studies on the accuracy of the dietary questionnaire measurements are used to correct for biases in relative risk estimates induced by dietary assessment errors. Generally, such validation studies are based on the comparison of questionnaire measurements (Q) with food consumption records or 24-hour diet recalls (R). In recent years, the statistical analysis of such studies has been formalised more in terms of statistical models. This made the need of crucial model assumptions more explicit. One key assumption is that random errors must be uncorrelated between measurements Q and R, as well as between replicate measurements R1 and R2 within the same individual. These assumptions may not hold in practice, however. Therefore, more complex statistical models have been proposed to validate measurements Q by simultaneous comparisons with measurements R plus a biomarker M, accounting for correlations between the random errors of Q and R.Conclusions:The more complex models accounting for random error correlations may work only for validation studies that include markers of diet based on physiological knowledge about the quantitative recovery, e.g. in urine, of specific elements such as nitrogen or potassium, or stable isotopes administered to the study subjects (e.g. the doubly labelled water method for assessment of energy expenditure). This type of marker, however, eliminates the problem of correlation of random errors between Q and R by simply taking the place of R, thus rendering complex statistical models unnecessary.


1995 ◽  
Vol 28 (5) ◽  
pp. 590-593 ◽  
Author(s):  
R. A. Winholtz

Two corrections are made to the equations for estimating the counting statistical errors in diffraction stress measurements. It is shown that the previous equations provide a conservative estimate of the counting-statistical component of the random errors in stress measurements. The results from the corrected equations are compared to a Monte Carlo model and to replicated measurements. A procedure to handle other sources of random error is also suggested.


2011 ◽  
Vol 57 (2) ◽  
pp. 241-254 ◽  
Author(s):  
Emma Ahlqvist ◽  
Tarunveer Singh Ahluwalia ◽  
Leif Groop

BACKGROUND Type 2 diabetes (T2D) is a complex disorder that is affected by multiple genetic and environmental factors. Extensive efforts have been made to identify the disease-affecting genes to better understand the disease pathogenesis, find new targets for clinical therapy, and allow prediction of disease. CONTENT Our knowledge about the genes involved in disease pathogenesis has increased substantially in recent years, thanks to genomewide association studies and international collaborations joining efforts to collect the huge numbers of individuals needed to study complex diseases on a population level. We have summarized what we have learned so far about the genes that affect T2D risk and their functions. Although more than 40 loci associated with T2D or glycemic traits have been reported and reproduced, only a minor part of the genetic component of the disease has been explained, and the causative variants and affected genes are unknown for many of the loci. SUMMARY Great advances have recently occurred in our understanding of the genetics of T2D, but much remains to be learned about the disease etiology. The genetics of T2D has so far been driven by technology, and we now hope that next-generation sequencing will provide important information on rare variants with stronger effects. Even when variants are known, however, great effort will be required to discover how they affect disease risk.


2011 ◽  
Vol 11 (4) ◽  
pp. 239-248 ◽  
Author(s):  
Saikat Das ◽  
Subhashini John ◽  
Paul Ravindran ◽  
Rajesh Isiah ◽  
Rajesh B ◽  
...  

AbstractContext: Setup error significantly affects the accuracy of treatment and outcome in high precision radiotherapy.Aims: To determine total, systematic, random error and clinical target volume (CTV) to planning target volume (PTV) margin with alpha cradle (VL) and ray cast (RC) immobilisation in abdominopelvic region.Methods and material: Setup error was compared by using digitally reconstructed radiograph (DRR) as reference image with electronic portal image (EPI) taken during the treatment. Statistical analysis used: The total errors in mediolateral (ML), craniocaudal (CC) and anteroposterior (AP) directions were compared by t-test. For systematic and random errors variance ratio test (F-statistics) was used. Margins were calculated using International Commission of Radiation Units (ICRU), Stroom’s and van Herk’s formula.Results: A total number of 306 portal images were analysed with 144 images in RC group and 162 images in VL group. For VL, in ML, CC, AP directions systematic errors were, in cm, (0.45, 0.29, 0.41), random errors (0.48, 0.32, 0.58), CTV to PTV margins (1.24, 0.80, 1.25), respectively. For RC, systematic errors were (0.25, 0.37, 0.80), random error (0.46, 0.80, 0.33), CTV to PTV margins (0.82, 1.30, 1.08), respectively. The difference of random error in CC and AP directions were statistically significant.Conclusions: Geometric errors and CTV to PTV margins are different in different directions. For abdomen and pelvis in VL immobilisation, the margin ranged from 8 mm to 12.4 mm and for RC it was 8.2 mm to 13 mm. Therefore, a margin of 10 mm with online correction would be adequate.


Sign in / Sign up

Export Citation Format

Share Document