scholarly journals Personalized beyond Precision: Designing Unbiased Gold Standards to Improve Single-Subject Studies of Personal Genome Dynamics from Gene Products

2020 ◽  
Vol 11 (1) ◽  
pp. 24
Author(s):  
Samir Rachid Zaim ◽  
Colleen Kenost ◽  
Hao Helen Zhang ◽  
Yves A. Lussier

Background: Developing patient-centric baseline standards that enable the detection of clinically significant outlier gene products on a genome-scale remains an unaddressed challenge required for advancing personalized medicine beyond the small pools of subjects implied by “precision medicine”. This manuscript proposes a novel approach for reference standard development to evaluate the accuracy of single-subject analyses of transcriptomes and offers extensions into proteomes and metabolomes. In evaluation frameworks for which the distributional assumptions of statistical testing imperfectly model genome dynamics of gene products, artefacts and biases are confounded with authentic signals. Model confirmation biases escalate when studies use the same analytical methods in the discovery sets and reference standards. In such studies, replicated biases are confounded with measures of accuracy. We hypothesized that developing method-agnostic reference standards would reduce such replication biases. We propose to evaluate discovery methods with a reference standard derived from a consensus of analytical methods distinct from the discovery one to minimize statistical artefact biases. Our methods involve thresholding effect-size and expression-level filtering of results to improve consensus between analytical methods. We developed and released an R package “referenceNof1” to facilitate the construction of robust reference standards. Results: Since RNA-Seq data analysis methods often rely on binomial and negative binomial assumptions to non-parametric analyses, the differences create statistical noise and make the reference standards method dependent. In our experimental design, the accuracy of 30 distinct combinations of fold changes (FC) and expression counts (hereinafter “expression”) were determined for five types of RNA analyses in two different datasets. This design was applied to two distinct datasets: Breast cancer cell lines and a yeast study with isogenic biological replicates in two experimental conditions. Furthermore, the reference standard (RS) comprised all RNA analytical methods with the exception of the method testing accuracy. To mitigate biases towards a specific analytical method, the pairwise Jaccard Concordance Index between observed results of distinct analytical methods were calculated for optimization. Optimization through thresholding effect-size and expression-level reduced the greatest discordances between distinct methods’ analytical results and resulted in a 65% increase in concordance. Conclusions: We have demonstrated that comparing accuracies of different single-subject analysis methods for clinical optimization in transcriptomics requires a new evaluation framework. Reliable and robust reference standards, independent of the evaluated method, can be obtained under a limited number of parameter combinations: Fold change (FC) ranges thresholds, expression level cutoffs, and exclusion of the tested method from the RS development process. When applying anticonservative reference standard frameworks (e.g., using the same method for RS development and prediction), most of the concordant signal between prediction and Gold Standard (GS) cannot be confirmed by other methods, which we conclude as biased results. Statistical tests to determine DEGs from a single-subject study generate many biased results requiring subsequent filtering to increase reliability. Conventional single-subject studies pertain to one or a few patient’s measures over time and require a substantial conceptual framework extension to address the numerous measures in genome-wide analyses of gene products. The proposed referenceNof1 framework addresses some of the inherent challenges for improving transcriptome scale single-subject analyses by providing a robust approach to constructing reference standards.

Author(s):  
Samir Rachid Zaim ◽  
Colleen Kenost ◽  
Hao Helen Zhang ◽  
Yves A. Lussier

Background: Developing patient-centric baseline standards that enable the detection of clinically significant outlier gene products on a genome-scale remains an unaddressed challenge required for advancing personalized medicine beyond the small pools of subjects implied by “precision medicine”. This manuscript proposes a novel approach for reference standard development to evaluate the accuracy of single-subject analyses of metabolomes, proteomes, or transcriptomes. Since distributional assumptions of statistical testing may inadequately model genome dynamics of gene products, the so-called significant results of previous studies may artefactually conflate with real signals. Model confirmation biases escalate when studies use the same analytical methods in the discovery sets and reference standards, as corroboration of results leads to an evaluation of reproducibility confounded with replicated biases rather than a measure of accuracy. We hypothesized that developing method-agnostic reference standards using effect-size and expression-level filtering of results, obtained from multiple discovery methods that are distinct from the one evaluated, would maximize the evaluation of clinical-transcriptomic signals and minimize statistical artefactual biases. We developed and released an R package “referenceNof1” to facilitate the construction of robust reference standards. Results: Since RNA-Seq data analysis methods often rely on binomial and negative binomial assumptions to non-parametric analyses, the differences create statistical noise and make the reference standards method dependent. In our experimental design, the accuracy of 30 distinct combinations of fold changes (FC) and expression levels (EL) were determined for five types of RNA analyses in two different datasets. This design was applied to two distinct datasets: breast cancer cell lines and a yeast study with isogenic biological replicates in two experimental conditions. In addition, the reference standard (RS) comprised all RNA analytical methods with the exception of the method testing accuracy. To mitigate for biased optimization of the RS parameters towards a specific analytical method, similarity between observed results of distinct analytical methods were calculated across all methods (Jaccard Concordance Index). The greatest differences were observed across diametric extremes. For example, filtering out differentially expressed genes (DEGs) using a fold change < 1.2 leads to a 50% increase in concordance between techniques when compared to results with FC > 1.2. Combining this FC cutoff with genes with mean expressions > 30 counts leads to a 65% increase in concordance in comparison to genes with expression levels < 30 counts and with FC < 1.2. Conclusions: We have demonstrated that comparing accuracies of different single-subject analysis methods for clinical optimization requires a new evaluation framework. Reliable and robust reference standards, independent of the evaluated method, can be obtained under a limited number of parameter combinations: fold change (FC) ranges thresholds, expression level cutoffs, and exclusion of the tested method from the RS development process. When applying anticonservative reference standard frameworks (e.g., using the same method for RS development and for prediction), a majority of the concordant signal between prediction and Gold Standard (GS) cannot be confirmed by other methods, which we conclude as biased results. Statistical tests to determine DEGs from a single-subject study generate many biased results that require subsequent filtering for increasing their reliability. Conventional single-subject studies pertain to one or a few measures in one patient over time [1]and need a substantial conceptual framework extension in order to address the tens of thousands of measures in genome-wide analyses of gene products. The proposed referenceNof1 framework addresses some of the inherent challenges in improving transcriptome scale single-subject analyses by providing a robust approach to constructing reference standards. Github: https://github.com/SamirRachidZaim/referenceNof1


2018 ◽  
Author(s):  
Samir Rachid Zaim ◽  
Colleen Kenost ◽  
Joanne Berghout ◽  
Helen Hao Zhang ◽  
Yves A. Lussier

AbstractBackgroundGene expression profiling has benefited medicine by providing clinically relevant insights at the molecular candidate and systems levels. However, to adopt a more ‘precision’ approach that integrates individual variability including ‘omics data into risk assessments, diagnoses, and therapeutic decision making, whole transcriptome expression analysis requires methodological advancements. One need is for users to confidently be able to make individual-level inferences from whole transcriptome data. We propose that biological replicates in isogenic conditions can provide a framework for testing differentially expressed genes (DEGs) in a single subject (ss) in absence of an appropriate external reference standard or replicates.MethodsEight ss methods for identifying genes with differential expression (NOISeq, DEGseq, edgeR, mixture model, DESeq, DESeq2, iDEG, and ensemble) were compared in Yeast (parental line versus snf2 deletion mutant; n=42/condition) and MCF7 breast-cancer cell (baseline and stimulated with estradiol; n=7/condition) RNA-Seq datasets where replicate analysis was used to build reference standards from NOISeq, DEGseq, edgeR, DESeq, DESeq2. Each dataset was randomly partitioned so that approximately two-thirds of the paired samples were used to construct reference standards and the remainder were treated separately as single-subject sample pairs and DEGs were assayed using ss methods. Receiver-operator characteristic (ROC) and precision-recall plots were determined for all ss methods against each RSs in both datasets (525 combinations).ResultsConsistent with prior analyses of these data, ~50% and ~15% DEGs were respectively obtained in Yeast and MCF7 reference standard datasets regardless of the analytical method. NOISeq, edgeR and DESeq were the most concordant and robust methods for creating a reference standard. Single-subject versions of NOISeq, DEGseq, and an ensemble learner achieved the best median ROC-area-under-the-curve to compare two transcriptomes without replicates regardless of the type of reference standard (>90% in Yeast, >0.75 in MCF7).ConclusionBetter and more consistent accuracies are obtained by an ensemble method applied to singlesubject studies across different conditions. In addition, distinct specific sing-subject methods perform better according to different proportions of DEGs. Single-subject methods for identifying DEGs from paired samples need improvement, as no method performs with both precision>90% and recall>90%. http://www.lussiergroup.org/publications/EnsembleBiomarker


2013 ◽  
Vol 82 (3) ◽  
pp. 358-374 ◽  
Author(s):  
Maaike Ugille ◽  
Mariola Moeyaert ◽  
S. Natasha Beretvas ◽  
John M. Ferron ◽  
Wim Van den Noortgate

2021 ◽  
pp. 014544552110540
Author(s):  
Nihal Sen

The purpose of this study is to provide a brief introduction to effect size calculation in single-subject design studies, including a description of nonparametric and regression-based effect sizes. We then focus the rest of the tutorial on common regression-based methods used to calculate effect size in single-subject experimental studies. We start by first describing the difference between five regression-based methods (Gorsuch, White et al., Center et al., Allison and Gorman, Huitema and McKean). This is followed by an example using the five regression-based effect size methods and a demonstration how these methods can be applied using a sample data set. In this way, the question of how the values obtained from different effect size methods differ was answered. The specific regression models used in these five regression-based methods and how these models can be obtained from the SPSS program were shown. R2 values obtained from these five methods were converted to Cohen’s d value and compared in this study. The d values obtained from the same data set were estimated as 0.003, 0.357, 2.180, 3.470, and 2.108 for the Allison and Gorman, Gorsuch, White et al., Center et al., as well as for Huitema and McKean methods, respectively. A brief description of selected statistical programs available to conduct regression-based methods was given.


Work ◽  
2020 ◽  
Vol 67 (1) ◽  
pp. 173-183
Author(s):  
Leonardo Correa Segedi ◽  
Daniel Rodrigues Ferreira Saint-Martin ◽  
Carlos Janssen Gomes da Cruz ◽  
Edgard M. K. Von Koenig Soares ◽  
Nayara Lima do Nascimento ◽  
...  

BACKGROUND: Minimum cardiorespiratory fitness (CRF) has been recommended for firefighters due to job requirements. Thus, it is important to identify accurate and readily available methods to assess CRF in this population. Non-exercise CRF estimates (NEx-CRF) have been proposed but this approach requires validation in this population. OBJECTIVE: To evaluate the accuracy of a NEx-CRF, as compared to a field maximum exercise test, among career military firefighters of both genders using a comprehensive agreement analysis. METHODS: We evaluated the accuracy of a NEx-CRF estimate compared to the Cooper 12 min running test among 702 males and 106 female firefighters. RESULTS: Cooper and NEx-CRF tests yielded similar CRF in both genders (differences <1.8±4.7 ml/kg–1.min–1; effect size <0.34). However, NEx-CRF underestimated Cooper-derived CRF among the fittest firefighters. NEx-CRF showed moderate to high sensitivity/specificity to detect fit or unfit firefighters (71.9% among men and 100% among women). Among men, the NEx-CRF method correctly identified most firefighters with less than 11 METs or greater than 13 METs, but showed lower precision to discriminate those with CRF between 11–13 METs. CONCLUSIONS: The NEx-CRF method to estimate firefighters’ CRF may be considered as an alternative method when an exercise-based method is not available or may be used to identify those who require more traditional testing (CRF 11–13 METs).


1985 ◽  
Vol 68 (4) ◽  
pp. 680-683
Author(s):  
Geraldine Vaughan Mitchell ◽  
Mamie Young Jenkins

Abstract Rat bioassay was used to assess the protein quality of powdered infant formulas and to evaluate the feasibility of using modified casein diets (containing the same source and level of fat and carbohydrate contributed by the infant formulas) as reference standards. Modification of the casein diet to match the milk-based formulas caused a significant reduction in weekly protein efficiency ratios (PER) and net protein ratios (NPR) at the third and fourth weeks. Modification of the casein diet to simulate the soy-based formulas had no significant effect on NPR values; PER values were more varied. When PER and NPR values of the powdered milk-based formulas were expressed relative to the unmodified reference standard, the relative values were lower than when each matched reference was used. With few exceptions, the relative weekly PER values of the soy-based formulas were similar regardless of the standard used. The relative NPR values of the formulas had a pattern similar to the relative PER values. The data indicate that protein quality evaluation of infant formulas using rat bioassay warrants the use of matched casein reference diets for each type of formula.


Author(s):  
Sanhua Zhang ◽  
Chongmin Jiang ◽  
Chunjing Tu

Background: The current national growth and development standard of preschool children in China was formulated in 2003, which has many deficiencies. It is necessary to construct more scientific percentile curve and growth reference standards in order to evaluate more effectively the growth, development and health status of Chinese children. Methods: Based on the physical and health data of 31 provinces in China measured in 2010 and 2014, the GAMLSS model was used to construct the growth reference standard and correlation curve. Results: We obtained growth reference standards for percentile curve and Z-score curve of height-for-age, sitting height-for-age, Weight-for-age, Chest circumference-for-age of Chinese preschool children. The C50 percentile of all indicators showed an obvious increasing trend with aged 3.0 to 6.5. Such as, the height of boys and girls increased by 21.1cm and 20.3cm respectively, the sitting height boys and girls increased by 10.3cm and 10.1cm respectively, the weight of boys and girls increased by 7.1 kg and 6.3 kg respectively, the Chest circumference of boys and girls increased by 6cm and 5.2 cm respectively. Conclusion: The children's growth and development charts provided in this study provide effective monitoring and personalized evaluation tools for the growth and development assessment of preschool children, as well as for the reduction of malnutrition, prevention and control of childhood obesity. It is recommended to be used in some areas such as child health, medical treatment and public health.


2021 ◽  
Author(s):  
Daniela F de Albuquerque ◽  
Jack Goffinet ◽  
Rachael Wright ◽  
John M Pearson

While functional magnetic resonance imaging (fMRI) remains one of the most widespread and important methods in basic and clinical neuroscience, the data it produces---time series of brain volumes---continue to pose daunting analysis challenges. The current standard ("mass univariate") approach involves constructing a matrix of task regressors, fitting a separate general linear model at each volume pixel ("voxel"), computing test statistics for each model, and correcting for false positives post hoc using bootstrap or other resampling methods. Despite its simplicity, this approach has enjoyed great success over the last two decades due to: 1) its ability to produce effect maps highlighting brain regions whose activity significantly correlates with a given variable of interest; and 2) its modeling of experimental effects as separable and thus easily interpretable. However, this approach suffers from several well-known drawbacks, namely: inaccurate assumptions of linearity and noise Gaussianity; a limited ability to capture individual effects and variability; and difficulties in performing proper statistical testing secondary to independently fitting voxels. In this work, we adopt a different approach, modeling entire volumes directly in a manner that increases model flexibility while preserving interpretability. Specifically, we use a generalized additive model (GAM) in which the effects of each regressor remain separable, the product of a spatial map produced by a variational autoencoder and a (potentially nonlinear) gain modeled by a covariate-specific Gaussian Process. The result is a model that yields group-level effect maps comparable or superior to the ones obtained with standard fMRI analysis software while also producing single-subject effect maps capturing individual differences. This suggests that generative models with a decomposable structure might offer a more flexible alternative for the analysis of task-based fMRI data.


Sign in / Sign up

Export Citation Format

Share Document