scholarly journals Comprehensive Comparative Analysis of Local False Discovery Rate Control Methods

Metabolites ◽  
2021 ◽  
Vol 11 (1) ◽  
pp. 53
Author(s):  
Shin June Kim ◽  
Youngjae Oh ◽  
Jaesik Jeong

Due to the advance in technology, the type of data is getting more complicated and large-scale. To analyze such complex data, more advanced technique is required. In case of omics data from two different groups, it is interesting to find significant biomarkers between two groups while controlling error rate such as false discovery rate (FDR). Over the last few decades, a lot of methods that control local false discovery rate have been developed, ranging from one-dimensional to k-dimensional FDR procedure. For comparison study, we select three of them, which have unique and significant properties: Efron’s approach, Ploner’s approach, and Kim’s approach in chronological order. The first approach is one-dimensional approach while the other two are two-dimensional ones. Furthermore, we consider two more variants of Ploner’s approach. We compare the performance of those methods on both simulated and real data.

Genes ◽  
2020 ◽  
Vol 11 (2) ◽  
pp. 167 ◽  
Author(s):  
Qingyang Zhang

The nonparanormal graphical model has emerged as an important tool for modeling dependency structure between variables because it is flexible to non-Gaussian data while maintaining the good interpretability and computational convenience of Gaussian graphical models. In this paper, we consider the problem of detecting differential substructure between two nonparanormal graphical models with false discovery rate control. We construct a new statistic based on a truncated estimator of the unknown transformation functions, together with a bias-corrected sample covariance. Furthermore, we show that the new test statistic converges to the same distribution as its oracle counterpart does. Both synthetic data and real cancer genomic data are used to illustrate the promise of the new method. Our proposed testing framework is simple and scalable, facilitating its applications to large-scale data. The computational pipeline has been implemented in the R package DNetFinder, which is freely available through the Comprehensive R Archive Network.


2018 ◽  
Vol 113 (523) ◽  
pp. 1172-1183 ◽  
Author(s):  
Pallavi Basu ◽  
T. Tony Cai ◽  
Kiranmoy Das ◽  
Wenguang Sun

2018 ◽  
Author(s):  
Qike Li ◽  
Samir Rachid Zaim ◽  
Dillon Aberasturi ◽  
Joanne Berghout ◽  
Haiquan Li ◽  
...  

AbstractCalculating Differentially Expressed Genes (DEGs) from RNA-sequencing requires replicates to estimate gene-wise variability, infeasible in clinics. By imposing restrictive transcriptome-wide assumptions limiting inferential opportunities of conventional methods (edgeR, NOISeq-sim, DESeq, DEGseq), comparing two conditions without replicates (TCWR) has been proposed, but not evaluated. Under TCWR conditions (e.g., unaffected tissue vs. tumor), differences of transformed expression of the proposed individualized DEG (iDEG) method follow a distribution calculated across a local partition of related transcripts at baseline expression; thereafter the probability of each DEG is estimated by empirical Bayes with local false discovery rate control using a two-group mixture model. In extensive simulation studies of TCWR methods, iDEG and NOISeq are more accurate at 5%<DEGs<20% (precision>90%, recall>75%, false_positive_rate<1%) and 30%<DEGs<40% (precision=recall∼90%), respectively.The proposed iDEG method borrows localized distribution information from the same individual, a strategy that improves accuracy to compare transcriptomes in absence of replicates at low DEGs conditions. http://www.lussiergroup.org/publications/iDEG


2019 ◽  
Author(s):  
Johannes Köster ◽  
Louis J. Dijkstra ◽  
Tobias Marschall ◽  
Alexander Schönhuth

AbstractAs witnessed by various population-scale cancer genome sequencing projects, accurate discovery of somatic variants has become of central importance in modern cancer research. However, count statistics on somatic insertions and deletions (indels) discovered so far point out that large amounts of discoveries must have been missed. The reason is that the combination of uncertainties relating to, for example, gap and alignment ambiguities, twilight zone indels, cancer heterogeneity, sample purity, sampling and strand bias are hard to accurately quantify. Here, a unifying statistical model is provided whose dependency structures enable to accurately quantify all inherent uncertainties in short time. As major consequence, false discovery rate (FDR) in somatic indel discovery can now be controlled at utmost accuracy. As demonstrated on simulated and real data, this enables to dramatically increase the amount of true discoveries while safely suppressing the FDR. Specifically supported by workflow design, our approach can be integrated as a post-processing step in large-scale projects.The software is publicly available at https://varlociraptor.github.io and can be easily installed via Bioconda1 [Grüning et al., 2018].


Sign in / Sign up

Export Citation Format

Share Document