scholarly journals The average likelihood ratio for large-scale multiple testing and detecting sparse mixtures

Author(s):  
Guenther Walther
2017 ◽  
Vol 46 (6) ◽  
pp. 284-292 ◽  
Author(s):  
Denis G. Dumas ◽  
Daniel M. McNeish

Single-timepoint educational measurement practices are capable of assessing student ability at the time of testing but are not designed to be informative of student capacity for developing in any particular academic domain, despite commonly being used in such a manner. For this reason, such measurement practice systematically underestimates the potential of students from nondominant socioeconomic or ethnic groups, who may not have had adequate opportunity to develop various academic skills but can nonetheless do so in the future. One long-standing approach to the partial rectification of this issue is dynamic assessment (DA), a technique that features multiple testing occasions integrated with learning opportunities. However, DA is extremely resource intensive to incorporate into educational assessment practice and cannot be applied to extant large-scale data sets. In this article, the authors describe a recently developed statistical technique, dynamic measurement modeling (DMM), which is capable of estimating quantities associated with DA—including student capacity for learning a particular skill—from existing large-scale longitudinal assessment data, allowing the core concepts of DA to be scaled up for use with secondary data sets such as those collected by Statewide Longitudinal Data Systems in the United States. The authors show that by considering several assessments over time, student capacity can be reliably estimated, and these capacity estimates are much less affected by student race/ethnicity, gender, and socioeconomic status than are single-timepoint assessment scores, thereby improving the consequential validity of measurement.


2015 ◽  
Vol 2015 ◽  
pp. 1-11 ◽  
Author(s):  
Yu-Bing Li ◽  
Xue-Zhong Zhou ◽  
Run-Shun Zhang ◽  
Ying-Hui Wang ◽  
Yonghong Peng ◽  
...  

Background. Traditional Chinese medicine (TCM) is an individualized medicine by observing the symptoms and signs (symptoms in brief) of patients. We aim to extract the meaningful herb-symptom relationships from large scale TCM clinical data.Methods. To investigate the correlations between symptoms and herbs held for patients, we use four clinical data sets collected from TCM outpatient clinical settings and calculate the similarities between patient pairs in terms of the herb constituents of their prescriptions and their manifesting symptoms by cosine measure. To address the large-scale multiple testing problems for the detection of herb-symptom associations and the dependence between herbs involving similar efficacies, we propose a network-based correlation analysis (NetCorrA) method to detect the herb-symptom associations.Results. The results show that there are strong positive correlations between symptom similarity and herb similarity, which indicates that herb-symptom correspondence is a clinical principle adhered to by most TCM physicians. Furthermore, the NetCorrA method obtains meaningful herb-symptom associations and performs better than the chi-square correlation method by filtering the false positive associations.Conclusions. Symptoms play significant roles for the prescriptions of herb treatment. The herb-symptom correspondence principle indicates that clinical phenotypic targets (i.e., symptoms) of herbs exist and would be valuable for further investigations.


Author(s):  
Wenguang Sun ◽  
Brian J. Reich ◽  
T. Tony Cai ◽  
Michele Guindani ◽  
Armin Schwartzman

Genes ◽  
2021 ◽  
Vol 13 (1) ◽  
pp. 87
Author(s):  
Sean M. Burnard ◽  
Rodney A. Lea ◽  
Miles Benton ◽  
David Eccles ◽  
Daniel W. Kennedy ◽  
...  

Conventional genome-wide association studies (GWASs) of complex traits, such as Multiple Sclerosis (MS), are reliant on per-SNP p-values and are therefore heavily burdened by multiple testing correction. Thus, in order to detect more subtle alterations, ever increasing sample sizes are required, while ignoring potentially valuable information that is readily available in existing datasets. To overcome this, we used penalised regression incorporating elastic net with a stability selection method by iterative subsampling to detect the potential interaction of loci with MS risk. Through re-analysis of the ANZgene dataset (1617 cases and 1988 controls) and an IMSGC dataset as a replication cohort (1313 cases and 1458 controls), we identified new association signals for MS predisposition, including SNPs above and below conventional significance thresholds while targeting two natural killer receptor loci and the well-established HLA loci. For example, rs2844482 (98.1% iterations), otherwise ignored by conventional statistics (p = 0.673) in the same dataset, was independently strongly associated with MS in another GWAS that required more than 40 times the number of cases (~45 K). Further comparison of our hits to those present in a large-scale meta-analysis, confirmed that the majority of SNPs identified by the elastic net model reached conventional statistical GWAS thresholds (p < 5 × 10−8) in this much larger dataset. Moreover, we found that gene variants involved in oxidative stress, in addition to innate immunity, were associated with MS. Overall, this study highlights the benefit of using more advanced statistical methods to (re-)analyse subtle genetic variation among loci that have a biological basis for their contribution to disease risk.


2021 ◽  
Author(s):  
Runqing Yang ◽  
Yuxin Song ◽  
Li Jiang ◽  
Zhiyu Hao ◽  
Runqing Yang

Abstract Complex computation and approximate solution hinder the application of generalized linear mixed models (GLMM) into genome-wide association studies. We extended GRAMMAR to handle binary diseases by considering genomic breeding values (GBVs) estimated in advance as a known predictor in genomic logit regression, and then controlled polygenic effects by regulating downward genomic heritability. Using simulations and case analyses, we showed in optimizing GRAMMAR, polygenic effects and genomic controls could be evaluated using the fewer sampling markers, which extremely simplified GLMM-based association analysis in large-scale data. In addition, joint analysis for quantitative trait nucleotide (QTN) candidates chosen by multiple testing offered significant improved statistical power to detect QTNs over existing methods.


2019 ◽  
Author(s):  
Yulan Gu ◽  
Chuandan Wan ◽  
Jiaming Qiu ◽  
Yanhong Cui ◽  
Tingwang Jiang

AbstractThe applications of liquid biopsy have attracted much attention in biomedical research in recent years. Circulating cell-free DNA (cfDNA) in the serum may serve as a unique tumor marker in various types of cancer. Circulating tumor DNA (ctDNA) is a type of serum cfDNA found in patients with cancer and contains abundant information regarding tumor characteristics, highlighting its potential diagnostic value in the clinical setting. However, the diagnostic value of cfDNA as a biomarker in cervical cancer remains unclear. Here, we performed a meta-analysis to evaluate the applications of ctDNA as a biomarker in cervical cancer. A systematic literature search was performed using PubMed, Embase, and WANFANG MED ONLINE databases up to March 18, 2019. All literature was analyzed using Meta Disc 1.4 and STATA 14.0 software. Diagnostic measures of accuracy of ctDNA in cervical cancer were pooled and investigated. Fifteen studies comprising 1109 patients with cervical cancer met our inclusion criteria and were subjected to analysis. The pooled sensitivity and specificity were 0.52 (95% confidence interval [CI], 0.33–0.71) and 0.97 (95% CI, 0.91–0.99), respectively. The pooled positive likelihood ratio and negative likelihood ratio were 16.0 (95% CI, 5.5–46.4) and 0.50 (95% CI, 0.33–0.75), respectively. The diagnostic odds ratio was 32 (95% CI, 10–108), and the area under the summary receiver operating characteristic curve was 0.92 (95% CI, 0.90– 0.94). There was no significant publication bias observed. In the included studies, ctDNA showed clear diagnostic value for diagnosing and monitoring cervical cancer. Our meta-analysis suggested that detection of human papilloma virus ctDNA in patients with cervical cancer could be used as a noninvasive early dynamic biomarker of tumors, with high specificity and moderate sensitivity. Further large-scale prospective studies are required to validate the factors that may influence the accuracy of cervical cancer diagnosis and monitoring.


2017 ◽  
Author(s):  
Jie Zheng ◽  
Tom G. Richardson ◽  
Louise A. C. Millard ◽  
Gibran Hemani ◽  
Christopher Raistrick ◽  
...  

AbstractBackgroundIdentifying phenotypic correlations between complex traits and diseases can provide useful etiological insights. Restricted access to individual-level phenotype data makes it difficult to estimate large-scale phenotypic correlation across the human phenome. State-of-the-art methods, metaCCA and LD score regression, provide an alternative approach to estimate phenotypic correlation using genome-wide association study (GWAS) summary statistics.ResultsHere, we present an integrated R toolkit, PhenoSpD, to 1) apply metaCCA (or LD score regression) to estimate phenotypic correlations using GWAS summary statistics; and 2) to utilize the estimated phenotypic correlations to inform correction of multiple testing for complex human traits using the spectral decomposition of matrices (SpD). The simulations suggest it is possible to estimate phenotypic correlation using samples with only a partial overlap, but as overlap decreases correlations will attenuate towards zero and multiple testing correction will be more stringent than in perfectly overlapping samples. In a case study, PhenoSpD using GWAS results suggested 324.4 independent tests among 452 metabolites, which is close to the 296 independent tests estimated using true phenotypic correlation. We further applied PhenoSpD to estimated 7,503 pair-wise phenotypic correlations among 123 metabolites using GWAS summary statistics from Kettunen et al. and PhenoSpD suggested 44.9 number of independent tests for theses metabolites.ConclusionPhenoSpD integrates existing methods and provides a simple and conservative way to reduce dimensionality for complex human traits using GWAS summary statistics, which is particularly valuable for post-GWAS analysis of complex molecular traits.AvailabilityR code and documentation for PhenoSpD V1.0.0 is available online (https://github.com/MRCIEU/PhenoSpD).


2017 ◽  
Author(s):  
Neda Jahanshad ◽  
Habib Ganjgahi ◽  
Janita Bralten ◽  
Anouk den Braber ◽  
Joshua Faskowitz ◽  
...  

Abstract:Susceptibility genes for psychiatric and neurological disorders - including APOE, BDNF, CLU,CNTNAP2, COMT, DISC1, DTNBP1, ErbB4, HFE, NRG1, NTKR3, and ZNF804A - have been reported to affect white matter (WM) microstructure in the healthy human brain, as assessed through diffusion tensor imaging (DTI). However, effects of single nucleotide polymorphisms (SNPs) in these genes explain only a small fraction of the overall variance and are challenging to detect reliably in single cohort studies. To date, few studies have evaluated the reproducibility of these results. As part of the ENIGMA-DTI consortium, we pooled regional fractional anisotropy (FA) measures for 6,165 subjects (CEU ancestry N=4,458) from 11 cohorts worldwide to evaluate effects of 15 candidate SNPs by examining their associations with WM microstructure. Additive association tests were conducted for each SNP. We used several meta-analytic and mega-analytic designs, and we evaluated regions of interest at multiple granularity levels. The ENIGMA-DTI protocol was able to detect single-cohort findings as originally reported. Even so, in this very large sample, no significant associations remained after multiple-testing correction for the 15 SNPs investigated. Suggestive associations (1.3×10-4 < p < 0.05, uncorrected) were found for BDNF, COMT, and ZNF804A in specific tracts. Meta-and mega-analyses revealed similar findings. Regardless of the approach, the previously reported candidate SNPs did not show significant associations with WM microstructure in this largest genetic study of DTI to date; the negative findings are likely not due to insufficient power. Genome-wide studies, involving large-scale meta-analyses, may help to discover SNPs robustly influencing WM microstructure.


Sign in / Sign up

Export Citation Format

Share Document