association testing
Recently Published Documents


TOTAL DOCUMENTS

229
(FIVE YEARS 69)

H-INDEX

32
(FIVE YEARS 2)

2021 ◽  
Author(s):  
Andrew R Ghazi ◽  
Kathleen Sucipto ◽  
Gholamali Rahnavard ◽  
Eric A Franzosa ◽  
Lauren J McIver ◽  
...  

Modern biological screens yield enormous numbers of measurements, and identifying and interpreting statistically significant associations among features is essential. Here, we present a novel hierarchical framework, HAllA (Hierarchical All-against-All association testing), for structured association discovery between paired high-dimensional datasets. HAllA efficiently integrates hierarchical hypothesis testing with false discovery rate correction to reveal significant linear and non-linear block-wise relationships among continuous and/or categorical data. We optimized and evaluated HAllA using heterogeneous synthetic datasets of known association structure, where HAllA outperformed all-against-all and other block testing approaches across a range of common similarity measures. We then applied HAllA to a series of real-world multi-omics datasets, revealing new associations between gene expression and host immune activity, the microbiome and host transcriptome, metabolomic profiling, and human health phenotypes. An open-source implementation of HAllA is freely available at http://huttenhower.sph.harvard.edu/halla along with documentation, demo datasets, and a user group.


2021 ◽  
Vol 12 ◽  
Author(s):  
Irving Simonin-Wilmer ◽  
Pedro Orozco-del-Pino ◽  
D. Timothy Bishop ◽  
Mark M. Iles ◽  
Carla Daniela Robles-Espinoza

Genome-wide association studies (GWAS) have been very successful at identifying genetic variants influencing a large number of traits. Although the great majority of these studies have been performed in European-descent individuals, it has been recognised that including populations with differing ancestries enhances the potential for identifying causal SNPs due to their differing patterns of linkage disequilibrium. However, when individuals from distinct ethnicities are included in a GWAS, it is necessary to implement a number of control steps to ensure that the identified associations are real genotype-phenotype relationships. In this Review, we discuss the analyses that are required when performing multi-ethnic studies, including methods for determining ancestry at the global and local level for sample exclusion, controlling for ancestry in association testing, and post-GWAS interrogation methods such as genomic control and meta-analysis. We hope that this overview provides a primer for those researchers interested in including distinct populations in their studies.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Lingfei Wang

AbstractSingle-cell RNA sequencing (scRNA-seq) provides unprecedented technical and statistical potential to study gene regulation but is subject to technical variations and sparsity. Furthermore, statistical association testing remains difficult for scRNA-seq. Here we present Normalisr, a normalization and statistical association testing framework that unifies single-cell differential expression, co-expression, and CRISPR screen analyses with linear models. By systematically detecting and removing nonlinear confounders arising from library size at mean and variance levels, Normalisr achieves high sensitivity, specificity, speed, and generalizability across multiple scRNA-seq protocols and experimental conditions with unbiased p-value estimation. The superior scalability allows us to reconstruct robust gene regulatory networks from trans-effects of guide RNAs in large-scale single cell CRISPRi screens. On conventional scRNA-seq, Normalisr recovers gene-level co-expression networks that recapitulated known gene functions.


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Mikhail Vysotskiy ◽  
Xue Zhong ◽  
Tyne W. Miller-Fleming ◽  
Dan Zhou ◽  
Nancy J. Cox ◽  
...  

Abstract Background Deletions and duplications of the multigenic 16p11.2 and 22q11.2 copy number variant (CNV) regions are associated with brain-related disorders including schizophrenia, intellectual disability, obesity, bipolar disorder, and autism spectrum disorder (ASD). The contribution of individual CNV genes to each of these identified phenotypes is unknown, as well as the contribution of these CNV genes to other potentially subtler health implications for carriers. Hypothesizing that DNA copy number exerts most effects via impacts on RNA expression, we attempted a novel in silico fine-mapping approach in non-CNV carriers using both GWAS and biobank data. Methods We first asked whether gene expression level in any individual gene in the CNV region alters risk for a known CNV-associated behavioral phenotype(s). Using transcriptomic imputation, we performed association testing for CNV genes within large genotyped cohorts for schizophrenia, IQ, BMI, bipolar disorder, and ASD. Second, we used a biobank containing electronic health data to compare the medical phenome of CNV carriers to controls within 700,000 individuals in order to investigate the full spectrum of health effects of the CNVs. Third, we used genotypes for over 48,000 individuals within the biobank to perform phenome-wide association studies between imputed expressions of individual 16p11.2 and 22q11.2 genes and over 1500 health traits. Results Using large genotyped cohorts, we found individual genes within 16p11.2 associated with schizophrenia (TMEM219, INO80E, YPEL3), BMI (TMEM219, SPN, TAOK2, INO80E), and IQ (SPN), using conditional analysis to identify upregulation of INO80E as the driver of schizophrenia, and downregulation of SPN and INO80E as increasing BMI. We identified both novel and previously observed over-represented traits within the electronic health records of 16p11.2 and 22q11.2 CNV carriers. In the phenome-wide association study, we found seventeen significant gene-trait pairs, including psychosis (NPIPB11, SLX1B) and mood disorders (SCARF2), and overall enrichment of mental traits. Conclusions Our results demonstrate how integration of genetic and clinical data aids in understanding CNV gene function and implicates pleiotropy and multigenicity in CNV biology.


2021 ◽  
Vol 99 (Supplement_3) ◽  
pp. 243-244
Author(s):  
Brittany N Diehl ◽  
Andres A Pech-Cervantes ◽  
Thomas H Terrill ◽  
Ibukun M Ogunade ◽  
Owen Rae ◽  
...  

Abstract Florida Native sheep is an indigenous breed from Florida and expresses superior parasite resistance. Previous candidate and genome wide association studies with Florida Native sheep have identified single nucleotide polymorphisms with additive and non-additive effects associated with parasite resistance. However, the role of other potential DNA variants, such as copy number variants (CNVs), controlling this complex trait have not been evaluated. The objective of the present study was to investigate the importance of CNVs on resistance to natural Haemonchus contortus infections in Florida Native sheep. A total of 200 sheep were evaluated in the present study. Phenotypic records included fecal egg count (FEC, eggs/gram), FAMACHA score, and packed cell volume (PCV, %). Sheep were genotyped using the GGP Ovine 50K SNP chip. The copy number analysis was used to identify CNVs using the univariate method. A total of 170 animals with CNVs and phenotypic data were used for the association testing. Association tests were carried out using single linear regression and Principal Component Analysis (PCA) correction to identify CNVs associated with FEC, FAMACHA, and PCV. To confirm our results, a second association testing using the correlation-trend test with PCA correction was performed. Significant CNVs were detected when their adjusted p-value was < 0.05 after FDR correction. A deletion CNV in chromosome 21 was associated with FEC. This DNA variant was located in intron 2 of RAB3IL gene and overlapped a QTL associated with changes in eosinophil number. Our study demonstrated for the first time that CNVs could be potentially involved with parasite resistance in this heritage sheep breed.


2021 ◽  
Author(s):  
Wodan Ling ◽  
Ni Zhao ◽  
Anju Lulla ◽  
Anna M. Plantinga ◽  
Weijia Fu ◽  
...  

Batch effects in microbiome data arise from differential processing of specimens and can lead to spurious findings and obscure true signals. Most existing strategies for mitigating batch effects rely on approaches designed for genomic analysis, failing to address the zero-inflated and over-dispersed microbiome data. Strategies tailored for microbiome data are restricted to association testing, failing to allow other analytic goals such as visualization. We develop the Conditional Quantile Regression (ConQuR) approach to remove microbiome batch effects using a two-part quantile regression model. It is a fundamental advancement in the field because it is the first comprehensive method that accommodates the complex distributions of microbial read counts, and it generates batch-removed zero-inflated read counts that can be used in and benefit all usual subsequent analyses. We apply ConQuR to real microbiome data sets and demonstrate its state-of-the-art performance in removing batch effects while preserving or even amplifying the signals of interest.


Microbiome ◽  
2021 ◽  
Vol 9 (1) ◽  
Author(s):  
Wodan Ling ◽  
Ni Zhao ◽  
Anna M. Plantinga ◽  
Lenore J. Launer ◽  
Anthony A. Fodor ◽  
...  

Abstract Background Identification of bacterial taxa associated with diseases, exposures, and other variables of interest offers a more comprehensive understanding of the role of microbes in many conditions. However, despite considerable research in statistical methods for association testing with microbiome data, approaches that are generally applicable remain elusive. Classical tests often do not accommodate the realities of microbiome data, leading to power loss. Approaches tailored for microbiome data depend highly upon the normalization strategies used to handle differential read depth and other data characteristics, and they often have unacceptably high false positive rates, generally due to unsatisfied distributional assumptions. On the other hand, many non-parametric tests suffer from loss of power and may also present difficulties in adjusting for potential covariates. Most extant approaches also fail in the presence of heterogeneous effects. The field needs new non-parametric approaches that are tailored to microbiome data, robust to distributional assumptions, and powerful under heterogeneous effects, while permitting adjustment for covariates. Methods As an alternative to existing approaches, we propose a zero-inflated quantile approach (ZINQ), which uses a two-part quantile regression model to accommodate the zero inflation in microbiome data. For a given taxon, ZINQ consists of a valid test in logistic regression to model the zero counts, followed by a series of quantile rank-score based tests on multiple quantiles of the non-zero part with adjustment for the zero inflation. As a regression and quantile-based approach, the method is non-parametric and robust to irregular distributions, while providing an allowance for covariate adjustment. Since no distributional assumptions are made, ZINQ can be applied to data that has been processed under any normalization strategy. Results Thorough simulations based on real data across a range of scenarios and application to real data sets show that ZINQ often has equivalent or higher power compared to existing tests even as it offers better control of false positives. Conclusions We present ZINQ, a quantile-based association test between microbiota and dichotomous or quantitative clinical variables, providing a powerful and robust alternative for the current microbiome differential abundance analysis.


2021 ◽  
Author(s):  
Cory Teuscher ◽  
Abbas Raza ◽  
Sean A Diehl ◽  
Laure K Case ◽  
Dimitry N Krementsov ◽  
...  

Histamine is a bioactive amine associated with a plethora of normal and pathophysiological processes, with the latter being dependent on both genetic and environmental factors including infectious agents. Previously, we showed in mice that susceptibility to Bordetella pertussis and pertussis toxin (PTX) induced histamine sensitization (Bphs) is controlled by histamine receptor H1 (Hrh1/HRH1) alleles. Bphs susceptible and resistant alleles (Bphss/Bphsr) encode for two-conserved protein haplotypes. Given the importance of HRH1 signaling in health and disease, we sequenced Hrh1 across an extended panel of laboratory and wild-derived inbred strains and phenotyped them for Bphs. Unexpectedly, eight strains homozygous for the Bphsr allele phenotyped as Bphss, suggesting the existence of a modifying locus segregating among the strains capable of complementing Bphsr. Genetic analyses mapped this modifier locus to mouse chromosome 6; designated Bphs-enhancer (Bphse), within a functional linkage disequilibrium domain encoding multiple loci controlling responsiveness to histamine (Bphs/Hrh1 and Histh1-4). Interval-specific single-nucleotide polymorphism (SNP) based association testing across 50 laboratory and wild-derived inbred mouse strains and functional prioritization analyses resulted in the identification of candidate genes for Bphse within a ~5.5 Mb interval (Chr6:111.0-116.4 Mb), including Atg7, Plxnd1, Tmcc1, Mkrn2, Il17re, Pparg, Lhfpl4, Vgll4, Rho and Syn2. Taken together, these results demonstrate the power of combining network-based computational methods with the evolutionarily significant diversity of wild-derived inbred mice to identify novel genetic mechanisms controlling susceptibility and resistance to histamine shock.


Sign in / Sign up

Export Citation Format

Share Document