association discovery
Recently Published Documents


TOTAL DOCUMENTS

52
(FIVE YEARS 12)

H-INDEX

11
(FIVE YEARS 1)

2021 ◽  
Author(s):  
Andrew R Ghazi ◽  
Kathleen Sucipto ◽  
Gholamali Rahnavard ◽  
Eric A Franzosa ◽  
Lauren J McIver ◽  
...  

Modern biological screens yield enormous numbers of measurements, and identifying and interpreting statistically significant associations among features is essential. Here, we present a novel hierarchical framework, HAllA (Hierarchical All-against-All association testing), for structured association discovery between paired high-dimensional datasets. HAllA efficiently integrates hierarchical hypothesis testing with false discovery rate correction to reveal significant linear and non-linear block-wise relationships among continuous and/or categorical data. We optimized and evaluated HAllA using heterogeneous synthetic datasets of known association structure, where HAllA outperformed all-against-all and other block testing approaches across a range of common similarity measures. We then applied HAllA to a series of real-world multi-omics datasets, revealing new associations between gene expression and host immune activity, the microbiome and host transcriptome, metabolomic profiling, and human health phenotypes. An open-source implementation of HAllA is freely available at http://huttenhower.sph.harvard.edu/halla along with documentation, demo datasets, and a user group.


2021 ◽  
Author(s):  
Sean J. Jurgens ◽  
James P. Pirruccello ◽  
Seung Hoan Choi ◽  
Valerie N. Morrill ◽  
Mark D. Chaffin ◽  
...  

With the emergence of large-scale sequencing data, methods for improving power in rare variant analyses (RVAT) are needed. Here, we show that adjusting for common variant polygenic scores improves the yield in gene-based RVAT across 65 quantitative traits in the UK Biobank (up to 20% increase at α=2.6x10-6), without a marked increase in false-positive rates or genomic inflation. Our results illustrate how adjusting for common variant effects can aid in rare variant association discovery.


2021 ◽  
Vol 11 ◽  
Author(s):  
Zhenwu Luo ◽  
Alexander V. Alekseyenko ◽  
Elizabeth Ogunrinde ◽  
Min Li ◽  
Quan-Zhen Li ◽  
...  

Blood microbiome is important to investigate microbial-host interactions and the effects on systemic immune perturbations. However, this effort has met with major challenges due to low microbial biomass and background artifacts. In the current study, microbial 16S DNA sequencing was applied to analyze plasma microbiome. We have developed a quality-filtering strategy to evaluate and exclude low levels of microbial sequences, potential contaminations, and artifacts from plasma microbial 16S DNA sequencing analyses. Furthermore, we have applied our technique in three cohorts, including tobacco-smokers, HIV-infected individuals, and individuals with systemic lupus erythematosus (SLE), as well as corresponding controls. More than 97% of total sequence data was removed using stringent quality-filtering strategy analyses; those removed amplicon sequence variants (ASVs) were low levels of microbial sequences, contaminations, and artifacts. The specifically enriched pathobiont bacterial ASVs have been identified in plasmas from tobacco-smokers, HIV-infected individuals, and individuals with SLE but not from control subjects. The associations between these ASVs and disease pathogenesis were demonstrated. The pathologic activities of some identified bacteria were further verified in vitro. We present a quality-filtering strategy to identify pathogenesis-associated plasma microbiome. Our approach provides a method for studying the diagnosis of subclinical microbial infection as well as for understanding the roles of microbiome-host interaction in disease pathogenesis.


2020 ◽  
Vol 21 (11) ◽  
pp. 1078-1084
Author(s):  
Ruizhi Fan ◽  
Chenhua Dong ◽  
Hu Song ◽  
Yixin Xu ◽  
Linsen Shi ◽  
...  

: Recently, an increasing number of biological and clinical reports have demonstrated that imbalance of microbial community has the ability to play important roles among several complex diseases concerning human health. Having a good knowledge of discovering potential of microbe-disease relationships, which provides the ability to having a better understanding of some issues, including disease pathology, further boosts disease diagnostics and prognostics, has been taken into account. Nevertheless, a few computational approaches can meet the need of huge scale of microbe-disease association discovery. In this work, we proposed the EHAI model, which is Enhanced Human microbe- disease Association Identification. EHAI employed the microbe-disease associations, and then Gaussian interaction profile kernel similarity has been utilized to enhance the basic microbe-disease association. Actually, some known microbe-disease associations and a large amount of associations are still unavailable among the datasets. The ‘super-microbe’ and ‘super-disease’ were employed to enhance the model. Computational results demonstrated that such super-classes have the ability to be helpful to the performance of EHAI. Therefore, it is anticipated that EHAI can be treated as an important biological tool in this field.


Author(s):  
Jiannan Liu ◽  
Chuanpeng Dong ◽  
Yunlong Liu ◽  
Huanmei Wu

Abstract Summary Cancer Gene and Pathway Explorer (CGPE) is developed to guide biological and clinical researchers, especially those with limited informatics and programming skills, performing preliminary cancer-related biomedical research using transcriptional data and publications. CGPE enables three user-friendly online analytical and visualization modules without requiring any local deployment. The GenePub HotIndex applies natural language processing, statistics and association discovery to provide analytical results on gene-specific PubMed publications, including gene-specific research trends, cancer types correlations, top-related genes and the WordCloud of publication profiles. The OnlineGSEA enables Gene Set Enrichment Analysis (GSEA) and results visualizations through an easy-to-follow interface for public or in-house transcriptional datasets, integrating the GSEA algorithm and preprocessed public TCGA and GEO datasets. The preprocessed datasets ensure gene sets analysis with appropriate pathway alternation and gene signatures. The CellLine Search presents evidence-based guidance for cell line selections with combined information on cell line dependency, gene expressions and pathway activity maps, which are valuable knowledge to have before conducting gene-related experiments. In a nutshell, the CGPE webserver provides a user-friendly, visual, intuitive and informative bioinformatics tool that allows biomedical researchers to perform efficient analyses and preliminary studies on in-house and publicly available bioinformatics data. Availability and implementation The webserver is freely available online at https://cgpe.soic.iupui.edu. Supplementary information Supplementary data are available at Bioinformatics online.


2020 ◽  
Vol 9 (3) ◽  
pp. 155
Author(s):  
Weihua Liao ◽  
Zhiheng Zhang ◽  
Weiguo Jiang

A relative lag in research methods, technical means and research paradigms has restricted the rapid development of geography and urban computing. Hence, there is a certain gap between urban data and industry applications. In this paper, a spatial association discovery framework for the urban service industry based on a concept lattice is proposed. First, location data are used to form the formal context expressed by 0 and 1. Frequent closed itemsets and a concept lattice are computed on the basis of the formal context of the urban service industry. Frequent closed itemsets can filter out redundant information in frequent itemsets, uniquely determine the complete set of all frequent itemsets, and be orders of magnitude smaller than the latter. Second, spatial frequent closed itemsets and association rules discovery algorithms are designed and built based on the formal context. The inputs of the frequent closed itemsets discovery algorithms include the given formal context and frequent threshold value, while the outputs are all frequent closed itemsets and the partial order relationship between them. Newly added attributes create new concepts to guarantee the uniqueness of the new spatial association concept. The inputs of spatial association rules discovery algorithms include frequent closed itemsets and confidence threshold values, and a rule is confident when and only if its confidence degree is not less than the confidence threshold value. Third, the spatial association of the urban service industry in Nanning, China is taken as a case to verify the method. The results are basically consistent with the spatial distribution of the urban service industry in Nanning City. This study enriches the theories and methods of geography as well as urban computing, and these findings can provide guidance for location-based service planning and management of urban services.


2019 ◽  
Vol 94 ◽  
pp. 103180 ◽  
Author(s):  
Sepideh Shamsizadeh ◽  
Sama Goliaei ◽  
Zahra Razaghi Moghadam

Sign in / Sign up

Export Citation Format

Share Document