module discovery Latest Research Papers

DISA tool: discriminative and informative subspace assessment with categorical and numerical outcomes

10.1101/2021.12.08.471785 ◽

2021 ◽

Author(s):

Leonardo Duarte Rodrigues Alexandre ◽

Rafael S. Costa ◽

Rui Henriques

Keyword(s):

Software Package ◽

Pattern Discovery ◽

Subspace Clustering ◽

Discriminative Power ◽

Risk Profiles ◽

Regulatory Module ◽

Module Discovery ◽

Biological Domain ◽

Omic Data

Motivation: Pattern discovery and subspace clustering play a central role in the biological domain, supporting for instance putative regulatory module discovery from omic data for both descriptive and predictive ends. In the presence of target variables (e.g. phenotypes), regulatory patterns should further satisfy delineate discriminative power properties, well-established in the presence of categorical outcomes, yet largely disregarded for numerical outcomes, such as risk profiles and quantitative phenotypes. Results: DISA (Discriminative and Informative Subspace Assessment), a Python software package, is proposed to assess patterns in the presence of numerical outcomes using well-established measures together with a novel principle able to statistically assess the correlation gain of the subspace against the overall space. Results confirm the possibility to soundly extend discriminative criteria towards numerical outcomes without the drawbacks well-associated with discretization procedures. A case study is provided to show the properties of the proposed method. Availability: DISA is freely available at https://github.com/JupitersMight/DISA under the MIT license.

A holistic miRNA-mRNA module discovery

Non-coding RNA Research ◽

10.1016/j.ncrna.2021.09.001 ◽

2021 ◽

Author(s):

Ghada Shommo ◽

Bruno Apolloni

Keyword(s):

Module Discovery

TPSC: a module detection method based on topology potential and spectral clustering in weighted networks and its application in gene co-expression module discovery

BMC Bioinformatics ◽

10.1186/s12859-021-03964-5 ◽

2021 ◽

Vol 22 (S4) ◽

Author(s):

Yusong Liu ◽

Xiufen Ye ◽

Christina Y. Yu ◽

Wei Shao ◽

Jie Hou ◽

...

Keyword(s):

Prior Knowledge ◽

Spectral Clustering ◽

Clustering Algorithm ◽

Detection Algorithm ◽

Complete Coverage ◽

Survival Difference ◽

Module Detection ◽

Module Discovery ◽

Module Size ◽

Expression Module

Abstract Background Gene co-expression networks are widely studied in the biomedical field, with algorithms such as WGCNA and lmQCM having been developed to detect co-expressed modules. However, these algorithms have limitations such as insufficient granularity and unbalanced module size, which prevent full acquisition of knowledge from data mining. In addition, it is difficult to incorporate prior knowledge in current co-expression module detection algorithms. Results In this paper, we propose a novel module detection algorithm based on topology potential and spectral clustering algorithm to detect co-expressed modules in gene co-expression networks. By testing on TCGA data, our novel method can provide more complete coverage of genes, more balanced module size and finer granularity than current methods in detecting modules with significant overall survival difference. In addition, the proposed algorithm can identify modules by incorporating prior knowledge. Conclusion In summary, we developed a method to obtain as much as possible information from networks with increased input coverage and the ability to detect more size-balanced and granular modules. In addition, our method can integrate data from different sources. Our proposed method performs better than current methods with complete coverage of input genes and finer granularity. Moreover, this method is designed not only for gene co-expression networks but can also be applied to any general fully connected weighted network.

MAGI-MS: Multiple seed-centric module discovery

10.1101/2021.09.14.460296 ◽

2021 ◽

Author(s):

Julie C. Chow ◽

Ryan Zhou ◽

Fereydoun Hormozdiari

Keyword(s):

Neurodevelopmental Disorders ◽

Biological Pathways ◽

Protein Interaction Networks ◽

Interaction Networks ◽

Glutamatergic Synapse ◽

Integrated Networks ◽

Complex Disorders ◽

Module Discovery ◽

Genetic Module ◽

Genetic And Environmental Factors

AbstractComplex disorders manifest by the interaction of multiple genetic and environmental factors. Through the construction of genetic modules that consist of highly co-expressed genes, it is possible to identify genes that participate in common biological pathways relevant to specific phenotypes. We have previously developed tools MAGI and MAGI-S for genetic module discovery by incorporating co-expression and protein-interaction networks. Here we introduce an extension to MAGI-S, denoted as Merging Affected Genes into Integrated Networks - Multiple Seeds (MAGI-MS), that permits the user to further specify a disease pathway of interest by selecting multiple seed genes likely to function in the same molecular mechanism. By providing MAGI-MS with pairs of seed genes involved in processes underlying certain classes of neurodevelopmental disorders, such as epilepsy, we demonstrate that MAGI-MS can reveal modules enriched in genes relevant to chemical synaptic transmission, glutamatergic synapse, and other functions associated with the provided seed genes.Availability and implementationMAGI-MS is free and is available at: https://github.com/jchow32/MAGI-MS

Cox-sMBPLS: An Algorithm for Disease Survival Prediction and Multi-Omics Module Discovery Incorporating Cis-Regulatory Quantitative Effects

Frontiers in Genetics ◽

10.3389/fgene.2021.701405 ◽

2021 ◽

Vol 12 ◽

Author(s):

Nasim Vahabi ◽

Caitrin W. McDonough ◽

Ankit A. Desai ◽

Larisa H. Cavallari ◽

Julio D. Duarte ◽

...

Keyword(s):

Feature Selection ◽

Disease Onset ◽

Prediction Performance ◽

Biological Information ◽

Survival Prediction ◽

Simulation Studies ◽

Time To Event ◽

Time To Event Data ◽

Proposed Model ◽

Module Discovery

BackgroundThe development of high-throughput techniques has enabled profiling a large number of biomolecules across a number of molecular compartments. The challenge then becomes to integrate such multimodal Omics data to gain insights into biological processes and disease onset and progression mechanisms. Further, given the high dimensionality of such data, incorporating prior biological information on interactions between molecular compartments when developing statistical models for data integration is beneficial, especially in settings involving a small number of samples.ResultsWe develop a supervised model for time to event data (e.g., death, biochemical recurrence) that simultaneously accounts for redundant information within Omics profiles and leverages prior biological associations between them through a multi-block PLS framework. The interactions between data from different molecular compartments (e.g., epigenome, transcriptome, methylome, etc.) were captured by using cis-regulatory quantitative effects in the proposed model. The model, coined Cox-sMBPLS, exhibits superior prediction performance and improved feature selection based on both simulation studies and analysis of data from heart failure patients.ConclusionThe proposed supervised Cox-sMBPLS model can effectively incorporate prior biological information in the survival prediction system, leading to improved prediction performance and feature selection. It also enables the identification of multi-Omics modules of biomolecules that impact the patients’ survival probability and also provides insights into potential relevant risk factors that merit further investigation.

Joint Lp-Norm and L2,1-Norm Constrained Graph Laplacian PCA for Robust Tumor Sample Clustering and Gene Network Module Discovery

Frontiers in Genetics ◽

10.3389/fgene.2021.621317 ◽

2021 ◽

Vol 12 ◽

Author(s):

Xiang-Zhen Kong ◽

Yu Song ◽

Jin-Xing Liu ◽

Chun-Hou Zheng ◽

Sha-Sha Yuan ◽

...

Keyword(s):

Gene Expression ◽

Principal Component Analysis ◽

Gene Network ◽

Principal Component ◽

Graph Laplacian ◽

Component Analysis ◽

Network Module ◽

Tumor Sample ◽

Lp Norm ◽

Module Discovery

The dimensionality reduction method accompanied by different norm constraints plays an important role in mining useful information from large-scale gene expression data. In this article, a novel method named Lp-norm and L2,1-norm constrained graph Laplacian principal component analysis (PL21GPCA) based on traditional principal component analysis (PCA) is proposed for robust tumor sample clustering and gene network module discovery. Three aspects are highlighted in the PL21GPCA method. First, to degrade the high sensitivity to outliers and noise, the non-convex proximal Lp-norm (0 < p < 1)constraint is applied on the loss function. Second, to enhance the sparsity of gene expression in cancer samples, the L2,1-norm constraint is used on one of the regularization terms. Third, to retain the geometric structure of the data, we introduce the graph Laplacian regularization item to the PL21GPCA optimization model. Extensive experiments on five gene expression datasets, including one benchmark dataset, two single-cancer datasets from The Cancer Genome Atlas (TCGA), and two integrated datasets of multiple cancers from TCGA, are performed to validate the effectiveness of our method. The experimental results demonstrate that the PL21GPCA method performs better than many other methods in terms of tumor sample clustering. Additionally, this method is used to discover the gene network modules for the purpose of finding key genes that may be associated with some cancers.

Gene co-expression in the interactome: moving from correlation toward causation via an integrated approach to disease module discovery

npj Systems Biology and Applications ◽

10.1038/s41540-020-00168-0 ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Paola Paci ◽

Giulia Fiscon ◽

Federica Conte ◽

Rui-Sheng Wang ◽

Lorenzo Farina ◽

...

Keyword(s):

Network Analysis ◽

Interaction Network ◽

Integrated Approach ◽

State Transitions ◽

Disease Genes ◽

Human Interactome ◽

Protein Protein Interaction ◽

Interactome Network ◽

Module Discovery ◽

Specific Disorders

AbstractIn this study, we integrate the outcomes of co-expression network analysis with the human interactome network to predict novel putative disease genes and modules. We first apply the SWItch Miner (SWIM) methodology, which predicts important (switch) genes within the co-expression network that regulate disease state transitions, then map them to the human protein–protein interaction network (PPI, or interactome) to predict novel disease–disease relationships (i.e., a SWIM-informed diseasome). Although the relevance of switch genes to an observed phenotype has been recently assessed, their performance at the system or network level constitutes a new, potentially fascinating territory yet to be explored. Quantifying the interplay between switch genes and human diseases in the interactome network, we found that switch genes associated with specific disorders are closer to each other than to other nodes in the network, and tend to form localized connected subnetworks. These subnetworks overlap between similar diseases and are situated in different neighborhoods for pathologically distinct phenotypes, consistent with the well-known topological proximity property of disease genes. These findings allow us to demonstrate how SWIM-based correlation network analysis can serve as a useful tool for efficient screening of potentially new disease gene associations. When integrated with an interactome-based network analysis, it not only identifies novel candidate disease genes, but also may offer testable hypotheses by which to elucidate the molecular underpinnings of human disease and reveal commonalities between seemingly unrelated diseases.

The effect of statistical normalisation on network propagation scores

Bioinformatics ◽

10.1093/bioinformatics/btaa896 ◽

2020 ◽

Author(s):

Sergio Picart-Armada ◽

Wesley K Thompson ◽

Alfonso Buil ◽

Alexandre Perera-Lluna

Keyword(s):

Protein Function ◽

Diffusion Processes ◽

Protein Function Prediction ◽

Interaction Network ◽

Mean Value ◽

Statistical Properties ◽

Label Propagation ◽

Supplementary Information ◽

Module Discovery ◽

Permutation Analysis

Abstract Motivation Network diffusion and label propagation are fundamental tools in computational biology, with applications like gene-disease association, protein function prediction and module discovery. More recently, several publications have introduced a permutation analysis after the propagation process, due to concerns that network topology can bias diffusion scores. This opens the question of the statistical properties and the presence of bias of such diffusion processes in each of its applications. In this work, we characterised some common null models behind the permutation analysis and the statistical properties of the diffusion scores. We benchmarked seven diffusion scores on three case studies: synthetic signals on a yeast interactome, simulated differential gene expression on a protein-protein interaction network and prospective gene set prediction on another interaction network. For clarity, all the datasets were based on binary labels, but we also present theoretical results for quantitative labels. Results Diffusion scores starting from binary labels were affected by the label codification, and exhibited a problem-dependent topological bias that could be removed by the statistical normalisation. Parametric and non-parametric normalisation addressed both points by being codification-independent and by equalising the bias. We identified and quantified two sources of bias -mean value and variance- that yielded performance differences when normalising the scores. We provided closed formulae for both and showed how the null covariance is related to the spectral properties of the graph. Despite none of the proposed scores systematically outperformed the others, normalisation was preferred when the sought positive labels were not aligned with the bias. We conclude that the decision on bias removal should be problem and data-driven, i.e. based on a quantitative analysis of the bias and its relation to the positive entities. Availability The code is publicly available at https://github.com/b2slab/diffuBench Supplementary information Supplementary data are available at Bioinformatics online.

MONET: Multi-omic module discovery by omic selection

PLoS Computational Biology ◽

10.1371/journal.pcbi.1008182 ◽

2020 ◽

Vol 16 (9) ◽

pp. e1008182

Author(s):

Nimrod Rappoport ◽

Roy Safra ◽

Ron Shamir

Keyword(s):

Module Discovery

PENGEMBANGAN MODUL PRAKTIKUM SERLI (DISCOVERY LEARNING) UNTUK PEMBELAJARAN SAINS DI SEKOLAH DASAR

Profesi Pendidikan Dasar ◽

10.23917/ppd.v7i1.10817 ◽

2020 ◽

Vol 7 (1) ◽

pp. 53-64

Author(s):

Azizah Thalib ◽

Puji Winarti ◽

Nurul Kami Sani

Keyword(s):

Elementary School ◽

Elementary Schools ◽

Research And Development ◽

Science Learning ◽

Discovery Learning ◽

Development Model ◽

Development Research ◽

Is Research ◽

Module Discovery

This study aims to develop a Serli Practicum Module (Discovery Learning) for Science learning in Class VI Elementary Schools that has met valid, practical, and effective criteria for use in Science learning in elementary schools. This type of research is research and development that uses the Peffers et al development model. The type of development research includes six phases, namely: (1) identifying problems that motivate research, (2) describing research objectives, (3) designing and developing products, (4) testing products, (5) evaluating trial results, and (6) ) communicating results. The results of the study show that the Serli Practicum Module (Discovery Learning) for Science learning in Class VI Elementary School has been valid, practical and effective. Validity values obtained are 61 with very valid criteria. Practical value of 65.73 or very practical. And the acquisition of effectiveness value is 0.70 with effective criteria.

module discovery
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

DISA tool: discriminative and informative subspace assessment with categorical and numerical outcomes

A holistic miRNA-mRNA module discovery

TPSC: a module detection method based on topology potential and spectral clustering in weighted networks and its application in gene co-expression module discovery

MAGI-MS: Multiple seed-centric module discovery

Cox-sMBPLS: An Algorithm for Disease Survival Prediction and Multi-Omics Module Discovery Incorporating Cis-Regulatory Quantitative Effects

Joint Lp-Norm and L2,1-Norm Constrained Graph Laplacian PCA for Robust Tumor Sample Clustering and Gene Network Module Discovery

Gene co-expression in the interactome: moving from correlation toward causation via an integrated approach to disease module discovery

The effect of statistical normalisation on network propagation scores

MONET: Multi-omic module discovery by omic selection

PENGEMBANGAN MODUL PRAKTIKUM SERLI (DISCOVERY LEARNING) UNTUK PEMBELAJARAN SAINS DI SEKOLAH DASAR

Export Citation Format

module discoveryRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

DISA tool: discriminative and informative subspace assessment with categorical and numerical outcomes

A holistic miRNA-mRNA module discovery

TPSC: a module detection method based on topology potential and spectral clustering in weighted networks and its application in gene co-expression module discovery

MAGI-MS: Multiple seed-centric module discovery

Cox-sMBPLS: An Algorithm for Disease Survival Prediction and Multi-Omics Module Discovery Incorporating Cis-Regulatory Quantitative Effects

Joint Lp-Norm and L2,1-Norm Constrained Graph Laplacian PCA for Robust Tumor Sample Clustering and Gene Network Module Discovery

Gene co-expression in the interactome: moving from correlation toward causation via an integrated approach to disease module discovery

The effect of statistical normalisation on network propagation scores

MONET: Multi-omic module discovery by omic selection

PENGEMBANGAN MODUL PRAKTIKUM SERLI (DISCOVERY LEARNING) UNTUK PEMBELAJARAN SAINS DI SEKOLAH DASAR

module discovery
Recently Published Documents