scholarly journals Combining gene expression data and prior knowledge for inferring gene regulatory networks via Bayesian networks using structural restrictions

Author(s):  
Luis M. de Campos ◽  
Andrés Cano ◽  
Javier G. Castellano ◽  
Serafín Moral

Abstract Gene Regulatory Networks (GRNs) are known as the most adequate instrument to provide a clear insight and understanding of the cellular systems. One of the most successful techniques to reconstruct GRNs using gene expression data is Bayesian networks (BN) which have proven to be an ideal approach for heterogeneous data integration in the learning process. Nevertheless, the incorporation of prior knowledge has been achieved by using prior beliefs or by using networks as a starting point in the search process. In this work, the utilization of different kinds of structural restrictions within algorithms for learning BNs from gene expression data is considered. These restrictions will codify prior knowledge, in such a way that a BN should satisfy them. Therefore, one aim of this work is to make a detailed review on the use of prior knowledge and gene expression data to inferring GRNs from BNs, but the major purpose in this paper is to research whether the structural learning algorithms for BNs from expression data can achieve better outcomes exploiting this prior knowledge with the use of structural restrictions. In the experimental study, it is shown that this new way to incorporate prior knowledge leads us to achieve better reverse-engineered networks.

2020 ◽  
Author(s):  
Yijie Wang ◽  
Justin M Fear ◽  
Isabelle Berger ◽  
Hangnoh Lee ◽  
Brian Oliver ◽  
...  

AbstractGene Regulatory Networks (GRNs) control many aspects of cellular processes including cell differentiation, maintenance of cell type specific states, signal transduction, and response to stress. Since GRNs provide information that is essential for understanding cell function, the inference of these networks is one of the key challenges in systems biology. Leading algorithms to reconstruct GRN utilize, in addition to gene expression data, prior knowledge such as Transcription Factor (TF) DNA binding motifs or results of DNA binding experiments. However, such prior knowledge is typically incomplete hence resulting in missing values. Therefore, the integration of such incomplete prior knowledge with gene expression to elucidate the underlying GRNs remains difficult.To address this challenge we introduce NetREX-CF – Regulatory Network Reconstruction using EXpression and Collaborative Filtering – a GRN reconstruction approach that brings together a modern machine learning strategy (Collaborative Filtering model) and a biologically justified model of gene expression (sparse Network Component Analysis based model). The Collaborative Filtering (CF) model is able to overcome the incompleteness of the prior knowledge and make edge recommends for building the GRN. Complementing CF, the sparse Network Component Analysis (NCA) model can use gene expression data to validate the recommended edges. Here we combine these two approaches using a novel data integration method and show that the new approach outperforms the currently leading GRN reconstruction methods.Furthermore, our mathematical formalization of the model has lead to a complex optimization problem of a type that has not been attempted before. Specifically, the formulation contains ℓ0 norm that can not be separated from other variables. To fill this gap, we introduce here a new method Generalized PALM (GPALM) that allows us to solve a broad class of non-convex optimization problems and prove its convergence.


Biotechnology ◽  
2019 ◽  
pp. 265-304
Author(s):  
David Correa Martins Jr. ◽  
Fabricio Martins Lopes ◽  
Shubhra Sankar Ray

The inference of Gene Regulatory Networks (GRNs) is a very challenging problem which has attracted increasing attention since the development of high-throughput sequencing and gene expression measurement technologies. Many models and algorithms have been developed to identify GRNs using mainly gene expression profile as data source. As the gene expression data usually has limited number of samples and inherent noise, the integration of gene expression with several other sources of information can be vital for accurately inferring GRNs. For instance, some prior information about the overall topological structure of the GRN can guide inference techniques toward better results. In addition to gene expression data, recently biological information from heterogeneous data sources have been integrated by GRN inference methods as well. The objective of this chapter is to present an overview of GRN inference models and techniques with focus on incorporation of prior information such as, global and local topological features and integration of several heterogeneous data sources.


2020 ◽  
Author(s):  
Xanthoula Atsalaki ◽  
Lefteris Koumakis ◽  
George Potamias ◽  
Manolis Tsiknakis

AbstractHigh-throughput technologies, such as chromatin immunoprecipitation (ChIP) with massively parallel sequencing (ChIP-seq) have enabled cost and time efficient generation of immense amount of genome data. The advent of advanced sequencing techniques allowed biologists and bioinformaticians to investigate biological aspects of cell function and understand or reveal unexplored disease etiologies. Systems biology attempts to formulate the molecular mechanisms in mathematical models and one of the most important areas is the gene regulatory networks (GRNs), a collection of DNA segments that somehow interact with each other. GRNs incorporate valuable information about molecular targets that can be corellated to specific phenotype.In our study we highlight the need to develop new explorative tools and approaches for the integration of different types of -omics data such as ChIP-seq and GRNs using pathway analysis methodologies. We present an integrative approach for ChIP-seq and gene expression data on GRNs. Using public microarray expression samples for lung cancer and healthy subjects along with the KEGG human gene regulatory networks, we identified ways to disrupt functional sub-pathways on lung cancer with the aid of CTCF ChIP-seq data, as a proof of concept.We expect that such a systems biology pipeline could assist researchers to identify corellations and causality of transcription factors over functional or disrupted biological sub-pathways.


Sign in / Sign up

Export Citation Format

Share Document