scholarly journals XGRN: Reconstruction of Biological Networks Based on Boosted Trees Regression

Computation ◽  
2021 ◽  
Vol 9 (4) ◽  
pp. 48
Author(s):  
Georgios N. Dimitrakopoulos

In Systems Biology, the complex relationships between different entities in the cells are modeled and analyzed using networks. Towards this aim, a rich variety of gene regulatory network (GRN) inference algorithms has been developed in recent years. However, most algorithms rely solely on gene expression data to reconstruct the network. Due to possible expression profile similarity, predictions can contain connections between biologically unrelated genes. Therefore, previously known biological information should also be considered by computational methods to obtain more consistent results, such as experimentally validated interactions between transcription factors and target genes. In this work, we propose XGBoost for gene regulatory networks (XGRN), a supervised algorithm, which combines gene expression data with previously known interactions for GRN inference. The key idea of our method is to train a regression model for each known interaction of the network and then utilize this model to predict new interactions. The regression is performed by XGBoost, a state-of-the-art algorithm using an ensemble of decision trees. In detail, XGRN learns a regression model based on gene expression of the two interactors and then provides predictions using as input the gene expression of other candidate interactors. Application on benchmark datasets and a real large single-cell RNA-Seq experiment resulted in high performance compared to other unsupervised and supervised methods, demonstrating the ability of XGRN to provide reliable predictions.

Biotechnology ◽  
2019 ◽  
pp. 265-304
Author(s):  
David Correa Martins Jr. ◽  
Fabricio Martins Lopes ◽  
Shubhra Sankar Ray

The inference of Gene Regulatory Networks (GRNs) is a very challenging problem which has attracted increasing attention since the development of high-throughput sequencing and gene expression measurement technologies. Many models and algorithms have been developed to identify GRNs using mainly gene expression profile as data source. As the gene expression data usually has limited number of samples and inherent noise, the integration of gene expression with several other sources of information can be vital for accurately inferring GRNs. For instance, some prior information about the overall topological structure of the GRN can guide inference techniques toward better results. In addition to gene expression data, recently biological information from heterogeneous data sources have been integrated by GRN inference methods as well. The objective of this chapter is to present an overview of GRN inference models and techniques with focus on incorporation of prior information such as, global and local topological features and integration of several heterogeneous data sources.


2020 ◽  
pp. 1052-1075 ◽  
Author(s):  
Dina Elsayad ◽  
A. Ali ◽  
Howida A. Shedeed ◽  
Mohamed F. Tolba

The gene expression analysis is an important research area of Bioinformatics. The gene expression data analysis aims to understand the genes interacting phenomena, gene functionality and the genes mutations effect. The Gene regulatory network analysis is one of the gene expression data analysis tasks. Gene regulatory network aims to study the genes interactions topological organization. The regulatory network is critical for understanding the pathological phenotypes and the normal cell physiology. There are many researches that focus on gene regulatory network analysis but unfortunately some algorithms are affected by data size. Where, the algorithm runtime is proportional to the data size, therefore, some parallel algorithms are presented to enhance the algorithms runtime and efficiency. This work presents a background, mathematical models and comparisons about gene regulatory networks analysis different techniques. In addition, this work proposes Parallel Architecture for Gene Regulatory Network (PAGeneRN).


Methods ◽  
2015 ◽  
Vol 85 ◽  
pp. 62-74 ◽  
Author(s):  
Peter J. Pemberton-Ross ◽  
Mikhail Pachkov ◽  
Erik van Nimwegen

2020 ◽  
Author(s):  
Xanthoula Atsalaki ◽  
Lefteris Koumakis ◽  
George Potamias ◽  
Manolis Tsiknakis

AbstractHigh-throughput technologies, such as chromatin immunoprecipitation (ChIP) with massively parallel sequencing (ChIP-seq) have enabled cost and time efficient generation of immense amount of genome data. The advent of advanced sequencing techniques allowed biologists and bioinformaticians to investigate biological aspects of cell function and understand or reveal unexplored disease etiologies. Systems biology attempts to formulate the molecular mechanisms in mathematical models and one of the most important areas is the gene regulatory networks (GRNs), a collection of DNA segments that somehow interact with each other. GRNs incorporate valuable information about molecular targets that can be corellated to specific phenotype.In our study we highlight the need to develop new explorative tools and approaches for the integration of different types of -omics data such as ChIP-seq and GRNs using pathway analysis methodologies. We present an integrative approach for ChIP-seq and gene expression data on GRNs. Using public microarray expression samples for lung cancer and healthy subjects along with the KEGG human gene regulatory networks, we identified ways to disrupt functional sub-pathways on lung cancer with the aid of CTCF ChIP-seq data, as a proof of concept.We expect that such a systems biology pipeline could assist researchers to identify corellations and causality of transcription factors over functional or disrupted biological sub-pathways.


Sign in / Sign up

Export Citation Format

Share Document