scholarly journals ISMARA: Completely automated inference of gene regulatory networks from high-throughput data

Author(s):  
Mikhail Pachkov ◽  
Piotr J Balwierz ◽  
Phil Arnold ◽  
Andreas J Gruber ◽  
Mihaela Zavolan ◽  
...  

As the costs of high-throughput measurement technologies continue to fall, experimental approaches in biomedicine are increasingly data intensive and the advent of big data is justifiably seen as holding the promise to transform medicine. However, as data volumes mount, researchers increasingly realize that extracting concrete, reliable, and actionable biological predictions from high-throughput data can be very challenging. Our laboratory has pioneered a number of methods for inferring key gene regulatory interactions from high-throughput data. For example, we developed motif activity response analysis (MARA)[, which models genome-wide gene expression (RNA-Seq, or microarray) and chromatin state (ChIP-Seq) data in terms of comprehensive predictions of regulatory sites for hundreds of mammalian regulators (TFs and micro-RNAs). Using these models, MARA identifies the key regulators driving gene expression and chromatin state changes, the activities of these regulators across the input samples, their target genes, and the sites on the genome through which these regulators act. We recently completely automated MARA in an integrated web-server (ismara.unibas.ch) that allows researchers to analyze their own data by simply uploading RNA-Seq or ChIP-Seq datasets, and provides results in an integrated web interface as well as in downloadable flat form.

Author(s):  
Mikhail Pachkov ◽  
Piotr J Balwierz ◽  
Phil Arnold ◽  
Andreas J Gruber ◽  
Mihaela Zavolan ◽  
...  

As the costs of high-throughput measurement technologies continue to fall, experimental approaches in biomedicine are increasingly data intensive and the advent of big data is justifiably seen as holding the promise to transform medicine. However, as data volumes mount, researchers increasingly realize that extracting concrete, reliable, and actionable biological predictions from high-throughput data can be very challenging. Our laboratory has pioneered a number of methods for inferring key gene regulatory interactions from high-throughput data. For example, we developed motif activity response analysis (MARA)[, which models genome-wide gene expression (RNA-Seq, or microarray) and chromatin state (ChIP-Seq) data in terms of comprehensive predictions of regulatory sites for hundreds of mammalian regulators (TFs and micro-RNAs). Using these models, MARA identifies the key regulators driving gene expression and chromatin state changes, the activities of these regulators across the input samples, their target genes, and the sites on the genome through which these regulators act. We recently completely automated MARA in an integrated web-server (ismara.unibas.ch) that allows researchers to analyze their own data by simply uploading RNA-Seq or ChIP-Seq datasets, and provides results in an integrated web interface as well as in downloadable flat form.


2016 ◽  
Vol 113 (13) ◽  
pp. E1835-E1843 ◽  
Author(s):  
Mina Fazlollahi ◽  
Ivor Muroff ◽  
Eunjee Lee ◽  
Helen C. Causton ◽  
Harmen J. Bussemaker

Regulation of gene expression by transcription factors (TFs) is highly dependent on genetic background and interactions with cofactors. Identifying specific context factors is a major challenge that requires new approaches. Here we show that exploiting natural variation is a potent strategy for probing functional interactions within gene regulatory networks. We developed an algorithm to identify genetic polymorphisms that modulate the regulatory connectivity between specific transcription factors and their target genes in vivo. As a proof of principle, we mapped connectivity quantitative trait loci (cQTLs) using parallel genotype and gene expression data for segregants from a cross between two strains of the yeast Saccharomyces cerevisiae. We identified a nonsynonymous mutation in the DIG2 gene as a cQTL for the transcription factor Ste12p and confirmed this prediction empirically. We also identified three polymorphisms in TAF13 as putative modulators of regulation by Gcn4p. Our method has potential for revealing how genetic differences among individuals influence gene regulatory networks in any organism for which gene expression and genotype data are available along with information on binding preferences for transcription factors.


2021 ◽  
Vol 30 (04) ◽  
pp. 2150022
Author(s):  
Sergio Peignier ◽  
Pauline Schmitt ◽  
Federica Calevro

Inferring Gene Regulatory Networks from high-throughput gene expression data is a challenging problem, addressed by the systems biology community. Most approaches that aim at unraveling the gene regulation mechanisms in a data-driven way, analyze gene expression datasets to score potential regulatory links between transcription factors and target genes. So far, three major families of approaches have been proposed to score regulatory links. These methods rely respectively on correlation measures, mutual information metrics, and regression algorithms. In this paper we present a new family of data-driven inference methods. This new family, inspired by the regression-based paradigm, relies on the use of classification algorithms. This paper assesses and advocates for the use of this paradigm as a new promising approach to infer gene regulatory networks. Indeed, the development and assessment of five new inference methods based on well-known classification algorithms shows that the classification-based inference family exhibits good results when compared to well-established paradigms.


eLife ◽  
2020 ◽  
Vol 9 ◽  
Author(s):  
Christopher A Jackson ◽  
Dayanne M Castro ◽  
Giuseppe-Antonio Saldi ◽  
Richard Bonneau ◽  
David Gresham

Understanding how gene expression programs are controlled requires identifying regulatory relationships between transcription factors and target genes. Gene regulatory networks are typically constructed from gene expression data acquired following genetic perturbation or environmental stimulus. Single-cell RNA sequencing (scRNAseq) captures the gene expression state of thousands of individual cells in a single experiment, offering advantages in combinatorial experimental design, large numbers of independent measurements, and accessing the interaction between the cell cycle and environmental responses that is hidden by population-level analysis of gene expression. To leverage these advantages, we developed a method for scRNAseq in budding yeast (Saccharomyces cerevisiae). We pooled diverse transcriptionally barcoded gene deletion mutants in 11 different environmental conditions and determined their expression state by sequencing 38,285 individual cells. We benchmarked a framework for learning gene regulatory networks from scRNAseq data that incorporates multitask learning and constructed a global gene regulatory network comprising 12,228 interactions.


Computation ◽  
2021 ◽  
Vol 9 (4) ◽  
pp. 48
Author(s):  
Georgios N. Dimitrakopoulos

In Systems Biology, the complex relationships between different entities in the cells are modeled and analyzed using networks. Towards this aim, a rich variety of gene regulatory network (GRN) inference algorithms has been developed in recent years. However, most algorithms rely solely on gene expression data to reconstruct the network. Due to possible expression profile similarity, predictions can contain connections between biologically unrelated genes. Therefore, previously known biological information should also be considered by computational methods to obtain more consistent results, such as experimentally validated interactions between transcription factors and target genes. In this work, we propose XGBoost for gene regulatory networks (XGRN), a supervised algorithm, which combines gene expression data with previously known interactions for GRN inference. The key idea of our method is to train a regression model for each known interaction of the network and then utilize this model to predict new interactions. The regression is performed by XGBoost, a state-of-the-art algorithm using an ensemble of decision trees. In detail, XGRN learns a regression model based on gene expression of the two interactors and then provides predictions using as input the gene expression of other candidate interactors. Application on benchmark datasets and a real large single-cell RNA-Seq experiment resulted in high performance compared to other unsupervised and supervised methods, demonstrating the ability of XGRN to provide reliable predictions.


2021 ◽  
Author(s):  
Kenji Okubo ◽  
Kunihiko Kaneko

AbstractMendelian inheritance is a fundamental law of genetics. Considering two alleles in a diploid, a phenotype of a heterotype is dominated by a particular homotype according to the law of dominance. This picture is usually based on simple genotype-phenotype mapping in which one gene regulates one phenotype. However, in reality, some interactions between genes can result in deviation from Mendelian dominance.Here, by using the numerical evolution of diploid gene regulatory networks (GRNs), we discuss whether Mendelian dominance evolves beyond the classical case of one-to-one genotype-phenotype mapping. We examine whether complex genotype-phenotype mapping can achieve Mendelian dominance through the evolution of the GRN with interacting genes. Specifically, we extend the GRN model to a diploid case, in which two GRN matrices are added to give gene expression dynamics, and simulate evolution with meiosis and recombination. Our results reveal that Mendelian dominance evolves even under complex genotype-phenotype mapping. This dominance is achieved via a group of genotypes that differ from each other but have a common phenotype given by the expression of target genes. Calculating the degree of dominance shows that it increases through the evolution, correlating closely with the decrease in phenotypic fluctuations and the increase in robustness to initial noise. This evolution of Mendelian dominance is associated with phenotypic robustness against meiosis-induced genome mixing, whereas sexual recombination arising from the mixing of chromosomes from the parents further enhances dominance and robustness. Owing to this dominance, the robustness to genetic differences increases, while the optimal fitness is sustained up to a large difference between the two genomes. In summary, Mendelian dominance is achieved by groups of genotypes that are associated with the increase in phenotypic robustness to noise.Author summaryMendelian dominance is one of the most fundamental laws in genetics. When two conflicting characters occur in a single diploid, the dominant character is always chosen. Assuming that one gene makes one character, this law is simple to grasp. However, in reality, phenotypes are generated via interactions between several genes, which may alter Mendel’s dominance law. The evolution of robustness to noise and mutations has been investigated extensively using complex expression dynamics with gene regulatory networks. Here, we applied gene-expression dynamics with complex interactions to the case of a diploid and simulated the evolution of the gene regulatory network to generate the optimal phenotype given by a certain gene expression pattern. Interestingly, after evolution, Mendelian dominance is achieved via a group of genes. This group-based Mendelian dominance is shaped by phenotype insensitivity to genome mixing by meiosis and evolves concurrently with the robustness to noise. By focusing on the influence of phenotypic robustness, which has received considerable attention recently, our result provides a novel perspective as to why Mendel’s law of dominance is commonly observed.


2019 ◽  
Author(s):  
Julia Åkesson ◽  
Zelmina Lubovac-Pilav ◽  
Rasmus Magnusson ◽  
Mika Gustafsson

AbstractSummaryHub transcription factors, regulating many target genes in gene regulatory networks (GRNs), play important roles as disease regulators and potential drug targets. However, while numerous methods have been developed to predict individual regulator-gene interactions from gene expression data, few methods focus on inferring these hubs. We have developed ComHub, a tool to predict hubs in GRNs. ComHub makes a community prediction of hubs by averaging over predictions by a compendium of network inference methods. Benchmarking ComHub to the DREAM5 challenge data and an independent data set of human gene expression, proved a robust performance of ComHub over all data sets. Lastly, we implemented ComHub to work with both predefined networks and to do standard network inference, which we believe will make it generally applicable.AvailabilityCode is available at https://gitlab.com/Gustafsson-lab/[email protected], [email protected]


2021 ◽  
Author(s):  
Jacob W Freimer ◽  
Oren Shaked ◽  
Sahin Naqvi ◽  
Nasa Sinnott-Armstrong ◽  
Arwa Kathiria ◽  
...  

Complex gene regulatory networks ensure that important genes are expressed at precise levels. When gene expression is sufficiently perturbed it can lead to disease. To understand how gene expression disruptions percolate through a network, we must first map connections between regulatory genes and their downstream targets. However, we lack comprehensive knowledge of the upstream regulators of most genes. Here we developed an approach for systematic discovery of upstream regulators of critical immune factors - IL2RA, IL-2, and CTLA4 - in primary human T cells. Then, we mapped the network of these regulators' target genes and enhancers using CRISPR perturbations, RNA-Seq, and ATAC-Seq. These regulators form densely interconnected networks with extensive feedback loops. Furthermore, this network is highly enriched for immune-associated disease variants and genes. These results provide insight into how immune-associated disease genes are regulated in T cells and broader principles about the structure of human gene regulatory networks.


Sign in / Sign up

Export Citation Format

Share Document