scholarly journals Condition-adaptive fused graphical lasso (CFGL): an adaptive procedure for inferring condition-specific gene co-expression network

2018 ◽  
Author(s):  
Yafei Lyu ◽  
Lingzhou Xue ◽  
Feipeng Zhang ◽  
Hillary Koch ◽  
Laura Saba ◽  
...  

AbstractCo-expression network analysis provides useful information for studying gene regulation in biological processes. Examining condition-specific patterns of co-expression can provide insights into the underlying cellular processes activated in a particular condition. One challenge in this type of analysis is that the sample sizes in each condition are usually small, making the statistical inference of co-expression patterns highly underpowered. A joint network construction that borrows information from related structures across conditions has the potential to improve the power of the analysis.One possible approach to constructing the co-expression network is to use the Gaussian graphical model. Though several methods are available for joint estimation of multiple graphical models, they do not fully account for the heterogeneity between samples and between co-expression patterns introduced by condition specificity. Here we develop the condition-adaptive fused graphical lasso (CFGL), a data-driven approach to incorporate condition specificity in the estimation of co-expression networks. We show that this method improves the accuracy with which networks are learned. The application of this method on a rat multi-tissue dataset and The Cancer Genome Atlas (TCGA) breast cancer dataset provides interesting biological insights. In both analyses, we identify numerous modules enriched for Gene Ontology functions and observe that the modules that are upregulated in a particular condition are often involved in condition-specific activities. Interestingly, we observe that the genes strongly associated with survival time in the TCGA dataset are less likely to be network hubs, suggesting that genes associated with cancer progression are likely to govern specific functions, rather than regulating a large number of biological processes. Additionally, we observed that the tumor-specific hub genes tend to have few shared edges with normal tissue, revealing tumor-specific regulatory mechanism.Author summaryGene co-expression networks provide insights into the mechanism of cellular activity and gene regulation. Condition-specific mechanisms may be identified by constructing and comparing co-expression networks of multiple conditions. We propose a novel statistical method to jointly construct co-expression networks for gene expression profiles from multiple conditions. By using a data-driven approach to capture condition-specific co-expression patterns, this method is effective in identifying both co-expression patterns that are specific to a condition and that are common across conditions. The application of this method on real datasets reveals interesting biological insights.

2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Kai Zhao ◽  
Song Chen ◽  
Wenjing Yao ◽  
Zihan Cheng ◽  
Boru Zhou ◽  
...  

Abstract Background The bZIP gene family, which is widely present in plants, participates in varied biological processes including growth and development and stress responses. How do the genes regulate such biological processes? Systems biology is powerful for mechanistic understanding of gene functions. However, such studies have not yet been reported in poplar. Results In this study, we identified 86 poplar bZIP transcription factors and described their conserved domains. According to the results of phylogenetic tree, we divided these members into 12 groups with specific gene structures and motif compositions. The corresponding genes that harbor a large number of segmental duplication events are unevenly distributed on the 17 poplar chromosomes. In addition, we further examined collinearity between these genes and the related genes from six other species. Evidence from transcriptomic data indicated that the bZIP genes in poplar displayed different expression patterns in roots, stems, and leaves. Furthermore, we identified 45 bZIP genes that respond to salt stress in the three tissues. We performed co-expression analysis on the representative genes, followed by gene set enrichment analysis. The results demonstrated that tissue differentially expressed genes, especially the co-expressing genes, are mainly involved in secondary metabolic and secondary metabolite biosynthetic processes. However, salt stress responsive genes and their co-expressing genes mainly participate in the regulation of metal ion transport, and methionine biosynthetic. Conclusions Using comparative genomics and systems biology approaches, we, for the first time, systematically explore the structures and functions of the bZIP gene family in poplar. It appears that the bZIP gene family plays significant roles in regulation of poplar development and growth and salt stress responses through differential gene networks or biological processes. These findings provide the foundation for genetic breeding by engineering target regulators and corresponding gene networks into poplar lines.


2008 ◽  
Vol 5 (suppl_1) ◽  
Author(s):  
Madalena Chaves ◽  
Réka Albert

The segment polarity gene family, and its gene regulatory network, is at the basis of Drosophila embryonic development. The network's capacity for generating and robustly maintaining a specific gene expression pattern has been investigated through mathematical modelling. The models have provided several useful insights by suggesting essential network links, or uncovering the importance of the relative time scales of different biological processes in the formation of the segment polarity genes' expression patterns. But the developmental pattern formation process raises many other questions. Two of these questions are analysed here: the dependence of the signalling protein sloppy paired on the segment polarity genes and the effect of cell division on the segment polarity genes' expression patterns. This study suggests that cell division increases the robustness of the segment polarity network with respect to perturbations in biological processes.


2021 ◽  
Author(s):  
Enzo Grossi ◽  
Elisa Caminada ◽  
Beatrice Vescovo ◽  
Tristana Castrignano ◽  
Daniele Piscitelli ◽  
...  

AbstractTwenty expert caregivers wearing a body cam recorded 1868 videoclips in 67 autistic subjects along a 3 months close follow-up. A team consisting of a senior child neuro-psychiatrist and a senior psychologist selected 780 of them as expressing repetitive behaviors (RB) and made an empirical classification according to components, complexity, body parts and sensory channels involved, with the aim to understand better the pattern complexity and correlate with autism severity. The RB spectrum for each subject ranged from 1 to 33 different patterns (average= 11.6; S.D.= 6.82). Forty subjects expressed prevalent simple pattern and 27 prevalent complex patterns. No significant differences are found between the two groups according to ADOS score severity. This study represents a first attempt to systematically document expression patterns of RB with a data driven approach. This may provide a better understanding of the pathophysiology, diagnosis, and treatment of RB.


2020 ◽  
Author(s):  
Leonardo Emberti Gialloreti ◽  
Roberto Enea ◽  
Valentina Di Micco ◽  
Daniele Di Giovanni ◽  
Paolo Curatolo

Abstract Background: Developments in gene-hunting techniques identified several ASD associated genes. The considerable significance of cluster analysis associated with gene network studies has led to reveal many disrupted key pathways in ASD, even if its genetic underpinnings remain a challenging task. This study aims to determine, through a novel data-driven approach, how networks of mutated genes impact biological processes underlying autism. Methods: We analyzed the VariCarta dataset, which presents more than 200,000 genomic variant events collected from 13,069 people with ASD. Firstly, we created a whole-genome and an exome sequencing subset. Then, for each subset we compared pairwise patients of each group to build “patient similarity matrices”. Hierarchical-agglomerative-clustering and heatmap were performed to identify clusters of patients with common occurrences of gene networks within these matrices. The subsequent enrichment analysis (EA) highlighted biological processes that might be impacted by the mutated genes of each subgroup. Results: Considering the whole-genome matrix, we identified three main genetic clusters of ASD patients, each one characterized by a network of shared genetic variants. We isolated 11,609 genetic variants shared by at least two subjects in each cluster; 4,187 of these variants (36.1%) were common to the three clusters. Only 331 patients (2.5%) shared none or very few mutated genes with anyone else. The EA highlighted common or cluster-specific biological processes related to the variants. Most of the common abnormal processes were involved in neuron projections guidance and morphogenesis, cell junctions and synapse assembly. Exome sequencing alone was not effectual in identifying ASD subgroups. Limitations: Caution is warranted when interpreting our results, as we did not compare them with a control group and did not verify if the identified subgroups where actually associated with different phenotypes. Future work will have to ascertain the strength and reproducibility of these results. Conclusions: Itemizing not just single mutated genes, but also gene networks and specific biological processes that characterize different ASD subpopulations might allow to better understand which networks of genetic variants play a major role in the etiopathology of ASD. The proposed methodology may represent a novel approach to help disentangle ASD complexity and an instrument to boost more focused genotype-phenotype studies.


2016 ◽  
Vol 36 (suppl_1) ◽  
Author(s):  
Yuqi Zhao ◽  
Jing Chen ◽  
Johannes M Freudenberg ◽  
Qingying Meng ◽  
Deepak K Rajpal ◽  
...  

Objective: Recent genome-wide association studies (GWAS) of coronary artery disease (CAD) have revealed 58 genome-wide significant and 148 suggestive genetic loci. However, the molecular mechanisms through which they contribute to CAD and the clinical implications of these findings remain largely unknown. We aim to retrieve gene subnetworks of the 206 CAD loci and identify and prioritize candidate regulators to better understand the biological mechanisms underlying the genetic associations. Approach and Results: We devised a new integrative genomics approach that incorporated i) candidate genes from the top CAD loci, ii) the complete genetic association results from the CARDIoGRAM-C4D CAD GWAS, iii) tissue-specific gene regulatory networks that depict the potential relationship and interactions between genes, and iv) tissue-specific gene expression patterns between CAD patients and controls. The networks and top ranked regulators according to these data-driven criteria were further queried against literature, experimental evidence, and drug information to evaluate their disease relevance and potential as drug targets. Our analysis uncovered several potential novel regulators of CAD such as LUM and STAT3 , which possess properties suitable as drug targets. We also revealed molecular relations and potential mechanisms through which the top CAD loci operate. Furthermore, we found that extracellular matrix genes coordinate multiple CAD-relevant biological processes such as complement and coagulation cascades and lipid metabolism through tissue-specific interactions in the CAD networks. Conclusion: Our data-driven integrative genomics framework unraveled tissue-specific relations among the candidate genes of the CAD GWAS loci and prioritized novel network regulatory genes orchestrating biological processes relevant to CAD.


Author(s):  
Elizabeth Ing-Simmons ◽  
Roshan Vaid ◽  
Xin Yang Bing ◽  
Michael Levine ◽  
Mattias Mannervik ◽  
...  

AbstractThe relationship between chromatin organization and gene regulation remains unclear. While disruption of chromatin domains and domain boundaries can lead to misexpression of developmental genes, acute depletion of regulators of genome organization has a relatively small effect on gene expression. It is therefore uncertain whether gene expression and chromatin state drive chromatin organization or whether changes in chromatin organization facilitate cell-type-specific activation of gene expression. Here, using the dorsoventral patterning of the Drosophila melanogaster embryo as a model system, we provide evidence for the independence of chromatin organization and dorsoventral gene expression. We define tissue-specific enhancers and link them to expression patterns using single-cell RNA-seq. Surprisingly, despite tissue-specific chromatin states and gene expression, chromatin organization is largely maintained across tissues. Our results indicate that tissue-specific chromatin conformation is not necessary for tissue-specific gene expression but rather acts as a scaffold facilitating gene expression when enhancers become active.


2021 ◽  
Author(s):  
Priya Kumari ◽  
Vijay Gahlaut ◽  
Ekjot Kaur ◽  
Sanatsujat Singh ◽  
Sanjay Kumar ◽  
...  

Abstract In the past few years, plant-specific GRAS transcription factors (TFs) were reported to play an essential role in regulating several biological processes, such as plant growth and development, phytochrome signal, arbuscular mycorrhiza (AM) symbiosis, environmental stress responses. GRAS genes have been thoroughly studied in several plant species, but unexplored in Rosa chinensis (rose). In this study, 59 rose GRAS genes (RcGRAS) were identified. Phylogenetic analyses grouped RcGRAS genes into 17 subfamilies, of which subfamily Rc2 was Rosaceae family-specific. Gene structure analyses showed that most of the RcGRAS genes were intronless and were relatively conserved. Cis-element analyses suggested that RcGRAS genes may involve in distinct biological processes and responsive to diverse abiotic stresses. Most of the genes were localized in the nucleus, except for a few in the cytoplasm. Gene expression analysis was also performed in various tissues, during gibberellin (GA) and drought stress treatment. The expression patterns of RcGRAS genes during GA treatment and in response to drought stresses suggested the potential functions of these genes in regulating stress and hormone responses. In summary, a comprehensive exploration of the rose GRAS gene family was performed, and the generated information can be utilized for further functional-based studies on this family.


2016 ◽  
Author(s):  
Dong Li ◽  
James B. Brown ◽  
Luisa Orsini ◽  
Zhisong Pan ◽  
Guyu Hu ◽  
...  

1SummaryGene co-expression network differential analysis is designed to help biologists understand gene expression patterns under different conditions. We have implemented an R package called MODA (Module Differential Analysis) for gene co-expression network differential analysis. Based on transcriptomic data, MODA can be used to estimate and construct condition-specific gene co-expression networks, and identify differentially expressed subnetworks as conserved or condition specific modules which are potentially associated with relevant biological processes. The usefulness of the method is also demonstrated by synthetic data as well as Daphnia magna gene expression data under different environmental stresses.


2015 ◽  
Author(s):  
Samantha M Thomas ◽  
Courtney Kagan ◽  
Bryan J Pavlovic ◽  
Jonathan Burnett ◽  
Kristen Patterson ◽  
...  

Renewable in vitro cell cultures, such as lymphoblastoid cell lines (LCLs), have facilitated studies that contributed to our understanding of genetic influence on human traits. However, the degree to which cell lines faithfully maintain differences in donor-specific phenotypes is still debated. We have previously reported that standard cell line maintenance practice results in a loss of donor-specific gene expression signatures in LCLs. An alternative to the LCL model is the induced pluripotent stem cell (iPSC) system, which carries the potential to model tissue-specific physiology through the use of differentiation protocols. Still, existing LCL banks represent an important source of starting material for iPSC generation, and it is possible that the disruptions in gene regulation associated with long-term LCL maintenance could persist through the reprogramming process. To address this concern, we studied the effect of reprogramming mature LCLs to iPSCs on the ensuing gene expression patterns within and between six unrelated donor individuals. We show that the reprogramming process results in a recovery of donor-specific gene regulatory signatures. Since environmental contributions are unlikely to be a source of individual variation in our system of highly passaged cultured cell lines, our observations suggest that the effect of genotype on gene regulation is more pronounced in the iPSCs than in the LCL precursors. Our findings indicate that iPSCs can be a powerful model system for studies of phenotypic variation across individuals in general, and the genetic association with variation in gene regulation in particular. We further conclude that LCLs are an appropriate starting material for iPSC generation.


2019 ◽  
Author(s):  
Aurora Savino ◽  
Lidia Avalle ◽  
Emanuele Monteleone ◽  
Irene Miglio ◽  
Alberto Griffa ◽  
...  

AbstractThe behaviour of complex biological systems is determined by the orchestrated activity of many components interacting with each other, and can be investigated by networks. In particular, gene co-expression networks have been widely used in the past years thanks to the increasing availability of huge gene expression databases. Breast cancer is a heterogeneous disease usually classified either according to immunohistochemical features or by expression profiling, which identifies the 5 subtypes luminal A, luminal B, basal-like, HER2-positive and normal-like. Basal-like tumours are the most aggressive subtype, for which so far no targeted therapy is available.Making use of the WGCNA clustering method to reconstruct breast cancer transcriptional networks from the METABRIC breast cancer dataset, we developed a platform to address specific questions related to breast cancer biology. In particular, we obtained gene modules significantly correlated with survival and age of onset, useful to understand how molecular features and gene expression patterns are organized in breast cancer. We next generated subtype-specific gene networks and in particular identified two modules that are significantly more connected in basal-like breast cancer with respect to all other subtypes, suggesting relevant biological functions. We demonstrate that network centrality (kWithin) is a suitable measure to identify relevant genes, since we could show that it correlates with clinical features and that it provides a mean to select potential upstream regulators of a module with high reliability. Finally, we showed the feasibility of adding meaning to the networks by combining them with independently obtained data related to activated pathways.In conclusion, our platform allows to identify groups of genes highly relevant in breast cancer and possibly amenable to drug targeting, due to their ability to regulate survival-related gene networks. This approach could be successfully extended to other BC subtypes, and to all tumor types for which enough expression data are available.


Sign in / Sign up

Export Citation Format

Share Document