scholarly journals Genome-Wide Co-Expression Distributions as a Metric to Prioritize Genes of Functional Importance

Genes ◽  
2020 ◽  
Vol 11 (10) ◽  
pp. 1231
Author(s):  
Pâmela A. Alexandre ◽  
Nicholas J. Hudson ◽  
Sigrid A. Lehnert ◽  
Marina R. S. Fortes ◽  
Marina Naval-Sánchez ◽  
...  

Genome-wide gene expression analysis are routinely used to gain a systems-level understanding of complex processes, including network connectivity. Network connectivity tends to be built on a small subset of extremely high co-expression signals that are deemed significant, but this overlooks the vast majority of pairwise signals. Here, we developed a computational pipeline to assign to every gene its pair-wise genome-wide co-expression distribution to one of 8 template distributions shapes varying between unimodal, bimodal, skewed, or symmetrical, representing different proportions of positive and negative correlations. We then used a hypergeometric test to determine if specific genes (regulators versus non-regulators) and properties (differentially expressed or not) are associated with a particular distribution shape. We applied our methodology to five publicly available RNA sequencing (RNA-seq) datasets from four organisms in different physiological conditions and tissues. Our results suggest that genes can be assigned consistently to pre-defined distribution shapes, regarding the enrichment of differential expression and regulatory genes, in situations involving contrasting phenotypes, time-series, or physiological baseline data. There is indeed a striking additional biological signal present in the genome-wide distribution of co-expression values which would be overlooked by currently adopted approaches. Our method can be applied to extract further information from transcriptomic data and help uncover the molecular mechanisms involved in the regulation of complex biological process and phenotypes.

2020 ◽  
Author(s):  
Pâmela A. Alexandre ◽  
Nicholas J. Hudson ◽  
Sigrid A. Lehnert ◽  
Marina R.S. Fortes ◽  
Marina Naval-Sánchez ◽  
...  

AbstractGenome-wide gene expression is routinely used as a tool to gain a systems-level understanding of complex, biological processes. Numerical approaches that have been used to highlight influential genes include abundance, differential expression, differential variation, network connectivity and differential connectivity. Network connectivity tends to be built on a small subset of extremely high co-expression signals that are deemed significant, but this overlooks the vast majority of pairwise signals. Here, we aimed to assess a complementary strategy, namely whether the entire shape of the distribution of genome-wide co-expression values contains a meaningful biological signal that has hitherto remained hidden from view. We have developed a computational pipeline to assign one of 8 distributions (including normal, skewed, bimodal, kurtotic, inverted) to every gene. We then used a hypergeometric enrichment process to determine if particular genes (regulators versus non-regulators) and properties (differentially expressed or not) tend to be associated with particular distributions greater than would be expected by chance. Examination of several distinct data sets spanning 4 species indicates that there is indeed an additional biological signal present in the genome-wide distribution of co-expression values which would be overlooked by currently adopted approaches.Author summaryHigh-throughput technologies, such as RNA-Seq, enables access to a vast amount of data. Here, we describe a new approach to interrogate these data and extract further information to help researchers to understand complex phenotypes. Our method is based on gene-level co-expression distributions which were compared to eight possible template shapes to group genes with similar behaviours. The method was tested using five different datasets and the consistency of the results indicate it can be used as a complementary strategy to analyse transcriptomic data.


2019 ◽  
Author(s):  
Yin Liu ◽  
Shenglin Mei ◽  
Hongxiu Wang ◽  
Fang Wang ◽  
Ying Wang ◽  
...  

Abstract CRISPR/Cas9 cleavage efficiency is crucial in a genomic editing experiment. However, the molecular mechanisms that underlie this cleavage difference remain unclear. In our study, we characterized genome-wide gene expression and epigenetic features in CRISPR-Cas9 low and high clones across 3 different cell lines. We show that Cas9 expression level is relatively higher in high efficiency clones. Notably, histone mark ChIP-seq data demonstrate that differential expressed genes also have a different acetylation of histone 3 at lysine 27 (H3K27ac) level. Finally, we observed that PARVA is an important gene that can enhance CRISPR-Cas9 efficiency.


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Patricia Garcia ◽  
Rita Fernandez-Hernandez ◽  
Ana Cuadrado ◽  
Ignacio Coca ◽  
Antonio Gomez ◽  
...  

AbstractCornelia de Lange syndrome (CdLS) is a rare disease affecting multiple organs and systems during development. Mutations in the cohesin loader, NIPBL/Scc2, were first described and are the most frequent in clinically diagnosed CdLS patients. The molecular mechanisms driving CdLS phenotypes are not understood. In addition to its canonical role in sister chromatid cohesion, cohesin is implicated in the spatial organization of the genome. Here, we investigate the transcriptome of CdLS patient-derived primary fibroblasts and observe the downregulation of genes involved in development and system skeletal organization, providing a link to the developmental alterations and limb abnormalities characteristic of CdLS patients. Genome-wide distribution studies demonstrate a global reduction of NIPBL at the NIPBL-associated high GC content regions in CdLS-derived cells. In addition, cohesin accumulates at NIPBL-occupied sites at CpG islands potentially due to reduced cohesin translocation along chromosomes, and fewer cohesin peaks colocalize with CTCF.


2021 ◽  
Author(s):  
Frank J Dekker ◽  
Martijn Zwinderman ◽  
Thamar Jessurun Lobo ◽  
Petra Van der Wouden ◽  
Diana C.J. Spierings ◽  
...  

Following DNA replication, equal amounts of histones are distributed over sister chromatids by re-deposition of parental histones and deposition of newly synthesized histones. Molecular mechanisms balancing the allocation of new and old histones remain largely unknown. Here, we studied the genome-wide distribution of new histones relative to parental DNA template strands and replication initiation zones using double-click-seq. In control conditions, new histones were preferentially found on DNA replicated by the lagging strand machinery. Strikingly, replication stress induced by hydroxyurea or curaxin treatment, and inhibition of ATR or p53 inactivation, inverted the observed histone deposition bias to the strand replicated by the leading strand polymerase in line with previously reported effects on RPA occupancy. We propose that asymmetric deposition of newly synthesized histones onto sister chromatids reflects differences in the processivity of leading and lagging strand synthesis.


2021 ◽  
Author(s):  
Meishan Zhang ◽  
Ning Li ◽  
Weiguang Yang ◽  
Bao Liu

Abstract Differential regulation of gene expression and alternative splicing (AS) are major molecular mechanisms dictating plant growth and development, as well as underpinning heterosis in F1 hybrids. Here, using deep RNA-sequencing we analyzed differences in genome-wide gene expression and AS between developing embryo and endosperm, and between F1 hybrids and their pure-line parents in sorghum. We uncover dramatic differences in both gene expression and AS between embryo and endosperm with respect to gene features and functions, which are consistent with the fundamentally different biological roles of the two tissues. Accordingly, F1 hybrids showed substantial and multifaceted differences in gene expression and AS compared with their pure-line parents, again with clear tissue specificities including extents of difference, genes involved and functional enrichments. Our results provide useful transcriptome resources as well as novel insights for further elucidation of seed yield heterosis in sorghum and related crops.


2020 ◽  
Vol 27 ◽  
Author(s):  
Giulia De Riso ◽  
Sergio Cocozza

: Epigenetics is a field of biological sciences focused on the study of reversible, heritable changes in gene function not due to modifications of the genomic sequence. These changes are the result of a complex cross-talk between several molecular mechanisms, that is in turn orchestrated by genetic and environmental factors. The epigenetic profile captures the unique regulatory landscape and the exposure to environmental stimuli of an individual. It thus constitutes a valuable reservoir of information for personalized medicine, which is aimed at customizing health-care interventions based on the unique characteristics of each individual. Nowadays, the complex milieu of epigenomic marks can be studied at the genome-wide level thanks to massive, highthroughput technologies. This new experimental approach is opening up new and interesting knowledge perspectives. However, the analysis of these complex omic data requires to face important analytic issues. Artificial Intelligence, and in particular Machine Learning, are emerging as powerful resources to decipher epigenomic data. In this review, we will first describe the most used ML approaches in epigenomics. We then will recapitulate some of the recent applications of ML to epigenomic analysis. Finally, we will provide some examples of how the ML approach to epigenetic data can be useful for personalized medicine.


2021 ◽  
Vol 7 (3) ◽  
pp. eabd9036
Author(s):  
Sara Saez-Atienzar ◽  
Sara Bandres-Ciga ◽  
Rebekah G. Langston ◽  
Jonggeol J. Kim ◽  
Shing Wan Choi ◽  
...  

Despite the considerable progress in unraveling the genetic causes of amyotrophic lateral sclerosis (ALS), we do not fully understand the molecular mechanisms underlying the disease. We analyzed genome-wide data involving 78,500 individuals using a polygenic risk score approach to identify the biological pathways and cell types involved in ALS. This data-driven approach identified multiple aspects of the biology underlying the disease that resolved into broader themes, namely, neuron projection morphogenesis, membrane trafficking, and signal transduction mediated by ribonucleotides. We also found that genomic risk in ALS maps consistently to GABAergic interneurons and oligodendrocytes, as confirmed in human single-nucleus RNA-seq data. Using two-sample Mendelian randomization, we nominated six differentially expressed genes (ATG16L2, ACSL5, MAP1LC3A, MAPKAPK3, PLXNB2, and SCFD1) within the significant pathways as relevant to ALS. We conclude that the disparate genetic etiologies of this fatal neurological disease converge on a smaller number of final common pathways and cell types.


Sign in / Sign up

Export Citation Format

Share Document