scholarly journals Matrix factorization recovers consistent regulatory signals from disparate datasets

Author(s):  
Anand V. Sastry ◽  
Alyssa Hu ◽  
David Heckmann ◽  
Saugat Poudel ◽  
Erol Kavvas ◽  
...  

AbstractThe availability of gene expression data has dramatically increased in recent years. This data deluge could result in detailed inference of underlying regulatory networks, but the diversity of experimental platforms and protocols introduces critical biases that could hinder scalable analysis of existing data. Here, we show that the underlying structure of the E. coli transcriptome, as determined by Independent Component Analysis (ICA), is conserved across multiple independent datasets, including both RNA-seq and microarray datasets. We also show that echoes of this structure remain in the proteome, accelerating biological discovery through multi-omics analysis. We subsequently combined five transcriptomics datasets into a large compendium containing over 800 expression profiles and discovered that its underlying ICA-based structure was still comparable to that of the individual datasets. ICA thus enables deep analysis of disparate data to uncover new insights that were not visible in the individual datasets.

2021 ◽  
Vol 17 (2) ◽  
pp. e1008647 ◽  
Author(s):  
Anand V. Sastry ◽  
Alyssa Hu ◽  
David Heckmann ◽  
Saugat Poudel ◽  
Erol Kavvas ◽  
...  

The availability of bacterial transcriptomes has dramatically increased in recent years. This data deluge could result in detailed inference of underlying regulatory networks, but the diversity of experimental platforms and protocols introduces critical biases that could hinder scalable analysis of existing data. Here, we show that the underlying structure of the E. coli transcriptome, as determined by Independent Component Analysis (ICA), is conserved across multiple independent datasets, including both RNA-seq and microarray datasets. We subsequently combined five transcriptomics datasets into a large compendium containing over 800 expression profiles and discovered that its underlying ICA-based structure was still comparable to that of the individual datasets. With this understanding, we expanded our analysis to over 3,000 E. coli expression profiles and predicted three high-impact regulons that respond to oxidative stress, anaerobiosis, and antibiotic treatment. ICA thus enables deep analysis of disparate data to uncover new insights that were not visible in the individual datasets.


2022 ◽  
Vol 23 (2) ◽  
pp. 650
Author(s):  
Laís Reis-das-Mercês ◽  
Tatiana Vinasco-Sandoval ◽  
Rafael Pompeu ◽  
Aline Cruz Ramos ◽  
Ana K. M. Anaissi ◽  
...  

Gastric cancer (GC) is the fifth most common type of cancer and the third leading cause of cancer death in the world. It is a disease that encompasses a variety of molecular alterations, including in non-coding RNAs such as circular RNAs (circRNAs). In the present study, we investigated hsa_circ_0000211, hsa_circ_0000284, hsa_circ_0000524, hsa_circ_0001136 and hsa_circ_0004771 expression profiles using RT-qPCR in 71 gastric tissue samples from GC patients (tumor and tumor-adjacent samples) and volunteers without cancer. In order to investigate the suitability of circRNAs as minimally invasive biomarkers, we also evaluated their expression profile through RT-qPCR in peripheral blood samples from patients with and without GC (n = 41). We also investigated the predicted interactions between circRNA-miRNA-mRNA and circRNA-RBP using the KEGG and Reactome databases. Overall, our results showed that hsa_circ_0000211, hsa_circ_0000284 and hsa_circ_0004771 presented equivalent expression profiles when analyzed by different methods (RNA-Seq and RT-qPCR) and different types of samples (tissue and blood). Further, functional enrichment results identified important signaling pathways related to GC. Thus, our data support the consideration of circRNAs as new, minimally invasive biomarkers capable of aiding in the diagnosis of GC and with great potential to be applied in clinical practice.


2021 ◽  
Vol 11 (2) ◽  
pp. 138
Author(s):  
Yigit Koray Babal ◽  
Basak Kandemir ◽  
Isil Aksan Kurnaz

The ETS domain family of transcription factors is involved in a number of biological processes, and is commonly misregulated in various forms of cancer. Using microarray datasets from patients with different grades of glioma, we have analyzed the expression profiles of various ETS genes, and have identified ETV1, ELK3, ETV4, ELF4, and ETV6 as novel biomarkers for the identification of different glioma grades. We have further analyzed the gene regulatory networks of ETS transcription factors and compared them to previous microarray studies, where Elk-1-VP16 or PEA3-VP16 were overexpressed in neuroblastoma cell lines, and we identify unique and common regulatory networks for these ETS proteins.


2012 ◽  
Vol 80 (3) ◽  
pp. 1232-1242 ◽  
Author(s):  
Jason W. Sahl ◽  
David A. Rasko

EnterotoxigenicEscherichia coli(ETEC) is an important pathogenic variant (pathovar) ofE. coliin developing countries from a human health perspective, causing significant morbidity and mortality. Previous studies have examined specific regulatory networks in ETEC, although little is known about the global effects of inter- and intrakingdom signaling on the expression of virulence and colonization factors in ETEC. In this study, anE. coli/Shigellapan-genome microarray, combined with quantitative reverse transcriptase PCR (qRT-PCR) and RNA sequencing (RNA-seq), was used to quantify the expression of ETEC virulence and colonization factors. Biologically relevant chemical signals were combined with ETEC isolate E24377A during growth in either Luria broth (LB) or Dulbecco's modified Eagle medium (DMEM), and transcription was examined during different phases of the growth cycle; chemical signals examined included glucose, bile salts, and preconditioned media fromE. coli/Shigellaisolates. The results demonstrate that the presence of bile salts, which are found in the intestine and thought to be bactericidal, upregulates the expression of many ETEC virulence factors, including heat-stable (estA) and heat-labile (eltA) enterotoxin genes. In contrast, the ETEC colonization factors CS1 and CS3 were downregulated in the presence of bile, consistent with findings in studies of other enteric pathogens. RNA-seq analysis demonstrated that one of the most differentially expressed genes in the presence of bile is a unique plasmid-encoded AraC-like transcriptional regulator (peaR); other previously unknown genetic elements were found as well. These results provide transcriptional targets and putative mechanisms that should help improve understanding of the global regulatory networks and virulence expression in this important human pathogen.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Océane Cassan ◽  
Sophie Lèbre ◽  
Antoine Martin

Abstract Background High-throughput transcriptomic datasets are often examined to discover new actors and regulators of a biological response. To this end, graphical interfaces have been developed and allow a broad range of users to conduct standard analyses from RNA-seq data, even with little programming experience. Although existing solutions usually provide adequate procedures for normalization, exploration or differential expression, more advanced features, such as gene clustering or regulatory network inference, often miss or do not reflect current state of the art methodologies. Results We developed here a user interface called DIANE (Dashboard for the Inference and Analysis of Networks from Expression data) designed to harness the potential of multi-factorial expression datasets from any organisms through a precise set of methods. DIANE interactive workflow provides normalization, dimensionality reduction, differential expression and ontology enrichment. Gene clustering can be performed and explored via configurable Mixture Models, and Random Forests are used to infer gene regulatory networks. DIANE also includes a novel procedure to assess the statistical significance of regulator-target influence measures based on permutations for Random Forest importance metrics. All along the pipeline, session reports and results can be downloaded to ensure clear and reproducible analyses. Conclusions We demonstrate the value and the benefits of DIANE using a recently published data set describing the transcriptional response of Arabidopsis thaliana under the combination of temperature, drought and salinity perturbations. We show that DIANE can intuitively carry out informative exploration and statistical procedures with RNA-Seq data, perform model based gene expression profiles clustering and go further into gene network reconstruction, providing relevant candidate genes or signalling pathways to explore. DIANE is available as a web service (https://diane.bpmp.inrae.fr), or can be installed and locally launched as a complete R package.


Author(s):  
Vanika Gupta ◽  
Brian P. Lazzaro

ABSTRACTGene expression profiles are typically described at the level of the tissue or, often in Drosophila, at the level of the whole organism. Collapsing the gene expression of entire tissues into single measures averages over potentially important heterogeneity among the cells that make up that tissue. The advent of single-cell RNA-sequencing technology (sc-RNAseq) allows transcriptomic evaluation of the individual cells that make up a tissue. However, sc-RNAseq requires a high-quality suspension of viable cells or nuclei, and cell dissociation methods that yield healthy cells and nuclei are still lacking for many important tissues. The insect fat body is a polyfunctional tissue responsible for diverse physiological processes and therefore is an important target for sc-RNAseq. The Drosophila adult fat body consists of fragile cells that are difficult to dissociate while maintaining cell viability. As an alternative, we developed a method to isolate single fat body nuclei for RNA-seq. Our isolation method is largely free of mitochondrial contamination and yields higher capture of transcripts per nucleus compared to other nuclei preparation methods. Our method works well for single cell nuclei sequencing and potentially can be implemented for bulk RNA-seq.


2017 ◽  
Author(s):  
VH Tierrafría ◽  
C Mejía-Almonte ◽  
JM Camacho-Zaragoza ◽  
H Salgado ◽  
K Alquicira ◽  
...  

AbstractMotivationA major component in our understanding of the biology of an organism is the mapping of its genotypic potential into the repertoire of its phenotypic expression profiles. This genotypic to phenotypic mapping is executed by the machinery of gene regulation that turns genes on and off, which in microorganisms is essentially studied by changes in growth conditions and genetic modifications. Although many efforts have been made to systematize the annotation of experimental conditions in microbiology, the available annotation is not based on a consistent and controlled vocabulary for the unambiguous description of growth conditions, making difficult the identification of biologically meaningful comparisons of knowledge generated in different experiments or laboratories, a task urgently needed given the massive amounts of data generated by high throughput (HT) technologies.ResultsWe curated terms related to experimental conditions that affect gene expression inE. coliK-12. Since this is the best studied microorganism, the collected terms are the seed for the first version of the Microbial Conditions Ontology (MCO), a controlled and structured vocabulary that can be expanded to annotate microbial conditions in general. Moreover, we developed an annotation framework using the MCO terms to describe experimental conditions, providing the foundation to identify regulatory networks that operate under a particular condition. MCO supports comparisons of HT-derived data from different repositories. In this sense, we started to map common RegulonDB terms and Colombos bacterial expression compendia terms to MCO.Availability and ImplementationAs far as we know, MCO is the first ontology for growth conditions of any bacterial organism and it is available athttp://regulondb.ccg.unam.mx/. Furthermore, we will disseminate MCO throughout the Open Biomedical Ontology (OBO) Foundry in order to set a standard for the annotation of gene expression data derived from conventional as well as HT experiments inE. coliand other microbial organisms. This will enable the comparison of data from diverse data [email protected],[email protected]


2020 ◽  
Vol 15 ◽  
Author(s):  
Yeqing Sun ◽  
Lei Chen ◽  
Yingqi Zhang ◽  
Jincheng Zhang ◽  
Shashi Ranjan Tiwari

Background: Osteoarthritis (OA), one of the most important causes leading to joint disability, was considered as an untreatable disease. A series of genes were reported to regulate the pathogenesis of OA, including microRNAs, Long non-coding RNAs and Circular RNA. So far, the expression profiles and functions of lncRNAs, mRNAs, and circRNAs in OA are not fully understood. Objective: The present study aimed to identify differently expressed genes in OA. Methods: The present study conducted RNA-seq to identify differently expressed genes in OA. Ontology (GO) analysis was used to analysis the Molecular Function and Biological Process. KEGG pathway analysis was used to perform the differentially expressed lncRNAs in biological pathways. Results: Hierarchical clustering revealed a total of 943 mRNAs, 518 lncRNAs, and 300 circRNAs were dysregulated in OA compared to normal samples. Furthermore, we constructed differentially expressed mRNAs mediated proteinprotein interaction network, differentially expressed lncRNAs mediated trans regulatory networks, and competitive endogenous RNA (ceRNA) to reveal the interaction among these genes in OA. Bioinformatics analysis revealed these dysregulated genes were involved in regulating multiple biological processes, such as wound healing, negative regulation of ossification, sister chromatid cohesion, positive regulation of interleukin-1 alpha production, sodium ion transmembrane transport, positive regulation of cell migration, and negative regulation of inflammatory response. To the best of our knowledge, this study for the first time revealed the expression pattern of mRNAs, lncRNAs and circRNAs in OA. Conclusion: This study provided novel information to validate these differentially expressed RNAs may be as possible biomarkers and targets in OA.


Genes ◽  
2021 ◽  
Vol 12 (5) ◽  
pp. 665
Author(s):  
Hui Yu ◽  
Yan Guo ◽  
Jingchun Chen ◽  
Xiangning Chen ◽  
Peilin Jia ◽  
...  

Transcriptomic studies of mental disorders using the human brain tissues have been limited, and gene expression signatures in schizophrenia (SCZ) remain elusive. In this study, we applied three differential co-expression methods to analyze five transcriptomic datasets (three RNA-Seq and two microarray datasets) derived from SCZ and matched normal postmortem brain samples. We aimed to uncover biological pathways where internal correlation structure was rewired or inter-coordination was disrupted in SCZ. In total, we identified 60 rewired pathways, many of which were related to neurotransmitter, synapse, immune, and cell adhesion. We found the hub genes, which were on the center of rewired pathways, were highly mutually consistent among the five datasets. The combinatory list of 92 hub genes was generally multi-functional, suggesting their complex and dynamic roles in SCZ pathophysiology. In our constructed pathway crosstalk network, we found “Clostridium neurotoxicity” and “signaling events mediated by focal adhesion kinase” had the highest interactions. We further identified disconnected gene links underlying the disrupted pathway crosstalk. Among them, four gene pairs (PAK1:SYT1, PAK1:RFC5, DCTN1:STX1A, and GRIA1:MAP2K4) were normally correlated in universal contexts. In summary, we systematically identified rewired pathways, disrupted pathway crosstalk circuits, and critical genes and gene links in schizophrenia transcriptomes.


2020 ◽  
Vol 98 (Supplement_4) ◽  
pp. 286-286
Author(s):  
Kwangwook Kim ◽  
Sungbong Jang ◽  
Yanhong Liu

Abstract Our previous studies have shown that supplementation of low-dose antibiotic growth promoter (AGP) exacerbated growth performance and systemic inflammation of weaned pigs infected with pathogenic Escherichia coli (E. coli). The objective of this experiment, which is extension of our previous report, was to investigate the effect of low-dose AGP on gene expression in ileal mucosa of weaned pigs experimentally infected with F18 E. coli. Thirty-four pigs (6.88 ± 1.03 kg BW) were individually housed in disease containment rooms and randomly allotted to one of three treatments (9 to 13 pigs/treatment). The three dietary treatments were control diet (control), and 2 additional diets supplemented with 0.5 or 50 mg/kg of AGP (carbadox), respectively. The experiment lasted 18 d [7 d before and 11 d after first inoculation (d 0)]. The F18 E. coli inoculum was orally provided to all pigs with the dose of 1010 cfu/3 mL for 3 consecutive days. Total RNA [4 to 6 pigs/treatment on d 5; 5 to 7 pigs/treatment on 11 post-inoculation (PI)] was extracted from ileal mucosa to analyze gene expression profiles by Batch-Tag-Seq. The modulated differential gene expression were defined by 1.5-fold difference and a cutoff of P < 0.05 using limma-voom package. All processed data were statistically analyzed and evaluated by PANTHER classification system to determine the biological process function of genes in these lists. Compared to control, supplementation of recommended-dose AGP down-regulated genes related to inflammatory responses on d 5 and 11 PI; whereas, feeding low-dose AGP up-regulated genes associated with negative regulation of metabolic process on d 5, but down-regulated the genes related to immune responses on d 11 PI. The present observations support adverse effects of low-dose AGP in our previous study, indicated by exacerbated the detrimental effects of E. coli infection on pigs’ growth rate, diarrhea and systemic inflammation.


Sign in / Sign up

Export Citation Format

Share Document