scholarly journals Xenbase: deep integration of GEO & SRA RNA-seq and ChIP-seq data in a model organism database

Author(s):  
Joshua D Fortriede ◽  
Troy J Pells ◽  
Stanley Chu ◽  
Praneet Chaturvedi ◽  
DongZhuo Wang ◽  
...  

Abstract Xenbase (www.xenbase.org) is a knowledge base for researchers and biomedical scientists that employ the amphibian Xenopus as a model organism in biomedical research to gain a deeper understanding of developmental and disease processes. Through expert curation and automated data provisioning from various sources Xenbase strives to integrate the body of knowledge on Xenopus genomics and biology together with the visualization of biologically significant interactions. Most current studies utilize next generation sequencing (NGS) but until now the results of different experiments were difficult to compare and not integrated with other Xenbase content. Xenbase has developed a suite of tools, interfaces and data processing pipelines that transforms NCBI Gene Expression Omnibus (GEO) NGS content into deeply integrated gene expression and chromatin data, mapping all aligned reads to the most recent genome builds. This content can be queried and visualized via multiple tools and also provides the basis for future automated ‘gene expression as a phenotype’ and gene regulatory network analyses.

2020 ◽  
Vol 26 (29) ◽  
pp. 3619-3630
Author(s):  
Saumya Choudhary ◽  
Dibyabhaba Pradhan ◽  
Noor S. Khan ◽  
Harpreet Singh ◽  
George Thomas ◽  
...  

Background: Psoriasis is a chronic immune mediated skin disorder with global prevalence of 0.2- 11.4%. Despite rare mortality, the severity of the disease could be understood by the accompanying comorbidities, that has even led to psychological problems among several patients. The cause and the disease mechanism still remain elusive. Objective: To identify potential therapeutic targets and affecting pathways for better insight of the disease pathogenesis. Method: The gene expression profile GSE13355 and GSE14905 were retrieved from NCBI, Gene Expression Omnibus database. The GEO profiles were integrated and the DEGs of lesional and non-lesional psoriasis skin were identified using the affy package in R software. The Kyoto Encyclopaedia of Genes and Genomes pathways of the DEGs were analyzed using clusterProfiler. Cytoscape, V3.7.1 was utilized to construct protein interaction network and analyze the interactome map of candidate proteins encoded in DEGs. Functionally relevant clusters were detected through Cytohubba and MCODE. Results: A total of 1013 genes were differentially expressed in lesional skin of which 557 were upregulated and 456 were downregulated. Seven dysregulated genes were extracted in non-lesional skin. The disease gene network of these DEGs revealed 75 newly identified differentially expressed gene that might have a role in development and progression of the disease. GO analysis revealed keratinocyte differentiation and positive regulation of cytokine production to be the most enriched biological process and molecular function. Cytokines -cytokine receptor was the most enriched pathways. Among 1013 identified DEGs in lesional group, 36 DEGs were found to have altered genetic signature including IL1B and STAT3 which are also reported as hub genes. CCNB1, CCNA2, CDK1, IL1B, CXCL8, MKI 67, ESR1, UBE2C, STAT1 and STAT3 were top 10 hub gene. Conclusion: The hub genes, genomic altered DEGs and other newly identified differentially dysregulated genes would improve our understanding of psoriasis pathogenesis, moreover, the hub genes could be explored as potential therapeutic targets for psoriasis.


2016 ◽  
Vol 45 (D1) ◽  
pp. D758-D768 ◽  
Author(s):  
Douglas G. Howe ◽  
Yvonne M. Bradford ◽  
Anne Eagle ◽  
David Fashena ◽  
Ken Frazer ◽  
...  

2021 ◽  
Author(s):  
Mathias N Stokholm ◽  
Maria B Rabaglino ◽  
Haja N Kadarmideen

Transcriptomic data is often expensive and difficult to generate in large cohorts in comparison to genomic data and therefore is often important to integrate multiple transcriptomic datasets from both microarray and next generation sequencing (NGS) based transcriptomic data across similar experiments or clinical trials to improve analytical power and discovery of novel transcripts and genes. However, transcriptomic data integration presents a few challenges including re-annotation and batch effect removal. We developed the Gene Expression Data Integration (GEDI) R package to enable transcriptomic data integration by combining already existing R packages. With just four functions, the GEDI R package makes constructing a transcriptomic data integration pipeline straightforward. Together, the functions overcome the complications in transcriptomic data integration by automatically re-annotating the data and removing the batch effect. The removal of the batch effect is verified with Principal Component Analysis and the data integration is verified using a logistic regression model with forward stepwise feature selection. To demonstrate the functionalities of the GEDI package, we integrated five bovine endometrial transcriptomic datasets from the NCBI Gene Expression Omnibus. The datasets included Affymetrix, Agilent and RNA-sequencing data. Furthermore, we compared the GEDI package to already existing tools and found that GEDI is the only tool that provides a full transcriptomic data integration pipeline including verification of both batch effect removal and data integration.


2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Yan Xu ◽  
Jiali Cai ◽  
Weibin Li ◽  
Jingkun Miao ◽  
Yan Mei ◽  
...  

Background. Pneumonia is a serious global health problem. In traditional Chinese medicine, acupuncture or moxibustion is used to directly stimulate select acupoints on the surface of the human body and produce physical stimulation to further stimulate regulatory functions in the body, strengthening bodily resistance, eliminating disease, and adjusting the viscera. However, this Chinese medicine knowledge does not include the specific mechanisms of action or targets of acupoints. Therefore, an in-depth research is needed. Methods. An acupoint-element database was constructed, and the target elements of the Feishu point were screened. The UniProt-Swiss-Prot sublibrary was used to obtain correct gene name information. The National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database and GEO2R were used to analyze differentially expressed genes in pneumonia. The STRING database was used to analyze interactions, construct a network of the Feishu point efficacy system in pneumonia, and elucidate the mechanisms of action. Results. The Feishu point comprises 34 elements in total. The protein interaction analysis has 38 nodes and 115 edges. The Feishu point efficacy system-pneumonia system network shows that cytokine signaling in the immune system, signaling by interleukins (ILs), IL-4 and IL-13 signaling, and the immune system may be related to immunity and inflammation. The Feishu point efficacy system regulating pneumonia showed that FCER2, IL4R, FASLG, TGFB1, IL6R, STAT6, IL1B, CASP3, IL5RA, IL2RB, MYD88, SQSTM1, IL12RB1, IFNGR1, ADAM17, and CDH1 are the main targets. Conclusion. From the perspective of systematic acupuncture and moxibustion, the Feishu point regulates cytokine signaling in the immune system, signaling by ILs, IL-4 and IL-13 signaling, and the immune system by targeting FCER2, IL4R, FASLG, TGFB1, IL6R, STAT6, IL1B, CASP3, IL5RA, IL2RB, MYD88, SQSTM1, IL12RB1, IFNGR1, ADAM17, and CDH1, thereby regulating pneumonia.


2021 ◽  
Vol 30 (4) ◽  
pp. 444-452
Author(s):  
Kyung-Wan Baek ◽  
So-Jeong Kim ◽  
Ji-Seok Kim ◽  
Sun-Ok Kwon

PURPOSE: This study evaluates the differences in the expression of genes frequently analyzed in the field of exercise science between the skeletal muscle tissue and various cell types that comprise the skeletal muscle tissue.METHODS: We summarized the genes and proteins expressed in the skeletal muscle that were published in “Exercise Science” journal from 2015 to present. Thereafter, we selected 15 genes and proteins that were the most analyzed genes and proteins in the skeletal muscle. These genes and proteins were horizontally compared for expression differences in skeletal muscle components and cultured cells based on NCBI Gene Expression Omnibus DataSets.RESULTS: The most analyzed genes (encoding analyzed proteins) in skeletal muscle tissues in “Exercise Science” were PPARGC1A, PPARD, MTOR, MAP1LC3A, MAP1LC3B, PRKAA1, AKT1, SLC2A4, MAPK1, COX4I1, MAPK14, MEF2A, MAPK8, RPS6KB1, and SOD1. Among them, PPARGC1A, AKT1, SLC2A4, MAPK1, and COX4I1 were specifically expressed in the skeletal muscle. However, expression of other genes was found to be significantly affected in other cell types of the skeletal muscle tissue.CONCLUSIONS: Genes such as PPARGC1A, which are specifically expressed in the skeletal muscle, may be analyzed without pretreating (such as perfusion) the skeletal muscle tissue. However, expression of other genes may depend on the skeletal muscle cell type. Thus, in such instances, pretreatment, such as perfusion and isolation, should be considered.


Author(s):  
Justine Dardaillon ◽  
Delphine Dauga ◽  
Paul Simion ◽  
Emmanuel Faure ◽  
Takeshi A Onuma ◽  
...  

Abstract ANISEED (https://www.aniseed.cnrs.fr) is the main model organism database for the worldwide community of scientists working on tunicates, the vertebrate sister-group. Information provided for each species includes functionally-annotated gene and transcript models with orthology relationships within tunicates, and with echinoderms, cephalochordates and vertebrates. Beyond genes the system describes other genetic elements, including repeated elements and cis-regulatory modules. Gene expression profiles for several thousand genes are formalized in both wild-type and experimentally-manipulated conditions, using formal anatomical ontologies. These data can be explored through three complementary types of browsers, each offering a different view-point. A developmental browser summarizes the information in a gene- or territory-centric manner. Advanced genomic browsers integrate the genetic features surrounding genes or gene sets within a species. A Genomicus synteny browser explores the conservation of local gene order across deuterostome. This new release covers an extended taxonomic range of 14 species, including for the first time a non-ascidian species, the appendicularian Oikopleura dioica. Functional annotations, provided for each species, were enhanced through a combination of manual curation of gene models and the development of an improved orthology detection pipeline. Finally, gene expression profiles and anatomical territories can be explored in 4D online through the newly developed Morphonet morphogenetic browser.


2020 ◽  
Vol 7 (1) ◽  
Author(s):  
Antonio Federico ◽  
Veera Hautanen ◽  
Nils Christian ◽  
Andreas Kremer ◽  
Angela Serra ◽  
...  

Abstract We present manually curated transcriptomics data of psoriasis and atopic dermatitis patients retrieved from the NCBI Gene Expression Omnibus and EBI ArrayExpress repositories. We collected 39 transcriptomics datasets, deriving from DNA microarrays and RNA-Sequencing technologies, for a total of 1677 samples. We provide quality-checked, homogenised and preprocessed gene expression matrices and their corresponding metadata tables along with the estimated surrogate variables. These data represent a ready-made valuable source of knowledge for translational researchers in the dermatology field.


2004 ◽  
Vol 5 (4) ◽  
pp. 362-369 ◽  
Author(s):  
Danforth Weems ◽  
Neil Miller ◽  
Margarita Garcia-Hernandez ◽  
Eva Huala ◽  
Seung Y. Rhee

TheArabidopsisInformation Resource (TAIR) is a web-based community database for the model plantArabidopsis thaliana. It provides an integrated view of genes, sequences, proteins, germplasms, clones, metabolic pathways, gene expression, ecotypes, polymorphisms, publications, maps and community information. TAIR is developed and maintained by collaboration between software developers and biologists. Biologists provide specification and use cases for the system, acquire, analyse and curate data, interact with users and test the software. Software developers design, implement and test the database and software. In this review, we briefly describe how TAIR was built and is being maintained.


2020 ◽  
Vol 9 (25) ◽  
Author(s):  
Kevin S. Myers ◽  
Michael Place ◽  
Daniel R. Noguera ◽  
Timothy J. Donohue

ABSTRACT We introduce COnTORT (COmprehensive Transcriptomic ORganizational Tool), a publicly available program that retrieves all available gene expression data and associated metadata for an organism from the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) database. The data are compiled into text files that can be used for downstream bioinformatic applications.


2014 ◽  
Vol 66 (3) ◽  
pp. 983-988 ◽  
Author(s):  
Hui Li ◽  
Xiaolan Zhong ◽  
Chaomin Li ◽  
Lijing Peng ◽  
Wei Liu ◽  
...  

Coronary artery disease (CAD) is the leading cause of death worldwide. Microarray analysis is a practical approach to study gene transcription changes that may reflect signatures that underlie the pathogenesis of CAD. Using gene expression profile data from the Gene Expression Omnibus database, we identified differentially expressed genes that can contribute to the pathology of CAD. Further pathway and network analyses were also implemented to identify pathways and hub genes related to the disease. We observed 466 downregulated and 560 upregulated genes. The ribosome pathway was the most significantly over-represented pathway with differentially expressed genes. Over 35% of the genes in this pathway were downregulated. Hub genes in the network, such as IL7R, FYN, CALM1 ESR1 and PLCG1, may play crucial roles in the pathogenesis of CAD. Our results facilitate the identification of molecular mechanisms that underlie CAD.


Sign in / Sign up

Export Citation Format

Share Document