scholarly journals PhycoMine: A Microalgae Data Warehouse

2021 ◽  
Author(s):  
Rodrigo R. D. Goitia ◽  
Diego M. Riaño-Pachón ◽  
Alexandre Victor Fassio ◽  
Flavia V. Winck

AbstractPhycoMine is data warehouse system created to fostering the analysis of complex and integrated data from microalgae species in a single computational environment. The PhycoMine was developed on top of the InterMine software system, and it has implemented an extended database model, containing a series of tools that help the users in the analysis and mining of individual data and group data. The platform has widgets created to facilitate simultaneous data mining of different datasets. Among the widgets implemented in PhycoMine, there are options for mining chromosome distribution, gene expression variation via transcriptomics, proteomics sets, Gene Onthology enrichment, KEGG enrichment, publication enrichment, EggNOG, Transcription factors and transcriptional regulators enrichment and phenotypical data. These widgets were created to facilitate data visualization of the gene expression levels in different experimental setups, for which RNA-seq experimental data is available in data repositories. For comparative purposes, we have reanalyzed 200 RNA-seq datasets from Chlamydomonas reinhardtii, a model unicellular microalga, for optimizing the performance and accuracy of data comparisons. We have also implemented widgets for metabolic pathway analysis of selected genes and proteins and options for biological network analysis. The option for analysis of orthologue genes was also included. With this platform, the users can perform data mining for a list of genes or proteins of interest in an integrated way through accessing the data from different sources and visualizing them in graphics and by exporting the data into table formats. The PhycoMine platform is freely available and can be visited through the URL https://PhycoMine.iq.usp.br.

2017 ◽  
Author(s):  
Alexander Lachmann ◽  
Denis Torre ◽  
Alexandra B. Keenan ◽  
Kathleen M. Jagodnik ◽  
Hyojin J. Lee ◽  
...  

RNA-sequencing (RNA-seq) is currently the leading technology for genome-wide transcript quantification. While the volume of RNA-seq data is rapidly increasing, the currently publicly available RNA-seq data is provided mostly in raw form, with small portions processed non- uniformly. This is mainly because the computational demand, particularly for the alignment step, is a significant barrier for global and integrative retrospective analyses. To address this challenge, we developed all RNA-seq and ChIP-seq sample and signature search (ARCHS4), a web resource that makes the majority of previously published RNA-seq data from human and mouse freely available at the gene count level. Such uniformly processed data enables easy integration for downstream analyses. For developing the ARCHS4 resource, all available FASTQ files from RNA-seq experiments were retrieved from the Gene Expression Omnibus (GEO) and aligned using a cloud-based infrastructure. In total 137,792 samples are accessible through ARCHS4 with 72,363 mouse and 65,429 human samples. Through efficient use of cloud resources and dockerized deployment of the sequencing pipeline, the alignment cost per sample is reduced to less than one cent. ARCHS4 is updated automatically by adding newly published samples to the database as they become available. Additionally, the ARCHS4 web interface provides intuitive exploration of the processed data through querying tools, interactive visualization, and gene landing pages that provide average expression across cell lines and tissues, top co-expressed genes, and predicted biological functions and protein-protein interactions for each gene based on prior knowledge combined with co-expression. Benchmarking the quality of these predictions, co-expression correlation data created from ARCHS4 outperforms co-expression data created from other major gene expression data repositories such as GTEx and CCLE.ARCHS4 is freely accessible at: http://amp.pharm.mssm.edu/archs4


2019 ◽  
Author(s):  
Hidemasa Bono

AbstractGene expression data have been archived as microarray and RNA-seq datasets in two public databases, Gene Expression Omnibus (GEO) and ArrayExpress (AE). In 2018, the DNA DataBank of Japan started a similar repository called the Genomic Expression Archive (GEA). These databases are useful resources for the functional interpretation of genes, but have been separately maintained and may lack RNA-seq data, while the original sequence data are available in the Sequence Read Archive (SRA).We constructed an index for those gene expression data repositories, called All Of gene Expression (AOE), to integrate publicly available gene expression data. The web interface of AOE can graphically query data in addition to the application programming interface. By collecting gene expression data from RNA-seq in the SRA, AOE also includes data not included in GEO and AE.AOE is accessible as a search tool from the GEA website and is freely available at https://aoe.dbcls.jp/.


2020 ◽  
Author(s):  
Christian Escoto-Sandoval ◽  
Alan Flores-Díaz ◽  
M. Humberto Reyes-Valdés ◽  
Neftalí Ochoa-Alejo ◽  
Octavio Martinez

Abstract Background: Open data sharing is instrumental for the advance of biological sciences. Gene expression is the primary molecular phenotype, usually estimated through RNA-Seq experiments. Large scope interpretation of RNA-Seq results is complicated by the extensive gene expression range, as well as by the diversity of biological sources and experimental treatments. Here we present “Salsa”, an auto-contained R package for extracting useful knowledge about gene expression during the development of chili pepper fruit. Methods and Results: Data from 168 RNA-Seq libraries, comprising more than 3.4 billion reads, were analyzed and curated to represent standardized expression profiles (SEPs) for all genes expressed during fruit development in 12 chili pepper accessions. Accessions have representatives of domesticated varieties, wild ancestors and crosses, covering a broad spectrum of genotypes. Data are organized in a relational way, and functions allow data mining from the level of single genes up to whole genomes, grouping profiles by different criteria. Those include any combination of expression model, accession, protein description and gene ontology (GO) term, among others. Also, GO enrichment analysis can be performed over any set of genes. Conclusions: “Salsa” opens endless possibilities for mining the transcriptome of chili pepper during fruit development.


Plants ◽  
2021 ◽  
Vol 10 (8) ◽  
pp. 1647
Author(s):  
Charles Barros Vitoriano ◽  
Cristiane Paula Gomes Calixto

Rice (Oryza sativa L.) is a major food crop but heat stress affects its yield and grain quality. To identify mechanistic solutions to improve rice yield under rising temperatures, molecular responses of thermotolerance must be understood. Transcriptional and post-transcriptional controls are involved in a wide range of plant environmental responses. Alternative splicing (AS), in particular, is a widespread mechanism impacting the stress defence in plants but it has been completely overlooked in rice genome-wide heat stress studies. In this context, we carried out a robust data mining of publicly available RNA-seq datasets to investigate the extension of heat-induced AS in rice leaves. For this, datasets of interest were subjected to filtering and quality control, followed by accurate transcript-specific quantifications. Powerful differential gene expression (DE) and differential AS (DAS) identified 17,143 and 2162 heat response genes, respectively, many of which are novel. Detailed analysis of DAS genes coding for key regulators of gene expression suggests that AS helps shape transcriptome and proteome diversity in response to heat. The knowledge resulting from this study confirmed a widespread transcriptional and post-transcriptional response to heat stress in plants, and it provided novel candidates for rapidly advancing rice breeding in response to climate change.


Genes ◽  
2021 ◽  
Vol 12 (5) ◽  
pp. 665
Author(s):  
Hui Yu ◽  
Yan Guo ◽  
Jingchun Chen ◽  
Xiangning Chen ◽  
Peilin Jia ◽  
...  

Transcriptomic studies of mental disorders using the human brain tissues have been limited, and gene expression signatures in schizophrenia (SCZ) remain elusive. In this study, we applied three differential co-expression methods to analyze five transcriptomic datasets (three RNA-Seq and two microarray datasets) derived from SCZ and matched normal postmortem brain samples. We aimed to uncover biological pathways where internal correlation structure was rewired or inter-coordination was disrupted in SCZ. In total, we identified 60 rewired pathways, many of which were related to neurotransmitter, synapse, immune, and cell adhesion. We found the hub genes, which were on the center of rewired pathways, were highly mutually consistent among the five datasets. The combinatory list of 92 hub genes was generally multi-functional, suggesting their complex and dynamic roles in SCZ pathophysiology. In our constructed pathway crosstalk network, we found “Clostridium neurotoxicity” and “signaling events mediated by focal adhesion kinase” had the highest interactions. We further identified disconnected gene links underlying the disrupted pathway crosstalk. Among them, four gene pairs (PAK1:SYT1, PAK1:RFC5, DCTN1:STX1A, and GRIA1:MAP2K4) were normally correlated in universal contexts. In summary, we systematically identified rewired pathways, disrupted pathway crosstalk circuits, and critical genes and gene links in schizophrenia transcriptomes.


2021 ◽  
Vol 22 (5) ◽  
pp. 2746
Author(s):  
Dimitri Shcherbakov ◽  
Reda Juskeviciene ◽  
Adrián Cortés Sanchón ◽  
Margarita Brilkova ◽  
Hubert Rehrauer ◽  
...  

Mitochondrial misreading, conferred by mutation V338Y in mitoribosomal protein Mrps5, in-vivo is associated with a subtle neurological phenotype. Brain mitochondria of homozygous knock-in mutant Mrps5V338Y/V338Y mice show decreased oxygen consumption and reduced ATP levels. Using a combination of unbiased RNA-Seq with untargeted metabolomics, we here demonstrate a concerted response, which alleviates the impaired functionality of OXPHOS complexes in Mrps5 mutant mice. This concerted response mitigates the age-associated decline in mitochondrial gene expression and compensates for impaired respiration by transcriptional upregulation of OXPHOS components together with anaplerotic replenishment of the TCA cycle (pyruvate, 2-ketoglutarate).


2021 ◽  
Vol 80 (Suppl 1) ◽  
pp. 12.2-12
Author(s):  
I. Muller ◽  
M. Verhoeven ◽  
H. Gosselt ◽  
M. Lin ◽  
T. De Jong ◽  
...  

Background:Tocilizumab (TCZ) is a monoclonal antibody that binds to the interleukin 6 receptor (IL-6R), inhibiting IL-6R signal transduction to downstream inflammatory mediators. TCZ has shown to be effective as monotherapy in early rheumatoid arthritis (RA) patients (1). However, approximately one third of patients inadequately respond to therapy and the biological mechanisms underlying lack of efficacy for TCZ remain elusive (1). Here we report gene expression differences, in both whole blood and peripheral blood mononuclear cells (PBMC) RNA samples between early RA patients, categorized by clinical TCZ response (reaching DAS28 < 3.2 at 6 months). These findings could lead to identification of predictive biomarkers for TCZ response and improve RA treatment strategies.Objectives:To identify potential baseline gene expression markers for TCZ response in early RA patients using an RNA-sequencing approach.Methods:Two cohorts of RA patients were included and blood was collected at baseline, before initiating TCZ treatment (8 mg/kg every 4 weeks, intravenously). DAS28-ESR scores were calculated at baseline and clinical response to TCZ was defined as DAS28 < 3.2 at 6 months of treatment. In the first cohort (n=21 patients, previously treated with DMARDs), RNA-sequencing (RNA-seq) was performed on baseline whole blood PAXgene RNA (Illumina TruSeq mRNA Stranded) and differential gene expression (DGE) profiles were measured between responders (n=14) and non-responders (n=7). For external replication, in a second cohort (n=95 therapy-naïve patients receiving TCZ monotherapy), RNA-seq was conducted on baseline PBMC RNA (SMARTer Stranded Total RNA-Seq Kit, Takara Bio) from the 2-year, multicenter, double-blind, placebo-controlled, randomized U-Act-Early trial (ClinicalTrials.gov identifier: NCT01034137) and DGE was analyzed between 84 responders and 11 non-responders.Results:Whole blood DGE analysis showed two significantly higher expressed genes in TCZ non-responders (False Discovery Rate, FDR < 0.05): urotensin 2 (UTS2) and caveolin-1 (CAV1). Subsequent analysis of U-Act-Early PBMC DGE showed nine differentially expressed genes (FDR < 0.05) of which expression in clinical TCZ non-responders was significantly higher for eight genes (MTCOP12, ZNF774, UTS2, SLC4A1, FECH, IFIT1B, AHSP, and SPTB) and significantly lower for one gene (TND2P28M). Both analyses were corrected for baseline DAS28-ESR, age and gender. Expression of UTS2, with a proposed function in regulatory T-cells (2), was significantly higher in TCZ non-responders in both cohorts. Furthermore, gene ontology enrichment analysis revealed no distinct gene ontology or IL-6 related pathway(s) that were significantly different between TCZ-responders and non-responders.Conclusion:Several genes are differentially expressed at baseline between responders and non-responders to TCZ therapy at 6 months. Most notably, UTS2 expression is significantly higher in TCZ non-responders in both whole blood as well as PBMC cohorts. UTS2 could be a promising target for further analyses as a potential predictive biomarker for TCZ response in RA patients in combination with clinical parameters (3).References:[1]Bijlsma JWJ, Welsing PMJ, Woodworth TG, et al. Early rheumatoid arthritis treated with tocilizumab, methotrexate, or their combination (U-Act-Early): a multicentre, randomised, double-blind, double-dummy, strategy trial. Lancet. 2016;388(10042):343-55.[2]Bhairavabhotla R, Kim YC, Glass DD, et al. Transcriptome profiling of human FoxP3+ regulatory T cells. Human Immunology. 2016;77(2):201-13.[3]Gosselt HR, Verhoeven MMA, Bulatovic-Calasan M, et al. Complex machine-learning algorithms and multivariable logistic regression on par in the prediction of insufficient clinical response to methotrexate in rheumatoid arthritis. Journal of Personalized Medicine. 2021;11(1).Disclosure of Interests:None declared


Sign in / Sign up

Export Citation Format

Share Document