scholarly journals An R Package for Data Mining Chili Pepper Fruit Transcriptomes

Author(s):  
Christian Escoto-Sandoval ◽  
Alan Flores-Díaz ◽  
M. Humberto Reyes-Valdés ◽  
Neftalí Ochoa-Alejo ◽  
Octavio Martinez

Abstract Background: Open data sharing is instrumental for the advance of biological sciences. Gene expression is the primary molecular phenotype, usually estimated through RNA-Seq experiments. Large scope interpretation of RNA-Seq results is complicated by the extensive gene expression range, as well as by the diversity of biological sources and experimental treatments. Here we present “Salsa”, an auto-contained R package for extracting useful knowledge about gene expression during the development of chili pepper fruit. Methods and Results: Data from 168 RNA-Seq libraries, comprising more than 3.4 billion reads, were analyzed and curated to represent standardized expression profiles (SEPs) for all genes expressed during fruit development in 12 chili pepper accessions. Accessions have representatives of domesticated varieties, wild ancestors and crosses, covering a broad spectrum of genotypes. Data are organized in a relational way, and functions allow data mining from the level of single genes up to whole genomes, grouping profiles by different criteria. Those include any combination of expression model, accession, protein description and gene ontology (GO) term, among others. Also, GO enrichment analysis can be performed over any set of genes. Conclusions: “Salsa” opens endless possibilities for mining the transcriptome of chili pepper during fruit development.

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Christian Escoto-Sandoval ◽  
Alan Flores-Díaz ◽  
M. Humberto Reyes-Valdés ◽  
Neftalí Ochoa-Alejo ◽  
Octavio Martínez

AbstractRNA-Seq experiments allow genome-wide estimation of relative gene expression. Estimation of gene expression at different time points generates time expression profiles of phenomena of interest, as for example fruit development. However, such profiles can be complex to analyze and interpret. We developed a methodology that transforms original RNA-Seq data from time course experiments into standardized expression profiles, which can be easily interpreted and analyzed. To exemplify this methodology we used RNA-Seq data obtained from 12 accessions of chili pepper (Capsicum annuum L.) during fruit development. All relevant data, as well as functions to perform analyses and interpretations from this experiment, were gathered into a publicly available R package: “Salsa”. Here we explain the rational of the methodology and exemplify the use of the package to obtain valuable insights into the multidimensional time expression changes that occur during chili pepper fruit development. We hope that this tool will be of interest for researchers studying fruit development in chili pepper as well as in other angiosperms.


Plants ◽  
2021 ◽  
Vol 10 (3) ◽  
pp. 585
Author(s):  
Octavio Martínez ◽  
Magda L. Arce-Rodríguez ◽  
Fernando Hernández-Godínez ◽  
Christian Escoto-Sandoval ◽  
Felipe Cervantes-Hernández ◽  
...  

Chili pepper (Capsicum spp.) is an important crop, as well as a model for fruit development studies and domestication. Here, we performed a time-course experiment to estimate standardized gene expression profiles with respect to fruit development for six domesticated and four wild chili pepper ancestors. We sampled the transcriptomes every 10 days from flowering to fruit maturity, and found that the mean standardized expression profiles for domesticated and wild accessions significantly differed. The mean standardized expression was higher and peaked earlier for domesticated vs. wild genotypes, particularly for genes involved in the cell cycle that ultimately control fruit size. We postulate that these gene expression changes are driven by selection pressures during domestication and show a robust network of cell cycle genes with a time shift in expression, which explains some of the differences between domesticated and wild phenotypes.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Christian Escoto-Sandoval ◽  
Neftalí Ochoa-Alejo ◽  
Octavio Martínez

AbstractGene expression is the primary molecular phenotype and can be estimated in specific organs or tissues at particular times. Here we analyzed genome-wide inheritance of gene expression in fruits of chili pepper (Capsicum annuum L.) in reciprocal crosses between a domesticated and a wild accession, estimating this parameter during fruit development. We defined a general hierarchical schema to classify gene expression inheritance which can be employed for any quantitative trait. We found that inheritance of gene expression is affected by both, the time of fruit development as well as the direction of the cross, and propose that such variations could be common in many developmental processes. We conclude that classification of inheritance patterns is important to have a better understanding of the mechanisms underlying gene expression regulation, and demonstrate that sets of genes with specific inheritance pattern at particular times of fruit development are enriched in different biological processes, molecular functions and cell components. All curated data and functions for analysis and visualization are publicly available as an R package.


2014 ◽  
Vol 11 (92) ◽  
pp. 20130950 ◽  
Author(s):  
Guini Hong ◽  
Wenjing Zhang ◽  
Hongdong Li ◽  
Xiaopei Shen ◽  
Zheng Guo

Two strategies are often adopted for enrichment analysis of pathways: the analysis of all differentially expressed (DE) genes together or the analysis of up- and downregulated genes separately. However, few studies have examined the rationales of these enrichment analysis strategies. Using both microarray and RNA-seq data, we show that gene pairs with functional links in pathways tended to have positively correlated expression levels, which could result in an imbalance between the up- and downregulated genes in particular pathways. We then show that the imbalance could greatly reduce the statistical power for finding disease-associated pathways through the analysis of all-DE genes. Further, using gene expression profiles from five types of tumours, we illustrate that the separate analysis of up- and downregulated genes could identify more pathways that are really pertinent to phenotypic difference. In conclusion, analysing up- and downregulated genes separately is more powerful than analysing all of the DE genes together.


2020 ◽  
Author(s):  
Octavio Martínez ◽  
Magda L. Arce-Rodríguez ◽  
Fernando Hernández-Godínez ◽  
Christian Escoto-Sandoval ◽  
Felipe Cervantes-Hernández ◽  
...  

ABSTRACTChili pepper (Capsicum spp.) is both an important crop and a model for domestication studies. Here we performed a time course experiment to estimate standardized gene expression profiles across fruit development for six domesticated and four wild chili pepper ancestors. We sampled the transcriptome every 10 days, from flower to fruit at 60 Days After Anthesis (DAA), and found that the mean standardized expression profile for domesticated and wild accessions significantly differed. The mean standardized expression was higher and peaked earlier for domesticated vs. wild genotypes, particularly for genes involved in the cell cycle that ultimately control fruit size. We postulate that these gene expression changes are driven by selection pressures during domestication and show a robust network of cell cycle genes with a time-shift in expression which explains some of the differences between domesticated and wild phenotypes.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Yanlei Yue ◽  
Ze Jiang ◽  
Enoch Sapey ◽  
Tingting Wu ◽  
Shi Sun ◽  
...  

Abstract Background In soybean, some circadian clock genes have been identified as loci for maturity traits. However, the effects of these genes on soybean circadian rhythmicity and their impacts on maturity are unclear. Results We used two geographically, phenotypically and genetically distinct cultivars, conventional juvenile Zhonghuang 24 (with functional J/GmELF3a, a homolog of the circadian clock indispensable component EARLY FLOWERING 3) and long juvenile Huaxia 3 (with dysfunctional j/Gmelf3a) to dissect the soybean circadian clock with time-series transcriptomal RNA-Seq analysis of unifoliate leaves on a day scale. The results showed that several known circadian clock components, including RVE1, GI, LUX and TOC1, phase differently in soybean than in Arabidopsis, demonstrating that the soybean circadian clock is obviously different from the canonical model in Arabidopsis. In contrast to the observation that ELF3 dysfunction results in clock arrhythmia in Arabidopsis, the circadian clock is conserved in soybean regardless of the functional status of J/GmELF3a. Soybean exhibits a circadian rhythmicity in both gene expression and alternative splicing. Genes can be grouped into six clusters, C1-C6, with different expression profiles. Many more genes are grouped into the night clusters (C4-C6) than in the day cluster (C2), showing that night is essential for gene expression and regulation. Moreover, soybean chromosomes are activated with a circadian rhythmicity, indicating that high-order chromosome structure might impact circadian rhythmicity. Interestingly, night time points were clustered in one group, while day time points were separated into two groups, morning and afternoon, demonstrating that morning and afternoon are representative of different environments for soybean growth and development. However, no genes were consistently differentially expressed over different time-points, indicating that it is necessary to perform a circadian rhythmicity analysis to more thoroughly dissect the function of a gene. Moreover, the analysis of the circadian rhythmicity of the GmFT family showed that GmELF3a might phase- and amplitude-modulate the GmFT family to regulate the juvenility and maturity traits of soybean. Conclusions These results and the resultant RNA-seq data should be helpful in understanding the soybean circadian clock and elucidating the connection between the circadian clock and soybean maturity.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Chunyan Li ◽  
Xiaoyun He ◽  
Zijun Zhang ◽  
Chunhuan Ren ◽  
Mingxing Chu

Abstract Background Long noncoding RNA (lncRNA) has been identified as important regulator in hypothalamic-pituitary-ovarian axis associated with sheep prolificacy. However, little is known of their expression pattern and potential roles in the pineal gland of sheep. Herein, RNA-Seq was used to detect transcriptome expression pattern in pineal gland between follicular phase (FP) and luteal phase (LP) in FecBBB (MM) and FecB++ (ww) STH sheep, respectively, and differentially expressed (DE) lncRNAs and mRNAs associated with reproduction were identified. Results Overall, 135 DE lncRNAs and 1360 DE mRNAs in pineal gland between MM and ww sheep were screened. Wherein, 39 DE lncRNAs and 764 DE mRNAs were identified (FP vs LP) in MM sheep, 96 DE lncRNAs and 596 DE mRNAs were identified (FP vs LP) in ww sheep. Moreover, GO and KEGG enrichment analysis indicated that the targets of DE lncRNAs and DE mRNAs were annotated to multiple biological processes such as phototransduction, circadian rhythm, melanogenesis, GSH metabolism and steroid biosynthesis, which directly or indirectly participate in hormone activities to affect sheep reproductive performance. Additionally, co-expression of lncRNAs-mRNAs and the network construction were performed based on correlation analysis, DE lncRNAs can modulate target genes involved in related pathways to affect sheep fecundity. Specifically, XLOC_466330, XLOC_532771, XLOC_028449 targeting RRM2B and GSTK1, XLOC_391199 targeting STMN1, XLOC_503926 targeting RAG2, XLOC_187711 targeting DLG4 were included. Conclusion All of these differential lncRNAs and mRNAs expression profiles in pineal gland provide a novel resource for elucidating regulatory mechanism underlying STH sheep prolificacy.


BMC Cancer ◽  
2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Baojie Wu ◽  
Shuyi Xi

Abstract Background This study aimed to explore and identify key genes and signaling pathways that contribute to the progression of cervical cancer to improve prognosis. Methods Three gene expression profiles (GSE63514, GSE64217 and GSE138080) were screened and downloaded from the Gene Expression Omnibus database (GEO). Differentially expressed genes (DEGs) were screened using the GEO2R and Venn diagram tools. Then, Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed. Gene set enrichment analysis (GSEA) was performed to analyze the three gene expression profiles. Moreover, a protein–protein interaction (PPI) network of the DEGs was constructed, and functional enrichment analysis was performed. On this basis, hub genes from critical PPI subnetworks were explored with Cytoscape software. The expression of these genes in tumors was verified, and survival analysis of potential prognostic genes from critical subnetworks was conducted. Functional annotation, multiple gene comparison and dimensionality reduction in candidate genes indicated the clinical significance of potential targets. Results A total of 476 DEGs were screened: 253 upregulated genes and 223 downregulated genes. DEGs were enriched in 22 biological processes, 16 cellular components and 9 molecular functions in precancerous lesions and cervical cancer. DEGs were mainly enriched in 10 KEGG pathways. Through intersection analysis and data mining, 3 key KEGG pathways and related core genes were revealed by GSEA. Moreover, a PPI network of 476 DEGs was constructed, hub genes from 12 critical subnetworks were explored, and a total of 14 potential molecular targets were obtained. Conclusions These findings promote the understanding of the molecular mechanism of and clinically related molecular targets for cervical cancer.


Sign in / Sign up

Export Citation Format

Share Document