scholarly journals Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes

2015 ◽  
Vol 16 (1) ◽  
Author(s):  
Anirban Bhar ◽  
Martin Haubrock ◽  
Anirban Mukhopadhyay ◽  
Edgar Wingender
2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Kyuri Jo ◽  
Inyoung Sung ◽  
Dohoon Lee ◽  
Hyuksoon Jang ◽  
Sun Kim

AbstractCellular stages of biological processes have been characterized using fluorescence-activated cell sorting and genetic perturbations, charting a limited landscape of cellular states. Time series transcriptome data can help define new cellular states at the molecular level since the analysis of transcriptional changes can provide information on cell states and transitions. However, existing methods for inferring cell states from transcriptome data use additional information such as prior knowledge on cell types or cell-type-specific markers to reduce the complexity of data. In this study, we present a novel time series clustering framework to infer TRAnscriptomic Cellular States (TRACS) only from time series transcriptome data by integrating Gaussian process regression, shape-based distance, and ranked pairs algorithm in a single computational framework. TRACS determines patterns that correspond to hidden cellular states by clustering gene expression data. TRACS was used to analyse single-cell and bulk RNA sequencing data and successfully generated cluster networks that reflected the characteristics of key stages of biological processes. Thus, TRACS has a potential to help reveal unknown cellular states and transitions at the molecular level using only time series transcriptome data. TRACS is implemented in Python and available at http://github.com/BML-cbnu/TRACS/.


2021 ◽  
Author(s):  
Santiago Gassó ◽  
Pawan Gupta ◽  
Paul Ginoux ◽  
Robert Levy

<p>Aerosol transport processes in the Southern Hemisphere (SH) have been the center of renewed attention in the last two decades because of a number of major geophysical events such as volcanic eruptions (Chile and Argentina), biomass burning (Australia and Chile) and dust storms (Australia and Argentina).<br><br>While volcanic and fire activity in the SH have been the focus of several studies, there is a dearth of satellite assessments of dust activity. The lack of such analysis impairs the understanding of biological processes in the Southern Ocean and of the provenance of dust found in snow at the surface of East Antarctica.<br><br>This presentation will show an analysis of time series of Aerosol Optical Depths over the Patagonia desert in South America. Data from two aerosol algorithms (Dark Target and Deep Blue) will be jointly analyzed to establish a timeline of dust activity in the region. Also, dust proxies from both algorithms will be compared with ground-based observations of visibility at different airports in the area. Once an understanding of frequency and time evolution of the dust activity is achieved, first estimations of ocean-going dust fluxes will be derived.</p>


PeerJ ◽  
2021 ◽  
Vol 9 ◽  
pp. e11321
Author(s):  
Di Zhang ◽  
Pengguang Yan ◽  
Taotao Han ◽  
Xiaoyun Cheng ◽  
Jingnan Li

Background Ulcerative colitis-associated colorectal cancer (UC-CRC) is a life-threatening complication of ulcerative colitis (UC). The mechanisms underlying UC-CRC remain to be elucidated. The purpose of this study was to explore the key genes and biological processes contributing to colitis-associated dysplasia (CAD) or carcinogenesis in UC via database mining, thus offering opportunities for early prediction and intervention of UC-CRC. Methods Microarray datasets (GSE47908 and GSE87466) were downloaded from Gene Expression Omnibus (GEO). Differentially expressed genes (DEGs) between groups of GSE47908 were identified using the “limma” R package. Weighted gene co-expression network analysis (WGCNA) based on DEGs between the CAD and control groups was conducted subsequently. Functional enrichment analysis was performed, and hub genes of selected modules were identified using the “clusterProfiler” R package. Single-gene gene set enrichment analysis (GSEA) was conducted to predict significant biological processes and pathways associated with the specified gene. Results Six functional modules were identified based on 4929 DEGs. Green and blue modules were selected because of their consistent correlation with UC and CAD, and the highest correlation coefficient with the progress of UC-associated carcinogenesis. Functional enrichment analysis revealed that genes of these two modules were significantly enriched in biological processes, including mitochondrial dysfunction, cell-cell junction, and immune responses. However, GSEA based on differential expression analysis between sporadic colorectal cancer (CRC) and normal controls from The Cancer Genome Atlas (TCGA) indicated that mitochondrial dysfunction may not be the major carcinogenic mechanism underlying sporadic CRC. Thirteen hub genes (SLC25A3, ACO2, AIFM1, ATP5A1, DLD, TFE3, UQCRC1, ADIPOR2, SLC35D1, TOR1AIP1, PRR5L, ATOX1, and DTX3) were identified. Their expression trends were validated in UC patients of GSE87466, and their potential carcinogenic effects in UC were supported by their known functions and other relevant studies reported in the literature. Single-gene GSEA indicated that biological processes and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways related to angiogenesis and immune response were positively correlated with the upregulation of TFE3, whereas those related to mitochondrial function and energy metabolism were negatively correlated with the upregulation of TFE3. Conclusions Using WGCNA, this study found two gene modules that were significantly correlated with CAD, of which 13 hub genes were identified as the potential key genes. The critical biological processes in which the genes of these two modules were significantly enriched include mitochondrial dysfunction, cell-cell junction, and immune responses. TFE3, a transcription factor related to mitochondrial function and cancers, may play a central role in UC-associated carcinogenesis.


2015 ◽  
Vol 42 (10) ◽  
pp. 1485-1485
Author(s):  
Xun Xia ◽  
Bo Qu ◽  
Yuan Ma ◽  
Li-bin Yang ◽  
Hai-dong Huang ◽  
...  

2021 ◽  
Vol 12 ◽  
Author(s):  
Juan Chen ◽  
Ruixian Zhang ◽  
Min Xie ◽  
Chunyan Luan ◽  
Xiaolan Li

Dermatomyositis (DM), an inflammatory disorder, is often associated with interstitial lung disease (ILD). However, the underlying mechanism remains unclear. Our study performed RNA sequencing (RNA-seq) and integrative bioinformatics analysis of differentially expressed genes (DEGs) in patients with dermatomyositis-associated interstitial lung disease (DM-ILD) and healthy controls. A total of 2,018 DEGs were identified between DM-ILD and healthy blood samples. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis showed that DEGs were mainly involved in immune- and inflammatory-related biological processes and pathways. Disease ontology (DO) enrichment analysis identified 35 candidate key genes involved in both skin and lung diseases. Meanwhile, a total of 886 differentially expressed alternative splicing (AS) events were found between DM-ILD and healthy blood samples. After overlapping DEGs with differential AS genes, the plasminogen activator and urokinase receptor (PLAUR) involved in immune-related biological processes and complement and coagulation cascades was screened and identified as the most important gene associated with DM-ILD. The protein–protein interaction (PPI) network revealed that PLAUR had interactions with multiple candidate key genes. Moreover, we observed that there were significantly more neutrophils and less naive B cells in DM-ILD samples than in healthy samples. And the expression of PLAUR was significantly positively correlated with the abundance of neutrophils. Significant higher abundance of PLAUR in DM-ILD patients than healthy controls was validated by RT-qPCR. In conclusion, we identified PLAUR as an important player in regulating DM-ILD by neutrophil-associated immune response. These findings enrich our understanding, which may benefit DM-ILD patients.


2022 ◽  
Author(s):  
Chao Duan ◽  
Feng-Hua Tian ◽  
Lan Yao ◽  
Jian-Hua Lv ◽  
Chuan-Wen Jia ◽  
...  

Abstract In order to explore the molecular mechanism of Sarcomyxa edulis response to lignocelluloses degradation, the developmental transcriptomes was analyzed for six stages covering the whole developmental process, including mycelium growing to half bag (B1), mycelium in cold stimulation after full bag (B2), mycelium in primordia appearing (B3), primordia (B4), mycelium at the harvest stage (B5) and mature fruiting body (B6). A total of 6 samples were used for transcriptome sequencing, with three biological replicates. Based on the above transcriptome data, we constructed a co-expression network of weighted genes associated with extracellular enzyme physiological traits by WGCNA, and obtained 19 gene co-expression modules closely related to lignocelluloses degradation. In addition, a number of key genes involved in lignocelluloses degradation pathways were discovered from the four modules with the highest correlation with target traits. These results provide clues for further study on the molecular genetic mechanisms of Sarcomyxa edulis lignocelluloses degradation.


2018 ◽  
Vol 70 (4) ◽  
pp. 629-637 ◽  
Author(s):  
Yong Zhou ◽  
Lingli Ge ◽  
Guanghua Li ◽  
Lunwei Jiang ◽  
Yingui Yang

The growth regulating factor (GRF) family is a conserved class of transcription factors involved in various biological processes in plants. However, there have been only a few studies of the GRF family genes in cucumber, Cucumis sativus (Cs). In this study, we identified and characterized 8 CsGRF genes in cucumber. Two highly conserved domains, QLQ and WRC, were identified to be present in all CsGRF proteins. In addition, three less conserved domains (FFD, TQL, and GGPL) were also detected in some CsGRF members. Based on phylogenetic analysis, the GRF genes from cucumber, Arabidopsis, tomato, rice and maize could be classified into 10 groups, and CsGRFs were clustered closer with the GRF genes from dicots (Arabidopsis and tomato) than with those from monocots (rice and maize). Promoter analysis revealed that the CsGRF genes were involved in cucumber growth and development as well as in responses to various hormones and stresses. Transcriptome data showed that the CsGRF genes have distinct expression patterns in different tissues, especially in ovaries and leaves. Expression profiling analysis indicated that all CsGRF genes were responsive to salt and drought stress treatments. These results demonstrate that the cucumber GRF gene family may function in organ development and plant stress responses.


2022 ◽  
Author(s):  
Yonas I. Tekle ◽  
Fang Wang ◽  
Hanh Tran ◽  
T. Danielle Hayes ◽  
Joseph F. Ryan

Abstract To date, genomic analyses in amoebozoans have been mostly limited to model organisms or medically important lineages. Consequently, the vast diversity of Amoebozoa genomes remain unexplored. A draft genome of Cochliopodium minus, an amoeba characterized by extensive cellular and nuclear fusions, is presented. C. minus has been a subject of recent investigation for its unusual sexual behavior. Cochliopodium’s sexual activity occurs during vegetative stage making it an ideal model for studying sexual development, which is sorely lacking in the group. Here we generate a C. minus draft genome assembly. From this genome, we detect a substantial number of lateral gene transfer (LGT) instances from bacteria (15%), archaea (0.9%) and viruses (0.7%) the majority of which are detected in our transcriptome data. We identify the complete meiosis toolkit genes in the C. minus genome, as well as the absence of several key genes involved in plasmogamy and karyogamy. Comparative genomics of amoebozoans reveals variation in sexual mechanism exist in the group. Similar to complex eukaryotes, C. minus (some amoebae) possesses Tyrosine kinases and duplicate copies of SPO11. We report a first example of alternative splicing in a key meiosis gene and draw important insights on molecular mechanism of sex in C. minus using genomic and transcriptomic data.


BMC Genomics ◽  
2019 ◽  
Vol 20 (S11) ◽  
Author(s):  
Dongwon Kang ◽  
Hongryul Ahn ◽  
Sangseon Lee ◽  
Chai-Jin Lee ◽  
Jihye Hur ◽  
...  

Abstract Background Recently, a number of studies have been conducted to investigate how plants respond to stress at the cellular molecular level by measuring gene expression profiles over time. As a result, a set of time-series gene expression data for the stress response are available in databases. With the data, an integrated analysis of multiple stresses is possible, which identifies stress-responsive genes with higher specificity because considering multiple stress can capture the effect of interference between stresses. To analyze such data, a machine learning model needs to be built. Results In this study, we developed StressGenePred, a neural network-based machine learning method, to integrate time-series transcriptome data of multiple stress types. StressGenePred is designed to detect single stress-specific biomarker genes by using a simple feature embedding method, a twin neural network model, and Confident Multiple Choice Learning (CMCL) loss. The twin neural network model consists of a biomarker gene discovery and a stress type prediction model that share the same logical layer to reduce training complexity. The CMCL loss is used to make the twin model select biomarker genes that respond specifically to a single stress. In experiments using Arabidopsis gene expression data for four major environmental stresses, such as heat, cold, salt, and drought, StressGenePred classified the types of stress more accurately than the limma feature embedding method and the support vector machine and random forest classification methods. In addition, StressGenePred discovered known stress-related genes with higher specificity than the Fisher method. Conclusions StressGenePred is a machine learning method for identifying stress-related genes and predicting stress types for an integrated analysis of multiple stress time-series transcriptome data. This method can be used to other phenotype-gene associated studies.


Sign in / Sign up

Export Citation Format

Share Document