The knowledge of chromosomal sex is important for large-scale analysis of gene expression

The aim of the study was to determine the sex of the fetus in gene expression data lacking this information using expression of the Y-linked genes, and to elucidate the difference between sex-chromosomal-linked gene expression between placental samples with XX and XY genotypes during pregnacy. We have detected 27 differentially expressed sex-chromosomes-linked genes. We have shown that, in most cases, the expression of genes from X-chromosomes in pregnancy carrying baby girls is higher than in pregnancy carrying baby boys, but there are exceptions to this pattern, which must be taken into account in large-scale studies of gene expression. The nature of the difference in gene expression during pregnancy carrying baby girls and boys (positive or ne gative difference) persists during pregnancy, but the magnitude of the difference may remain unchanged or decrease from the first to the third trimester. Taking sex dimorphism into account when analyzing large-scale gene expression data between trimesters of pregnancy increases the number of differentially expressed genes, which improves the informative value of the study and is important for elucidating the pathogenesis of pregnancy complications associated with placental dysfunction.

Download Full-text

Consensus Clustering for Cancer Gene Expression Data - Large-Scale Analysis using Evidence Accumulation Approach

Proceedings of the 10th International Joint Conference on Biomedical Engineering Systems and Technologies ◽

10.5220/0006174501760183 ◽

2017 ◽

Author(s):

Isidora Šašić ◽

Sanja Brdar ◽

Tatjana Lončar-Turukalo ◽

Helena Aidos ◽

Ana Fred

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Large Scale ◽

Consensus Clustering ◽

Cancer Gene ◽

Expression Data ◽

Scale Analysis ◽

Evidence Accumulation ◽

Large Scale Analysis

Download Full-text

Identification of metagenes and their Interactions through Large-scale Analysis of Arabidopsis Gene Expression Data

BMC Genomics ◽

10.1186/1471-2164-13-237 ◽

2012 ◽

Vol 13 (1) ◽

pp. 237 ◽

Cited By ~ 6

Author(s):

Tyler J Wilson ◽

Liming Lai ◽

Yuguang Ban ◽

Steven X Ge

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Large Scale ◽

Expression Data ◽

Scale Analysis ◽

Arabidopsis Gene ◽

Large Scale Analysis

Download Full-text

Large-Scale Analysis of Gene Expression Data Reveals a Novel Gene Expression Signature Associated with Colorectal Cancer Distant Recurrence

PLoS ONE ◽

10.1371/journal.pone.0167455 ◽

2016 ◽

Vol 11 (12) ◽

pp. e0167455 ◽

Cited By ~ 10

Author(s):

Nehad M. Alajez

Keyword(s):

Gene Expression ◽

Colorectal Cancer ◽

Gene Expression Data ◽

Large Scale ◽

Gene Expression Signature ◽

Distant Recurrence ◽

Expression Data ◽

Scale Analysis ◽

Expression Signature ◽

Large Scale Analysis

Download Full-text

Gene Set Correlation Analysis and Visualization Using Gene Expression Data

Current Bioinformatics ◽

10.2174/1574893615999200629124444 ◽

2020 ◽

Vol 15 ◽

Author(s):

Chen-An Tsai ◽

James J. Chen

Keyword(s):

Gene Expression ◽

Correlation Analysis ◽

Gene Expression Data ◽

Differentially Expressed Gene ◽

Differentially Expressed ◽

Superior Performance ◽

Expression Data ◽

Gene Set ◽

Gene Sets ◽

Set Correlation

Background: Gene set enrichment analyses (GSEA) provide a useful and powerful approach to identify differentially expressed gene sets with prior biological knowledge. Several GSEA algorithms have been proposed to perform enrichment analyses on groups of genes. However, many of these algorithms have focused on identification of differentially expressed gene sets in a given phenotype. Objective: In this paper, we propose a gene set analytic framework, Gene Set Correlation Analysis (GSCoA), that simultaneously measures within and between gene sets variation to identify sets of genes enriched for differential expression and highly co-related pathways. Methods: We apply co-inertia analysis to the comparisons of cross-gene sets in gene expression data to measure the costructure of expression profiles in pairs of gene sets. Co-inertia analysis (CIA) is one multivariate method to identify trends or co-relationships in multiple datasets, which contain the same samples. The objective of CIA is to seek ordinations (dimension reduction diagrams) of two gene sets such that the square covariance between the projections of the gene sets on successive axes is maximized. Simulation studies illustrate that CIA offers superior performance in identifying corelationships between gene sets in all simulation settings when compared to correlation-based gene set methods. Result and Conclusion: We also combine between-gene set CIA and GSEA to discover the relationships between gene sets significantly associated with phenotypes. In addition, we provide a graphical technique for visualizing and simultaneously exploring the associations of between and within gene sets and their interaction and network. We then demonstrate integration of within and between gene sets variation using CIA and GSEA, applied to the p53 gene expression data using the c2 curated gene sets. Ultimately, the GSCoA approach provides an attractive tool for identification and visualization of novel associations between pairs of gene sets by integrating co-relationships between gene sets into gene set analysis.

Download Full-text

Graph Convolutional Network for Drug Response Prediction Using Gene Expression Data

Mathematics ◽

10.3390/math9070772 ◽

2021 ◽

Vol 9 (7) ◽

pp. 772

Author(s):

Seonghun Kim ◽

Seockhun Bae ◽

Yinhua Piao ◽

Kyuri Jo

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Large Scale ◽

Drug Response ◽

Response Prediction ◽

Biological Data ◽

Expression Data ◽

Convolutional Network ◽

Essential Information ◽

Protein Protein Interaction

Genomic profiles of cancer patients such as gene expression have become a major source to predict responses to drugs in the era of personalized medicine. As large-scale drug screening data with cancer cell lines are available, a number of computational methods have been developed for drug response prediction. However, few methods incorporate both gene expression data and the biological network, which can harbor essential information about the underlying process of the drug response. We proposed an analysis framework called DrugGCN for prediction of Drug response using a Graph Convolutional Network (GCN). DrugGCN first generates a gene graph by combining a Protein-Protein Interaction (PPI) network and gene expression data with feature selection of drug-related genes, and the GCN model detects the local features such as subnetworks of genes that contribute to the drug response by localized filtering. We demonstrated the effectiveness of DrugGCN using biological data showing its high prediction accuracy among the competing methods.

Download Full-text

GENE DISCOVERY METHODS FROM LARGE-SCALE GENE EXPRESSION DATA

Quantum Bio-Informatics III ◽

10.1142/9789814304061_0040 ◽

2010 ◽

Author(s):

AKIFUMI SHIMIZU ◽

KENTARO YANO

Keyword(s):

Gene Expression ◽

Gene Expression Data ◽

Large Scale ◽

Gene Discovery ◽

Expression Data

Download Full-text

LSTrAP-Crowd: Prediction of novel components of bacterial ribosomes with crowd-sourced analysis of RNA sequencing data

10.1101/2020.04.20.005249 ◽

2020 ◽

Author(s):

Benedict Hew ◽

Qiao Wen Tan ◽

William Goh ◽

Jonathan Wei Xiong Ng ◽

Kenny Koh ◽

...

Keyword(s):

Gene Expression ◽

Protein Synthesis ◽

Rna Sequencing ◽

Gene Expression Data ◽

Large Scale ◽

Bacterial Resistance ◽

Expression Data ◽

Sequencing Data ◽

Novel Proteins ◽

Novel Antibiotics

AbstractBacterial resistance to antibiotics is a growing problem that is projected to cause more deaths than cancer in 2050. Consequently, novel antibiotics are urgently needed. Since more than half of the available antibiotics target the bacterial ribosomes, proteins that are involved in protein synthesis are thus prime targets for the development of novel antibiotics. However, experimental identification of these potential antibiotic target proteins can be labor-intensive and challenging, as these proteins are likely to be poorly characterized and specific to few bacteria. In order to identify these novel proteins, we established a Large-Scale Transcriptomic Analysis Pipeline in Crowd (LSTrAP-Crowd), where 285 individuals processed 26 terabytes of RNA-sequencing data of the 17 most notorious bacterial pathogens. In total, the crowd processed 26,269 RNA-seq experiments and used the data to construct gene co-expression networks, which were used to identify more than a hundred uncharacterized genes that were transcriptionally associated with protein synthesis. We provide the identity of these genes together with the processed gene expression data. The data can be used to identify other vulnerabilities or bacteria, while our approach demonstrates how the processing of gene expression data can be easily crowdsourced.

Download Full-text