scholarly journals Visual Comparison of Multiple Gene Expression Datasets in a Genomic Context

2008 ◽  
Vol 5 (2) ◽  
Author(s):  
Krzysztof Borowski ◽  
Jung Soh ◽  
Christoph W. Sensen

SummaryThe need for novel methods of visualizing microarray data is growing. New perspectives are beneficial to finding patterns in expression data. The Bluejay genome browser provides an integrative way of visualizing gene expression datasets in a genomic context. We have now developed the functionality to display multiple microarray datasets simultaneously in Bluejay, in order to provide researchers with a comprehensive view of their datasets linked to a graphical representation of gene function. This will enable biologists to obtain valuable insights on expression patterns, by allowing them to analyze the expression values in relation to the gene locations as well as to compare expression profiles of related genomes or of di erent experiments for the same genome.

Author(s):  
Crescenzio Gallo

The possible applications of modeling and simulation in the field of bioinformatics are very extensive, ranging from understanding basic metabolic paths to exploring genetic variability. Experimental results carried out with DNA microarrays allow researchers to measure expression levels for thousands of genes simultaneously, across different conditions and over time. A key step in the analysis of gene expression data is the detection of groups of genes that manifest similar expression patterns. In this chapter, the authors examine various methods for analyzing gene expression data, addressing the important topics of (1) selecting the most differentially expressed genes, (2) grouping them by means of their relationships, and (3) classifying samples based on gene expressions.


2005 ◽  
Vol 14 (05) ◽  
pp. 771-789 ◽  
Author(s):  
JIONG YANG ◽  
HAIXUN WANG ◽  
WEI WANG ◽  
PHILIP S. YU

Microarrays are one of the latest breakthroughs in experimental molecular biology, which provide a powerful tool by which the expression patterns of thousands of genes can be monitored simultaneously and are already producing huge amount of valuable data. The concept of bicluster was introduced by Cheng and Church1 to capture the coherence of a subset of genes and a subset of conditions. A set of heuristic algorithms were also designed to either find one bicluster or a set of biclusters, which consist of iterations of masking null values and discovered biclusters, coarse and fine node deletion, node addition, and the inclusion of inverted data. These heuristics inevitably suffer from some serious drawback. The masking of null values and discovered biclusters with random numbers may result in the phenomenon of random interference which in turn impacts the discovery of high quality biclusters. To address this issue and to further accelerate the biclustering process, we generalize the model of bicluster to incorporate null values and propose a probabilistic algorithm (FLOC) that can discover a set of k possibly overlapping biclusters simultaneously. Furthermore, this algorithm can easily be extended to support additional features that suit different requirements at virtually little cost. Experimental study on the yeast gene expression data2 shows that the FLOC algorithm can offer substantial improvements over the previously proposed algorithm.


2016 ◽  
Author(s):  
Yanhui Hu ◽  
Aram Comjean ◽  
Norbert Perrimon ◽  
Stephanie Mohr

AbstractBackgroundNext-generation sequencing technologies have greatly increased our ability to identify gene expression levels, including at specific developmental stages and in specific tissues. Gene expression data can help researchers understand the diverse functions of genes and gene networks, as well as help in the design of specific and efficient functional studies, such as by helping researchers choose the most appropriate tissue for a study of a group of genes, or conversely, by limiting a long list of gene candidates to the subset that are normally expressed at a given stage or in a given tissue.ResultsWe report a Drosophila Gene Expression Tool (DGET, www.flyrnai.org/tools/dget/web/), which stores and facilitates search of RNA-Seq based expression profiles available from the modENCODE consortium and other public data sets. Using DGET, researchers are able to look up gene expression profiles, filter results based on threshold expression values, and compare expression data across different developmental stages, tissues and treatments. In addition, at DGET a researcher can analyze tissue or stage-specific enrichment for an inputted list of genes (e.g. ‘hits’ from a screen) and search for additional genes with similar expression patterns. We performed a number of analyses to demonstrate the quality and robustness of the resource. In particular, we show that evolutionary conserved genes expressed at high or moderate levels in both fly and human tend to be expressed in similar tissues. Using DGET, we compared whole tissue profile and sub-region/cell-type specific datasets and estimated the potential cause of false positives in one dataset. We also demonstrated the usefulness of DGET for synexpression studies by querying genes with similar expression profile to the mesodermal master regulator Twist.ConclusionAltogether, DGET provides a flexible tool for expression data retrieval and analysis with short or long lists of Drosophila genes, which can help scientists to design stage- or tissue-specific in vivo studies and do other subsequent analyses.


Author(s):  
Georgia Tsiliki ◽  
Dimitrios Vlachakis ◽  
Sophia Kossida

With the extensive use of microarray technology as a potential prognostic and diagnostic tool, the comparison and reproducibility of results obtained from the use of different platforms is of interest. The integration of those datasets can yield more informative results corresponding to numerous datasets and microarray platforms. We developed a novel integration technique for microarray gene-expression data derived by different studies for the purpose of a two-way Bayesian partition modelling which estimates co-expression profiles under subsets of genes and between biological samples or experimental conditions. The suggested methodology transforms disparate gene-expression data on a common probability scale to obtain inter-study-validated gene signatures. We evaluated the performance of our model using artificial data. Finally, we applied our model to six publicly available cancer gene-expression datasets and compared our results with well-known integrative microarray data methods. Our study shows that the suggested framework can relieve the limited sample size problem while reporting high accuracies by integrating multi-experiment data.


2018 ◽  
Author(s):  
Dongya Jia ◽  
Jason T. George ◽  
Satyendra C. Tripathi ◽  
Deepali L. Kundnani ◽  
Mingyang Lu ◽  
...  

AbstractThe epithelial-mesenchymal transition (EMT) plays a central role in cancer metastasis and drug resistance – two persistent clinical challenges. Epithelial cells can undergo a partial or full EMT, attaining either a hybrid epithelial/mesenchymal (E/M) or mesenchymal phenotype, respectively. Recent studies have emphasized that hybrid E/M cells may be more aggressive than their mesenchymal counterparts. However, mechanisms driving hybrid E/M phenotypes remain largely elusive. Here, to better characterize the hybrid E/M phenotype(s) and tumor aggressiveness, we integrate two computational methods – (a) RACIPE – to identify the robust gene expression patterns emerging from the dynamics of a given gene regulatory network, and (b) EMT scoring metric - to calculate the probability that a given gene expression profile displays a hybrid E/M phenotype. We apply the EMT scoring metric to RACIPE-generated gene expression data generated from a core EMT regulatory network and classify the gene expression profiles into relevant categories (epithelial, hybrid E/M, mesenchymal). This categorization is broadly consistent with hierarchical clustering readouts of RACIPE-generated gene expression data. We show that the EMT scoring metric can be used to distinguish between samples composed of exclusively hybrid E/M cells and those containing mixtures of epithelial and mesenchymal subpopulations using the RACIPE-generated gene expression data.


Author(s):  
Crescenzio Gallo

The possible applications of modeling and simulation in the field of bioinformatics are very extensive, ranging from understanding basic metabolic paths to exploring genetic variability. Experimental results carried out with DNA microarrays allow researchers to measure expression levels for thousands of genes simultaneously, across different conditions and over time. A key step in the analysis of gene expression data is the detection of groups of genes that manifest similar expression patterns. In this chapter we examine various methods for analyzing gene expression data, addressing the important topics of (1) selecting the most differentially expressed genes, (2) grouping them by means of their relationships, and (3) classifying samples based on gene expressions.


Author(s):  
Naveen Trivedi ◽  
Suvendu Kanungo

Background: Today bi-clustering technique plays a vital role to analyze gene expression data in microarray technology. This technique performs clustering on both rows and columns of expression data simultaneously. It determines the expression level of genes set under the subset of several conditions or samples. Basically, obtained information is collected in the form of a sub matrix comprising of microarray data that satisfy coherent expression patterns of subsets of genes with respect to subsets of conditions. These sub matrices are represented as bi-clusters and overall process is called bi-clustering. In this paper, we proposed a new meta-heuristics hybrid ABC-MWOA-CC which is based on artificial bee colony (ABC), modified whale optimization algorithm (MWOA) and Cheng and Church (CC) algorithm to optimize the extracted bi-clusters. In order to validate this algorithm, we also delve into finding the statistical and biological relevancy of extracted genes with respect to various conditions. However, most of the bi-clustering techniques do not address the biological significance of genes belonging to extracted bi-clusters Objective: The major aim of the proposed work is to design and develop a novel hybrid multi-objective bi-clustering approach for in microarray data to produce desired number of valid bi-clusters. Further, these extracted bi-clusters are to be optimized to obtain optimal solution. Method: In the proposed approach, a hybrid multi-objective bi-clustering algorithm which is based on ABC along with MWOA is recommended to group the data into desired number of bi-clusters. Further, ABC with MWOA multi-objective optimization algorithm is applied in order to optimize the solutions using variety of the fitness functions. Results: In the analysis of the result, the multi-objective functions which are employed to judge the fitness calculation like Volume Mean (VM), Mean of Genes (GM), Mean of Conditions (CM) and Mean of MSR (MMSR) leads to improve the performance analysis of the CC bi-clustering algorithm on real life data set such as Yeast Saccharomyces cerevisiae cell cycle gene Expression datasets. Conclusion: The effectiveness of the ABC-MWOA-CC algorithm is comprehensively demonstrated by comparing it with well-known traditional ABC-CC, OPSM and CC algorithm in terms of VM, GM, CM and MMSR.


BMC Genomics ◽  
2008 ◽  
Vol 9 (1) ◽  
pp. 495 ◽  
Author(s):  
Evert Blom ◽  
Rainer Breitling ◽  
Klaas Hofstede ◽  
Jos BTM Roerdink ◽  
Sacha AFT van Hijum ◽  
...  

2021 ◽  
Vol 22 (4) ◽  
pp. 1901
Author(s):  
Brielle Jones ◽  
Chaoyang Li ◽  
Min Sung Park ◽  
Anne Lerch ◽  
Vimal Jacob ◽  
...  

Mesenchymal stromal cells derived from the fetal placenta, composed of an amnion membrane, chorion membrane, and umbilical cord, have emerged as promising sources for regenerative medicine. Here, we used next-generation sequencing technology to comprehensively compare amniotic stromal cells (ASCs) with chorionic stromal cells (CSCs) at the molecular and signaling levels. Principal component analysis showed a clear dichotomy of gene expression profiles between ASCs and CSCs. Unsupervised hierarchical clustering confirmed that the biological repeats of ASCs and CSCs were able to respectively group together. Supervised analysis identified differentially expressed genes, such as LMO3, HOXA11, and HOXA13, and differentially expressed isoforms, such as CXCL6 and HGF. Gene Ontology (GO) analysis showed that the GO terms of the extracellular matrix, angiogenesis, and cell adhesion were significantly enriched in CSCs. We further explored the factors associated with inflammation and angiogenesis using a multiplex assay. In comparison with ASCs, CSCs secreted higher levels of angiogenic factors, including angiogenin, VEGFA, HGF, and bFGF. The results of a tube formation assay proved that CSCs exhibited a strong angiogenic function. However, ASCs secreted two-fold more of an anti-inflammatory factor, TSG-6, than CSCs. In conclusion, our study demonstrated the differential gene expression patterns between ASCs and CSCs. CSCs have superior angiogenic potential, whereas ASCs exhibit increased anti-inflammatory properties.


Processes ◽  
2019 ◽  
Vol 7 (5) ◽  
pp. 301
Author(s):  
Muying Wang ◽  
Satoshi Fukuyama ◽  
Yoshihiro Kawaoka ◽  
Jason E. Shoemaker

Motivation: Immune cell dynamics is a critical factor of disease-associated pathology (immunopathology) that also impacts the levels of mRNAs in diseased tissue. Deconvolution algorithms attempt to infer cell quantities in a tissue/organ sample based on gene expression profiles and are often evaluated using artificial, non-complex samples. Their accuracy on estimating cell counts given temporal tissue gene expression data remains not well characterized and has never been characterized when using diseased lung. Further, how to remove the effects of cell migration on transcript counts to improve discovery of disease factors is an open question. Results: Four cell count inference (i.e., deconvolution) tools are evaluated using microarray data from influenza-infected lung sampled at several time points post-infection. The analysis finds that inferred cell quantities are accurate only for select cell types and there is a tendency for algorithms to have a good relative fit (R 2 ) but a poor absolute fit (normalized mean squared error; NMSE), which suggests systemic biases exist. Nonetheless, using cell fraction estimates to adjust gene expression data, we show that genes associated with influenza virus replication and increased infection pathology are more likely to be identified as significant than when applying traditional statistical tests.


Sign in / Sign up

Export Citation Format

Share Document