IDENTIFYING DIFFERENTIALLY EXPRESSED GENES IN TIME-COURSE MICROARRAY EXPERIMENT WITHOUT REPLICATE

2007 ◽  
Vol 05 (02a) ◽  
pp. 281-296 ◽  
Author(s):  
XU HAN ◽  
WING-KIN SUNG ◽  
LIN FENG

Replication of time series in microarray experiments is costly. To analyze time series data with no replicate, many model-specific approaches have been proposed. However, they fail to identify the genes whose expression patterns do not fit the pre-defined models. Besides, modeling the temporal expression patterns is difficult when the dynamics of gene expression in the experiment is poorly understood. We propose a method called Partial Energy ratio for Microarray (PEM) for the analysis of time course microarray data. In the PEM method, we assume the gene expressions vary smoothly in the temporal domain. This assumption is comparatively weak and hence the method is general enough to identify genes expressed in unexpected patterns. To identify the differentially expressed genes, a new statistic is developed by comparing the energies of two convoluted profiles. We further improve the statistic for microarray analysis by introducing the concept of partial energy. The PEM statistic can be easily incorporated into the SAM framework for significance analysis. We evaluated the PEM method with an artificial dataset and two published time course cDNA microarray datasets on yeast. The experimental results show the robustness and the generality of the PEM method in identifying the genes of interest.

2021 ◽  
Vol 3 (1) ◽  
Author(s):  
Hitoshi Iuchi ◽  
Michiaki Hamada

Abstract Time-course experiments using parallel sequencers have the potential to uncover gradual changes in cells over time that cannot be observed in a two-point comparison. An essential step in time-series data analysis is the identification of temporal differentially expressed genes (TEGs) under two conditions (e.g. control versus case). Model-based approaches, which are typical TEG detection methods, often set one parameter (e.g. degree or degree of freedom) for one dataset. This approach risks modeling of linearly increasing genes with higher-order functions, or fitting of cyclic gene expression with linear functions, thereby leading to false positives/negatives. Here, we present a Jonckheere–Terpstra–Kendall (JTK)-based non-parametric algorithm for TEG detection. Benchmarks, using simulation data, show that the JTK-based approach outperforms existing methods, especially in long time-series experiments. Additionally, application of JTK in the analysis of time-series RNA-seq data from seven tissue types, across developmental stages in mouse and rat, suggested that the wave pattern contributes to the TEG identification of JTK, not the difference in expression levels. This result suggests that JTK is a suitable algorithm when focusing on expression patterns over time rather than expression levels, such as comparisons between different species. These results show that JTK is an excellent candidate for TEG detection.


2021 ◽  
Vol 36 (Supplement_1) ◽  
Author(s):  
Y Zhang ◽  
Y W Zhao ◽  
C C Wang ◽  
T C Li

Abstract Study question To investigate the different metabolomic profiling in serum between pregnant and non-pregnant women during early implantation period. Summary answer Metabolomics of progesterone-related hormones enhances from ET day3 for pregnancy women compared with non-pregnancy women. What is known already Metabolomics is based on high-throughput analytical methods to identify and quantify metabolites. Compared to other omics study, metabolomics is the closest one to the phenotype, allowing the observation of dynamic changes in phenotype at specific timepoints. So far there is no published work about the metabolomics profile in human early implantation period. Study design, size, duration: Study design: comparative study. Size: 14 pregnancy women and 14 non-pregnancy women. duration: time-course. Participants/materials, setting, methods Participants: pregnancy women and unpregnancy women after embryo transfer (ET). Setting: university-based study. Methods: Peripheral blood were collected at ET day0, 3, 6 and 9. metabolomic profiling in serum by platforms of capillary electrophoresis-mass spectrometry (CE-MS) and liquid chromatography–mass spectrometry (LC-MS). Main results and the role of chance There were no statistical difference of the age, BMI, basal FSH level, endometrium thickness on the day of embryo transfer, distribution of primary and secondary fertility, embryo transfer cycle as well as the infertile types between the two groups. After deleting those with over 50% missing data, we finally have 310 metabolites into statistical analysis. Among the 310 metabolite, lipid metabolites account the largest percentage, nearly half of all metabolites. The second biggest class of metabolites in our data was organic acids. Combined results in repeated measurement ANOVA (RM-ANOVA) and ANOVA-simultaneous component analysis (ASCA) as well as multivariate empirical Bayes time-series analysis (MEBA), we finally found that progesterone-related hormones were the most important metabolites for the whole time-series data. Those significant metabolites showed a significant down regulation from ET day0 to ET day3 and up regulation from ET day3 to ET day9. Limitations, reasons for caution we have limited sample size for this study and further validation is necessary for confirmation. Wider implications of the findings: The phenomenon of upregulation of progesterone-related hormones from day3 in pregnancy group might be related to the embryo-originated hcg. Because the embryo has entered into endometrium at day3 and produced cytokines, hcg and other interaction with endometrium. Trial registration number NA


2017 ◽  
Author(s):  
María José Nueda ◽  
Jordi Martorell-Marugan ◽  
Cristina Martí ◽  
Sonia Tarazona ◽  
Ana Conesa

AbstractAs sequencing technologies improve their capacity to detect distinct transcripts of the same gene and to address complex experimental designs such as longitudinal studies, there is a need to develop statistical methods for the analysis of isoform expression changes in time series data. Iso-maSigPro is a new functionality of the R package maSigPro for transcriptomics time series data analysis. Iso-maSigPro identifies genes with a differential isoform usage across time. The package also includes new clustering and visualization functions that allow grouping of genes with similar expression patterns at the isoform level, as well as those genes with a shift in major expressed isoform. The package is freely available under the LGPL license from the Bioconductor web site (http://bioconductor.org).


2016 ◽  
Author(s):  
Luis F. Jover ◽  
Justin Romberg ◽  
Joshua S. Weitz

In communities with bacterial viruses (phage) and bacteria, the phage-bacteria infection network establishes which virus types infects which host types. The structure of the infection network is a key element in understanding community dynamics. Yet, this infection network is often difficult to ascertain. Introduced over 60 years ago, the plaque assay remains the gold-standard for establishing who infects whom in a community. This culture-based approach does not scale to environmental samples with increased levels of phage and bacterial diversity, much of which is currently unculturable. Here, we propose an alternative method of inferring phage-bacteria infection networks. This method uses time series data of fluctuating population densities to estimate the complete interaction network without having to test each phage-bacteria pair individually. We use in silico experiments to analyze the factors affecting the quality of network reconstruction and find robust regimes where accurate reconstructions are possible. In addition, we present a multi-experiment approach where time series from different experiments are combined to improve estimates of the infection network and mitigate against the possibility of evolutionary changes to infection during the time-course of measurement.


2016 ◽  
Vol 3 (11) ◽  
pp. 160654 ◽  
Author(s):  
Luis F. Jover ◽  
Justin Romberg ◽  
Joshua S. Weitz

In communities with bacterial viruses (phage) and bacteria, the phage–bacteria infection network establishes which virus types infect which host types. The structure of the infection network is a key element in understanding community dynamics. Yet, this infection network is often difficult to ascertain. Introduced over 60 years ago, the plaque assay remains the gold standard for establishing who infects whom in a community. This culture-based approach does not scale to environmental samples with increased levels of phage and bacterial diversity, much of which is currently unculturable. Here, we propose an alternative method of inferring phage–bacteria infection networks. This method uses time-series data of fluctuating population densities to estimate the complete interaction network without having to test each phage–bacteria pair individually. We use in silico experiments to analyse the factors affecting the quality of network reconstruction and find robust regimes where accurate reconstructions are possible. In addition, we present a multi-experiment approach where time series from different experiments are combined to improve estimates of the infection network. This approach also mitigates against the possibility of evolutionary changes to relevant phenotypes during the time course of measurement.


2019 ◽  
Vol 2 (4) ◽  
pp. 81 ◽  
Author(s):  
Oike ◽  
Ogawa ◽  
Oishi

Actograms are well-established methods used for visualizing periodic activity of animals in chronobiological research. They help in the understanding of the overall characteristics of rhythms and are instrumental in defining the direction of subsequent detailed analysis. Although there exists specialized software for creating actograms, new users such as students and researchers from other fields often find it inconvenient to use. In this study, we demonstrate a fast and easy method to create actograms using Microsoft Excel. As operations in Excel are simple and user-friendly, it takes only a few minutes to create an actogram. Using this method, it is possible to obtain a visual understanding of the characteristics of rhythms not only from typical activity data, but also from any kind of time-series data such as body temperature, blood sugar level, gene expressions, sleep electroencephalogram, heartbeat, and so on. The actogram thus created can also be converted to the "heatogram” shown by color temperature. As opposed to conventional chronograms, this new type of chronogram facilitates easy understanding of rhythmic features in a more intuitive manner. This method is therefore convenient and beneficial for a broad range of researchers including students as it aids in the better understanding of periodic phenomena from a large amount of time-series data.


2014 ◽  
Author(s):  
Jonathan Terhorst ◽  
Yun S. Song

Genomic time series data generated by evolve-and-resequence (E&R) experiments offer a powerful window into the mechanisms that drive evolution. However, standard population genetic inference procedures do not account for sampling serially over time, and new methods are needed to make full use of modern experimental evolution data. To address this problem, we develop a Gaussian process approximation to the multi-locus Wright-Fisher process with selection over a time course of tens of generations. The mean and covariance structure of the Gaussian process are obtained by computing the corresponding moments in discrete-time Wright-Fisher models conditioned on the presence of a linked selected site. This enables our method to account for the effects of linkage and selection, both along the genome and across sampled time points, in an approximate but principled manner. Using simulated data, we demonstrate the power of our method to correctly detect, locate and estimate the fitness of a selected allele from among several linked sites. We also study how this power changes for different values of selection strength, initial haplotypic diversity, population size, sampling frequency, experimental duration, number of replicates, and sequencing coverage depth. In addition to providing quantitative estimates of selection parameters from experimental evolution data, our model can be used by practitioners to design E&R experiments with requisite power. Finally, we explore how our likelihood-based approach can be used to infer other model parameters, including effective population size and recombination rate, and discuss extensions to more complex models.


Author(s):  
Andrew Blanchard ◽  
Christopher Wolter ◽  
David S. McNabb ◽  
Eitan Gross

In this paper, the authors present a wavelet-based algorithm (Wave-SOM) to help visualize and cluster oscillatory time-series data in two-dimensional gene expression micro-arrays. Using various wavelet transformations, raw data are first de-noised by decomposing the time-series into low and high frequency wavelet coefficients. Following thresholding, the coefficients are fed as an input vector into a two-dimensional Self-Organizing-Map clustering algorithm. Transformed data are then clustered by minimizing the Euclidean (L2) distance between their corresponding fluctuation patterns. A multi-resolution analysis by Wave-SOM of expression data from the yeast Saccharomyces cerevisiae, exposed to oxidative stress and glucose-limited growth, identified 29 genes with correlated expression patterns that were mapped into 5 different nodes. The ordered clustering of yeast genes by Wave-SOM illustrates that the same set of genes (encoding ribosomal proteins) can be regulated by two different environmental stresses, oxidative stress and starvation. The algorithm provides heuristic information regarding the similarity of different genes. Using previously studied expression patterns of yeast cell-cycle and functional genes as test data sets, the authors’ algorithm outperformed five other competing programs.


Sign in / Sign up

Export Citation Format

Share Document