Dynamic Distribution Decomposition for Single-Cell Snapshot Time Series Identifies Subpopulations and Trajectories during iPSC Reprogramming

Mapping Intimacies ◽

10.1101/367789 ◽

2018 ◽

Author(s):

Jake P. Taylor-King ◽

Asbjørn N. Riseth ◽

Manfred Claassen

Keyword(s):

Time Series ◽

Single Cell ◽

Continuous Time ◽

Time Series Data ◽

Synthetic Data ◽

Series Data ◽

High Dimensional ◽

Mass Cytometry ◽

Dynamic Distribution ◽

Time Points

AbstractRecent high-dimensional single-cell technologies such as mass cytometry are enabling time series experiments to monitor the temporal evolution of cell state distributions and to identify dynamically important cell states, such as fate decision states in differentiation. However, these technologies are destructive, and require analysis approaches that temporally map between cell state distributions across time points. Current approaches to approximate the single-cell time series as a dynamical system suffer from too restrictive assumptions about the type of kinetics, or link together pairs of sequential measurements in a discontinuous fashion.We propose Dynamic Distribution Decomposition (DDD), an operator approximation approach to infer a continuous distribution map between time points. On the basis of single-cell snapshot time series data, DDD approximates the continuous time Perron-Frobenius operator by means of a finite set of basis functions. This procedure can be interpreted as a continuous time Markov chain over a continuum of states. By only assuming a memoryless Markov (autonomous) process, the types of dynamics represented are more general than those represented by other common models, e.g., chemical reaction networks, stochastic differential equations. Additionally, the continuity assumption ensures that the same dynamical system maps between all time points, not arbitrarily changing at each time point. We demonstrate the ability of DDD to reconstruct dynamically important cell states and their transitions both on synthetic data, as well as on mass cytometry time series of iPSC reprogramming of a fibroblast system. We use DDD to find previously identified subpopulations of cells and to visualize differentiation trajectories.Dynamic Distribution Decomposition allows interpreting high-dimensional snapshot time series data as a low-dimensional Markov process, thereby enabling an interpretable dynamics analysis for a variety of biological processes by means of identifying their dynamically important cell states.Author summaryHigh-dimensional single-cell snapshot measurements are now increasingly utilized to study dynamic processes. Such measurements enable us to evaluate cell population distributions and their evolution over time. However, it is not trivial to map these distribution across time and to identify dynamically important cell states, i.e. bottleneck regions of state space exhibiting a high degree of change. We present Dynamic Distribution Decomposition (DDD) achieving this task by encoding single-cell measurements as linear combination of basis function distributions and evolving these as a linear system. We demonstrate reconstruction of dynamically important states for synthetic data of a bifurcated diffusion process and mass cytometry data for iPSC reprogramming.

LSTM-Guided Coaching Assistant for Table Tennis Practice

Sensors ◽

10.3390/s18124112 ◽

2018 ◽

Vol 18 (12) ◽

pp. 4112 ◽

Cited By ~ 6

Author(s):

Se-Min Lim ◽

Hyeong-Cheol Oh ◽

Jaein Kim ◽

Juwon Lee ◽

Jooyoung Park

Keyword(s):

Time Series ◽

State Space ◽

Time Series Data ◽

State Space Model ◽

Skill Assessment ◽

Series Data ◽

High Dimensional ◽

Table Tennis ◽

Space Model ◽

Low Dimensional

Recently, wearable devices have become a prominent health care application domain by incorporating a growing number of sensors and adopting smart machine learning technologies. One closely related topic is the strategy of combining the wearable device technology with skill assessment, which can be used in wearable device apps for coaching and/or personal training. Particularly pertinent to skill assessment based on high-dimensional time series data from wearable sensors is classifying whether a player is an expert or a beginner, which skills the player is exercising, and extracting some low-dimensional representations useful for coaching. In this paper, we present a deep learning-based coaching assistant method, which can provide useful information in supporting table tennis practice. Our method uses a combination of LSTM (Long short-term memory) with a deep state space model and probabilistic inference. More precisely, we use the expressive power of LSTM when handling high-dimensional time series data, and state space model and probabilistic inference to extract low-dimensional latent representations useful for coaching. Experimental results show that our method can yield promising results for characterizing high-dimensional time series patterns and for providing useful information when working with wearable IMU (Inertial measurement unit) sensors for table tennis coaching.

Testing Serial Correlation and ARCH Effect of High-Dimensional Time-Series Data

Journal of Business and Economic Statistics ◽

10.1080/07350015.2019.1647844 ◽

2019 ◽

Vol 39 (1) ◽

pp. 136-147 ◽

Cited By ~ 1

Author(s):

Shiqing Ling ◽

Ruey S. Tsay ◽

Yaxing Yang

Keyword(s):

Time Series ◽

Serial Correlation ◽

Time Series Data ◽

Series Data ◽

High Dimensional ◽

Arch Effect

WATCH: Wasserstein Change Point Detection for High-Dimensional Time Series Data

10.1109/bigdata52589.2021.9671962 ◽

2021 ◽

Author(s):

Kamil Faber ◽

Roberto Corizzo ◽

Bartlomiej Sniezynski ◽

Michael Baron ◽

Nathalie Japkowicz

Keyword(s):

Time Series ◽

Change Point ◽

Time Series Data ◽

Change Point Detection ◽

Series Data ◽

High Dimensional ◽

Point Detection

Likelihood-based estimation of continuous-time epidemic models from time-series data: application to measles transmission in London

Journal of The Royal Society Interface ◽

10.1098/rsif.2007.1292 ◽

2008 ◽

Vol 5 (25) ◽

pp. 885-897 ◽

Cited By ~ 76

Author(s):

Simon Cauchemez ◽

Neil M Ferguson

Keyword(s):

Time Series ◽

Discrete Time ◽

Continuous Time ◽

Generation Time ◽

Data Augmentation ◽

Time Series Data ◽

Observation Interval ◽

Series Data ◽

Time Model ◽

Similar Time

We present a new statistical approach to analyse epidemic time-series data. A major difficulty for inference is that (i) the latent transmission process is partially observed and (ii) observed quantities are further aggregated temporally. We develop a data augmentation strategy to tackle these problems and introduce a diffusion process that mimicks the susceptible–infectious–removed (SIR) epidemic process, but that is more tractable analytically. While methods based on discrete-time models require epidemic and data collection processes to have similar time scales, our approach, based on a continuous-time model, is free of such constraint. Using simulated data, we found that all parameters of the SIR model, including the generation time, were estimated accurately if the observation interval was less than 2.5 times the generation time of the disease. Previous discrete-time TSIR models have been unable to estimate generation times, given that they assume the generation time is equal to the observation interval. However, we were unable to estimate the generation time of measles accurately from historical data. This indicates that simple models assuming homogenous mixing (even with age structure) of the type which are standard in mathematical epidemiology miss key features of epidemics in large populations.

Using matrix approximation for high-dimensional discrete optimization problems: Server consolidation based on cyclic time-series data

European Journal of Operational Research ◽

10.1016/j.ejor.2012.12.005 ◽

2013 ◽

Vol 227 (1) ◽

pp. 62-75 ◽

Cited By ~ 17

Author(s):

Thomas Setzer ◽

Martin Bichler

Keyword(s):

Time Series ◽

Discrete Optimization ◽

Time Series Data ◽

Optimization Problems ◽

Series Data ◽

High Dimensional ◽

Matrix Approximation ◽

Server Consolidation ◽

Cyclic Time ◽

Discrete Optimization Problems

TSmap3D: Browser visualization of high dimensional time series data

2016 IEEE International Conference on Big Data (Big Data) ◽

10.1109/bigdata.2016.7841022 ◽

2016 ◽

Cited By ~ 1

Author(s):

Supun Kamburugamuve ◽

Pulasthi Wickramasinghe ◽

Saliya Ekanayake ◽

Chathuri Wimalasena ◽

Milinda Pathirage ◽

...

Keyword(s):

Time Series ◽

Time Series Data ◽

Series Data ◽

High Dimensional

Piecewise Trend Approximation: A Ratio-Based Time Series Representation

Abstract and Applied Analysis ◽

10.1155/2013/603629 ◽

2013 ◽

Vol 2013 ◽

pp. 1-7 ◽

Cited By ~ 4

Author(s):

Jingpei Dan ◽

Weiren Shi ◽

Fangyan Dong ◽

Kaoru Hirota

Keyword(s):

Time Series ◽

Time Series Data ◽

Series Representation ◽

Feature Space ◽

Original Data ◽

Series Data ◽

High Dimensional ◽

Original Time Series ◽

Data Space ◽

Original Time

A time series representation, piecewise trend approximation (PTA), is proposed to improve efficiency of time series data mining in high dimensional large databases. PTA represents time series in concise form while retaining main trends in original time series; the dimensionality of original data is therefore reduced, and the key features are maintained. Different from the representations that based on original data space, PTA transforms original data space into the feature space of ratio between any two consecutive data points in original time series, of which sign and magnitude indicate changing direction and degree of local trend, respectively. Based on the ratio-based feature space, segmentation is performed such that each two conjoint segments have different trends, and then the piecewise segments are approximated by the ratios between the first and last points within the segments. To validate the proposed PTA, it is compared with classical time series representations PAA and APCA on two classical datasets by applying the commonly used K-NN classification algorithm. For ControlChart dataset, PTA outperforms them by 3.55% and 2.33% higher classification accuracy and 8.94% and 7.07% higher for Mixed-BagShapes dataset, respectively. It is indicated that the proposed PTA is effective for high dimensional time series data mining.

ARCO: An Artificial Counterfactual Approach For High-Dimensional Panel Time-Series Data

SSRN Electronic Journal ◽

10.2139/ssrn.2823687 ◽

2016 ◽

Author(s):

Carlos Carvalho

Keyword(s):

Time Series ◽

Time Series Data ◽

Series Data ◽

High Dimensional ◽

Counterfactual Approach

MITRE: predicting host status from microbiota time-series data

10.1101/447250 ◽

2018 ◽

Author(s):

Elijah Bogart ◽

Richard Creswell ◽

Georg K. Gerber

Keyword(s):

Machine Learning ◽

Time Series ◽

Time Series Data ◽

Synthetic Data ◽

Black Box ◽

Series Data ◽

Learning Approaches ◽

Rule Engine ◽

Microbiome Composition ◽

Host Status

AbstractLongitudinal studies are crucial for discovering casual relationships between the microbiome and human disease. We present Microbiome Interpretable Temporal Rule Engine (MITRE), the first machine learning method specifically designed for predicting host status from microbiome time-series data. Our method maintains interpretability by learning predictive rules over automatically inferred time-periods and phylogenetically related microbes. We validate MITRE’s performance on semi-synthetic data, and five real datasets measuring microbiome composition over time in infant and adult cohorts. Our results demonstrate that MITRE performs on par or outperforms “black box” machine learning approaches, providing a powerful new tool enabling discovery of biologically interpretable relationships between microbiome and human host.

Developing an Embedding, Koopman and Autoencoder Technologies-Based Multi-Omics Time Series Predictive Model (EKATP) for Systems Biology research

Frontiers in Genetics ◽

10.3389/fgene.2021.761629 ◽

2021 ◽

Vol 12 ◽

Author(s):

Suran Liu ◽

Yujie You ◽

Zhaoqi Tong ◽

Le Zhang

Keyword(s):

Time Series ◽

Predictive Model ◽

Time Series Data ◽

Chaotic Behavior ◽

Flow Behavior ◽

Series Data ◽

High Dimensional ◽

Future State ◽

Disease Occurrence ◽

The Future

It is very important for systems biologists to predict the state of the multi-omics time series for disease occurrence and health detection. However, it is difficult to make the prediction due to the high-dimensional, nonlinear and noisy characteristics of the multi-omics time series data. For this reason, this study innovatively proposes an Embedding, Koopman and Autoencoder technologies-based multi-omics time series predictive model (EKATP) to predict the future state of a high-dimensional nonlinear multi-omics time series. We evaluate this EKATP by using a genomics time series with chaotic behavior, a proteomics time series with oscillating behavior and a metabolomics time series with flow behavior. The computational experiments demonstrate that our proposed EKATP can substantially improve the accuracy, robustness and generalizability to predict the future state of a time series for multi-omics data.