scholarly journals NeTOIF: A Network-based Approach for Time-Series Omics Data Imputation and Forecasting

2021 ◽  
Author(s):  
Min Shi ◽  
Shamim Mollah

Abstract: High-throughput studies of biological systems are rapidly generating a wealth of 'omics'-scale data. Many of these studies are time-series collecting proteomics and genomics data capturing dynamic observations. While time-series omics data are essential to unravel the mechanisms of various diseases, they often include missing (or incomplete) values resulting in data shortage. Data missing and shortage are especially problematic for downstream applications such as omics data integration and computational analyses that need complete and sufficient data representations. Data imputation and forecasting methods have been widely used to mitigate these issues. However, existing imputation and forecasting techniques typically address static omics data representing a single time point and perform forecasting on data with complete values. As a result, these techniques lack the ability to capture the time-ordered nature of data and cannot handle omics data containing missing values at multiple time points. Result: We propose a network-based method for time-series omics data imputation and forecasting (NeTOIF) that handle omics data containing missing values at multiple time points. NeTOIF takes advantage of topological relationships (e.g., protein-protein and gene-gene interactions) among omics data samples and incorporates a graph convolutional network to first infer the missing values at different time points. Then, we combine these inferred values with the original omics data to perform time-series imputation and forecasting using a long short-term memory network. Evaluating NeTOIF with a proteomic and a genomic dataset demonstrated a distinct advantage of NeTOIF over existing data imputation and forecasting methods. The average mean square error of NeTOIF improved 11.3% for imputation and 6.4% for forcasting compared to the baseline methods.

2021 ◽  
Vol 13 (15) ◽  
pp. 3042
Author(s):  
Kateřina Gdulová ◽  
Jana Marešová ◽  
Vojtěch Barták ◽  
Marta Szostak ◽  
Jaroslav Červenka ◽  
...  

The availability of global digital elevation models (DEMs) from multiple time points allows their combination for analysing vegetation changes. The combination of models (e.g., SRTM and TanDEM-X) can contain errors, which can, due to their synergistic effects, yield incorrect results. We used a high-resolution LiDAR-derived digital surface model (DSM) to evaluate the accuracy of canopy height estimates of the aforementioned global DEMs. In addition, we subtracted SRTM and TanDEM-X data at 90 and 30 m resolutions, respectively, to detect deforestation caused by bark beetle disturbance and evaluated the associations of their difference with terrain characteristics. The study areas covered three Central European mountain ranges and their surrounding areas: Bohemian Forest, Erzgebirge, and Giant Mountains. We found that vertical bias of SRTM and TanDEM-X, relative to the canopy height, is similar with negative values of up to −2.5 m and LE90s below 7.8 m in non-forest areas. In forests, the vertical bias of SRTM and TanDEM-X ranged from −0.5 to 4.1 m and LE90s from 7.2 to 11.0 m, respectively. The height differences between SRTM and TanDEM-X show moderate dependence on the slope and its orientation. LE90s for TDX-SRTM differences tended to be smaller for east-facing than for west-facing slopes, and varied, with aspect, by up to 1.5 m in non-forest areas and 3 m in forests, respectively. Finally, subtracting SRTM and NASA DEMs from TanDEM-X and Copernicus DEMs, respectively, successfully identified large areas of deforestation caused by hurricane Kyril in 2007 and a subsequent bark beetle disturbance in the Bohemian Forest. However, local errors in TanDEM-X, associated mainly with forest-covered west-facing slopes, resulted in erroneous identification of deforestation. Therefore, caution is needed when combining SRTM and TanDEM-X data in multitemporal studies in a mountain environment. Still, we can conclude that SRTM and TanDEM-X data represent suitable near global sources for the identification of deforestation in the period between the time points of their acquisition.


BMC Genomics ◽  
2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Henriette Miko ◽  
Yunjiang Qiu ◽  
Bjoern Gaertner ◽  
Maike Sander ◽  
Uwe Ohler

Abstract Background Co-localized combinations of histone modifications (“chromatin states”) have been shown to correlate with promoter and enhancer activity. Changes in chromatin states over multiple time points (“chromatin state trajectories”) have previously been analyzed at promoter and enhancers separately. With the advent of time series Hi-C data it is now possible to connect promoters and enhancers and to analyze chromatin state trajectories at promoter-enhancer pairs. Results We present TimelessFlex, a framework for investigating chromatin state trajectories at promoters and enhancers and at promoter-enhancer pairs based on Hi-C information. TimelessFlex extends our previous approach Timeless, a Bayesian network for clustering multiple histone modification data sets at promoter and enhancer feature regions. We utilize time series ATAC-seq data measuring open chromatin to define promoters and enhancer candidates. We developed an expectation-maximization algorithm to assign promoters and enhancers to each other based on Hi-C interactions and jointly cluster their feature regions into paired chromatin state trajectories. We find jointly clustered promoter-enhancer pairs showing the same activation patterns on both sides but with a stronger trend at the enhancer side. While the promoter side remains accessible across the time series, the enhancer side becomes dynamically more open towards the gene activation time point. Promoter cluster patterns show strong correlations with gene expression signals, whereas Hi-C signals get only slightly stronger towards activation. The code of the framework is available at https://github.com/henriettemiko/TimelessFlex. Conclusions TimelessFlex clusters time series histone modifications at promoter-enhancer pairs based on Hi-C and it can identify distinct chromatin states at promoter and enhancer feature regions and their changes over time.


2012 ◽  
Vol 9 (5) ◽  
pp. 610-620 ◽  
Author(s):  
Thomas A Trikalinos ◽  
Ingram Olkin

Background Many comparative studies report results at multiple time points. Such data are correlated because they pertain to the same patients, but are typically meta-analyzed as separate quantitative syntheses at each time point, ignoring the correlations between time points. Purpose To develop a meta-analytic approach that estimates treatment effects at successive time points and takes account of the stochastic dependencies of those effects. Methods We present both fixed and random effects methods for multivariate meta-analysis of effect sizes reported at multiple time points. We provide formulas for calculating the covariance (and correlations) of the effect sizes at successive time points for four common metrics (log odds ratio, log risk ratio, risk difference, and arcsine difference) based on data reported in the primary studies. We work through an example of a meta-analysis of 17 randomized trials of radiotherapy and chemotherapy versus radiotherapy alone for the postoperative treatment of patients with malignant gliomas, where in each trial survival is assessed at 6, 12, 18, and 24 months post randomization. We also provide software code for the main analyses described in the article. Results We discuss the estimation of fixed and random effects models and explore five options for the structure of the covariance matrix of the random effects. In the example, we compare separate (univariate) meta-analyses at each of the four time points with joint analyses across all four time points using the proposed methods. Although results of univariate and multivariate analyses are generally similar in the example, there are small differences in the magnitude of the effect sizes and the corresponding standard errors. We also discuss conditional multivariate analyses where one compares treatment effects at later time points given observed data at earlier time points. Limitations Simulation and empirical studies are needed to clarify the gains of multivariate analyses compared with separate meta-analyses under a variety of conditions. Conclusions Data reported at multiple time points are multivariate in nature and are efficiently analyzed using multivariate methods. The latter are an attractive alternative or complement to performing separate meta-analyses.


Hydrology ◽  
2018 ◽  
Vol 5 (4) ◽  
pp. 63 ◽  
Author(s):  
Benjamin Nelsen ◽  
D. Williams ◽  
Gustavious Williams ◽  
Candace Berrett

Complete and accurate data are necessary for analyzing and understanding trends in time-series datasets; however, many of the available time-series datasets have gaps that affect the analysis, especially in the earth sciences. As most available data have missing values, researchers use various interpolation methods or ad hoc approaches to data imputation. Since the analysis based on inaccurate data can lead to inaccurate conclusions, more accurate data imputation methods can provide accurate analysis. We present a spatial-temporal data imputation method using Empirical Mode Decomposition (EMD) based on spatial correlations. We call this method EMD-spatial data imputation or EMD-SDI. Though this method is applicable to other time-series data sets, here we demonstrate the method using temperature data. The EMD algorithm decomposes data into periodic components called intrinsic mode functions (IMF) and exactly reconstructs the original signal by summing these IMFs. EMD-SDI initially decomposes the data from the target station and other stations in the region into IMFs. EMD-SDI evaluates each IMF from the target station in turn and selects the IMF from other stations in the region with periodic behavior most correlated to target IMF. EMD-SDI then replaces a section of missing data in the target station IMF with the section from the most closely correlated IMF from the regional stations. We found that EMD-SDI selects the IMFs used for reconstruction from different stations throughout the region, not necessarily the station closest in the geographic sense. EMD-SDI accurately filled data gaps from 3 months to 5 years in length in our tests and favorably compares to a simple temporal method. EMD-SDI leverages regional correlation and the fact that different stations can be subject to different periodic behaviors. In addition to data imputation, the EMD-SDI method provides IMFs that can be used to better understand regional correlations and processes.


2002 ◽  
Vol 30 (4) ◽  
pp. 415-425 ◽  
Author(s):  
Meredith E. Coles ◽  
Cynthia L. Turk ◽  
Richard G. Heimberg

Cognitive-behavioral models (Clark & Wells, 1995; Rapee & Heimberg, 1997) and recent research suggest that individuals with social phobia (SP) experience both images (Hackmann, Surawy, & Clark, 1998) and memories (Coles, Turk, Heimberg, & Fresco, 2001; Wells, Clark, & Ahmad, 1998) of anxiety-producing social situations from an observer perspective. The current study examines memory perspective for two role-played situations (speech and social interaction) at multiple time points (immediate and 3 weeks post) in 22 individuals with generalized SP and 30 non-anxious controls (NACs). At both time points, SPs recalled the role-plays from a more observer/less field perspective than did NACs. Further, over time, the memory perspective of SPs became even more observer/less field while the memory perspective of NAC remained relatively stable.


Author(s):  
Dan Breznitz

This chapter acknowledges that, for many regions, the idea of attracting cutting-edge tech start-ups is almost irresistible. Seemingly every community aspires to become the next Silicon Valley. But is that feasible? This chapter make these lessons concrete by elaborating on the rapid rise and, even faster and deeper, decline of America’s first Silicon Valley—Cleveland, Ohio. It then shows the near impossibility of trying to become the next Silicon Valley by analyzing the mysterious failure of Atlanta, Georgia—a city that diligently followed all the advice ever given to an aspiring new start-up hub, but somehow was always left only with the “potential.” We will see how at multiple time-points Atlanta’s companies were the leading innovators with the best products in the newest information and communication technologies (ICT), only to falter and be taken over by Silicon Valley companies without leaving any apparent impact on the region. It then brings in social-network research and the concept of embeddedness to explain why trying to recreate a Silicon Valley is a doomed (and expensive) enterprise.


2011 ◽  
Vol 11 (1) ◽  
Author(s):  
Qingjiang Hou ◽  
Zhiyue Lin ◽  
Reginald Dusing ◽  
Byron J Gajewski ◽  
Richard W McCallum ◽  
...  

Blood ◽  
2000 ◽  
Vol 96 (1) ◽  
pp. 1-8 ◽  
Author(s):  
Hyeoung Joon Kim ◽  
John F. Tisdale ◽  
Tong Wu ◽  
Masaaki Takatoku ◽  
Stephanie E. Sellers ◽  
...  

Retroviral insertion site analysis was used to track the contribution of retrovirally transduced primitive progenitors to hematopoiesis after autologous transplantation in the rhesus macaque model. CD34-enriched mobilized peripheral blood cells were transduced with retroviral marking vectors containing the neo gene and were reinfused after total body irradiation. High-level gene transfer efficiency allowed insertion site analysis of individual myeloid and erythroid colony-forming units (CFU) and of highly purified B- and T-lymphoid populations in 2 animals. At multiple time points up to 1 year after transplantation, retroviral insertion sites were identified by performing inverse polymerase chain reaction and sequencing vector-containing CFU or more than 99% pure T- and B-cell populations. Forty-eight unique insertion sequences were detected in the first animal and also in the second animal, and multiple clones contributed to hematopoiesis at 2 or more time points. Multipotential clones contributing to myeloid and lymphoid lineages were identified. These results support the concept that hematopoiesis in large animals is polyclonal and that individual multipotential stem or progenitor cells can contribute to hematopoiesis for prolonged periods. Gene transfer to long-lived, multipotent clones is shown and is encouraging for human gene therapy applications.


Sign in / Sign up

Export Citation Format

Share Document