Statistical significance approximation in local trend analysis of high-throughput time-series data using the theory of Markov chains

AbstractMotivationIdentification of constitutive reference genes is critical for analysis of gene expression. Large numbers of high throughput time series expression data are available, but current methods for identifying invariant expression are not tailored for time series. Identification of reference genes from these data sets can benefit from methods which incorporate the additional information they provide.ResultsHere we show that we can improve identification of invariant expression from time series by modelling the time component of the data. We implement the Prediction Interval Ranking Score (PIRS) software, which screens high throughput time series data and provides a ranked list of reference candidates. We expect that PIRS will improve the quality of gene expression analysis by allowing researchers to identify the best reference genes for their system from publicly available time series.AvailabilityPIRS can be downloaded and installed with dependencies using ‘pip install pirs’ and Python code and documentation is available for download at https://github.com/aleccrowell/[email protected]

Download Full-text

Discriminate Supervised Weighted Scheme for the Classification of Time Series Signals

International Journal of Sociotechnology and Knowledge Development ◽

10.4018/ijskd.2021070101 ◽

2021 ◽

Vol 13 (3) ◽

pp. 1-16

Author(s):

Elangovan Ramanujam ◽

S. Padmavathi

Keyword(s):

Time Series ◽

Time Series Data ◽

State Of The Art ◽

Statistical Significance ◽

Series Data ◽

Bag Of Words ◽

Time Series Classification ◽

Problem Of Time ◽

Weighted Matrix

Innovations and applicability of time series data mining techniques have significantly increased the researchers' interest in the problem of time series classification. Several algorithms have been proposed for this purpose categorized under shapelet, interval, motif, and whole series-based techniques. Among this, the bag-of-words technique, an extensive application of the text mining approach, performs well due to its simplicity and effectiveness. To extend the efficiency of the bag-of-words technique, this paper proposes a discriminate supervised weighted scheme to identify the characteristic and representative pattern of a class for efficient classification. This paper uses a modified weighted matrix that discriminates the representative and non-representative pattern which enables the interpretability in classification. Experimentation has been carried out to compare the performance of the proposed technique with state-of-the-art techniques in terms of accuracy and statistical significance.

Download Full-text

Abstract 19225: Impact of Change in Resuscitation Guidelines on National Out-of-hospital Cardiac Arrest Outcomes: Fulfilled Expectations?

Circulation ◽

10.1161/circ.132.suppl_3.19225 ◽

2015 ◽

Vol 132 (suppl_3) ◽

Author(s):

Shaker M Eid ◽

Aiham Albaeni ◽

Rebeca Rios ◽

May Baydoun ◽

Bolanle Akinyele ◽

...

Keyword(s):

Time Series ◽

Cardiac Arrest ◽

Time Series Data ◽

Statistical Significance ◽

Interrupted Time Series ◽

The United States ◽

Series Data ◽

National Database ◽

Resuscitation Guidelines ◽

Hospital Cardiac Arrest

Background: The intent of the 5-yearly Resuscitation Guidelines is to improve outcomes. Previous studies have yielded conflicting reports of a beneficial impact of the 2005 guidelines on out-of-hospital cardiac arrest (OHCA) survival. Using a national database, we examined survival before and after the introduction of both the 2005 and 2010 guidelines. Methods: We used the 2000 through 2012 National Inpatient Sample database to select patients ≥18 years admitted to hospitals in the United States with non-traumatic OHCA (ICD-9 CM codes 427.5 & 427.41). A quasi-experimental (interrupted time series) design was used to compare monthly survival trends. Outcomes for OHCA were compared pre- and post- 2005 and 2010 resuscitation guidelines release as follows: 01/2000-09/2005 vs. 10/2005-9/2010 and 10/2005-9/2010 vs. 10/2010-12/2012. Segmented regression analyses of interrupted time series data were performed to examine changes in survival to hospital discharge. Results: For the pre- and post- guidelines periods, 81600, 69139 and 36556 patients respectively survived to hospital admission following OHCA. Subsequent to the release of the 2005 guidelines, there was a statistically significant worsening in survival trends (β= -0.089, 95% CI -0.163 – -0.016, p =0.018) until the release of the 2010 guidelines when a sharp increase in survival was noted which persisted for the period of study (β= 0.054, 95% CI -0.143 – 0.251, p =0.588) but did not achieve statistical significance (Figure). Conclusion: National clinical guidelines developed to impact outcomes must include mechanisms to assess whether benefit actually occurs. The worsening in OHCA survival following the 2005 guidelines is thought provoking but the improvement following the release of the 2010 guidelines is reassuring and worthy of perpetuation.

Download Full-text

Trend analysis of time-series data: A novel method for untargeted metabolite discovery

Analytica Chimica Acta ◽

10.1016/j.aca.2010.01.038 ◽

2010 ◽

Vol 663 (1) ◽

pp. 98-104 ◽

Cited By ~ 14

Author(s):

Sonja Peters ◽

Hans-Gerd Janssen ◽

Gabriel Vivó-Truyols

Keyword(s):

Time Series ◽

Trend Analysis ◽

Time Series Data ◽

Series Data ◽

Novel Method ◽

Analysis Of Time Series

Download Full-text

Trend Analysis of Rainfall Time Series in Shanxi Province, Northern China (1957–2019)

Water ◽

10.3390/w12092335 ◽

2020 ◽

Vol 12 (9) ◽

pp. 2335

Author(s):

Feng Gao ◽

Yunpeng Wang ◽

Xiaoling Chen ◽

Wenfu Yang

Keyword(s):

Time Series ◽

Water Supply ◽

Trend Analysis ◽

Agricultural Production ◽

Time Series Data ◽

Shanxi Province ◽

Series Data ◽

Rainfall Time Series ◽

Mk Test ◽

Wutai Shan

Changes in rainfall play an important role in agricultural production, water supply and management, and social and economic development in arid and semi-arid regions. The objective of this study was to examine the trend of rainfall series from 18 meteorological stations for monthly, seasonal, and annual scales in Shanxi province over the period 1957–2019. The Mann–Kendall (MK) test, Spearman’s Rho (SR) test, and the Revised Mann–Kendall (RMK) test were used to identify the trends. Sen’s slope estimator (SSE) was used to estimate the magnitude of the rainfall trend. An autocorrelation function (ACF) plot was used to examine the autocorrelation coefficients at various lags in order to improve the trend analysis by the application of the RMK test. The results indicate remarkable differences with positive and negative trends (significant or non-significant) depending on stations. The largest number of stations showing decreasing trends occurred in March, with 10 out of 18 stations at the 10%, 5%, and 1% levels. Wutai Shan station has strong negative trends in January, March, April, November, and December at the level of 1%. In addition, Wutai Shan station also experienced a significant decreasing trend over four seasons at a significance level of 1% and 10%. On the annual scale, there was no significant trend detected by the three identification methods for most stations. MK and SR tests have similar power for detecting monotonic trends in rainfall time series data. Although similar results were obtained by the MK/SR and RMK tests in this study, in some cases, unreasonable trends may be provided by the RMK test. The findings of this study could benefit agricultural production activities, water supply and management, drought monitoring, and socioeconomic development in Shanxi province in the future.

Download Full-text

Curve Fitting for Short Time Series Data from High Throughput Experiments with Correction for Biological Variation

Advances in Intelligent Data Analysis XI - Lecture Notes in Computer Science ◽

10.1007/978-3-642-34156-4_15 ◽

2012 ◽

pp. 150-160 ◽

Cited By ~ 2

Author(s):

Frank Klawonn ◽

Nada Abidi ◽

Evelin Berger ◽

Lothar Jänsch

Keyword(s):

Time Series ◽

High Throughput ◽

Curve Fitting ◽

Time Series Data ◽

Biological Variation ◽

Series Data ◽

Short Time Series ◽

Short Time ◽

High Throughput Experiments

Download Full-text

Statistical significance approximation for local similarity analysis of dependent time series data

BMC Bioinformatics ◽

10.1186/s12859-019-2595-x ◽

2019 ◽

Vol 20 (1) ◽

Cited By ~ 1

Author(s):

Fang Zhang ◽

Fengzhu Sun ◽

Yihui Luan

Keyword(s):

Time Series ◽

Time Series Data ◽

Statistical Significance ◽

Series Data ◽

Similarity Analysis ◽

Local Similarity

Download Full-text

Comparison of six statistical methods for interrupted time series studies: empirical evaluation of 190 published series

BMC Medical Research Methodology ◽

10.1186/s12874-021-01306-w ◽

2021 ◽

Vol 21 (1) ◽

Author(s):

Simon L. Turner ◽

Amalia Karahalios ◽

Andrew B. Forbes ◽

Monica Taljaard ◽

Jeremy M. Grimshaw ◽

...

Keyword(s):

Time Series ◽

Statistical Method ◽

Statistical Methods ◽

Time Series Data ◽

Statistical Significance ◽

Empirical Evaluation ◽

Interrupted Time Series ◽

Series Data ◽

Standard Errors ◽

The Impact

Abstract Background The Interrupted Time Series (ITS) is a quasi-experimental design commonly used in public health to evaluate the impact of interventions or exposures. Multiple statistical methods are available to analyse data from ITS studies, but no empirical investigation has examined how the different methods compare when applied to real-world datasets. Methods A random sample of 200 ITS studies identified in a previous methods review were included. Time series data from each of these studies was sought. Each dataset was re-analysed using six statistical methods. Point and confidence interval estimates for level and slope changes, standard errors, p-values and estimates of autocorrelation were compared between methods. Results From the 200 ITS studies, including 230 time series, 190 datasets were obtained. We found that the choice of statistical method can importantly affect the level and slope change point estimates, their standard errors, width of confidence intervals and p-values. Statistical significance (categorised at the 5% level) often differed across the pairwise comparisons of methods, ranging from 4 to 25% disagreement. Estimates of autocorrelation differed depending on the method used and the length of the series. Conclusions The choice of statistical method in ITS studies can lead to substantially different conclusions about the impact of the interruption. Pre-specification of the statistical method is encouraged, and naive conclusions based on statistical significance should be avoided.

Download Full-text

Growth Score: a single metric to define growth in 96-well phenotype assays

PeerJ ◽

10.7717/peerj.4681 ◽

2018 ◽

Vol 6 ◽

pp. e4681

Author(s):

Daniel A. Cuevas ◽

Robert A. Edwards

Keyword(s):

Time Series ◽

Systems Biology ◽

High Throughput ◽

Growth Curve ◽

Time Series Data ◽

Series Data ◽

Direct Measurements ◽

Essential Growth ◽

Data Analytic ◽

Require Data

High-throughput phenotype assays are a cornerstone of systems biology as they allow direct measurements of mutations, genes, strains, or even different genera. High-throughput methods also require data analytic methods that reduce complex time-series data to a single numeric evaluation. Here, we present the Growth Score, an improvement on the previous Growth Level formula. There is strong correlation between Growth Score and Growth Level, but the new Growth Score contains only essential growth curve properties while the formula of the previous Growth Level was convoluted and not easily interpretable. Several programs can be used to estimate the parameters required to calculate the Growth Score metric, including ourPMAnalyzerpipeline.

Download Full-text