BAYESIAN IDENTIFICATION OF MULTIPLE CHANGE POINTS IN POISSON DATA

The identification of multiple change points is a problem shared by many subject areas, including disease and criminality mapping, medical diagnosis, industrial control, and finance. An algorithm based on the Product Partition Model (PPM) is developed to solve the multiple change point identification problem in Poisson data sequences. In order to address the PPM, a simple and easy way to implement Gibbs sampling scheme is derived. A sensitivity analysis is performed, for different prior specifications. The algorithm is then applied to the analysis of a real data sequence. The results show that the method is quite effective and provides useful inferences.

Download Full-text

APPLYING THE PRODUCT PARTITION MODEL TO THE IDENTIFICATION OF MULTIPLE CHANGE POINTS

Advances in Complex Systems ◽

10.1142/s0219525902000651 ◽

2002 ◽

Vol 05 (04) ◽

pp. 371-387 ◽

Cited By ~ 2

Author(s):

R. H. LOSCHI ◽

F. R. B. CRUZ

Keyword(s):

Bayesian Approach ◽

Bayesian Methods ◽

Change Point ◽

Medical Diagnosis ◽

Identification Problem ◽

Change Points ◽

Industrial Control ◽

Partition Model ◽

Practical Applications ◽

Subject Areas

The multiple change point identification problem may be encountered in many subject areas, including disease mapping, medical diagnosis, industrial control, and finance. One appealing way of tackling the problem is through the product partition model (PPM), a Bayesian approach. Nowadays, practical applications of Bayesian methods have attracted attention perhaps because of the generalized use of powerful and inexpensive personal computers. A Gibbs sampling scheme, simple and easy to implement, is used to obtain the estimates. We apply the algorithm to the analysis of two important stock market data in Brazil. The results show that the method is efficient and effective in analyzing change point problems.

Download Full-text

Bayesian Online Learning of the Hazard Rate in Change-Point Problems

Neural Computation ◽

10.1162/neco_a_00007 ◽

2010 ◽

Vol 22 (9) ◽

pp. 2452-2476 ◽

Cited By ~ 83

Author(s):

Robert C. Wilson ◽

Matthew R. Nassar ◽

Joshua I. Gold

Keyword(s):

Change Point ◽

Stock Price ◽

Hazard Rate ◽

Real Data ◽

Generative Models ◽

Ideal Observer ◽

Change Points ◽

Brain State ◽

Observer Model ◽

Change Point Models

Change-point models are generative models of time-varying data in which the underlying generative parameters undergo discontinuous changes at different points in time known as change points. Change-points often represent important events in the underlying processes, like a change in brain state reflected in EEG data or a change in the value of a company reflected in its stock price. However, change-points can be difficult to identify in noisy data streams. Previous attempts to identify change-points online using Bayesian inference relied on specifying in advance the rate at which they occur, called the hazard rate (h). This approach leads to predictions that can depend strongly on the choice of h and is unable to deal optimally with systems in which h is not constant in time. In this letter, we overcome these limitations by developing a hierarchical extension to earlier models. This approach allows h itself to be inferred from the data, which in turn helps to identify when change-points occur. We show that our approach can effectively identify change-points in both toy and real data sets with complex hazard rates and how it can be used as an ideal-observer model for human and animal behavior when faced with rapidly changing inputs.

Download Full-text

Change Point Enhanced Anomaly Detection for IoT Time Series Data

Water ◽

10.3390/w13121633 ◽

2021 ◽

Vol 13 (12) ◽

pp. 1633

Author(s):

Elena-Simona Apostol ◽

Ciprian-Octavian Truică ◽

Florin Pop ◽

Christian Esposito

Keyword(s):

Time Series ◽

Anomaly Detection ◽

Change Point ◽

Time Series Data ◽

Multivariate Time Series ◽

Change Point Detection ◽

Change Points ◽

Series Data ◽

Prediction And Forecasting ◽

Point Detection

Due to the exponential growth of the Internet of Things networks and the massive amount of time series data collected from these networks, it is essential to apply efficient methods for Big Data analysis in order to extract meaningful information and statistics. Anomaly detection is an important part of time series analysis, improving the quality of further analysis, such as prediction and forecasting. Thus, detecting sudden change points with normal behavior and using them to discriminate between abnormal behavior, i.e., outliers, is a crucial step used to minimize the false positive rate and to build accurate machine learning models for prediction and forecasting. In this paper, we propose a rule-based decision system that enhances anomaly detection in multivariate time series using change point detection. Our architecture uses a pipeline that automatically manages to detect real anomalies and remove the false positives introduced by change points. We employ both traditional and deep learning unsupervised algorithms, in total, five anomaly detection and five change point detection algorithms. Additionally, we propose a new confidence metric based on the support for a time series point to be an anomaly and the support for the same point to be a change point. In our experiments, we use a large real-world dataset containing multivariate time series about water consumption collected from smart meters. As an evaluation metric, we use Mean Absolute Error (MAE). The low MAE values show that the algorithms accurately determine anomalies and change points. The experimental results strengthen our assumption that anomaly detection can be improved by determining and removing change points as well as validates the correctness of our proposed rules in real-world scenarios. Furthermore, the proposed rule-based decision support systems enable users to make informed decisions regarding the status of the water distribution network and perform effectively predictive and proactive maintenance.

Download Full-text

Testing for long memory in the presence of a general trend

Journal of Applied Probability ◽

10.1017/s0021900200019215 ◽

2001 ◽

Vol 38 (04) ◽

pp. 1033-1054 ◽

Cited By ~ 19

Author(s):

Liudas Giraitis ◽

Piotr Kokoszka ◽

Remigijus Leipus

Keyword(s):

Long Memory ◽

Change Point ◽

General Trend ◽

Quantitative Description ◽

Stationary Sequence ◽

Change Points ◽

Asymptotic Size ◽

Short Memory ◽

The Impact ◽

Do So

The paper studies the impact of a broadly understood trend, which includes a change point in mean and monotonic trends studied by Bhattacharyaet al.(1983), on the asymptotic behaviour of a class of tests designed to detect long memory in a stationary sequence. Our results pertain to a family of tests which are similar to Lo's (1991) modifiedR/Stest. We show that both long memory and nonstationarity (presence of trend or change points) can lead to rejection of the null hypothesis of short memory, so that further testing is needed to discriminate between long memory and some forms of nonstationarity. We provide quantitative description of trends which do or do not fool theR/S-type long memory tests. We show, in particular, that a shift in mean of a magnitude larger thanN-½, whereNis the sample size, affects the asymptotic size of the tests, whereas smaller shifts do not do so.

Download Full-text

Detecting common breaks in the means of high dimensional cross-dependent panels

Econometrics Journal ◽

10.1093/ectj/utab028 ◽

2021 ◽

Author(s):

Lajos Horváth ◽

Zhenya Liu ◽

Gregory Rice ◽

Yuqian Zhao

Keyword(s):

Panel Data ◽

Common Factors ◽

Real Data ◽

Change Points ◽

High Dimensional ◽

Asymptotic Results ◽

Cross Sectional ◽

Data Set ◽

Monte Carlo Simulation Study ◽

Cross Sectional Dependence

Abstract The problem of detecting change points in the mean of high dimensional panel data with potentially strong cross–sectional dependence is considered. Under the assumption that the cross–sectional dependence is captured by an unknown number of common factors, a new CUSUM type statistic is proposed. We derive its asymptotic properties under three scenarios depending on to what extent the common factors are asymptotically dominant. With panel data consisting of N cross sectional time series of length T, the asymptotic results hold under the mild assumption that min {N, T} → ∞, with an otherwise arbitrary relationship between N and T, allowing the results to apply to most panel data examples. Bootstrap procedures are proposed to approximate the sampling distribution of the test statistics. A Monte Carlo simulation study showed that our test outperforms several other existing tests in finite samples in a number of cases, particularly when N is much larger than T. The practical application of the proposed results are demonstrated with real data applications to detecting and estimating change points in the high dimensional FRED-MD macroeconomic data set.

Download Full-text

Forecasting and Estimating Multiple Change-Point Models with an Unknown Number of Change Points

SSRN Electronic Journal ◽

10.2139/ssrn.628561 ◽

2004 ◽

Cited By ~ 4

Author(s):

Simon Potter ◽

Gary M. Koop

Keyword(s):

Change Point ◽

Change Points ◽

Unknown Number ◽

Change Point Models

Download Full-text

Closed-Form Estimation of Multiple Change-Point Models

10.7287/peerj.preprints.90 ◽

2013 ◽

Author(s):

Greg Jensen

Keyword(s):

Time Series ◽

Linear Regression ◽

Change Point ◽

Marginal Likelihood ◽

Stationary Time Series ◽

Change Points ◽

Model Complexity ◽

Marginal Model ◽

General Strategy ◽

Change Point Models

Identifying discontinuities (or change-points) in otherwise stationary time series is a powerful analytic tool. This paper outlines a general strategy for identifying an unknown number of change-points using elementary principles of Bayesian statistics. Using a strategy of binary partitioning by marginal likelihood, a time series is recursively subdivided on the basis of whether adding divisions (and thus increasing model complexity) yields a justified improvement in the marginal model likelihood. When this approach is combined with the use of conjugate priors, it yields the Conjugate Partitioned Recursion (CPR) algorithm, which identifies change-points without computationally intensive numerical integration. Using the CPR algorithm, methods are described for specifying change-point models drawn from a host of familiar distributions, both discrete (binomial, geometric, Poisson) and continuous (exponential, Gaussian, uniform, and multiple linear regression), as well as multivariate distribution (multinomial, multivariate normal, and multivariate linear regression). Methods by which the CPR algorithm could be extended or modified are discussed, and several detailed applications to data published in psychology and biomedical engineering are described.

Download Full-text

Optimal number and allocation of data collection points for linear spline growth curve modeling

International Journal of Behavioral Development ◽

10.1177/0165025416644076 ◽

2016 ◽

Vol 41 (4) ◽

pp. 550-558 ◽

Cited By ~ 3

Author(s):

Wei Wu ◽

Fan Jia ◽

Richard Kinai ◽

Todd D. Little

Keyword(s):

Data Collection ◽

Change Point ◽

Growth Models ◽

Optimal Number ◽

Change Points ◽

Change Processes ◽

Linear Spline ◽

Curve Modeling ◽

Data Points ◽

Two Phases

Spline growth modelling is a popular tool to model change processes with distinct phases and change points in longitudinal studies. Focusing on linear spline growth models with two phases and a fixed change point (the transition point from one phase to the other), we detail how to find optimal data collection designs that maximize the efficiency of detecting key parameters in the spline models, holding the total number of data points or sample size constant. We identify efficient designs for the cases where (a) the exact location of the change point is known (complete certainty), (b) only the interval that contains the change point is known (partial certainty), and (c) no prior knowledge on the location of the change point is available (zero certainty). We conclude with recommendations for optimal number and allocation of data collection points.

Download Full-text

Efficient Change-Points Detection For Genomic Sequences Via Cumulative Segmented Regression

Bioinformatics ◽

10.1093/bioinformatics/btab685 ◽

2021 ◽

Author(s):

Shengji Jia ◽

Lei Shi

Keyword(s):

Change Point ◽

Serial Correlation ◽

Copy Number Variations ◽

Change Points ◽

Supplementary Information ◽

Genomic Sequences ◽

Segmented Regression ◽

Computationally Efficient ◽

R Program ◽

Point Estimator

Abstract Motivation Knowing the number and the exact locations of multiple change points in genomic sequences serves several biological needs. The cumulative segmented algorithm (cumSeg) has been recently proposed as a computationally efficient approach for multiple change-points detection, which is based on a simple transformation of data and provides results quite robust to model mis-specifications. However, the errors are also accumulated in the transformed model so that heteroscedasticity and serial correlation will show up, and thus the variations of the estimated change points will be quite different, while the locations of the change points should be of the same importance in the original genomic sequences. Results In this study, we develop two new change-points detection procedures in the framework of cumulative segmented regression. Simulations reveal that the proposed methods not only improve the efficiency of each change point estimator substantially but also provide the estimators with similar variations for all the change points. By applying these proposed algorithms to Coriel and SNP genotyping data, we illustrate their performance on detecting copy number variations. Supplementary information The proposed algorithms are implemented in R program and are available at Bioinformatics online.

Download Full-text

Abstract 015: Critical Periods in Cardiovascular Health Across the Life Course: A Pooled Cohort Analysis

Circulation ◽

10.1161/circ.137.suppl_1.015 ◽

2018 ◽

Vol 137 (suppl_1) ◽

Author(s):

Norrina B Allen ◽

Amy Krefman ◽

Darwin Labarthe ◽

Philip Greenland ◽

Markus Juonala ◽

...

Keyword(s):

Change Point ◽

Mixed Model ◽

Cardiovascular Health ◽

Linear Mixed Model ◽

Cohort Analysis ◽

Change Points ◽

Rapid Decline ◽

Critical Periods ◽

Ideal Cardiovascular Health ◽

Inflection Points

Background: The prevalence of Ideal Cardiovascular Health (CVH) decreases with age, beginning in childhood. However, more precise estimates of trajectories of CVH across the lifespan are needed to guide intervention. The aims of this analysis are to describe trajectories in CVH from childhood through middle age and examine whether there are critical inflection points in the decline in CVH. Methods: We pooled data from five prospective childhood/early adulthood cohorts including Bogalusa, Young Finns, HB!, CARDIA, and STRIP. Clinical CVH factors—blood pressure, BMI, cholesterol, glucose—were categorized as poor, intermediate and ideal then summed to create a clinical CVH score, ranging from 0 to 8 (higher score= more ideal CVH). The association between clinical CVH score and age in years was modeled using a segmented linear mixed model, with a random participant intercept, fixed slopes, and fixed change points. Change points were estimated using an extension of the R package ‘segmented’ which utilizes a likelihood based approach to iteratively determine one or more change points. All models were adjusted for race, gender and cohort. Results: This study included 18,290 participants (51% female, 67% White, 46% between the ages of 8-11 at baseline). CVH scores decline with age from 8 through 55 years. We found two ages at which the slope of the CVH trajectories change significantly. CVH scores are generally stable from age 8 until the first change point at age 17 (95% CI 16.3-17.4), when they begin to decline more rapidly with a 0.08 CVH unit loss per year from age 17 to 30. The second change point occurs at age 30 (26.7-33.6) when the rate of decline increases by an additional 0.01 units per year. Conclusion: The clinical CVH score declines from favorable levels from childhood through adulthood, with a rapid decline starting at age 17 that becomes slightly steeper from age 30 to 55 years. These inflection points signal that there are critical periods in an individual’s clinical CVH trajectory during which prevention efforts may be targeted.

Download Full-text