scholarly journals Dynamic and Interpretable Hazard-Based Models of Traffic Incident Durations

2021 ◽  
Vol 2 ◽  
Author(s):  
Kieran Kalair ◽  
Colm Connaughton

Understanding and predicting the duration or “return-to-normal” time of traffic incidents is important for system-level management and optimization of road transportation networks. Increasing real-time availability of multiple data sources characterizing the state of urban traffic networks, together with advances in machine learning offer the opportunity for new and improved approaches to this problem that go beyond static statistical analyses of incident duration. In this paper we consider two such improvements: dynamic update of incident duration predictions as new information about incidents becomes available and automated interpretation of the factors responsible for these predictions. For our use case, we take one year of incident data and traffic state time-series data from the M25 motorway in London. We use it to train models that predict the probability distribution of incident durations, utilizing both time-invariant and time-varying features of the data. The latter allow predictions to be updated as an incident progresses, and more information becomes available. For dynamic predictions, time-series features are fed into the Match-Net algorithm, a temporal convolutional hitting-time network, recently developed for dynamical survival analysis in clinical applications. The predictions are benchmarked against static regression models for survival analysis and against an established dynamic technique known as landmarking and found to perform favourably by several standard comparison measures. To provide interpretability, we utilize the concept of Shapley values recently developed in the domain of interpretable artificial intelligence to rank the features most relevant to the model predictions at different time horizons. For example, the time of day is always a significantly influential time-invariant feature, whereas the time-series features strongly influence predictions at 5 and 60-min horizons. Although we focus here on traffic incidents, the methodology we describe can be applied to many survival analysis problems where time-series data is to be combined with time-invariant features.

2016 ◽  
Vol 50 (1) ◽  
pp. 41-57 ◽  
Author(s):  
Linghe Huang ◽  
Qinghua Zhu ◽  
Jia Tina Du ◽  
Baozhen Lee

Purpose – Wiki is a new form of information production and organization, which has become one of the most important knowledge resources. In recent years, with the increase of users in wikis, “free rider problem” has been serious. In order to motivate editors to contribute more to a wiki system, it is important to fully understand their contribution behavior. The purpose of this paper is to explore the law of dynamic contribution behavior of editors in wikis. Design/methodology/approach – After developing a dynamic model of contribution behavior, the authors employed both the metrological and clustering methods to process the time series data. The experimental data were collected from Baidu Baike, a renowned Chinese wiki system similar to Wikipedia. Findings – There are four categories of editors: “testers,” “dropouts,” “delayers” and “stickers.” Testers, who contribute the least content and stop contributing rapidly after editing a few articles. After editing a large amount of content, dropouts stop contributing completely. Delayers are the editors who do not stop contributing during the observation time, but they may stop contributing in the near future. Stickers, who keep contributing and edit the most content, are the core editors. In addition, there are significant time-of-day and holiday effects on the number of editors’ contributions. Originality/value – By using the method of time series analysis, some new characteristics of editors and editor types were found. Compared with the former studies, this research also had a larger sample. Therefore, the results are more scientific and representative and can help managers to better optimize the wiki systems and formulate incentive strategies for editors.


2019 ◽  
Author(s):  
Aaron Jason Fisher ◽  
Peter D. Soyster

The present study sought to apply statistical classification methods to idiographic time series data in order to make accurate future predictions of behavior. We recruited 70 individuals who presented as regular smokers; 52 completed experience sampling method (ESM) data collection and provided sufficient time series data. Time stamps from ESM surveys were used to calculate the time of day, day of the week, and continuous time—where the last datum was, in turn, used to calculate 12-hr and 24-hr cycles. Each individual’s time series was split into sequential training and testing sections, so that trained models could be tested on future observations. Prediction models were trained on the first 75% of the individual’s data and tested on the last 25%. Predictions of future behavior were made on a person by person basis. Two prediction algorithms were employed, elastic net regularization and naïve Bayes classification. Sample-wide area under the curve was nearly 80%, with some models demonstrating perfect prediction accuracies. Sensitivity and specificity were between 0.78 and 0.81 across the two approaches. Importantly, prediction models were based on a lagged data structure. Thus, in addition to supporting the prediction accuracy of our models with out-of-sample tests in time-forward data, the models themselves were time-lagged, such that each prediction was for the subsequent measurement. Such a system could be the basis for mobile, just-in-time interventions for substance use, as models that accurately predict future behavior could ostensibly be used for delivering personalized interventions at empirically-indicated moments of need.


2021 ◽  
Vol 1 (1) ◽  
Author(s):  
Marvin M. Mayerhofer ◽  
Falk Eigemann ◽  
Carsten Lackner ◽  
Jutta Hoffmann ◽  
Ferdi L. Hellweger

AbstractThe functioning of microbial ecosystems has important consequences from global climate to human health, but quantitative mechanistic understanding remains elusive. The components of microbial ecosystems can now be observed at high resolution, but interactions still have to be inferred e.g., a time-series may show a bloom of bacteria X followed by virus Y suggesting they interact. Existing inference approaches are mostly empirical, like correlation networks, which are not mechanistically constrained and do not provide quantitative mass fluxes, and thus have limited utility. We developed an inference method, where a mechanistic model with hundreds of species and thousands of parameters is calibrated to time series data. The large scale, nonlinearity and feedbacks pose a challenging optimization problem, which is overcome using a novel procedure that mimics natural speciation or diversification e.g., stepwise increase of bacteria species. The method allows for curation using species-level information from e.g., physiological experiments or genome sequences. The product is a mass-balancing, mechanistically-constrained, quantitative representation of the ecosystem. We apply the method to characterize phytoplankton—heterotrophic bacteria interactions via dissolved organic matter in a marine system. The resulting model predicts quantitative fluxes for each interaction and time point (e.g., 0.16 µmolC/L/d of chrysolaminarin to Polaribacter on April 16, 2009). At the system level, the flux network shows a strong correlation between the abundance of bacteria species and their carbon flux during blooms, with copiotrophs being relatively more important than oligotrophs. However, oligotrophs, like SAR11, are unexpectedly high carbon processors for weeks into blooms, due to their higher biomass. The fraction of exudates (vs. grazing/death products) in the DOM pool decreases during blooms, and they are preferentially consumed by oligotrophs. In addition, functional similarity of phytoplankton i.e., what they produce, decouples their association with heterotrophs. The methodology is applicable to other microbial ecosystems, like human microbiome or wastewater treatment plants.


2019 ◽  
Author(s):  
Aaron Jason Fisher ◽  
Hannah G Bosley

The present study tested a novel, person-specific method for identifying discrete mood profiles from time-series data, and examined the degree to which these profiles could be predicted by lagged mood and anxiety variables and time-based variables, including trends (linear, quadratic, cubic), cycles (12-hr, 24-hr, and 7-day), day of the week, and time of day. We analyzed ambulatory data from 45 individuals with mood and anxiety disorders prior to therapy. Data were collected four-times-daily for at least 30 days. Latent profile analysis was applied person-by-person to discretize each individual’s continuous multivariate time series of rumination, worry, fear, anger, irritability, anhedonia, hopelessness, depressed mood, and avoidance. That is, each time point was classified according to its unique blend of emotional states, and latent classes representing discrete mood profiles were identified for each participant. We found that the modal number of latent classes per person was three (mean = 3.04, median = 3), with a range of two to four classes. After splitting each individual’s time series into random halves for training and testing, we used elastic net regularization to identify the temporal and lagged predictors of each mood profile’s presence or absence in the training set. Prediction accuracy was evaluated in the testing set. Across 127 models, the average area under the curved was 0.77, with sensitivity of 0.81 and specificity of 0.75. Brier scores indicated an average prediction accuracy of 83%.


Algorithms ◽  
2020 ◽  
Vol 13 (9) ◽  
pp. 222 ◽  
Author(s):  
Eric S. Weber ◽  
Steven N. Harding ◽  
Lee Przybylski

We introduce a novel methodology for anomaly detection in time-series data. The method uses persistence diagrams and bottleneck distances to identify anomalies. Specifically, we generate multiple predictors by randomly bagging the data (reference bags), then for each data point replacing the data point for a randomly chosen point in each bag (modified bags). The predictors then are the set of bottleneck distances for the reference/modified bag pairs. We prove the stability of the predictors as the number of bags increases. We apply our methodology to traffic data and measure the performance for identifying known incidents.


2013 ◽  
Author(s):  
Stephen J. Tueller ◽  
Richard A. Van Dorn ◽  
Georgiy Bobashev ◽  
Barry Eggleston

Author(s):  
Rizki Rahma Kusumadewi ◽  
Wahyu Widayat

Exchange rate is one tool to measure a country’s economic conditions. The growth of a stable currency value indicates that the country has a relatively good economic conditions or stable. This study has the purpose to analyze the factors that affect the exchange rate of the Indonesian Rupiah against the United States Dollar in the period of 2000-2013. The data used in this study is a secondary data which are time series data, made up of exports, imports, inflation, the BI rate, Gross Domestic Product (GDP), and the money supply (M1) in the quarter base, from first quarter on 2000 to fourth quarter on 2013. Regression model time series data used the ARCH-GARCH with ARCH model selection indicates that the variables that significantly influence the exchange rate are exports, inflation, the central bank rate and the money supply (M1). Whereas import and GDP did not give any influence.


Sign in / Sign up

Export Citation Format

Share Document