scholarly journals Differentially Private Autocorrelation Time-Series Data Publishing Based on Sliding Window

2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Jing Zhao ◽  
Shubo Liu ◽  
Xingxing Xiong ◽  
Zhaohui Cai

Privacy protection is one of the major obstacles for data sharing. Time-series data have the characteristics of autocorrelation, continuity, and large scale. Current research on time-series data publication mainly ignores the correlation of time-series data and the lack of privacy protection. In this paper, we study the problem of correlated time-series data publication and propose a sliding window-based autocorrelation time-series data publication algorithm, called SW-ATS. Instead of using global sensitivity in the traditional differential privacy mechanisms, we proposed periodic sensitivity to provide a stronger degree of privacy guarantee. SW-ATS introduces a sliding window mechanism, with the correlation between the noise-adding sequence and the original time-series data guaranteed by sequence indistinguishability, to protect the privacy of the latest data. We prove that SW-ATS satisfies ε-differential privacy. Compared with the state-of-the-art algorithm, SW-ATS is superior in reducing the error rate of MAE which is about 25%, improving the utility of data, and providing stronger privacy protection.

2020 ◽  
Author(s):  
Hsiao-Ko Chang ◽  
Hui-Chih Wang ◽  
Chih-Fen Huang ◽  
Feipei Lai

BACKGROUND In most of Taiwan’s medical institutions, congestion is a serious problem for emergency departments. Due to a lack of beds, patients spend more time in emergency retention zones, which make it difficult to detect cardiac arrest (CA). OBJECTIVE We seek to develop a pharmaceutical early warning model to predict cardiac arrest in emergency departments via drug classification and medical expert suggestion. METHODS We propose a new early warning score model for detecting cardiac arrest via pharmaceutical classification and by using a sliding window; we apply learning-based algorithms to time-series data for a Pharmaceutical Early Warning Scoring Model (PEWSM). By treating pharmaceutical features as a dynamic time-series factor for cardiopulmonary resuscitation (CPR) patients, we increase sensitivity, reduce false alarm rates and mortality, and increase the model’s accuracy. To evaluate the proposed model we use the area under the receiver operating characteristic curve (AUROC). RESULTS Four important findings are as follows: (1) We identify the most important drug predictors: bits, and replenishers and regulators of water and electrolytes. The best AUROC of bits is 85%; that of replenishers and regulators of water and electrolytes is 86%. These two features are the most influential of the drug features in the task. (2) We verify feature selection, in which accounting for drugs improve the accuracy: In Task 1, the best AUROC of vital signs is 77%, and that of all features is 86%. In Task 2, the best AUROC of all features is 85%, which demonstrates that thus accounting for the drugs significantly affects prediction. (3) We use a better model: For traditional machine learning, this study adds a new AI technology: the long short-term memory (LSTM) model with the best time-series accuracy, comparable to the traditional random forest (RF) model; the two AUROC measures are 85%. (4) We determine whether the event can be predicted beforehand: The best classifier is still an RF model, in which the observational starting time is 4 hours before the CPR event. Although the accuracy is impaired, the predictive accuracy still reaches 70%. Therefore, we believe that CPR events can be predicted four hours before the event. CONCLUSIONS This paper uses a sliding window to account for dynamic time-series data consisting of the patient’s vital signs and drug injections. In a comparison with NEWS, we improve predictive accuracy via feature selection, which includes drugs as features. In addition, LSTM yields better performance with time-series data. The proposed PEWSM, which offers 4-hour predictions, is better than the National Early Warning Score (NEWS) in the literature. This also confirms that the doctor’s heuristic rules are consistent with the results found by machine learning algorithms.


2021 ◽  
Author(s):  
Sadnan Al Manir ◽  
Justin Niestroy ◽  
Maxwell Adam Levinson ◽  
Timothy Clark

Introduction: Transparency of computation is a requirement for assessing the validity of computed results and research claims based upon them; and it is essential for access to, assessment, and reuse of computational components. These components may be subject to methodological or other challenges over time. While reference to archived software and/or data is increasingly common in publications, a single machine-interpretable, integrative representation of how results were derived, that supports defeasible reasoning, has been absent. Methods: We developed the Evidence Graph Ontology, EVI, in OWL 2, with a set of inference rules, to provide deep representations of supporting and challenging evidence for computations, services, software, data, and results, across arbitrarily deep networks of computations, in connected or fully distinct processes. EVI integrates FAIR practices on data and software, with important concepts from provenance models, and argumentation theory. It extends PROV for additional expressiveness, with support for defeasible reasoning. EVI treats any com- putational result or component of evidence as a defeasible assertion, supported by a DAG of the computations, software, data, and agents that produced it. Results: We have successfully deployed EVI for very-large-scale predictive analytics on clinical time-series data. Every result may reference its own evidence graph as metadata, which can be extended when subsequent computations are executed. Discussion: Evidence graphs support transparency and defeasible reasoning on results. They are first-class computational objects, and reference the datasets and software from which they are derived. They support fully transparent computation, with challenge and support propagation. The EVI approach may be extended to include instruments, animal models, and critical experimental reagents.


Algorithms ◽  
2020 ◽  
Vol 13 (4) ◽  
pp. 95 ◽  
Author(s):  
Johannes Stübinger ◽  
Katharina Adler

This paper develops the generalized causality algorithm and applies it to a multitude of data from the fields of economics and finance. Specifically, our parameter-free algorithm efficiently determines the optimal non-linear mapping and identifies varying lead–lag effects between two given time series. This procedure allows an elastic adjustment of the time axis to find similar but phase-shifted sequences—structural breaks in their relationship are also captured. A large-scale simulation study validates the outperformance in the vast majority of parameter constellations in terms of efficiency, robustness, and feasibility. Finally, the presented methodology is applied to real data from the areas of macroeconomics, finance, and metal. Highest similarity show the pairs of gross domestic product and consumer price index (macroeconomics), S&P 500 index and Deutscher Aktienindex (finance), as well as gold and silver (metal). In addition, the algorithm takes full use of its flexibility and identifies both various structural breaks and regime patterns over time, which are (partly) well documented in the literature.


2018 ◽  
Vol 7 (3.3) ◽  
pp. 218 ◽  
Author(s):  
D Senthil ◽  
G Suseendran

Time series analysis is an important and complex problem in machine learning and statistics. In the existing system, Support Vector Machine (SVM) and Association Rule Mining (ARM) is introduced to implement the time series data. However it has issues with lower accuracy and higher time complexity. Also it has issue with optimal rules discovery and segmentation on time series data. To avoid the above mentioned issues, in the proposed research Sliding Window Technique based Improved ARM with Enhanced SVM (SWT-IARM with ESVM) is proposed. In the proposed system, the preprocessing is performed using Modified K-Means Clustering (MKMC). The indexing process is done by using R-tree which is used to provide faster results. Segmentation is performed by using SWT and it reduces the cost complexity by optimal segments. Then IARM is applied on efficient rule discovery process by generating the most frequent rules. By using ESVM classification approach, the rules are classified more accurately.  


2020 ◽  
Vol 496 (1) ◽  
pp. 629-637
Author(s):  
Ce Yu ◽  
Kun Li ◽  
Shanjiang Tang ◽  
Chao Sun ◽  
Bin Ma ◽  
...  

ABSTRACT Time series data of celestial objects are commonly used to study valuable and unexpected objects such as extrasolar planets and supernova in time domain astronomy. Due to the rapid growth of data volume, traditional manual methods are becoming extremely hard and infeasible for continuously analysing accumulated observation data. To meet such demands, we designed and implemented a special tool named AstroCatR that can efficiently and flexibly reconstruct time series data from large-scale astronomical catalogues. AstroCatR can load original catalogue data from Flexible Image Transport System (FITS) files or data bases, match each item to determine which object it belongs to, and finally produce time series data sets. To support the high-performance parallel processing of large-scale data sets, AstroCatR uses the extract-transform-load (ETL) pre-processing module to create sky zone files and balance the workload. The matching module uses the overlapped indexing method and an in-memory reference table to improve accuracy and performance. The output of AstroCatR can be stored in CSV files or be transformed other into formats as needed. Simultaneously, the module-based software architecture ensures the flexibility and scalability of AstroCatR. We evaluated AstroCatR with actual observation data from The three Antarctic Survey Telescopes (AST3). The experiments demonstrate that AstroCatR can efficiently and flexibly reconstruct all time series data by setting relevant parameters and configuration files. Furthermore, the tool is approximately 3× faster than methods using relational data base management systems at matching massive catalogues.


2013 ◽  
Author(s):  
Zaixian Xie ◽  
Matthew O. Ward ◽  
Elke A. Rundensteiner

Sign in / Sign up

Export Citation Format

Share Document