scholarly journals Outlier Detection in Climatology Time Series with Sliding Window Prediction

It is important to identify outliers for climatology series data. With better quality of data decision capability will improve which in turn will improve the complete operation. An algorithm utilising the sliding window prediction method is being proposed to improve the data decision capability in this paper. The time series are parted in accordance with the size of sliding window. Thereafter a prediction model is rooted with the help of historical data to forecast the new values. There is a pre decided threshold value which will be compared to the difference of predicted and measured value. If the difference is greater than a predefined threshold then the specific point will be treated as an outlier. Results from experiment are showing that the algorithm is identifying the outliers in climatology time series data and also remodeling the correction efficiency.

2020 ◽  
Author(s):  
Hsiao-Ko Chang ◽  
Hui-Chih Wang ◽  
Chih-Fen Huang ◽  
Feipei Lai

BACKGROUND In most of Taiwan’s medical institutions, congestion is a serious problem for emergency departments. Due to a lack of beds, patients spend more time in emergency retention zones, which make it difficult to detect cardiac arrest (CA). OBJECTIVE We seek to develop a pharmaceutical early warning model to predict cardiac arrest in emergency departments via drug classification and medical expert suggestion. METHODS We propose a new early warning score model for detecting cardiac arrest via pharmaceutical classification and by using a sliding window; we apply learning-based algorithms to time-series data for a Pharmaceutical Early Warning Scoring Model (PEWSM). By treating pharmaceutical features as a dynamic time-series factor for cardiopulmonary resuscitation (CPR) patients, we increase sensitivity, reduce false alarm rates and mortality, and increase the model’s accuracy. To evaluate the proposed model we use the area under the receiver operating characteristic curve (AUROC). RESULTS Four important findings are as follows: (1) We identify the most important drug predictors: bits, and replenishers and regulators of water and electrolytes. The best AUROC of bits is 85%; that of replenishers and regulators of water and electrolytes is 86%. These two features are the most influential of the drug features in the task. (2) We verify feature selection, in which accounting for drugs improve the accuracy: In Task 1, the best AUROC of vital signs is 77%, and that of all features is 86%. In Task 2, the best AUROC of all features is 85%, which demonstrates that thus accounting for the drugs significantly affects prediction. (3) We use a better model: For traditional machine learning, this study adds a new AI technology: the long short-term memory (LSTM) model with the best time-series accuracy, comparable to the traditional random forest (RF) model; the two AUROC measures are 85%. (4) We determine whether the event can be predicted beforehand: The best classifier is still an RF model, in which the observational starting time is 4 hours before the CPR event. Although the accuracy is impaired, the predictive accuracy still reaches 70%. Therefore, we believe that CPR events can be predicted four hours before the event. CONCLUSIONS This paper uses a sliding window to account for dynamic time-series data consisting of the patient’s vital signs and drug injections. In a comparison with NEWS, we improve predictive accuracy via feature selection, which includes drugs as features. In addition, LSTM yields better performance with time-series data. The proposed PEWSM, which offers 4-hour predictions, is better than the National Early Warning Score (NEWS) in the literature. This also confirms that the doctor’s heuristic rules are consistent with the results found by machine learning algorithms.


Author(s):  
Achintya Mukhopadhyay ◽  
Subhashis Datta ◽  
Dipankar Sanyal

The effect of tailpipe friction on the combustion dynamics inside a thermal pulse combustor has been investigated using a nonlinear model consisting of four coupled first order ordinary differential equations. The dynamics of the system is represented through time series plots, time-delay phase plots, and Poincaré maps. The results indicate that as the tailpipe friction factor is lowered, the system undergoes a transition from steady combustion through oscillating combustion to an intermittent combustion with chaotic characteristics before extinction. The time series data are shown to be useful indicator for early detection of extinction. In one approach (thresholding), the occurrence of local peak pressures below a predefined threshold value is identified as an event and the number of events (event count) and largest number of successive cycles with such events (event duration) are recorded as the friction factor is lowered. In another approach, the statistical moments (kurtosis) of the data are used. Number of kurtosis peaks above a prescribed value and variance of the kurtosis values are recorded for decreasing values of friction factor. All these numbers sharply increase as the system approaches extinction.


2021 ◽  
Vol 2021 ◽  
pp. 1-10
Author(s):  
Jing Zhao ◽  
Shubo Liu ◽  
Xingxing Xiong ◽  
Zhaohui Cai

Privacy protection is one of the major obstacles for data sharing. Time-series data have the characteristics of autocorrelation, continuity, and large scale. Current research on time-series data publication mainly ignores the correlation of time-series data and the lack of privacy protection. In this paper, we study the problem of correlated time-series data publication and propose a sliding window-based autocorrelation time-series data publication algorithm, called SW-ATS. Instead of using global sensitivity in the traditional differential privacy mechanisms, we proposed periodic sensitivity to provide a stronger degree of privacy guarantee. SW-ATS introduces a sliding window mechanism, with the correlation between the noise-adding sequence and the original time-series data guaranteed by sequence indistinguishability, to protect the privacy of the latest data. We prove that SW-ATS satisfies ε-differential privacy. Compared with the state-of-the-art algorithm, SW-ATS is superior in reducing the error rate of MAE which is about 25%, improving the utility of data, and providing stronger privacy protection.


Axioms ◽  
2020 ◽  
Vol 9 (2) ◽  
pp. 49
Author(s):  
Anton Romanov ◽  
Valeria Voronina ◽  
Gleb Guskov ◽  
Irina Moshkina ◽  
Nadezhda Yarushkina

The development of the economy and the transition to industry 4.0 creates new challenges for artificial intelligence methods. Such challenges include the processing of large volumes of data, the analysis of various dynamic indicators, the discovery of complex dependencies in the accumulated data, and the forecasting of the state of processes. The main point of this study is the development of a set of analytical and prognostic methods. The methods described in this article based on fuzzy logic, statistic, and time series data mining, because data extracted from dynamic systems are initially incomplete and have a high degree of uncertainty. The ultimate goal of the study is to improve the quality of data analysis in industrial and economic systems. The advantages of the proposed methods are flexibility and orientation to the high interpretability of dynamic data. The high level of the interpretability and interoperability of dynamic data is achieved due to a combination of time series data mining and knowledge base engineering methods. The merging of a set of rules extracted from the time series and knowledge base rules allow for making a forecast in case of insufficiency of the length and nature of the time series. The proposed methods are also based on the summarization of the results of processes modeling for diagnosing technical systems, forecasting of the economic condition of enterprises, and approaches to the technological preparation of production in a multi-productive production program with the application of type 2 fuzzy sets for time series modeling. Intelligent systems based on the proposed methods demonstrate an increase in the quality and stability of their functioning. This article contains a set of experiments to approve this statement.


Author(s):  
Subhashis Datta ◽  
Achintya Mukhopadhyay ◽  
Dipankar Sanyal

A nonlinear fourth-order dynamic model of a thermal pulse combustor has been developed. In this work, the time series data generated by solution of the fourth order system is converted into a set of symbols based on the values of pressure variables. The key step to symbolization involves transformation of the original values to a stream of discretised symbols by partitioning the range of observed values into a finite number of regions and then assigning a symbol to each measurement based on the region in which it falls. Once all the measured values are symbolized, a symbol sequence vector consisting of L successive temporal observations is defined and its relative frequency is determined. In this work, the relative frequencies of different symbol sequences are computed by scanning the time series data in forward and reverse directions. The difference between the relative frequencies obtained in forward and reverse scanning is termed as "irreversibility" of the process. It is observed that for given alphabet and word sizes, the "irreversibility" increases as the system approaches extinction. The effects of different choices of alphabet and word sizes are also considered.


2013 ◽  
Vol 63 (2) ◽  
Author(s):  
M. H. Osman ◽  
Z. M. Nopiah ◽  
S. Abdullah ◽  
A. Lennie

An overlapping segmentation method on time series data is often used for preparing training dataset i.e. the population of instance, for classification data mining. Having large number of redundant instances would burden the training process with heavy computational operation. This would happen if practitioners fail to acknowledge an appropriate amount of overlap when performing the time series segmentation. Fortunately, the risk could be decreased if knowledge preferences can be determined to guide on overlapping criteria in the segmentation algorithm. Thus, this study aims to investigate how the Varri method is able to contribute for better understanding in preparing training dataset consists of irredundant fatigue segment from the loading history (fatigue signal). Generally, the method locates segment boundaries based on local maxima in the difference function which are above the assigned threshold. In the present study, the mean and standard deviation have been used to define the function due to the fact that predicting attributes are the key components in defining instance redundancy. The resulting dataset from the proposed method is trained by three classification algorithms under the supervision of the Genetic algorithms-based feature selection wrapper approach. The average performance index shows an additional advantage of the proposed method as compared to the conventional procedure in preparing training dataset.


2020 ◽  
Vol 29 (07n08) ◽  
pp. 2040010
Author(s):  
Shao-Pei Ji ◽  
Yu-Long Meng ◽  
Liang Yan ◽  
Gui-Shan Dong ◽  
Dong Liu

Time series data from real problems have nonlinear, non-smooth, and multi-scale composite characteristics. This paper first proposes a gated recurrent unit-correction (GRU-corr) network model, which adds a correction layer to the GRU neural network. Then, a adaptive staged variation PSO (ASPSO) is proposed. Finally, to overcome the drawbacks of the imprecise selection of the GRU-corr network parameters and obtain the high-precision global optimization of network parameters, weight parameters and the hidden nodes number of GRU-corr is optimized by ASPSO, and a time series prediction model (ASPSO-GRU-corr) is proposed based on the GRU-corr optimized by ASPSO. In the experiment, a comparative analysis of the optimization performance of ASPSO on a benchmark function was performed to verify its validity, and then the ASPSO-GRU-corr model is used to predict the ship motion cross-sway angle data. The results show that, ASPSO has better optimization performance and convergence speed compared with other algorithms, while the ASPSO-GRU-corr has higher generalization performance and lower architecture complexity. The ASPSO-GRU-corr can reveal the intrinsic multi-scale composite features of the time series, which is a reliable nonlinear and non-steady time series prediction method.


2018 ◽  
Vol 7 (3.3) ◽  
pp. 218 ◽  
Author(s):  
D Senthil ◽  
G Suseendran

Time series analysis is an important and complex problem in machine learning and statistics. In the existing system, Support Vector Machine (SVM) and Association Rule Mining (ARM) is introduced to implement the time series data. However it has issues with lower accuracy and higher time complexity. Also it has issue with optimal rules discovery and segmentation on time series data. To avoid the above mentioned issues, in the proposed research Sliding Window Technique based Improved ARM with Enhanced SVM (SWT-IARM with ESVM) is proposed. In the proposed system, the preprocessing is performed using Modified K-Means Clustering (MKMC). The indexing process is done by using R-tree which is used to provide faster results. Segmentation is performed by using SWT and it reduces the cost complexity by optimal segments. Then IARM is applied on efficient rule discovery process by generating the most frequent rules. By using ESVM classification approach, the rules are classified more accurately.  


2017 ◽  
Vol 04 (04) ◽  
pp. 1750045 ◽  
Author(s):  
Dilip B. Madan ◽  
King Wang

Market clichés assert that markets take escalators up and elevators down. The observation suggests differentiating models for up and down moves. Non-diffusive models allow for this and we model the move as the difference of two independent mean reverting increasing processes driven by gamma process shocks. The model is estimated on time series data as well as option data. Broadly speaking, the rise occurs with more frequent and smaller jumps with a faster rate of convergence to equilibrium. The down tick process has larger, less frequent moves with longer memories. Applications to delta hedging and the setting of profit targets and stop losses are also presented.


2010 ◽  
Vol 113-116 ◽  
pp. 1367-1370 ◽  
Author(s):  
Bin Sheng Liu ◽  
Ying Wang ◽  
Xue Ping Hu

There are many ways to predict drinking water quality such as neural network, gray model, ARIMA. But the prediction precise is need to improve. This paper proposes a new forecast method according the characteristic of drinking water quality and the evidence showed that the prediction is effectively. So it is able to being used in actual prediction.


Sign in / Sign up

Export Citation Format

Share Document