MODIFIED NEAREST NEIGHBOR METHOD FOR MULTISTEP AHEAD TIME SERIES FORECASTING

Multistep ahead time series forecasting has become an important activity in various fields of science and technology due to its usefulness in future events management. Nearest neighbor search is a pattern matching algorithm for forecasting, and the accuracy of the method considerably depends on the similarity of the pattern found in the database with the reference pattern. Original time series is embedded into optimal dimension. The optimal dimension is determined by using autocorrelation function plot. The last vector in the embedded matrix is taken as the reference vector and all the previous vectors as candidate vectors. In nearest neighbor algorithm, the reference vector is matched with all the candidate vectors in terms of Euclidean distance and the best matched pattern is used for forecasting. In this paper, we have proposed a hybrid distance measure to improve the search of the nearest neighbor. The proposed method is based on cross-correlation and Euclidean distance. The candidate patterns are shortlisted by using cross-correlation and then Euclidean distance is used to select the best matched pattern. Moreover, in multistep ahead forecasting, standard nearest neighbor method introduces a bias in the search which results in higher forecasting errors. We have modified the search methodology to remove the bias by ignoring the latest forecasted value during the search of the nearest neighbor in the subsequent iteration. The proposed algorithm is evaluated on two benchmark time series as well as two real life time series.

Download Full-text

LONG RANGE TIME SERIES FORECASTING BY UPSAMPLING AND USING CROSS-CORRELATION BASED SELECTION OF NEAREST NEIGHBOR

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s021800140600523x ◽

2006 ◽

Vol 20 (08) ◽

pp. 1261-1278 ◽

Cited By ~ 1

Author(s):

SYED RAHAT ABBAS ◽

MUHAMMAD ARIF

Keyword(s):

Time Series ◽

Long Range ◽

Euclidean Distance ◽

Cross Correlation ◽

Nearest Neighbor ◽

Time Series Forecasting ◽

Embedding Dimension ◽

Auto Correlation Function ◽

Original Time ◽

Selection Of

Long range or multistep-ahead time series forecasting is an important issue in various fields of business, science and technology. In this paper, we have proposed a modified nearest neighbor based algorithm that can be used for long range time series forecasting. In the original time series, optimal selection of embedding dimension that can unfold the dynamics of the system is improved by using upsampling of the time series. Zeroth order cross-correlation and Euclidian distance criterion are used to select the nearest neighbor from up-sampled time series. Embedding dimension size and number of candidate vectors for nearest neighbor selection play an important role in forecasting. The size of embedding is optimized by using auto-correlation function (ACF) plot of the time series. It is observed that proposed algorithm outperforms the standard nearest neighbor algorithm. The cross-correlation based criteria shows better performance than Euclidean distance criteria.

Download Full-text

The Earth Mover’s Distance as a Metric for the Space of Inorganic Compositions

10.26434/chemrxiv.12777566.v1 ◽

2020 ◽

Author(s):

Cameron Hargreaves ◽

Matthew Dyer ◽

Michael Gaultois ◽

Vitaliy Kurlin ◽

Matthew J Rosseinsky

Keyword(s):

Euclidean Distance ◽

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Inorganic Crystal Structure Database ◽

Earth Mover’S Distance ◽

Chemical Similarity ◽

Earth Mover's Distance ◽

Neighbor Search ◽

The Earth ◽

Binary Compounds

It is a core problem in any field to reliably tell how close two objects are to being the same, and once this relation has been established we can use this information to precisely quantify potential relationships, both analytically and with machine learning (ML). For inorganic solids, the chemical composition is a fundamental descriptor, which can be represented by assigning the ratio of each element in the material to a vector. These vectors are a convenient mathematical data structure for measuring similarity, but unfortunately, the standard metric (the Euclidean distance) gives little to no variance in the resultant distances between chemically dissimilar compositions. We present the Earth Mover’s Distance (EMD) for inorganic compositions, a well-defined metric which enables the measure of chemical similarity in an explainable fashion. We compute the EMD between two compositions from the ratio of each of the elements and the absolute distance between the elements on the modified Pettifor scale. This simple metric shows clear strength at distinguishing compounds and is efficient to compute in practice. The resultant distances have greater alignment with chemical understanding than the Euclidean distance, which is demonstrated on the binary compositions of the Inorganic Crystal Structure Database (ICSD). The EMD is a reliable numeric measure of chemical similarity that can be incorporated into automated workflows for a range of ML techniques. We have found that with no supervision the use of this metric gives a distinct partitioning of binary compounds into clear trends and families of chemical property, with future applications for nearest neighbor search queries in chemical database retrieval systems and supervised ML techniques.

Download Full-text

Road Short-Term Travel Time Prediction Method Based on Flow Spatial Distribution and the Relations

Mathematical Problems in Engineering ◽

10.1155/2016/7626875 ◽

2016 ◽

Vol 2016 ◽

pp. 1-14 ◽

Cited By ~ 1

Author(s):

Mingjun Deng ◽

Shiru Qu

Keyword(s):

Time Series ◽

Spatial Distribution ◽

Travel Time ◽

Nonparametric Regression ◽

Nearest Neighbor ◽

Nearest Neighbor Search ◽

Short Term ◽

Combination Model ◽

The Road ◽

Neighbor Search

There are many short-term road travel time forecasting studies based on time series, but indeed, road travel time not only relies on the historical travel time series, but also depends on the road and its adjacent sections history flow. However, few studies have considered that. This paper is based on the correlation of flow spatial distribution and the road travel time series, applying nearest neighbor and nonparametric regression method to build a forecasting model. In aspect of spatial nearest neighbor search, three different space distances are defined. In addition, two forecasting functions are introduced: one combines the forecasting value by mean weight and the other uses the reciprocal of nearest neighbors distance as combined weight. Three different distances are applied in nearest neighbor search, which apply to the two forecasting functions. For travel time series, the nearest neighbor and nonparametric regression are applied too. Then minimizing forecast error variance is utilized as an objective to establish the combination model. The empirical results show that the combination model can improve the forecast performance obviously. Besides, the experimental results of the evaluation for the computational complexity show that the proposed method can satisfy the real-time requirement.

Download Full-text

Multidimensionalk-nearest neighbor model based on EEMD for financial time series forecasting

Physica A Statistical Mechanics and its Applications ◽

10.1016/j.physa.2017.02.072 ◽

2017 ◽

Vol 477 ◽

pp. 161-173 ◽

Cited By ~ 38

Author(s):

Ningning Zhang ◽

Aijing Lin ◽

Pengjian Shang

Keyword(s):

Time Series ◽

Nearest Neighbor ◽

Financial Time Series ◽

Time Series Forecasting ◽

Financial Time ◽

Model Based

Download Full-text

Efficient Shared Execution Processing of k-Nearest Neighbor Joins in Road Networks

Mobile Information Systems ◽

10.1155/2018/1243289 ◽

2018 ◽

Vol 2018 ◽

pp. 1-17 ◽

Cited By ~ 1

Author(s):

Hyung-Ju Cho

Keyword(s):

Euclidean Distance ◽

Nearest Neighbor ◽

Real Life ◽

Road Networks ◽

Nearest Neighbors ◽

Superior Performance ◽

K Nearest Neighbor ◽

Wide Range ◽

Primitive Operation ◽

Nested Loop

We investigate the k-nearest neighbor (kNN) join in road networks to determine the k-nearest neighbors (NNs) from a dataset S to every object in another dataset R. The kNN join is a primitive operation and is widely used in many data mining applications. However, it is an expensive operation because it combines the kNN query and the join operation, whereas most existing methods assume the use of the Euclidean distance metric. We alternatively consider the problem of processing kNN joins in road networks where the distance between two points is the length of the shortest path connecting them. We propose a shared execution-based approach called the group-nested loop (GNL) method that can efficiently evaluate kNN joins in road networks by exploiting grouping and shared execution. The GNL method can be easily implemented using existing kNN query algorithms. Extensive experiments using several real-life roadmaps confirm the superior performance and effectiveness of the proposed method in a wide range of problem settings.

Download Full-text

A methodology for applying k-nearest neighbor to time series forecasting

Artificial Intelligence Review ◽

10.1007/s10462-017-9593-z ◽

2017 ◽

Vol 52 (3) ◽

pp. 2019-2037 ◽

Cited By ~ 13

Author(s):

Francisco Martínez ◽

María Pilar Frías ◽

María Dolores Pérez ◽

Antonio Jesús Rivera

Keyword(s):

Time Series ◽

Nearest Neighbor ◽

Time Series Forecasting ◽

K Nearest Neighbor

Download Full-text

Fuzzy Time Series Forecasting Models Evaluation Based on A Novel Distance Measure

Advances in Time Series Forecasting - Advances in Time Series Forecasting Volume 2 ◽

10.2174/9781681085289117020003 ◽

2017 ◽

pp. 1-23

Keyword(s):

Time Series ◽

Distance Measure ◽

Time Series Forecasting ◽

Fuzzy Time Series ◽

Forecasting Models

Download Full-text

Query-sensitive distance measure selection for time series nearest neighbor classification

Intelligent Data Analysis ◽

10.3233/ida-150791 ◽

2016 ◽

Vol 20 (1) ◽

pp. 5-27 ◽

Cited By ~ 5

Author(s):

Alexios Kotsifakos ◽

Vassilis Athitsos ◽

Panagiotis Papapetrou

Keyword(s):

Time Series ◽

Nearest Neighbor ◽

Distance Measure ◽

Nearest Neighbor Classification ◽

Selection For ◽

Neighbor Classification

Download Full-text

Discovery of Meaningful Rules by using DTW based on Cubic Spline Interpolation

Revista Tecnología en Marcha ◽

10.18845/tm.v33i2.4073 ◽

2020 ◽

Author(s):

Luis Alexander Calvo-Valverde ◽

David Elías Alfaro-Barboza

Keyword(s):

Time Series ◽

Data Science ◽

Distance Measure ◽

Spline Interpolation ◽

Real Life ◽

Cubic Spline ◽

Data Types ◽

Vast Number ◽

Cubic Spline Interpolation ◽

Using Data

The ability to make short or long term predictions is at the heart of much of science. In the last decade, the data science community have been highly interested in foretelling real life events, using data mining techniques to discover meaningful rules or patterns, from different data types, including Time Series. Short-term predictions based on “the shape” of meaningful rules lead to a vast number of applications. The discovery of meaningful rules is achieved through efficient algorithms, equipped with a robust and accurate distance measure. Consequently, it is important to wisely choose a distance measure that can deal with noise, entropy and other technical constraints, to get accurate outcomes of similarity from the comparison between two time series. In this work, we do believe that Dynamic Time Warping based on Cubic Spline Interpolation (SIDTW), can be useful to carry out the similarity computation for two specific algorithms: 1- DiscoverRules() and 2- TestRules(). Mohammad Shokoohi-Yekta et al developed a framework, using these two algoritghms, to find and test meaningful rules from time series. Our research expanded the scope of their project, adding a set of well-known similarity search measures, including SIDTW as novel and enhanced version of DTW.

Download Full-text

Hierarchical Electricity Time Series Forecasting for Integrating Consumption Patterns Analysis and Aggregation Consistency

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/487 ◽

2018 ◽

Cited By ~ 3

Author(s):

Yue Pang ◽

Bo Yao ◽

Xiangdong Zhou ◽

Yong Zhang ◽

Yiming Xu ◽

...

Keyword(s):

Time Series ◽

Demand Forecasting ◽

Real Life ◽

Forecast Accuracy ◽

Electricity Consumption ◽

Electricity Demand ◽

Time Series Forecasting ◽

Consumption Patterns ◽

Consumption Pattern ◽

Bottom Level

Electricity demand forecasting is a very important problem for energy supply and environmental protection. It can be formalized as a hierarchical time series forecasting problem with the aggregation constraints according to the geographical hierarchy, since the sum of the prediction results of the disaggregated time series should be equal to the prediction results of the aggregated ones. However in most previous work, the aggregation consistency is ensured at the loss of forecast accuracy. In this paper, we propose a novel clustering-based hierarchical electricity time series forecasting approach. Instead of dealing with the geographical hierarchy directly, we explore electricity consumption patterns by clustering analysis and build a new consumption pattern based time series hierarchy. We then present a novel hierarchical forecasting method with consumption hierarchical aggregation constraints to improve the electricity demand predictions of the bottom level, followed by a ``bottom-up" method to obtain forecasts of the geographical higher levels. Especially, we observe that in our consumption pattern based hierarchy the reconciliation error of the bottom level time series is ``correlated" to its membership degree of the corresponding cluster (consumption pattern), and hence apply this correlations as the regularization term in our forecasting objective function. Extensive experiments on real-life datasets verify that our approach achieves the best prediction accuracy, compared with the state-of-the-art methods.

Download Full-text