scholarly journals A scalable framework for large time series prediction

2021 ◽  
Vol 63 (5) ◽  
pp. 1093-1116
Author(s):  
Youssef Hmamouche ◽  
Lotfi Lakhal ◽  
Alain Casali

AbstractKnowledge discovery systems are nowadays supposed to store and process very large data. When working with big time series, multivariate prediction becomes more and more complicated because the use of all the variables does not allow to have the most accurate predictions and poses certain problems for classical prediction models. In this article, we present a scalable prediction process for large time series prediction, including a new algorithm for identifying time series predictors, which analyses the dependencies between time series using the mutual reinforcement principle between Hubs and Authorities of the Hits (Hyperlink-Induced Topic Search) algorithm. The proposed framework is evaluated on 3 real datasets. The results show that the best predictions are obtained using a very small number of predictors compared to the initial number of variables. The proposed feature selection algorithm shows promising results compared to widely known algorithms, such as the classic and the kernel principle component analysis, factor analysis, and the fast correlation-based filter method, and improves the prediction accuracy of many time series of the used datasets.

Energies ◽  
2021 ◽  
Vol 14 (9) ◽  
pp. 2392
Author(s):  
Antonello Rosato ◽  
Rodolfo Araneo ◽  
Amedeo Andreotti ◽  
Federico Succetti ◽  
Massimo Panella

Here, we propose a new deep learning scheme to solve the energy time series prediction problem. The model implementation is based on the use of Long Short-Term Memory networks and Convolutional Neural Networks. These techniques are combined in such a fashion that inter-dependencies among several different time series can be exploited and used for forecasting purposes by filtering and joining their samples. The resulting learning scheme can be summarized as a superposition of network layers, resulting in a stacked deep neural architecture. We proved the accuracy and robustness of the proposed approach by testing it on real-world energy problems.


2020 ◽  
Vol 12 (11) ◽  
pp. 4730 ◽  
Author(s):  
Ping Wang ◽  
Hongyinping Feng ◽  
Guisheng Zhang ◽  
Daizong Yu

An accurate, reliable and stable air quality prediction system is conducive to the public health and management of atmospheric ecological environment; therefore, many models, individual or hybrid, have been implemented widely to deal with the prediction problem. However, many of these models do not take into consideration or extract improperly the period information in air quality index (AQI) time series, which impacts the models’ learning efficiency greatly. In this paper, a period extraction algorithm is proposed by using a Luenberger observer, and then a novel period-aware hybrid model combined the period extraction algorithm and tradition time series models is build to exploit the comprehensive forecasting capacity to the AQI time series with nonlinear and non-stationary noise. The hybrid model requires a multi-phase implementation. In the first step, the Luenberger observer is used to estimate the implied period function in the one-dimensional AQI series, and then the analyzed time series is mapped to the period space through the function to obtain the period information sub-series of the original series. In the second step, the period sub-series is combined with the original input vector as input vector components according to the time points to establish a new data set. Finally, the new data set containing period information is applied to train the traditional time series prediction models. Both theoretical proof and experimental results obtained on the AQI hour values of Beijing, Tianjin, Taiyuan and Shijiazhuang in North China prove that the hybrid model with period information presents stronger robustness and better forecasting accuracy than the traditional benchmark models.


2018 ◽  
Vol 2018 ◽  
pp. 1-14 ◽  
Author(s):  
Wen-Pei Chen ◽  
Shih-Hao Chang ◽  
Chuan-Yi Tang ◽  
Ming-Li Liou ◽  
Suh-Jen Jane Tsai ◽  
...  

Periodontitis is an inflammatory disease involving complex interactions between oral microorganisms and the host immune response. Understanding the structure of the microbiota community associated with periodontitis is essential for improving classifications and diagnoses of various types of periodontal diseases and will facilitate clinical decision-making. In this study, we used a 16S rRNA metagenomics approach to investigate and compare the compositions of the microbiota communities from 76 subgingival plagues samples, including 26 from healthy individuals and 50 from patients with periodontitis. Furthermore, we propose a novel feature selection algorithm for selecting features with more information from many variables with a combination of these features and machine learning methods were used to construct prediction models for predicting the health status of patients with periodontal disease. We identified a total of 12 phyla, 124 genera, and 355 species and observed differences between health- and periodontitis-associated bacterial communities at all phylogenetic levels. We discovered that the generaPorphyromonas,Treponema,Tannerella,Filifactor, andAggregatibacterwere more abundant in patients with periodontal disease, whereasStreptococcus,Haemophilus,Capnocytophaga,Gemella,Campylobacter, andGranulicatellawere found at higher levels in healthy controls. Using our feature selection algorithm, random forests performed better in terms of predictive power than other methods and consumed the least amount of computational time.


2006 ◽  
Vol 15 (06) ◽  
pp. 893-915 ◽  
Author(s):  
JIANG LI ◽  
JIANHUA YAO ◽  
RONALD M. SUMMERS ◽  
NICHOLAS PETRICK ◽  
MICHAEL T. MANRY ◽  
...  

We present an efficient feature selection algorithm for computer aided detection (CAD) computed tomographic (CT) colonography. The algorithm (1) determines an appropriate piecewise linear network (PLN) model by cross validation, (2) applies the orthonormal least square (OLS) procedure to the PLN model utilizing a Modified Schmidt procedure, and (3) uses a floating search algorithm to select features that minimize the output variance. The undesirable "nesting effect" is prevented by the floating search approach, and the piecewise linear OLS procedure makes this algorithm very computationally efficient because the Modified Schmidt procedure only requires one data pass during the whole searching process. The selected features are compared to those obtained by other methods, through cross validation with support vector machines (SVMs).


Author(s):  
Ronald Wesonga ◽  
Fabian Nabugoomu ◽  
Brian Masimbi

Flight delays affect passenger travel satisfaction and increase airline costs. The authors explore airline differences with a focus on their delays based on autoregressive integrated moving averages. Aviation daily data were used in the analysis and model development. Time series modelling for six airlines was done to predict delays as a function of airport's timeliness performance. Findings show differences in the time series prediction models by airline. Differential analysis in the time series prediction models for airline delay suggests variations in airline efficiencies though at the same airport. The differences could be attributed to different management styles in the countries where the airlines originate. Thus, to improve airport timeliness performance, the study recommends airline disaggregated studies to explore the dynamics attributable to determinants of airline unique characteristics.


Sign in / Sign up

Export Citation Format

Share Document