scholarly journals Cross-City Transfer Learning for Deep Spatio-Temporal Prediction

Author(s):  
Leye Wang ◽  
Xu Geng ◽  
Xiaojuan Ma ◽  
Feng Liu ◽  
Qiang Yang

Spatio-temporal prediction is a key type of tasks in urban computing, e.g., traffic flow and air quality. Adequate data is usually a prerequisite, especially when deep learning is adopted. However, the development levels of different cities are unbalanced, and still many cities suffer from data scarcity. To address the problem, we propose a novel cross-city transfer learning method for deep spatio-temporal prediction tasks, called RegionTrans. RegionTrans aims to effectively transfer knowledge from a data-rich source city to a data-scarce target city. More specifically, we first learn an inter-city region matching function to match each target city region to a similar source city region. A neural network is designed to effectively extract region-level representation for spatio-temporal prediction. Finally, an optimization algorithm is proposed to transfer learned features from the source city to the target city with the region matching function. Using citywide crowd flow prediction as a demonstration experiment, we verify the effectiveness of RegionTrans. Results show that RegionTrans can outperform the state-of-the-art fine-tuning deep spatio-temporal prediction models by reducing up to 10.7% prediction error. 

PeerJ ◽  
2018 ◽  
Vol 6 ◽  
pp. e5518 ◽  
Author(s):  
Tomislav Hengl ◽  
Madlene Nussbaum ◽  
Marvin N. Wright ◽  
Gerard B.M. Heuvelink ◽  
Benedikt Gräler

Random forest and similar Machine Learning techniques are already used to generate spatial predictions, but spatial location of points (geography) is often ignored in the modeling process. Spatial auto-correlation, especially if still existent in the cross-validation residuals, indicates that the predictions are maybe biased, and this is suboptimal. This paper presents a random forest for spatial predictions framework (RFsp) where buffer distances from observation points are used as explanatory variables, thus incorporating geographical proximity effects into the prediction process. The RFsp framework is illustrated with examples that use textbook datasets and apply spatial and spatio-temporal prediction to numeric, binary, categorical, multivariate and spatiotemporal variables. Performance of the RFsp framework is compared with the state-of-the-art kriging techniques using fivefold cross-validation with refitting. The results show that RFsp can obtain equally accurate and unbiased predictions as different versions of kriging. Advantages of using RFsp over kriging are that it needs no rigid statistical assumptions about the distribution and stationarity of the target variable, it is more flexible towards incorporating, combining and extending covariates of different types, and it possibly yields more informative maps characterizing the prediction error. RFsp appears to be especially attractive for building multivariate spatial prediction models that can be used as “knowledge engines” in various geoscience fields. Some disadvantages of RFsp are the exponentially growing computational intensity with increase of calibration data and covariates and the high sensitivity of predictions to input data quality. The key to the success of the RFsp framework might be the training data quality—especially quality of spatial sampling (to minimize extrapolation problems and any type of bias in data), and quality of model validation (to ensure that accuracy is not effected by overfitting). For many data sets, especially those with lower number of points and covariates and close-to-linear relationships, model-based geostatistics can still lead to more accurate predictions than RFsp.


Energies ◽  
2020 ◽  
Vol 13 (13) ◽  
pp. 3440
Author(s):  
Arnas Uselis ◽  
Mantas Lukoševičius ◽  
Lukas Stasytis

Convolutional Neural Networks (CNN) possess many positive qualities when it comes to spatial raster data. Translation invariance enables CNNs to detect features regardless of their position in the scene. However, in some domains, like geospatial, not all locations are exactly equal. In this work, we propose localized convolutional neural networks that enable convolutional architectures to learn local features in addition to the global ones. We investigate their instantiations in the form of learnable inputs, local weights, and a more general form. They can be added to any convolutional layers, easily end-to-end trained, introduce minimal additional complexity, and let CNNs retain most of their benefits to the extent that they are needed. In this work we address spatio-temporal prediction: test the effectiveness of our methods on a synthetic benchmark dataset and tackle three real-world wind prediction datasets. For one of them, we propose a method to spatially order the unordered data. We compare the recent state-of-the-art spatio-temporal prediction models on the same data. Models that use convolutional layers can be and are extended with our localizations. In all these cases our extensions improve the results, and thus often the state-of-the-art. We share all the code at a public repository.


Author(s):  
Tomislav Hengl ◽  
Madlene Nussbaum ◽  
Marvin N Wright ◽  
Gerard B.M. Heuvelink ◽  
Benedikt Gräler

Random forest and similar Machine Learning techniques are already used to generate spatial predictions, but spatial location of points (geography) is often ignored in the modeling process. Spatial auto-correlation, especially if still existent in the cross-validation residuals, indicates that the predictions are maybe biased, and this is suboptimal. This paper presents a random forest for spatial predictions framework (RFsp) where buffer distances from observation points are used as explanatory variables, thus incorporating geographical proximity effects into the prediction process. The RFsp framework is illustrated with examples that use textbook datasets and apply spatial and spatio-temporal prediction to numeric, binary, categorical, multivariate and spatiotemporal variables. Performance of the RFsp framework is compared with the state-of-the-art kriging techniques using 5 – fold cross-validation with refitting. The results show that RFsp can obtain equally accurate and unbiased predictions as different versions of kriging. Advantages of using RFsp over kriging are that it needs no rigid statistical assumptions about the distribution and stationarity of the target variable, it is more flexible towards incorporating, combining and extending covariates of different types, and it possibly yields more informative maps characterizing the prediction error. RFsp appears to be especially attractive for building multivariate spatial prediction models that can be used as "knowledge engines" in various geoscience fields. Some disadvantages of RFsp are the exponentially growing computational intensity with increase of calibration data and covariates, sensitivity of predictions to input data quality and extrapolation problems. The key to the success of the RFsp framework might be the training data quality — especially quality of spatial sampling (to minimize extrapolation problems and any type of bias in data), and quality of model validation (to ensure that accuracy is not effected by overfitting). For many data sets, especially those with fewer number of points and covariates and close-to-linear relationships, model-based geostatistics can still lead to more accurate predictions than RFsp.


2022 ◽  
Vol 165 ◽  
pp. 106511
Author(s):  
Cheuk Ki Man ◽  
Mohammed Quddus ◽  
Athanasios Theofilatos

2018 ◽  
Author(s):  
Tomislav Hengl ◽  
Madlene Nussbaum ◽  
Marvin N Wright ◽  
Gerard B.M. Heuvelink ◽  
Benedikt Gräler

Random forest and similar Machine Learning techniques are already used to generate spatial predictions, but spatial location of points (geography) is often ignored in the modeling process. Spatial auto-correlation, especially if still existent in the cross-validation residuals, indicates that the predictions are maybe biased, and this is suboptimal. This paper presents a random forest for spatial predictions framework (RFsp) where buffer distances from observation points are used as explanatory variables, thus incorporating geographical proximity effects into the prediction process. The RFsp framework is illustrated with examples that use textbook datasets and apply spatial and spatio-temporal prediction to numeric, binary, categorical, multivariate and spatiotemporal variables. Performance of the RFsp framework is compared with the state-of-the-art kriging techniques using 5 – fold cross-validation with refitting. The results show that RFsp can obtain equally accurate and unbiased predictions as different versions of kriging. Advantages of using RFsp over kriging are that it needs no rigid statistical assumptions about the distribution and stationarity of the target variable, it is more flexible towards incorporating, combining and extending covariates of different types, and it possibly yields more informative maps characterizing the prediction error. RFsp appears to be especially attractive for building multivariate spatial prediction models that can be used as 'knowledge engines' in various geoscience fields. Some disadvantages of RFsp are the exponentially growing computational intensity with increase of calibration data and covariates, sensitivity of predictions to input data quality and extrapolation problems. The key to the success of the RFsp framework might be the training data quality — especially quality of spatial sampling (to minimize extrapolation problems and any type of bias in data), and quality of model validation (to ensure that accuracy is not effected by overfitting). For many data sets, especially those with fewer number of points and covariates and close-to-linear relationships, model-based geostatistics can still lead to more accurate predictions than RFsp.


Author(s):  
Tomislav Hengl ◽  
Madlene Nussbaum ◽  
Marvin N Wright ◽  
Gerard B.M. Heuvelink

Random forest and similar Machine Learning techniques are already used to generate spatial predictions, but spatial location of points (geography) is often ignored in the modeling process. Spatial auto-correlation, especially if still existent in the cross-validation residuals, indicates that the predictions are maybe biased, and this is suboptimal. This paper presents a random forest for spatial predictions framework (RFsp) where buffer distances from observation points are used as explanatory variables, thus incorporating geographical proximity effects into the prediction process. The RFsp framework is illustrated with examples that use textbook datasets and apply spatial and spatio-temporal prediction to numeric, binary, categorical, multivariate and spatiotemporal variables. Performance of the RFsp framework is compared with the state-of-the-art kriging techniques using 5--fold cross-validation with refitting. The results show that RFsp can obtain equally accurate and unbiased predictions as different versions of kriging. Advantages of using RFsp over kriging are that it needs no rigid statistical assumptions about the distribution and stationarity of the target variable, it is more flexible towards incorporating, combining and extending covariates of different types, and it possibly yields more informative maps characterizing the prediction error. RFsp appears to be especially attractive for building multivariate spatial prediction models that can be used as "knowledge engines" in various geoscience fields. Some disadvantages of RFsp are the exponentially growing computational intensity with increase of calibration data and covariates and the high sensitivity of predictions to input data quality. For many data sets, especially those with lower number of points and covariates and close-to-linear relationships, model-based geostatistics can still lead to more accurate predictions than RFsp.


Author(s):  
Saeedeh Zebhi ◽  
SMT Almodarresi ◽  
Vahid Abootalebi

A Gait History Image (GHI) is a spatial template that accumulates regions of motion into a single image in which moving pixels are brighter than others. A new descriptor named Time-Sliced Averaged Gradient Boundary Magnitude (TAGBM) is also designed to show the time variations of motion. The spatial and temporal information of each video can be condensed using these templates. Based on this opinion, a new method is proposed in this paper. Each video is split into N and M groups of consecutive frames, and the GHI and TAGBM are computed for each group, resulting spatial and temporal templates. Transfer learning with the fine-tuning technique has been used for classifying these templates. This proposed method achieves the recognition accuracies of 96.50%, 92.30% and 97.12% for KTH, UCF Sport and UCF-11 action datasets, respectively. Also it is compared with state-of-the-art approaches and the results show that the proposed method has the best performance.


2018 ◽  
Author(s):  
Tomislav Hengl ◽  
Madlene Nussbaum ◽  
Marvin N Wright ◽  
Gerard B.M. Heuvelink ◽  
Benedikt Gräler

Random forest and similar Machine Learning techniques are already used to generate spatial predictions, but spatial location of points (geography) is often ignored in the modeling process. Spatial auto-correlation, especially if still existent in the cross-validation residuals, indicates that the predictions are maybe biased, and this is suboptimal. This paper presents a random forest for spatial predictions framework (RFsp) where buffer distances from observation points are used as explanatory variables, thus incorporating geographical proximity effects into the prediction process. The RFsp framework is illustrated with examples that use textbook datasets and apply spatial and spatio-temporal prediction to numeric, binary, categorical, multivariate and spatiotemporal variables. Performance of the RFsp framework is compared with the state-of-the-art kriging techniques using 5 – fold cross-validation with refitting. The results show that RFsp can obtain equally accurate and unbiased predictions as different versions of kriging. Advantages of using RFsp over kriging are that it needs no rigid statistical assumptions about the distribution and stationarity of the target variable, it is more flexible towards incorporating, combining and extending covariates of different types, and it possibly yields more informative maps characterizing the prediction error. RFsp appears to be especially attractive for building multivariate spatial prediction models that can be used as "knowledge engines" in various geoscience fields. Some disadvantages of RFsp are the exponentially growing computational intensity with increase of calibration data and covariates, sensitivity of predictions to input data quality and extrapolation problems. The key to the success of the RFsp framework might be the training data quality — especially quality of spatial sampling (to minimize extrapolation problems and any type of bias in data), and quality of model validation (to ensure that accuracy is not effected by overfitting). For many data sets, especially those with fewer number of points and covariates and close-to-linear relationships, model-based geostatistics can still lead to more accurate predictions than RFsp.


Author(s):  
Jianpeng Xu ◽  
Xi Liu ◽  
Tyler Wilson ◽  
Pang-Ning Tan ◽  
Pouyan Hatami ◽  
...  

In climate and environmental sciences, vast amount of spatio-temporal data have been generated at varying spatial resolutions from satellite observations and computer models. Integrating such diverse sources of data has proven to be useful for building prediction models as the multi-scale data may capture different aspects of the Earth system. In this paper, we present a novel framework called MUSCAT for predictive modeling of multi-scale, spatio-temporal data. MUSCAT performs a joint decomposition of multiple tensors from different spatial scales, taking into account the relationships between the variables. The latent factors derived from the joint tensor decomposition are  used to train the spatial and temporal prediction models at different scales for each location. The outputs from these ensemble of spatial and temporal models will be aggregated to generate future predictions. An incremental learning algorithm is also proposed to handle the massive size of the tensors. Experimental results on real-world data from the United States Historical Climate Network (USHCN) showed that MUSCAT outperformed other competing methods in more than 70\% of the locations.


Sign in / Sign up

Export Citation Format

Share Document