A Solution for Absent Spatial Data: The Common Correlated Effects Estimator

International Regional Science Review ◽

10.1177/0160017620959132 ◽

2020 ◽

pp. 016001762095913

Author(s):

Michael Beenstock ◽

Daniel Felsenstein

Keyword(s):

Time Series ◽

Spatial Data ◽

Time Series Data ◽

Housing Stock ◽

National Level ◽

Series Data ◽

Data Series ◽

Regional Data ◽

Common Correlated Effects ◽

The Common

Informed regional policy needs good regional data. As regional data series for key economic variables are generally absent whereas national-level time series data for the same variables are ubiquitous, we suggest an approach that leverages this advantage. We hypothesize the existence of a pervasive “common factor” represented by the national time series that affects regions differentially. We provide an empirical illustration in which national FDI is used in place of panel data for FDI, which are absent. The proposed methodology is tested empirically with respect to the determinants of regional demand for housing. We use a quasi-experimental approach to compare the results of a “common correlated effects” (CCE) estimator with a benchmark case when absent regional data are omitted. Using three common factors relating to national population, income and housing stock, we find mixed support for the common correlated effects hypothesis. We conclude by discussing how our experimental design may serve as a methodological prototype for further tests of CCE as a solution to the absent spatial data problem.

Download Full-text

An easy way to create duration variables in binary cross-sectional time-series data

The Stata Journal Promoting communications on statistics and Stata ◽

10.1177/1536867x20976322 ◽

2020 ◽

Vol 20 (4) ◽

pp. 916-930

Author(s):

Andrew Q. Philips

Keyword(s):

Time Series ◽

Missing Data ◽

Time Series Data ◽

Series Data ◽

Duration Dependence ◽

Cross Sectional ◽

Common Solution ◽

The Common

In cross-sectional time-series data with a dichotomous dependent variable, failing to account for duration dependence when it exists can lead to faulty inferences. A common solution is to include duration dummies, polynomials, or splines to proxy for duration dependence. Because creating these is not easy for the common practitioner, I introduce a new command, mkduration, that is a straightforward way to generate a duration variable for binary cross-sectional time-series data in Stata. mkduration can handle various forms of missing data and allows the duration variable to easily be turned into common parametric and nonparametric approximations.

Download Full-text

An Empirical Mode-Spatial Model for Environmental Data Imputation

Hydrology ◽

10.3390/hydrology5040063 ◽

2018 ◽

Vol 5 (4) ◽

pp. 63 ◽

Cited By ~ 1

Author(s):

Benjamin Nelsen ◽

D. Williams ◽

Gustavious Williams ◽

Candace Berrett

Keyword(s):

Time Series ◽

Spatial Data ◽

Missing Values ◽

Time Series Data ◽

Environmental Data ◽

Series Data ◽

Data Imputation ◽

Accurate Data ◽

Target Station ◽

Periodic Components

Complete and accurate data are necessary for analyzing and understanding trends in time-series datasets; however, many of the available time-series datasets have gaps that affect the analysis, especially in the earth sciences. As most available data have missing values, researchers use various interpolation methods or ad hoc approaches to data imputation. Since the analysis based on inaccurate data can lead to inaccurate conclusions, more accurate data imputation methods can provide accurate analysis. We present a spatial-temporal data imputation method using Empirical Mode Decomposition (EMD) based on spatial correlations. We call this method EMD-spatial data imputation or EMD-SDI. Though this method is applicable to other time-series data sets, here we demonstrate the method using temperature data. The EMD algorithm decomposes data into periodic components called intrinsic mode functions (IMF) and exactly reconstructs the original signal by summing these IMFs. EMD-SDI initially decomposes the data from the target station and other stations in the region into IMFs. EMD-SDI evaluates each IMF from the target station in turn and selects the IMF from other stations in the region with periodic behavior most correlated to target IMF. EMD-SDI then replaces a section of missing data in the target station IMF with the section from the most closely correlated IMF from the regional stations. We found that EMD-SDI selects the IMFs used for reconstruction from different stations throughout the region, not necessarily the station closest in the geographic sense. EMD-SDI accurately filled data gaps from 3 months to 5 years in length in our tests and favorably compares to a simple temporal method. EMD-SDI leverages regional correlation and the fact that different stations can be subject to different periodic behaviors. In addition to data imputation, the EMD-SDI method provides IMFs that can be used to better understand regional correlations and processes.

Download Full-text

Reconstructing Three Decades of Land Use and Land Cover Changes in Brazilian Biomes with Landsat Archive and Earth Engine

Remote Sensing ◽

10.3390/rs12172735 ◽

2020 ◽

Vol 12 (17) ◽

pp. 2735 ◽

Cited By ~ 16

Author(s):

Carlos M. Souza ◽

Julia Z. Shimbo ◽

Marcos R. Rosa ◽

Leandro L. Parente ◽

Ane A. Alencar ◽

...

Keyword(s):

Land Use ◽

Time Series ◽

Land Cover ◽

Time Series Data ◽

Google Earth ◽

Series Data ◽

Data Series ◽

Forest Conversion ◽

Pixel Resolution ◽

Historical Land Use

Brazil has a monitoring system to track annual forest conversion in the Amazon and most recently to monitor the Cerrado biome. However, there is still a gap of annual land use and land cover (LULC) information in all Brazilian biomes in the country. Existing countrywide efforts to map land use and land cover lack regularly updates and high spatial resolution time-series data to better understand historical land use and land cover dynamics, and the subsequent impacts in the country biomes. In this study, we described a novel approach and the results achieved by a multi-disciplinary network called MapBiomas to reconstruct annual land use and land cover information between 1985 and 2017 for Brazil, based on random forest applied to Landsat archive using Google Earth Engine. We mapped five major classes: forest, non-forest natural formation, farming, non-vegetated areas, and water. These classes were broken into two sub-classification levels leading to the most comprehensive and detailed mapping for the country at a 30 m pixel resolution. The average overall accuracy of the land use and land cover time-series, based on a stratified random sample of 75,000 pixel locations, was 89% ranging from 73 to 95% in the biomes. The 33 years of LULC change data series revealed that Brazil lost 71 Mha of natural vegetation, mostly to cattle ranching and agriculture activities. Pasture expanded by 46% from 1985 to 2017, and agriculture by 172%, mostly replacing old pasture fields. We also identified that 86 Mha of the converted native vegetation was undergoing some level of regrowth. Several applications of the MapBiomas dataset are underway, suggesting that reconstructing historical land use and land cover change maps is useful for advancing the science and to guide social, economic and environmental policy decision-making processes in Brazil.

Download Full-text

Sur l’estimation des équations de CANDIDE-R

L Actualité économique ◽

10.7202/800650ar ◽

2009 ◽

Vol 51 (4) ◽

pp. 626-633

Author(s):

Alban D’Amours

Keyword(s):

Time Series ◽

Cross Sections ◽

Economic Model ◽

Time Series Data ◽

Series Data ◽

Necessary Condition ◽

Regional Data ◽

Regional Models ◽

The Core ◽

Error Components Models

Abstract CANDIDE-R is a huge simultaneous macro-economic model which raises estimations difficulties. We avoid the problem of identification assuming that the great number of variables in our model makes it impossible that the necessary condition be not satisfied. We assume that our system converges to a solution solving this way the problem of identification. The core of the paper gives justifications of the procedure we adopted to estimate CANDIDE-R. Because of the presence of regional equations and the limited amount of regional data, we are bound to pool cross sections and time series data. We then justified the use of Zellner's approach instead of the error components models within the class of regional models built on national premises.

Download Full-text

AN OVERVIEW OF SIMILARITY SEARCH IN TIME SERIES DATA

Science and Technology Development Journal ◽

10.32508/stdj.v14i2.1911 ◽

2011 ◽

Vol 14 (2) ◽

pp. 71-79

Author(s):

Anh Tuan Duong

Keyword(s):

Time Series ◽

Large Time ◽

Time Series Data ◽

Query Sequence ◽

Real Life ◽

Series Data ◽

Science And Engineering ◽

Access Methods ◽

The Common ◽

Spatial Access Methods

Time series data occur in many real life applications, ranging from science and engineering to business. In many of these applications, searching through large time series database based on query sequence is often desirable. Such similarity-based retrieval is also the basic subroutine in several advanced time series data mining tasks such as clustering, classification, finding motifs, detecting anomaly patterns, rule discovery and visualization. Although several different approaches have been developed, most are based on the common premise of dimensionality reduction and spatial access methods. This survey gives an overview of recent research and shows how the methods fit into a general framework of feature extraction.

Download Full-text

Accuracy of automatic forecasting methods for univariate time series data: A case study predicting the results of the 2018 Swedish general election using decades-long data series

Communications in Statistics Case Studies Data Analysis and Applications ◽

10.1080/23737484.2021.1964407 ◽

2021 ◽

pp. 1-19

Author(s):

Andreas Karlsson Rosenblad

Keyword(s):

Time Series ◽

Time Series Data ◽

General Election ◽

Series Data ◽

Data Series ◽

Forecasting Methods ◽

Univariate Time Series ◽

Automatic Forecasting

Download Full-text

An ECM Approach for Long Run Relationship Between Real Exchange Rate and Output Growth: Evidence from Bangladesh

Dhaka University Journal of Science ◽

10.3329/dujs.v63i2.24444 ◽

2015 ◽

Vol 63 (2) ◽

pp. 105-110 ◽

Cited By ~ 1

Author(s):

Khnd Md Mostafa Kamal

Keyword(s):

Time Series ◽

Exchange Rate ◽

Real Exchange Rate ◽

Time Series Data ◽

Output Growth ◽

Series Data ◽

Data Series ◽

Currency Exchange ◽

Long Run ◽

Short Run

Currency exchange rate is an important aspect in modern economy which indicates the strength of domestic currency with respect to international currency. This study uses 42 years (1972 to 2013) time series data for Bangladesh in order to empirically determine whether the real exchange rate has significant impact on output growth for Bangladesh by using error correction model (ECM).The time series econometrics properties of the data series have been thoroughly investigated to apply ECM approach. The empirical evidence suggests mixed results; in the short run low exchange rate has positive significant effect while in the long run output growth is positively affected high exchange rate pass through.Dhaka Univ. J. Sci. 63(2):105-110, 2015 (July)

Download Full-text

Time Series Outlier Detection Based on Sliding Window Prediction

Mathematical Problems in Engineering ◽

10.1155/2014/879736 ◽

2014 ◽

Vol 2014 ◽

pp. 1-14 ◽

Cited By ~ 23

Author(s):

Yufeng Yu ◽

Yuelong Zhu ◽

Shijin Li ◽

Dingsheng Wan

Keyword(s):

Time Series ◽

Outlier Detection ◽

Time Series Data ◽

Series Data ◽

Forecasting Model ◽

Data Series ◽

Hydrologic Time Series ◽

Prediction Confidence ◽

Evaluation Of Data ◽

Operation And Management

In order to detect outliers in hydrological time series data for improving data quality and decision-making quality related to design, operation, and management of water resources, this research develops a time series outlier detection method for hydrologic data that can be used to identify data that deviate from historical patterns. The method first built a forecasting model on the history data and then used it to predict future values. Anomalies are assumed to take place if the observed values fall outside a given prediction confidence interval (PCI), which can be calculated by the predicted value and confidence coefficient. The use ofPCIas threshold is mainly on the fact that it considers the uncertainty in the data series parameters in the forecasting model to address the suitable threshold selection problem. The method performs fast, incremental evaluation of data as it becomes available, scales to large quantities of data, and requires no preclassification of anomalies. Experiments with different hydrologic real-world time series showed that the proposed methods are fast and correctly identify abnormal data and can be used for hydrologic time series analysis.

Download Full-text

Detection and mapping of burnt areas from time series of MODIS-derived NDVI data in a Mediterranean region

Open Geosciences ◽

10.2478/s13533-012-0167-y ◽

2014 ◽

Vol 6 (1) ◽

Cited By ~ 1

Author(s):

Miguel García ◽

José Alloza ◽

Ángeles Mayor ◽

Susana Bautista ◽

Francisco Rodríguez

Keyword(s):

Remote Sensing ◽

Time Series ◽

Mediterranean Region ◽

Time Series Data ◽

Series Data ◽

Data Series ◽

Regional Government ◽

Fire Event ◽

Fire Scar ◽

Burnt Areas

AbstractModerate resolution remote sensing data, as provided by MODIS, can be used to detect and map active or past wildfires from daily records of suitable combinations of reflectance bands. The objective of the present work was to develop and test simple algorithms and variations for automatic or semiautomatic detection of burnt areas from time series data of MODIS biweekly vegetation indices for a Mediterranean region. MODIS-derived NDVI 250m time series data for the Valencia region, East Spain, were subjected to a two-step process for the detection of candidate burnt areas, and the results compared with available fire event records from the Valencia Regional Government. For each pixel and date in the data series, a model was fitted to both the previous and posterior time series data. Combining drops between two consecutive points and 1-year average drops, we used discrepancies or jumps between the pre and post models to identify seed pixels, and then delimitated fire scars for each potential wildfire using an extension algorithm from the seed pixels. The resulting maps of the detected burnt areas showed a very good agreement with the perimeters registered in the database of fire records used as reference. Overall accuracies and indices of agreement were very high, and omission and commission errors were similar or lower than in previous studies that used automatic or semiautomatic fire scar detection based on remote sensing. This supports the effectiveness of the method for detecting and mapping burnt areas in the Mediterranean region.

Download Full-text

Modelling aboveground net primary production (ANPP) of an Atlantic mountain grassland based on time series approach

Cuadernos de Investigación Geográfica ◽

10.18172/cig.3561 ◽

2019 ◽

Vol 45 (2) ◽

pp. 551

Author(s):

A. Salaberria ◽

G. García-Baquero ◽

I. Odriozola ◽

A. Aldezabal

Keyword(s):

Time Series ◽

Primary Production ◽

Time Series Data ◽

Net Primary Production ◽

Series Data ◽

Data Series ◽

Treatment Level ◽

Aboveground Net Primary Production ◽

Smoothing Methods ◽

Mountain Grassland

Because primary productivity is related both with the energy that sustains food webs and with species diversity, it is usually considered a key ecosystem property and a reliable indicator of available forage. In this work the aboveground net primary production (ANPP) of an Atlantic mountain grassland system was modelled in order to attempt producing short-term forecasts. Since grazing influences productivity, two treatment levels (grazing and exclusion) were experimentally applied in each of three field sites. Monthly ANPP data were then collected over three consecutive vegetative periods (2006-2008), thereby obtaining six time series (one per plot). Since no significant differences among sites (within treatments) were found, these six series were later reduced through averaging to only two series (one per treatment level). Two kinds of statistical models were then used to attempt monthly ANPP forecasting: exponential smoothing methods and ARIMA models. Both methodologies turned out to produce inadequate forecasts due to the presence of marked local features (innovative outliers) in our relatively short time-series data. Nonetheless, useful information for a more innovative shepherding management was revealed (e.g. the presence of within-year variation in ANPP, and differences between the grazing and exclusion treatments). Longer data series, which would require a more demanding effort in sampling investment, are likely necessary in order to obtain adequate forecasts using these time series methodologies.

Download Full-text