scholarly journals Measurement validity in historical conflict data: Comparing datasets from Spain

2021 ◽  
Author(s):  
Francisco Villamil

Conflict research usually suffers from data availability problems, which sometimes motivates the use of use of proxy variables for violent events. But since they are usually the only alternative to measure violence patterns, there is not ground-truth data to compare them to. This limitation explains why there are no studies assessing their validity. This research note exploits a case where there are two sources on political violence: the Spanish Civil War. Comparing georeferenced mass graves and direct records of victimization, I show that the differences between these two datasets are not random but respond to different data generation processes, introducing important biases. Results highlight the need for a more careful assessment when using proxy variables for political violence.

Author(s):  
J. S. Lopez-Villa ◽  
H. D. Insuasti-Ceballos ◽  
S. Molina-Giraldo ◽  
A. Alvarez-Meza ◽  
G. Castellanos-Dominguez

2012 ◽  
Vol 16 (8) ◽  
pp. 2801-2811 ◽  
Author(s):  
M. T. Vu ◽  
S. V. Raghavan ◽  
S. Y. Liong

Abstract. Many research studies that focus on basin hydrology have applied the SWAT model using station data to simulate runoff. But over regions lacking robust station data, there is a problem of applying the model to study the hydrological responses. For some countries and remote areas, the rainfall data availability might be a constraint due to many different reasons such as lacking of technology, war time and financial limitation that lead to difficulty in constructing the runoff data. To overcome such a limitation, this research study uses some of the available globally gridded high resolution precipitation datasets to simulate runoff. Five popular gridded observation precipitation datasets: (1) Asian Precipitation Highly Resolved Observational Data Integration Towards the Evaluation of Water Resources (APHRODITE), (2) Tropical Rainfall Measuring Mission (TRMM), (3) Precipitation Estimation from Remote Sensing Information using Artificial Neural Network (PERSIANN), (4) Global Precipitation Climatology Project (GPCP), (5) a modified version of Global Historical Climatology Network (GHCN2) and one reanalysis dataset, National Centers for Environment Prediction/National Center for Atmospheric Research (NCEP/NCAR) are used to simulate runoff over the Dak Bla river (a small tributary of the Mekong River) in Vietnam. Wherever possible, available station data are also used for comparison. Bilinear interpolation of these gridded datasets is used to input the precipitation data at the closest grid points to the station locations. Sensitivity Analysis and Auto-calibration are performed for the SWAT model. The Nash-Sutcliffe Efficiency (NSE) and Coefficient of Determination (R2) indices are used to benchmark the model performance. Results indicate that the APHRODITE dataset performed very well on a daily scale simulation of discharge having a good NSE of 0.54 and R2 of 0.55, when compared to the discharge simulation using station data (0.68 and 0.71). The GPCP proved to be the next best dataset that was applied to the runoff modelling, with NSE and R2 of 0.46 and 0.51, respectively. The PERSIANN and TRMM rainfall data driven runoff did not show good agreement compared to the station data as both the NSE and R2 indices showed a low value of 0.3. GHCN2 and NCEP also did not show good correlations. The varied results by using these datasets indicate that although the gauge based and satellite-gauge merged products use some ground truth data, the different interpolation techniques and merging algorithms could also be a source of uncertainties. This entails a good understanding of the response of the hydrological model to different datasets and a quantification of the uncertainties in these datasets. Such a methodology is also useful for planning on Rainfall-runoff and even reservoir/river management both at rural and urban scales.


2021 ◽  
Vol 13 (13) ◽  
pp. 2619
Author(s):  
Joao Fonseca ◽  
Georgios Douzas ◽  
Fernando Bacao

In remote sensing, Active Learning (AL) has become an important technique to collect informative ground truth data ``on-demand'' for supervised classification tasks. Despite its effectiveness, it is still significantly reliant on user interaction, which makes it both expensive and time consuming to implement. Most of the current literature focuses on the optimization of AL by modifying the selection criteria and the classifiers used. Although improvements in these areas will result in more effective data collection, the use of artificial data sources to reduce human--computer interaction remains unexplored. In this paper, we introduce a new component to the typical AL framework, the data generator, a source of artificial data to reduce the amount of user-labeled data required in AL. The implementation of the proposed AL framework is done using Geometric SMOTE as the data generator. We compare the new AL framework to the original one using similar acquisition functions and classifiers over three AL-specific performance metrics in seven benchmark datasets. We show that this modification of the AL framework significantly reduces cost and time requirements for a successful AL implementation in all of the datasets used in the experiment.


2021 ◽  
pp. 000276422110216
Author(s):  
Scott Althaus ◽  
Buddy Peyton ◽  
Dan Shalmon

Understanding how useful any particular set of event data might be for conflict research requires appropriate methods for assessing validity when ground truth data about the population of interest do not exist. We argue that a total error framework can provide better leverage on these critical questions than previous methods have been able to deliver. We first define a total event data error approach for identifying 19 types of error that can affect the validity of event data. We then address the challenge of applying a total error framework when authoritative ground truth about the actual distribution of relevant events is lacking. We argue that carefully constructed gold standard datasets can effectively benchmark validity problems even in the absence of ground truth data about event populations. To illustrate the limitations of conventional strategies for validating event data, we present a case study of Boko Haram activity in Nigeria over a 3-month offensive in 2015 that compares events generated by six prominent event extraction pipelines—ACLED, SCAD, ICEWS, GDELT, PETRARCH, and the Cline Center’s SPEED project. We conclude that conventional ways of assessing validity in event data using only published datasets offer little insight into potential sources of error or bias. Finally, we illustrate the benefits of validating event data using a total error approach by showing how the gold standard approach used to validate SPEED data offers a clear and robust method for detecting and evaluating the severity of temporal errors in event data.


Author(s):  
Georgios Papamakarios ◽  
Dimitris Giakoumis ◽  
Konstantinos Votis ◽  
Sofia Segkouli ◽  
Dimitrios Tzovaras ◽  
...  

2014 ◽  
Vol 129 (3) ◽  
pp. 569-581 ◽  
Author(s):  
O. Ibáñez ◽  
F. Cavalli ◽  
B. R. Campomanes-Álvarez ◽  
C. Campomanes-Álvarez ◽  
A. Valsecchi ◽  
...  

Author(s):  
G. Häufel ◽  
D. Bulatov ◽  
P. Helmholz

<p><strong>Abstract.</strong> In recent years, the task of land cover classification from airborne image and elevation data advanced considerably due to enhanced applicability of CNNs (Convolutional Neural Networks). Nevertheless, CNNs require a huge amount of training data. Traditionally, few essential feature values, such as elevation or vegetation index, had been chosen to provide a coarse distinction of classes, but very often these values have to be adapted depending on the imagery. To improve this process, freely available GIS data are combined with spectral and spatial features (and their variations) following the <i>K-Means</i> and <i>Mean-Shift</i> algorithm. Based on cluster assignments to pixels, statistical analysis for extracting plausible values for distinguishing between land cover classes is applied. The resulting labeled databases are evaluated using ground truth data, and will form the basis for the training data required for CNNs.</p>


Land ◽  
2021 ◽  
Vol 10 (4) ◽  
pp. 402
Author(s):  
Charalampos Kontoes ◽  
Constantinos Loupasakis ◽  
Ioannis Papoutsis ◽  
Stavroula Alatza ◽  
Eleftheria Poyiadji ◽  
...  

The exploitation of remote sensing techniques has substantially improved pre- and post- disaster landslide management over the last decade. A variety of landslide susceptibility methods exists, with capabilities and limitations related to scale and spatial accuracy issues, as well as data availability. The Interferometric Synthetic Aperture Radar (InSAR) capabilities have significantly contributed to the detection, monitoring, and mapping of landslide phenomena. The present study aims to point out the contribution of InSAR data in landslide detection and to evaluate two different scale landslide models by comparing a heuristic to a statistical method for the rainfall-induced landslide hazard assessment. Aiming to include areas with both high and low landslide occurrence frequencies, the study area covers a large part of the Aetolia–Acarnania and Evritania prefectures, Central and Western Greece. The landslide susceptibility product provided from the weights of evidence (WoE) method proved more accurate, benefitting from the expert opinion and the landslide inventory. On the other hand, the Norwegian Geological Institute (NGI) methodology has the edge on its immediate implementation, with minimum data requirements. Finally, it was proved that using sequential SAR image acquisitions gives the benefit of an updated landslide inventory, resulting in the generation of, on request, updated landslide susceptibility maps.


Sign in / Sign up

Export Citation Format

Share Document