scholarly journals Selection Bias in Comparative Research: The Case of Incomplete Data Sets

2003 ◽  
Vol 11 (3) ◽  
pp. 255-274 ◽  
Author(s):  
Simon Hug

Selection bias is an important but often neglected problem in comparative research. While comparative case studies pay some attention to this problem, this is less the case in broader cross-national studies, where this problem may appear through the way the data used are generated. The article discusses three examples: studies of the success of newly formed political parties, research on protest events, and recent work on ethnic conflict. In all cases the data at hand are likely to be afflicted by selection bias. Failing to take into consideration this problem leads to serious biases in the estimation of simple relationships. Empirical examples illustrate a possible solution (a variation of a Tobit model) to the problems in these cases. The article also discusses results of Monte Carlo simulations, illustrating under what conditions the proposed estimation procedures lead to improved results.

2014 ◽  
Vol 22 (2) ◽  
pp. 205-223 ◽  
Author(s):  
James Lo ◽  
Sven-Oliver Proksch ◽  
Thomas Gschwend

This article presents a scaling approach to jointly estimate the locations of voters, parties, and European political groups on a common left-right scale. Although most comparative research assumes that cross-national comparisons of voters and parties are possible, few correct for systematic biases commonly known to exist in surveys or examine whether survey data are comparable across countries. Our scaling method addresses scale perception in surveys and links cross-national surveys through new bridging observations. We apply our approach to the 2009 European Election Survey and demonstrate that the improvement in party estimates that one gains from fixing various survey bias issues is significant. Our scaling strategy provides left-right positions of voters and of 162 political parties, and we demonstrate that variables based on rescaled voter and party positions on the left-right dimension significantly improve the fit of a cross-national vote choice model.


1980 ◽  
Vol 40 (1) ◽  
pp. 69-76 ◽  
Author(s):  
Roger Benjamin

Recent work on Japanese politics, as presented in five representative works reviewed here is based increasingly on comparative (cross-national) frameworks of analysis. The studies allow us to consider once again the relationship between area-based (contextual) and comparative-based knowledge. This essay argues that we are further ahead in our understanding of Japanese politics if we move toward strategies of theory construction that place Japanese and comparative research in explicit juxtaposition. Arguments both for and against area-based and comparative-based explanations form the foundations of the conclusions developed.


1992 ◽  
Vol 19 (3) ◽  
pp. 188-189 ◽  
Author(s):  
John F. Walsh

Courses in statistics and experimental design can be enhanced through use of crafted data sets. The use of examples highlights the interface between data and statistical routine. FORTRAN programs utilizing the International Mathematical and Statistical Library subroutines permit the user to control the variance—covariance structure of multivariate normal variables and build data sets that have instructional value. Scale transformations and Monte Carlo simulations of the data can be performed as well.


2020 ◽  
Vol 57 (5) ◽  
pp. 789-809
Author(s):  
Andrey Simonov ◽  
Jean-Pierre Dubé ◽  
Günter Hitsch ◽  
Peter Rossi

The authors analyze the initial conditions bias in the estimation of brand choice models with structural state dependence. Using a combination of Monte Carlo simulations and empirical case studies of shopping panels, they show that popular, simple solutions that misspecify the initial conditions are likely to lead to bias even in relatively long panel data sets. The magnitude of the bias in the state dependence parameter can be as large as a factor of 2–2.5. The authors propose a solution to the initial conditions problem that samples the initial states as auxiliary variables in a Markov chain Monte Carlo procedure. The approach assumes that the joint distribution of prices and consumer choices is in equilibrium, which is plausible for the mature consumer packaged goods products commonly used in empirical applications. In Monte Carlo simulations, the approach recovers the true parameter values even in relatively short panels. Finally, the authors propose a diagnostic tool that uses common, biased approaches to bound the values of the state dependence and construct a computationally light test for state dependence.


Author(s):  
Harold W. Hatch ◽  
Gordon W. McCann

We describe a methodology for constructing tabular potentials of supertoroids with short-range interactions, which requires the calculation of the volume of overlap of these shapes for many relative positions and orientations. Recent advances in the synthesis of anisotropic colloids have made experimental realizations of such particles feasible and have increased the practical impact of fundamental simulation studies of these families of shapes. This extends our recent work on superquadric potentials to now include a family of ring-like shapes with a hole in the middle. Along with the addition of supertoroids, the ability to make tables for nonidentical particles and particle pairs with multiple, disconnected overlap volumes was added. Using newly developed extensions to a previously published algorithm, we produced tabular potentials for all of these new cases. The algorithmic developments in this work will enable Monte Carlo simulations of a wider variety of shapes to predict thermodynamic properties over a range of conditions.


Author(s):  
Haitham Yousof ◽  
Ahmed Z Afify ◽  
Morad Alizadeh ◽  
G. G. Hamedani ◽  
S. Jahanshahi ◽  
...  

In this work, we introduce a new class of continuous distributions called the generalized poissonfamily which extends the quadratic rank transmutation map. We provide some special models for thenew family. Some of its mathematical properties including Rényi and q-entropies, order statistics andcharacterizations are derived. The estimations of the model parameters is performed by maximumlikelihood method. The Monte Carlo simulations is used for assessing the performance of the maximumlikelihood estimators. The ‡exibility of the proposed family is illustrated by means of two applicationsto real data sets.


Data in Brief ◽  
2018 ◽  
Vol 19 ◽  
pp. 564-569
Author(s):  
Ekaterina Baibuz ◽  
Simon Vigonski ◽  
Jyri Lahtinen ◽  
Junlei Zhao ◽  
Ville Jansson ◽  
...  

2016 ◽  
Vol 22 (12) ◽  
pp. 4359-4363
Author(s):  
Nur Azimah Rahim Abdul ◽  
Norazan Mohamed Ramli ◽  
Nor Azura Md Ghani

Plant Disease ◽  
2007 ◽  
Vol 91 (8) ◽  
pp. 1002-1012 ◽  
Author(s):  
David H. Gent ◽  
William W. Turechek ◽  
Walter F. Mahaffee

Hop powdery mildew (caused by Podosphaera macularis) is an important disease of hops (Humulus lupulus) in the Pacific Northwest. Sequential sampling models for estimation and classification of the incidence of powdery mildew on leaves of hop were developed based on the beta-binomial distribution, using parameter estimates of the binary power law determined in previous studies. Stop lines, models that indicate that enough information has been collected to estimate disease incidence and cease sampling, for sequential estimation were validated by bootstrap simulations of a select group of 18 data sets (out of a total of 198 data sets) from the model-construction data, and through simulated sampling of 104 data sets collected independently (i.e., validation data sets). The achieved coefficient of variation (C) approached prespecified C values as the achieved disease incidence ([Formula: see text]) increased. Achieving a C of 0.1 was not possible for data sets in which [Formula: see text] < 0.10. The 95% confidence interval of the median difference between the true p and [Formula: see text] included zero for 16 of 18 data sets evaluated at C = 0.2 and all data sets when C = 0.1. For sequential classification, Monte-Carlo simulations were used to determine the probability of classifying mean disease incidence as less than a threshold incidence, pt (operating characteristic [OC]), and average sample number (ASN) curves for 16 combinations of candidate stop lines and error levels (α and β). Four pairs of stop lines were selected for further evaluation based on the results of the Monte-Carlo simulations. Bootstrap simulations of the 18 selected data sets indicated that the OC and ASN curves of the sequential sampling plans for each of the four sets of stop lines were similar to OC and ASN values determined by Monte Carlo simulation. Correct classification of disease incidence as being above or below preselected thresholds was 2.0 to 7.7% higher when stop lines were determined by the beta-binomial approximation than when stop lines were calculated using the binomial distribution. Correct decision rates differed depending on the location where sampling was initiated in the hop yard; however, in all instances were greater than 86% when stop lines were determined using the beta-binomial approximation. The sequential sampling plans evaluated in this study should allow for rapid and accurate estimation and classification of the incidence of hop leaves with powdery mildew, and aid in sampling for pest management decision making.


Sign in / Sign up

Export Citation Format

Share Document