Ecological Regression and Ecological Inference

2004 ◽  
pp. 123-143 ◽  
Author(s):  
Bernard Grofman ◽  
Samuel Merrill
2019 ◽  
Vol 28 (1) ◽  
pp. 65-86
Author(s):  
Wenxin Jiang ◽  
Gary King ◽  
Allen Schmaltz ◽  
Martin A. Tanner

Ecological inference (EI) is the process of learning about individual behavior from aggregate data. We relax assumptions by allowing for “linear contextual effects,” which previous works have regarded as plausible but avoided due to nonidentification, a problem we sidestep by deriving bounds instead of point estimates. In this way, we offer a conceptual framework to improve on the Duncan–Davis bound, derived more than 65 years ago. To study the effectiveness of our approach, we collect and analyze 8,430 $2\times 2$ EI datasets with known ground truth from several sources—thus bringing considerably more data to bear on the problem than the existing dozen or so datasets available in the literature for evaluating EI estimators. For the 88% of real data sets in our collection that fit a proposed rule, our approach reduces the width of the Duncan–Davis bound, on average, by about 44%, while still capturing the true district-level parameter about 99% of the time. The remaining 12% revert to the Duncan–Davis bound.


1969 ◽  
Vol 63 (4) ◽  
pp. 1183-1196 ◽  
Author(s):  
W. Phillips Shively

Because they are inexpensive and easy to obtain, because they may be available under circumstances in which survey data are unavailable, and because they eliminate many of the measurement problems of survey research, data on geographic units such as counties or census tracts are often used by political scientists to measure individual behavior. This has involved us in the long-standing problem of inferring individual-level relationships from aggregate data, which was first raised by W. S. Robinson in the early nineteen fifties.In this paper, I shall first discuss the problem raised by Robinson. I shall then review three partial solutions to the problem—the Duncan-Davis method of setting limits, Blalock's version of ecological regression, and Goodman's version of ecological regression. Finally, I shall propose some ways in which Goodman's method may be used so as to reduce the problem of bias in its estimates, and make it a more reasonable tool for reserch.Our difficulty, as Robinson showed, is that we cannot necessarily infer the correlation between variables, taking people as the unit of analysis, on the basis of correlations between the same variables based on groups of people as units. For example, the “ecological” correlation between per cent black and per cent illiterate is +0.946, whereas the correlation between color and illiteracy among individuals is only+0.203.


1976 ◽  
Vol 6 (1) ◽  
pp. 43-81 ◽  
Author(s):  
Ivor Crewe ◽  
Clive Payne

This article develops a number of themes first raised in an earlier paper where we attempted to publicize the existence of Census data based, for the first time, on British parliamentary constituencies, and where we briefly described the potential and limits of a variety of available statistical techniques of analysis. Until the earlier paper was published, studies of British electoral behaviour using aggregate data were largely historical, generally used only the simplest statistical techniques such as cross-tabulations, and usually proceeded blithely unaware of the snares of ecological inference. A small number of more advanced analyses had appeared but none focused on Britain or even on England as a whole. Since our earlier article appeared, there have been two attempts to construct predictive models of Labour support by applying multivariate statistical analysis to aggregate-level data. As we show in this paper, both Barnett and Rasmussen produce models that are statistically less powerful than our own and are subject to various weaknesses, of which the most important is the failure to tackle the problem of ecological inference.


2015 ◽  
Author(s):  
Alejandro Corvalan ◽  
Emerson Melo ◽  
Robert P Sherman ◽  
Matthew Shum

Author(s):  
Derek J. N. Young ◽  
Sean M. A. Jeronimo ◽  
Derek J. Churchill ◽  
Van R. Kane ◽  
Andrew M. Latimer

2016 ◽  
Vol 24 (2) ◽  
pp. 263-272 ◽  
Author(s):  
Kosuke Imai ◽  
Kabir Khanna

In both political behavior research and voting rights litigation, turnout and vote choice for different racial groups are often inferred using aggregate election results and racial composition. Over the past several decades, many statistical methods have been proposed to address this ecological inference problem. We propose an alternative method to reduce aggregation bias by predicting individual-level ethnicity from voter registration records. Building on the existing methodological literature, we use Bayes's rule to combine the Census Bureau's Surname List with various information from geocoded voter registration records. We evaluate the performance of the proposed methodology using approximately nine million voter registration records from Florida, where self-reported ethnicity is available. We find that it is possible to reduce the false positive rate among Black and Latino voters to 6% and 3%, respectively, while maintaining the true positive rate above 80%. Moreover, we use our predictions to estimate turnout by race and find that our estimates yields substantially less amounts of bias and root mean squared error than standard ecological inference estimates. We provide open-source software to implement the proposed methodology.


Sign in / Sign up

Export Citation Format

Share Document