Explaining Rare Events in International Relations

2001 ◽  
Vol 55 (3) ◽  
pp. 693-715 ◽  
Author(s):  
Gary King ◽  
Langche Zeng

Some of the most important phenomena in international conflict are coded as “rare events”: binary dependent variables with dozens to thousands of times fewer events, such as wars and coups, than “nonevents.” Unfortunately, rare events data are difficult to explain and predict, a problem stemming from at least two sources. First, and most important, the data-collection strategies used in international conflict studies are grossly inefficient. The fear of collecting data with too few events has led to data collections with huge numbers of observations but relatively few, and poorly measured, explanatory variables. As it turns out, more efficient sampling designs exist for making valid inferences, such as sampling all available events (wars, for example) and a tiny fraction of nonevents (peace). This enables scholars to save as much as 99 percent of their (nonfixed) data-collection costs or to collect much more meaningful explanatory variables. Second, logistic regression, and other commonly used statistical procedures, can underestimate the probability of rare events. We introduce some corrections that outperform existing methods and change the estimates of absolute and relative risks by as much as some estimated effects reported in the literature. We also provide easy-to-use methods and software that link these two results, enabling both types of corrections to work simultaneously.

2001 ◽  
Vol 9 (2) ◽  
pp. 137-163 ◽  
Author(s):  
Gary King ◽  
Langche Zeng

We study rare events data, binary dependent variables with dozens to thousands of times fewer ones (events, such as wars, vetoes, cases of political activism, or epidemiological infections) than zeros (“nonevents”). In many literatures, these variables have proven difficult to explain and predict, a problem that seems to have at least two sources. First, popular statistical procedures, such as logistic regression, can sharply underestimate the probability of rare events. We recommend corrections that outperform existing methods and change the estimates of absolute and relative risks by as much as some estimated effects reported in the literature. Second, commonly used data collection strategies are grossly inefficient for rare events data. The fear of collecting data with too few events has led to data collections with huge numbers of observations but relatively few, and poorly measured, explanatory variables, such as in international conflict data with more than a quarter-million dyads, only a few of which are at war. As it turns out, more efficient sampling designs exist for making valid inferences, such as sampling all available events (e.g., wars) and a tiny fraction of nonevents (peace). This enables scholars to save as much as 99% of their (nonfixed) data collection costs or to collect much more meaningful explanatory variables. We provide methods that link these two results, enabling both types of corrections to work simultaneously, and software that implements the methods developed.


2003 ◽  
Vol 57 (3) ◽  
pp. 617-642 ◽  
Author(s):  
Gary King ◽  
Will Lowe

Despite widespread recognition that aggregated summary statistics on international conflict and cooperation miss most of the complex interactions among nations, the vast majority of scholars continue to employ annual, quarterly, or (occasionally) monthly observations. Daily events data, coded from some of the huge volume of news stories produced by journalists, have not been used much for the past two decades. We offer some reason to change this practice, which we feel should lead to considerably increased use of these data. We address advances in event categorization schemes and software programs that automatically produce data by “reading” news stories without human coders. We design a method that makes it feasible, for the first time, to evaluate these programs when they are applied in areas with the particular characteristics of international conflict and cooperation data, namely event categories with highly unequal prevalences, and where rare events (such as highly conflictual actions) are of special interest. We use this rare events design to evaluate one existing program, and find it to be as good as trained human coders, but obviously far less expensive to use. For large-scale data collections, the program dominates human coding. Our new evaluative method should be of use in international relations, as well as more generally in the field of computational linguistics, for evaluating other automated information extraction tools. We believe that the data created by programs similar to the one we evaluated should see dramatically increased use in international relations research. To facilitate this process, we are releasing with this article data on 3.7 million international events, covering the entire world for the past decade.


1997 ◽  
Vol 28 (3) ◽  
pp. 288-296 ◽  
Author(s):  
Jack S. Damico ◽  
Sandra K. Damico

One aspect of therapeutic discourse that has not been fully investigated in language intervention is the way that interactional dominance is established and maintained within the therapeutic encounter. Using various data collection strategies, therapeutic discourse from 10 language intervention sessions was collected and analyzed. By employing an analytic device known as the "dominant interpretive framework," the interactional styles and strategies of two speech-language pathologists were investigated. Data revealed several systematic patterns of interaction that constrained the ranges of interaction between the clinician and the client. Several implications regarding client empowerment, mediation, and assimilation into the school culture are discussed.


2021 ◽  
Vol 30 (20) ◽  
pp. 1190-1197
Author(s):  
Pam Hodge ◽  
Nora Cooper ◽  
Brian P Richardson

Aims: To offer child health student nurses a broader learning experience in practice with an autonomous choice of a volunteer placement area. To reflect the changing nature of health care and the move of care closer to home in the placement experience. To evaluate participants' experiences. Design: This study used descriptive and interpretative methods of qualitative data collection. This successive cross-sectional data collection ran from 2017 to 2020. All data were thematically analysed using Braun and Clarke's model. Methods: Data collection strategies included two focus groups (n=14) and written reflections (n=19). Results: Students identified their increased confidence, development as a professional, wider learning and community engagement. They also appreciated the relief from formal assessment of practice and the chance to focus on the experience. Conclusion: Students positively evaluated this experience, reporting a wider understanding of health and wellbeing in the community. Consideration needs to be given to risk assessments in the areas students undertake the placements and the embedding of the experience into the overall curriculum.


2003 ◽  
Vol 9 (3) ◽  
pp. 125-129 ◽  
Author(s):  
Pamela Whitten ◽  
Inez Adams

We studied two rural telemedicine projects in the state of Michigan: one that enjoyed success and steady growth in activity, and one that experienced frustration and a lack of clinical utilization. Multiple data collection strategies were employed during study periods, which lasted approximately one year. Both projects enjoyed a grassroots approach and had dedicated project coordinators. However, the more successful project benefited from resources and expertise not available to the less successful project. In addition, the more successful project possessed a more formalized organizational structure for the telemedicine application. A comparison of the two projects leads to a simple conclusion. Telemedicine programmes are positioned within larger health organizations and do not operate in a vacuum. It is crucial that the organization in which it is intended to launch telemedicine is examined carefully first. Each organization operates within a larger environment, which is often constrained by fiscal, geographical and personnel factors. All these will affect the introduction of telemedicine.


Author(s):  
John C. Mace ◽  
Nipun Thekkummal ◽  
Charles Morisset ◽  
Aad Van Moorsel

2014 ◽  
Vol 20 (2) ◽  
pp. 347-372
Author(s):  
Scott Y. Lin ◽  
Carlos Seiglie

AbstractStudying the determinants of international conflict, researchers have found a series of influential variables, but few have addressed the robustness of the results to changes in the definition of the dependent variable, conflict. The two main sources for operationalizing conflict in empirical work are data on militarized interstate disputes (MIDs) and events data. In this paper, we find that a χ2-test indicates a correlation between events data and MIDs data. However, detailed regression analysis indicates that there are some contradictory findings depending on whether we use events data as opposed to MIDs data to measure conflict.


1983 ◽  
Vol 245 (5) ◽  
pp. R620-R623
Author(s):  
M. Berman ◽  
P. Van Eerdewegh

A measure is proposed for the information content of data with respect to models. A model, defined by a set of parameter values in a mathematical framework, is considered a point in a hyperspace. The proposed measure expresses the information content of experimental data as the contribution they make, in units of information bits, in defining a model to within a desired region of the hyperspace. This measure is then normalized to conventional statistical measures of uncertainty. It is shown how the measure can be used to estimate the information of newly planned experiments and help in decisions on data collection strategies.


Sign in / Sign up

Export Citation Format

Share Document