scholarly journals An Introduction to Bayesian Inference via Variational Approximations

2011 ◽  
Vol 19 (1) ◽  
pp. 32-47 ◽  
Author(s):  
Justin Grimmer

Markov chain Monte Carlo (MCMC) methods have facilitated an explosion of interest in Bayesian methods. MCMC is an incredibly useful and important tool but can face difficulties when used to estimate complex posteriors or models applied to large data sets. In this paper, we show how a recently developed tool in computer science for fitting Bayesian models, variational approximations, can be used to facilitate the application of Bayesian models to political science data. Variational approximations are often much faster than MCMC for fully Bayesian inference and in some instances facilitate the estimation of models that would be otherwise impossible to estimate. As a deterministic posterior approximation method, variational approximations are guaranteed to converge and convergence is easily assessed. But variational approximations do have some limitations, which we detail below. Therefore, variational approximations are best suited to problems when fully Bayesian inference would otherwise be impossible. Through a series of examples, we demonstrate how variational approximations are useful for a variety of political science research. This includes models to describe legislative voting blocs and statistical models for political texts. The code that implements the models in this paper is available in the supplementary material.

BioScience ◽  
2020 ◽  
Author(s):  
Corey T Callaghan ◽  
Alistair G B Poore ◽  
Thomas Mesaglio ◽  
Angela T Moles ◽  
Shinichi Nakagawa ◽  
...  

Abstract Citizen science is fundamentally shifting the future of biodiversity research. But although citizen science observations are contributing an increasingly large proportion of biodiversity data, they only feature in a relatively small percentage of research papers on biodiversity. We provide our perspective on three frontiers of citizen science research, areas that we feel to date have had minimal scientific exploration but that we believe deserve greater attention as they present substantial opportunities for the future of biodiversity research: sampling the undersampled, capitalizing on citizen science's unique ability to sample poorly sampled taxa and regions of the world, reducing taxonomic and spatial biases in global biodiversity data sets; estimating abundance and density in space and time, develop techniques to derive taxon-specific densities from presence or absence and presence-only data; and capitalizing on secondary data collection, moving beyond data on the occurrence of single species and gain further understanding of ecological interactions among species or habitats. The contribution of citizen science to understanding the important biodiversity questions of our time should be more fully realized.


1990 ◽  
Vol 2 ◽  
pp. 153-171 ◽  
Author(s):  
Michael S. Lewis-Beck ◽  
Andrew Skalaban

In political science research these days, the R2 is out of fashion. A chorus of our best methodologists sounds notes of caution, at varying degrees of pitch. Berry and Feldman (1985, 15) remark in their popular regression monograph: “A researcher should be careful to recognize the limitations of R2 as a measure of goodness of fit.” In their more general statistics text, Hanushek and Jackson (1977, 59) claim that “one must be extremely cautious in interpreting the R2 value for an estimation and particularly in comparing R2 values for models that have been estimated with different data sets.” Perhaps the most pointed attack comes from Achen (1982, 61), who argues that the R2 “measures nothing of serious importance.” His contention is that it should be abandoned, and the standard error of the regression (SEE) substituted as a goodness-of-fit measure. Developing these lines of inquiry further, King (1986) provides the latest set of criticisms. Accordingly, “In most practical political science situations, it makes little sense to use [the R2]” (King 1986, 669). And, concerning the “proportion of variance explained” definition more particularly, “it is not clear how this interpretation adds meaning to political analyses.” (King 1986, 678).


2012 ◽  
Vol 24 (6) ◽  
pp. 1462-1486 ◽  
Author(s):  
Ke Yuan ◽  
Mark Girolami ◽  
Mahesan Niranjan

This letter considers how a number of modern Markov chain Monte Carlo (MCMC) methods can be applied for parameter estimation and inference in state-space models with point process observations. We quantified the efficiencies of these MCMC methods on synthetic data, and our results suggest that the Reimannian manifold Hamiltonian Monte Carlo method offers the best performance. We further compared such a method with a previously tested variational Bayes method on two experimental data sets. Results indicate similar performance on the large data sets and superior performance on small ones. The work offers an extensive suite of MCMC algorithms evaluated on an important class of models for physiological signal analysis.


2021 ◽  
Author(s):  
◽  
Ijay Ushaka

<p>Theory in Information Systems (IS) is very important to the development of the field. Theory building, and theory testing seeks to accumulate knowledge about the relationships between people and technology. Testing theory can be difficult to accomplish, especially when it involves humans, a diversity of methods and sources, multiple experiments, large data sets, and careful tuning of conditions and instruments. Crowdsourcing is a strategy supporting the distribution of activities to crowd workers, which suggests that it may be used to support theory testing.  This exploratory study seeks to analyse the adoption of crowdsourcing in theory testing, and to develop guidance for researchers to instantiate the strategy in their research projects.  The study adopts the design science research paradigm to explore incorporating the crowdsourcing strategy in theory testing, and to evaluate its viability and utility. According to the principles of design science research, the study is structured around the construction of several interconnected IS artefacts: 1) a conceptual framework articulating the main principles of theory testing; 2) a pattern model of theory testing, which codifies existing research approaches to theory testing; and 3) a decision tool, which codifies guidelines for researchers making decisions on which research activities to crowdsource.  In order to build the conceptual framework and pattern model, the study conducts a systematic review of theory testing in the IS domain. Both the conceptual framework and pattern model are then operationalized in the decision tool. The utility of the various artefacts is then assessed with the participation of research practitioners.  This study is relevant because it synthesizes knowledge about theory testing, builds innovative artefacts supporting the adoption of crowdsourcing in theory testing, helps academic researchers understanding the theory testing process, and enables them to adopt crowdsourcing for theory testing.</p>


Author(s):  
Anthony Scime ◽  
Gregg R. Murray

Social scientists address some of the most pressing issues of society such as health and wellness, government processes and citizen reactions, individual and collective knowledge, working conditions and socio-economic processes, and societal peace and violence. In an effort to understand these and many other consequential issues, social scientists invest substantial resources to collect large quantities of data, much of which are not fully explored. This chapter proffers the argument that privacy protection and responsible use are not the only ethical considerations related to data mining social data. Given (1) the substantial resources allocated and (2) the leverage these “big data” give on such weighty issues, this chapter suggests social scientists are ethically obligated to conduct comprehensive analysis of their data. Data mining techniques provide pertinent tools that are valuable for identifying attributes in large data sets that may be useful for addressing important issues in the social sciences. By using these comprehensive analytical processes, a researcher may discover a set of attributes that is useful for making behavioral predictions, validating social science theories, and creating rules for understanding behavior in social domains. Taken together, these attributes and values often present previously unknown knowledge that may have important applied and theoretical consequences for a domain, social scientific or otherwise. This chapter concludes with examples of important social problems studied using various data mining methodologies including ethical concerns.


2014 ◽  
Vol 22 (2) ◽  
pp. 224-242 ◽  
Author(s):  
Vito D'Orazio ◽  
Steven T. Landis ◽  
Glenn Palmer ◽  
Philip Schrodt

Due in large part to the proliferation of digitized text, much of it available for little or no cost from the Internet, political science research has experienced a substantial increase in the number of data sets and large-nresearch initiatives. As the ability to collect detailed information on events of interest expands, so does the need to efficiently sort through the volumes of available information. Automated document classification presents a particularly attractive methodology for accomplishing this task. It is efficient, widely applicable to a variety of data collection efforts, and considerably flexible in tailoring its application for specific research needs. This article offers a holistic review of the application of automated document classification for data collection in political science research by discussing the process in its entirety. We argue that the application of a two-stage support vector machine (SVM) classification process offers advantages over other well-known alternatives, due to the nature of SVMs being a discriminative classifier and having the ability to effectively address two primary attributes of textual data: high dimensionality and extreme sparseness. Evidence for this claim is presented through a discussion of the efficiency gains derived from using automated document classification on the Militarized Interstate Dispute 4 (MID4) data collection project.


2020 ◽  
Vol 30 (5) ◽  
pp. 374-381 ◽  
Author(s):  
Benjamin J. Narang ◽  
Greg Atkinson ◽  
Javier T. Gonzalez ◽  
James A. Betts

The analysis of time series data is common in nutrition and metabolism research for quantifying the physiological responses to various stimuli. The reduction of many data from a time series into a summary statistic(s) can help quantify and communicate the overall response in a more straightforward way and in line with a specific hypothesis. Nevertheless, many summary statistics have been selected by various researchers, and some approaches are still complex. The time-intensive nature of such calculations can be a burden for especially large data sets and may, therefore, introduce computational errors, which are difficult to recognize and correct. In this short commentary, the authors introduce a newly developed tool that automates many of the processes commonly used by researchers for discrete time series analysis, with particular emphasis on how the tool may be implemented within nutrition and exercise science research.


2021 ◽  
Author(s):  
◽  
Ijay Ushaka

<p>Theory in Information Systems (IS) is very important to the development of the field. Theory building, and theory testing seeks to accumulate knowledge about the relationships between people and technology. Testing theory can be difficult to accomplish, especially when it involves humans, a diversity of methods and sources, multiple experiments, large data sets, and careful tuning of conditions and instruments. Crowdsourcing is a strategy supporting the distribution of activities to crowd workers, which suggests that it may be used to support theory testing.  This exploratory study seeks to analyse the adoption of crowdsourcing in theory testing, and to develop guidance for researchers to instantiate the strategy in their research projects.  The study adopts the design science research paradigm to explore incorporating the crowdsourcing strategy in theory testing, and to evaluate its viability and utility. According to the principles of design science research, the study is structured around the construction of several interconnected IS artefacts: 1) a conceptual framework articulating the main principles of theory testing; 2) a pattern model of theory testing, which codifies existing research approaches to theory testing; and 3) a decision tool, which codifies guidelines for researchers making decisions on which research activities to crowdsource.  In order to build the conceptual framework and pattern model, the study conducts a systematic review of theory testing in the IS domain. Both the conceptual framework and pattern model are then operationalized in the decision tool. The utility of the various artefacts is then assessed with the participation of research practitioners.  This study is relevant because it synthesizes knowledge about theory testing, builds innovative artefacts supporting the adoption of crowdsourcing in theory testing, helps academic researchers understanding the theory testing process, and enables them to adopt crowdsourcing for theory testing.</p>


Author(s):  
John A. Hunt

Spectrum-imaging is a useful technique for comparing different processing methods on very large data sets which are identical for each method. This paper is concerned with comparing methods of electron energy-loss spectroscopy (EELS) quantitative analysis on the Al-Li system. The spectrum-image analyzed here was obtained from an Al-10at%Li foil aged to produce δ' precipitates that can span the foil thickness. Two 1024 channel EELS spectra offset in energy by 1 eV were recorded and stored at each pixel in the 80x80 spectrum-image (25 Mbytes). An energy range of 39-89eV (20 channels/eV) are represented. During processing the spectra are either subtracted to create an artifact corrected difference spectrum, or the energy offset is numerically removed and the spectra are added to create a normal spectrum. The spectrum-images are processed into 2D floating-point images using methods and software described in [1].


Sign in / Sign up

Export Citation Format

Share Document