scholarly journals Generalized Ideal Point Models for Time-Varying and Missing-Data Inference

Author(s):  
Robert Kubinec

This paper presents an item-response theory parameterization of ideal points that unifies existing approaches to ideal point models while also extending them. For time-varying inference, the model permits ideal points to vary in a random walk, in a stationary autoregressive process, or in a semi-parametric Gaussian process. For missing data, the model implements a two-stage selection adjustment to account for non-ignorable missingness. In addition, the ideal point model is extended to handle new distributions, including continuous, positive-continuous and ordinal data. To enable modeling of datasets with mixed data (discrete and continuous), I incorporate joint modeling of different distributions. Finally, I also address ways of implementing Bayesian inference with big data sets, including variational inference and within-chain MCMC parallelization.

2021 ◽  
pp. 107699862110571
Author(s):  
Kuan-Yu Jin ◽  
Yi-Jhen Wu ◽  
Hui-Fang Chen

For surveys of complex issues that entail multiple steps, multiple reference points, and nongradient attributes (e.g., social inequality), this study proposes a new multiprocess model that integrates ideal-point and dominance approaches into a treelike structure (IDtree). In the IDtree, an ideal-point approach describes an individual’s attitude and then a dominance approach describes their tendency for using extreme response categories. Evaluation of IDtree performance via two empirical data sets showed that the IDtree fit these data better than other models. Furthermore, simulation studies showed a satisfactory parameter recovery of the IDtree. Thus, the IDtree model sheds light on the response processes of a multistage structure.


2016 ◽  
Vol 110 (4) ◽  
pp. 631-656 ◽  
Author(s):  
KOSUKE IMAI ◽  
JAMES LO ◽  
JONATHAN OLMSTED

Estimation of ideological positions among voters, legislators, and other actors is central to many subfields of political science. Recent applications include large data sets of various types including roll calls, surveys, and textual and social media data. To overcome the resulting computational challenges, we propose fast estimation methods for ideal points with massive data. We derive the expectation-maximization (EM) algorithms to estimate the standard ideal point model with binary, ordinal, and continuous outcome variables. We then extend this methodology to dynamic and hierarchical ideal point models by developing variational EM algorithms for approximate inference. We demonstrate the computational efficiency and scalability of our methodology through a variety of real and simulated data. In cases where a standard Markov chain Monte Carlo algorithm would require several days to compute ideal points, the proposed algorithm can produce essentially identical estimates within minutes. Open-source software is available for implementing the proposed methods.


2020 ◽  
Vol 14 (1) ◽  
pp. 12
Author(s):  
Julien Chevallier

In the Dynamic Conditional Correlation with Mixed Data Sampling (DCC-MIDAS) framework, we scrutinize the correlations between the macro-financial environment and CO2 emissions in the aftermath of the COVID-19 diffusion. The main original idea is that the economy’s lock-down will alleviate part of the greenhouse gases’ burden that human activity induces on the environment. We capture the time-varying correlations between U.S. COVID-19 confirmed cases, deaths, and recovered cases that were recorded by the Johns Hopkins Coronavirus Center, on the one hand; U.S. Total Industrial Production Index and Total Fossil Fuels CO2 emissions from the U.S. Energy Information Administration on the other hand. High-frequency data for U.S. stock markets are included with five-minute realized volatility from the Oxford-Man Institute of Quantitative Finance. The DCC-MIDAS approach indicates that COVID-19 confirmed cases and deaths negatively influence the macro-financial variables and CO2 emissions. We quantify the time-varying correlations of CO2 emissions with either COVID-19 confirmed cases or COVID-19 deaths to sharply decrease by −15% to −30%. The main takeaway is that we track correlations and reveal a recessionary outlook against the background of the pandemic.


2019 ◽  
Vol 19 (1) ◽  
pp. 3-23
Author(s):  
Aurea Soriano-Vargas ◽  
Bernd Hamann ◽  
Maria Cristina F de Oliveira

We present an integrated interactive framework for the visual analysis of time-varying multivariate data sets. As part of our research, we performed in-depth studies concerning the applicability of visualization techniques to obtain valuable insights. We consolidated the considered analysis and visualization methods in one framework, called TV-MV Analytics. TV-MV Analytics effectively combines visualization and data mining algorithms providing the following capabilities: (1) visual exploration of multivariate data at different temporal scales, and (2) a hierarchical small multiples visualization combined with interactive clustering and multidimensional projection to detect temporal relationships in the data. We demonstrate the value of our framework for specific scenarios, by studying three use cases that were validated and discussed with domain experts.


2003 ◽  
Vol 1836 (1) ◽  
pp. 132-142 ◽  
Author(s):  
Brian L. Smith ◽  
William T. Scherer ◽  
James H. Conklin

Many states have implemented large-scale transportation management systems to improve mobility in urban areas. These systems are highly prone to missing and erroneous data, which results in drastically reduced data sets for analysis and real-time operations. Imputation is the practice of filling in missing data with estimated values. Currently, the transportation industry generally does not use imputation as a means for handling missing data. Other disciplines have recognized the importance of addressing missing data and, as a result, methods and software for imputing missing data are becoming widely available. The feasibility and applicability of imputing missing traffic data are addressed, and a preliminary analysis of several heuristic and statistical imputation techniques is performed. Preliminary results produced excellent performance in the case study and indicate that the statistical techniques are more accurate while maintaining the natural characteristics of the data.


2018 ◽  
Author(s):  
Robert Kubinec ◽  
Sharan Grewal

Is power-sharing an effective way for endangered transitional democracies to reduce political tensions and improve government performance? We provide one of the first quantitative tests of this question in Tunisia, the Arab Spring's only success story. We argue that power-sharing may reduce polarization for a limited time, but at the cost of undermining democratic institutions. To measure polarization, we examine all rollcall votes from Tunisia's first and second post-transition parliaments. We employ a time-varying ideal point model and examine whether power-sharing agreements led to convergence in political parties' ideal points. Our analysis reveals that Tunisia's national unity government in 2015 temporarily moderated political tensions and allowed for parliamentary activity to resume. However, despite a broadening of the coalition in mid-2016, polarization reemerged and crucial legislation stalled. Moreover, longitudinal survey data suggest that the failure of power-sharing in Tunisia contributed to disillusionment with political parties, parliament, and democracy.


2011 ◽  
pp. 24-32 ◽  
Author(s):  
Nicoleta Rogovschi ◽  
Mustapha Lebbah ◽  
Younès Bennani

Most traditional clustering algorithms are limited to handle data sets that contain either continuous or categorical variables. However data sets with mixed types of variables are commonly used in data mining field. In this paper we introduce a weighted self-organizing map for clustering, analysis and visualization mixed data (continuous/binary). The learning of weights and prototypes is done in a simultaneous manner assuring an optimized data clustering. More variables has a high weight, more the clustering algorithm will take into account the informations transmitted by these variables. The learning of these topological maps is combined with a weighting process of different variables by computing weights which influence the quality of clustering. We illustrate the power of this method with data sets taken from a public data set repository: a handwritten digit data set, Zoo data set and other three mixed data sets. The results show a good quality of the topological ordering and homogenous clustering.


Author(s):  
Fereshteh Shahoveisi ◽  
Atena Oladzad ◽  
Luis E. del Rio Mendoza ◽  
Seyedali Hosseinirad ◽  
Susan Ruud ◽  
...  

The polyploid nature of canola (Brassica napus) represents a challenge for the accurate identification of single nucleotide polymorphisms (SNPs) and the detection of quantitative trait loci (QTL). In this study, combinations of eight phenotyping scoring systems and six SNP calling and filtering parameters were evaluated for their efficiency in detection of QTL associated with response to Sclerotinia stem rot, caused by Sclerotinia sclerotiorum, in two doubled haploid (DH) canola mapping populations. Most QTL were detected in lesion length, relative areas under the disease progress curve (rAUDPC) for lesion length, and binomial-plant mortality data sets. Binomial data derived from lesion size were less efficient in QTL detection. Inclusion of additional phenotypic sets to the analysis increased the numbers of significant QTL by 2.3-fold; however, the continuous data sets were more efficient. Between two filtering parameters used to analyze genotyping by sequencing (GBS) data, imputation of missing data increased QTL detection in one population with a high level of missing data but not in the other. Inclusion of segregation-distorted SNPs increased QTL detection but did not impact their R2 values significantly. Twelve of the 16 detected QTL were on chromosomes A02 and C01, and the rest were on A07, A09, and C03. Marker A02-7594120, associated with a QTL on chromosome A02 was detected in both populations. Results of this study suggest the impact of genotypic variant calling and filtering parameters may be population dependent while deriving additional phenotyping scoring systems such as rAUDPC datasets and mortality binary may improve QTL detection efficiency.


Sign in / Sign up

Export Citation Format

Share Document