Assessing Fit Quality and Testing for Misspecification in Binary-Dependent Variable Models

2012 ◽  
Vol 20 (4) ◽  
pp. 480-500 ◽  
Author(s):  
Justin Esarey ◽  
Andrew Pierce

In this article, we present a technique and critical test statistic for assessing the fit of a binary-dependent variable model (e.g., a logit or probit). We examine how closely a model's predicted probabilities match the observed frequency of events in the data set, and whether these deviations are systematic or merely noise. Our technique allows researchers to detect problems with a model's specification that obscure substantive understanding of the underlying data-generating process, such as missing interaction terms or unmodeled nonlinearities. We also show that these problems go undetected by the fit statistics most commonly used in political science.

1987 ◽  
Vol 65 (3) ◽  
pp. 691-707 ◽  
Author(s):  
A. F. L. Nemec ◽  
R. O. Brinkhurst

A data matrix of 23 generic or subgeneric taxa versus 24 characters and a shorter matrix of 15 characters were analyzed by means of ordination, cluster analyses, parsimony, and compatibility methods (the last two of which are phylogenetic tree reconstruction methods) and the results were compared inter alia and with traditional methods. Various measures of fit for evaluating the parsimony methods were employed. There were few compatible characters in the data set, and much homoplasy, but most analyses separated a group based on Stylaria from the rest of the family, which could then be separated into four groups, recognized here for the first time as tribes (Naidini, Derini, Pristinini, and Chaetogastrini). There was less consistency of results within these groups. Modern methods produced results that do not conflict with traditional groupings. The Jaccard coefficient minimizes the significance of symplesiomorphy and complete linkage avoids chaining effects and corresponds to actual similarities, unlike single or average linkage methods, respectively. Ordination complements cluster analysis. The Wagner parsimony method was superior to the less flexible Camin–Sokal approach and produced better measure of fit statistics. All of the aforementioned methods contain areas susceptible to subjective decisions but, nevertheless, they lead to a complete disclosure of both the methods used and the assumptions made, and facilitate objective hypothesis testing rather than the presentation of conflicting phylogenies based on the different, undisclosed premises of manual approaches.


Author(s):  
Alexander Baturo ◽  
Johan A. Elkink

Abstract How can one assess which countries select more experienced leaders for the highest office? There is wide variation in prior career paths of national leaders within, and even more so between, regime types. It is therefore challenging to obtain a truly comparative measure of political experience; empirical studies have to rely on proxies instead. This article proposes PolEx, a measure of political experience that abstracts away from the details of career paths and generalizes based on the duration, quality and breadth of an individual's experience in politics. The analysis draws on a novel data set of around 2,000 leaders from 1950 to 2017 and uses a Bayesian latent variable model to estimate PolEx. The article illustrates how the new measure can be used comparatively to assess whether democracies select more experienced leaders. The authors find that while on average they do, the difference with non-democracies has declined dramatically since the early 2000s. Future research may leverage PolEx to investigate the role of prior political experience in, for example, policy making and crisis management.


2021 ◽  
Vol 13 (1) ◽  
pp. 56
Author(s):  
Josephine Njeri Ngure ◽  
Anthony Gichuhi Waititu

A non parametric Auto-Regressive Conditional Heteroscedastic model for financial returns series is considered in which the conditional mean and volatility functions are estimated non-parametrically using Nadaraya Watson kernel. A test statistic for unknown abrupt change point in volatility which takes into consideration conditional heteroskedasticity, dependence, heterogeneity and the fourth moment of financial returns, since kurtosis is a function of the fourth moment is considered. The test is based on L2norm of the conditional variance functions of the squared residuals. A non-parametric change point estimator in volatility of financial returns is further obtained. The consistency of the estimator is shown theoretically and through simulation. An application of the estimator in change point estimation in volatility of United States Dollar/Kenya Shilling exchange rate returns data set is made. Through binary segmentation procedure, three change points in volatility of the exchange rate returns are estimated and further accounted for.


Symmetry ◽  
2020 ◽  
Vol 12 (1) ◽  
pp. 80 ◽  
Author(s):  
Martynas Narmontas ◽  
Petras Rupšys ◽  
Edmundas Petrauskas

In this work, we employ stochastic differential equations (SDEs) to model tree stem taper. SDE stem taper models have some theoretical advantages over the commonly employed regression-based stem taper modeling techniques, as SDE models have both simple analytic forms and a high level of accuracy. We perform fixed- and mixed-effect parameters estimation for the stem taper models by developing an approximated maximum likelihood procedure and using a data set of longitudinal measurements from 319 mountain pine trees. The symmetric Vasicek- and asymmetric Gompertz-type diffusion processes used adequately describe stem taper evolution. The proposed SDE stem taper models are compared to four regression stem taper equations and four volume equations. Overall, the best goodness-of-fit statistics are produced by the mixed-effect parameters SDEs stem taper models. All results are obtained in the Maple computer algebra system.


2012 ◽  
Vol 36 (4) ◽  
pp. 81-94 ◽  
Author(s):  
Emmanouil Benetos ◽  
Simon Dixon

In this work, a probabilistic model for multiple-instrument automatic music transcription is proposed. The model extends the shift-invariant probabilistic latent component analysis method, which is used for spectrogram factorization. Proposed extensions support the use of multiple spectral templates per pitch and per instrument source, as well as a time-varying pitch contribution for each source. Thus, this method can effectively be used for multiple-instrument automatic transcription. In addition, the shift-invariant aspect of the method can be exploited for detecting tuning changes and frequency modulations, as well as for visualizing pitch content. For note tracking and smoothing, pitch-wise hidden Markov models are used. For training, pitch templates from eight orchestral instruments were extracted, covering their complete note range. The transcription system was tested on multiple-instrument polyphonic recordings from the RWC database, a Disklavier data set, and the MIREX 2007 multi-F0 data set. Results demonstrate that the proposed method outperforms leading approaches from the transcription literature, using several error metrics.


2018 ◽  
Vol 74 (5) ◽  
pp. 1053-1073 ◽  
Author(s):  
Wolfgang Zenk-Möltgen ◽  
Esra Akdeniz ◽  
Alexia Katsanidou ◽  
Verena Naßhoven ◽  
Ebru Balaban

Purpose Open data and data sharing should improve transparency of research. The purpose of this paper is to investigate how different institutional and individual factors affect the data sharing behavior of authors of research articles in sociology and political science. Design/methodology/approach Desktop research analyzed attributes of sociology and political science journals (n=262) from their websites. A second data set of articles (n=1,011; published 2012-2014) was derived from ten of the main journals (five from each discipline) and stated data sharing was examined. A survey of the authors used the Theory of Planned Behavior to examine motivations, behavioral control, and perceived norms for sharing data. Statistical tests (Spearman’s ρ, χ2) examined correlations and associations. Findings Although many journals have a data policy for their authors (78 percent in sociology, 44 percent in political science), only around half of the empirical articles stated that the data were available, and for only 37 percent of the articles could the data be accessed. Journals with higher impact factors, those with a stated data policy, and younger journals were more likely to offer data availability. Of the authors surveyed, 446 responded (44 percent). Statistical analysis indicated that authors’ attitudes, reported past behavior, social norms, and perceived behavioral control affected their intentions to share data. Research limitations/implications Less than 50 percent of the authors contacted provided responses to the survey. Results indicate that data sharing would improve if journals had explicit data sharing policies but authors also need support from other institutions (their universities, funding councils, and professional associations) to improve data management skills and infrastructures. Originality/value This paper builds on previous similar research in sociology and political science and explains some of the barriers to data sharing in social sciences by combining journal policies, published articles, and authors’ responses to a survey.


2018 ◽  
Vol 28 (8) ◽  
pp. 2418-2438
Author(s):  
Xi Shen ◽  
Chang-Xing Ma ◽  
Kam C Yuen ◽  
Guo-Liang Tian

Bilateral correlated data are often encountered in medical researches such as ophthalmologic (or otolaryngologic) studies, in which each unit contributes information from paired organs to the data analysis, and the measurements from such paired organs are generally highly correlated. Various statistical methods have been developed to tackle intra-class correlation on bilateral correlated data analysis. In practice, it is very important to adjust the effect of confounder on statistical inferences, since either ignoring the intra-class correlation or confounding effect may lead to biased results. In this article, we propose three approaches for testing common risk difference for stratified bilateral correlated data under the assumption of equal correlation. Five confidence intervals of common difference of two proportions are derived. The performance of the proposed test methods and confidence interval estimations is evaluated by Monte Carlo simulations. The simulation results show that the score test statistic outperforms other statistics in the sense that the former has robust type [Formula: see text] error rates with high powers. The score confidence interval induced from the score test statistic performs satisfactorily in terms of coverage probabilities with reasonable interval widths. A real data set from an otolaryngologic study is used to illustrate the proposed methodologies.


2018 ◽  
Vol 28 (9) ◽  
pp. 2868-2875
Author(s):  
Zhongxue Chen ◽  
Qingzhong Liu ◽  
Kai Wang

Several gene- or set-based association tests have been proposed recently in the literature. Powerful statistical approaches are still highly desirable in this area. In this paper we propose a novel statistical association test, which uses information of the burden component and its complement from the genotypes. This new test statistic has a simple null distribution, which is a special and simplified variance-gamma distribution, and its p-value can be easily calculated. Through a comprehensive simulation study, we show that the new test can control type I error rate and has superior detecting power compared with some popular existing methods. We also apply the new approach to a real data set; the results demonstrate that this test is promising.


Author(s):  
Naveen K. Bansal ◽  
Mehdi Maadooliat ◽  
Steven J. Schrodi

Abstract We consider a multiple hypotheses problem with directional alternatives in a decision theoretic framework. We obtain an empirical Bayes rule subject to a constraint on mixed directional false discovery rate (mdFDR≤α) under the semiparametric setting where the distribution of the test statistic is parametric, but the prior distribution is nonparametric. We proposed separate priors for the left tail and right tail alternatives as it may be required for many applications. The proposed Bayes rule is compared through simulation against rules proposed by Benjamini and Yekutieli and Efron. We illustrate the proposed methodology for two sets of data from biological experiments: HIV-transfected cell-line mRNA expression data, and a quantitative trait genome-wide SNP data set. We have developed a user-friendly web-based shiny App for the proposed method which is available through URL https://npseb.shinyapps.io/npseb/. The HIV and SNP data can be directly accessed, and the results presented in this paper can be executed.


1990 ◽  
Vol 66 (6) ◽  
pp. 600-605 ◽  
Author(s):  
R. T. Morton ◽  
T. I. Grabowski ◽  
S. J. Titus ◽  
G. M. Bonnor

In 1985, a survey of nine provinces and two territories was conducted to summarize operational tree volume estimation methods. Based on those results, six tree volume estimation functions were evaluated to answer the question: can a single model be used nation-wide for tree volume estimation? The six models were fitted to nation-wide data for 980 white spruce trees distributed nearly equally among the provinces and territories. Based on goodness of fit statistics and analysis of residuals, Schumacher's (1933) model and the Quebec combined variable model performed marginally better than the others. Further, the analyses did not reveal any significant differences between territories and provinces. It appears that any of these models could be applied to broad regions of Canada without suffering significant losses in accuracy.


Sign in / Sign up

Export Citation Format

Share Document