Improving Data Analysis in Political Science

1969 ◽  
Vol 21 (4) ◽  
pp. 641-654 ◽  
Author(s):  
Edward R. Tufte

Students of politics use statistical and quantitative techniques to: summarize a large body of numbers into a small collection of typical values;confirm (and perhaps sanctify) the results of the analysis by using tests of statistical significance that help protect against sampling and measurement error;discover what's going on in their data and expose some new relationships; andinform their audience what's going on in the data.

2007 ◽  
Vol 14 (1) ◽  
pp. 79-88 ◽  
Author(s):  
D. V. Divine ◽  
F. Godtliebsen

Abstract. This study proposes and justifies a Bayesian approach to modeling wavelet coefficients and finding statistically significant features in wavelet power spectra. The approach utilizes ideas elaborated in scale-space smoothing methods and wavelet data analysis. We treat each scale of the discrete wavelet decomposition as a sequence of independent random variables and then apply Bayes' rule for constructing the posterior distribution of the smoothed wavelet coefficients. Samples drawn from the posterior are subsequently used for finding the estimate of the true wavelet spectrum at each scale. The method offers two different significance testing procedures for wavelet spectra. A traditional approach assesses the statistical significance against a red noise background. The second procedure tests for homoscedasticity of the wavelet power assessing whether the spectrum derivative significantly differs from zero at each particular point of the spectrum. Case studies with simulated data and climatic time-series prove the method to be a potentially useful tool in data analysis.


1983 ◽  
Vol 38 ◽  
pp. 1-9
Author(s):  
Herbert F. Weisberg

We are now entering a new era of computing in political science. The first era was marked by punched-card technology. Initially, the most sophisticated analyses possible were frequency counts and tables produced on a counter-sorter, a machine that specialized in chewing up data cards. By the early 1960s, batch processing on large mainframe computers became the predominant mode of data analysis, with turnaround time of up to a week. By the late 1960s, turnaround time was cut down to a matter of a few minutes and OSIRIS and then SPSS (and more recently SAS) were developed as general-purpose data analysis packages for the social sciences. Even today, use of these packages in batch mode remains one of the most efficient means of processing large-scale data analysis.


2020 ◽  
Author(s):  
Sebastian Bergrath ◽  
Tobias Strapatsas ◽  
Michael Tuemen ◽  
Thorsten Reith ◽  
Marc Deussen ◽  
...  

Abstract Background: The outbreak of the coronavirus disease 2019 (COVID-19) caused by the severe respiratory distress syndrome coronavirus 2 (SARS-CoV-2) led to severe disruption in social life and economics. The present study should analyze the impact of the local COVID-19 epidemic on emergency resources for all hospitals in a major urban center (Moenchengladbach, Germany). Methods: An observational multicenter study was performed involving all four acute care hospitals. Systemic parameters department (ED) parameters from week 4 to 24 in 2020 were compared to the corresponding period in 2019 for each hospital and in a summative data analysis using a logistic regression model. Outcomes: ED visits, ED to hospital admission, ED to Intensive Care Unit (ICU) admission, medical specialties of admitted patients, work related accidents. Results: In week 9/2020 the first SARS-CoV-2 positive patients were detected in our region. All hospitals decided to minimize elective admissions to ensure operational capability for COVID-19 patients. The summative number of ED visits dropped from 34,659 to 28,008. Numbers decreased from week 8 on between 38% and 48% per week per hospital at the maximum and began to rise again from week 16 on. The pooled data analysis showed statistically significant decreases in outpatient ED visits (20,152 vs. 16,477, p=<0.001), hospital admissions of ED patients (14,507 vs. 11,531, p=<0.001), and work-related accidents (2,290 vs. 1,468, p=<0.001). The decrease in admissions from ED to ICU did not reach statistical significance (2,093 vs. 1,566, p=0.255). The decline in ED cases was mainly caused by a decrease in non-trauma and non-surgical patients. Conclusion: The regional COVID-19 outbreak led to significantly reduced ED contacts after the first COVID-19 cases appeared. Even the admissions to the hospitals and the number of ED to ICU-admissions decreased, which is potentially dangerous, because the ratio of emergency outpatients vs. inpatients remained stable. Therefore, one can assume that patients with severe medical problems did not seek ED care in many cases. The decline of patients was earlier than in other German hospitals and in contrast to the findings in the U.S. and Italy where ED visits and hospital admissions in medical disciplines increased.


Author(s):  
Soobia Saeed ◽  
N. Z. Jhanjhi ◽  
Mehmood Naqvi ◽  
Mamoona Humayun ◽  
Vasaki Ponnusamy

Human beings have a knack for errors. Counter-effective actions rendered to specify and rectify such errors in a minimum period of time are required when effectiveness and swift advancement depends on the capability of acknowledging the faults and errors and repair quickly. The software as audit module application in IT complaint is in review in this commentary as is another significant instrument created in the field of data analysis that digs deep into quickly and successfully assessing the imprecisions or grievances identified by the users in a certain company. The target of this study is to evaluate the statistical significance in relationship between client reporting attitude and client reliability and to evaluate the impact of strong responsiveness on client reliability, to measure the statistically noteworthy effect of client grievance conduct on service quality, and to test the impact of service quality on client dedication.


2020 ◽  
Vol 19 (3) ◽  
pp. 339-357
Author(s):  
Papar Kananurak ◽  
Aeggarchat Sirisankanan

Purpose There are several different factors that can influence self-employment. However, there is little evidence stemming from direct examination of the impact of financial development (FD) on self-employment. This study aims to formulate empirical specification models to examine the effect of FD on self-employment. Design/methodology/approach Panel data analysis of 136 sample countries was performed during the period from 2000 to 2017. This study initially implemented the new financial index developed by the International Monetary Fund (IMF) to examine the impact of FD on self-employment. Panel data analysis including the pooled model, fixed effect and random effect model has been carried out. Findings The empirical results show that the financial institutions index has a negative significant impact on self-employment by a considerable magnitude, whereas the financial markets index does not show any statistical significance. The results also find that the government effectiveness index is negative and statistically significant on self-employment. Originality/value There are several different factors which can influence self-employment. Nevertheless, there is little evidence for the direct examination of the impact of FD on self-employment. This study investigated the impact of FD on self-employment by using the new FD index created by the IMF. The finding may help policymakers to implement FD along with other institutional policies to control self-employment.


2019 ◽  
Vol 5 (1) ◽  
pp. 17 ◽  
Author(s):  
José Solís-Lemus ◽  
Brian Stramer ◽  
Greg Slabaugh ◽  
Constantino Reyes-Aldasoro

This paper presents a novel software framework, called macrosight, which incorporates routines to detect, track, and analyze the shape and movement of objects, with special emphasis on macrophages. The key feature presented in macrosight consists of an algorithm to assess the changes of direction derived from cell–cell contact, where an interaction is assumed to occur. The main biological motivation is the determination of certain cell interactions influencing cell migration. Thus, the main objective of this work is to provide insights into the notion that interactions between cell structures cause a change in orientation. Macrosight analyzes the change of direction of cells before and after they come in contact with another cell. Interactions are determined when the cells overlap and form clumps of two or more cells. The framework integrates a segmentation technique capable of detecting overlapping cells and a tracking framework into a tool for the analysis of the trajectories of cells before and after they overlap. Preliminary results show promise into the analysis and the hypothesis proposed, and lays the groundwork for further developments. The extensive experimentation and data analysis show, with statistical significance, that under certain conditions, the movement changes before and after an interaction are different from movement in controlled cases.


2015 ◽  
Vol 105 (11) ◽  
pp. 1400-1407 ◽  
Author(s):  
L. V. Madden ◽  
D. A. Shah ◽  
P. D. Esker

The P value (significance level) is possibly the mostly widely used, and also misused, quantity in data analysis. P has been heavily criticized on philosophical and theoretical grounds, especially from a Bayesian perspective. In contrast, a properly interpreted P has been strongly defended as a measure of evidence against the null hypothesis, H0. We discuss the meaning of P and null-hypothesis statistical testing, and present some key arguments concerning their use. P is the probability of observing data as extreme as, or more extreme than, the data actually observed, conditional on H0 being true. However, P is often mistakenly equated with the posterior probability that H0 is true conditional on the data, which can lead to exaggerated claims about the effect of a treatment, experimental factor or interaction. Fortunately, a lower bound for the posterior probability of H0 can be approximated using P and the prior probability that H0 is true. When one is completely uncertain about the truth of H0 before an experiment (i.e., when the prior probability of H0 is 0.5), the posterior probability of H0 is much higher than P, which means that one needs P values lower than typically accepted for statistical significance (e.g., P = 0.05) for strong evidence against H0. When properly interpreted, we support the continued use of P as one component of a data analysis that emphasizes data visualization and estimation of effect sizes (treatment effects).


2018 ◽  
Vol 96 (7) ◽  
pp. 738-748 ◽  
Author(s):  
Peter D. Wentzell ◽  
Chelsi C. Wicks ◽  
Jez W.B. Braga ◽  
Liz F. Soares ◽  
Tereza C.M. Pastore ◽  
...  

The analysis of multivariate chemical data is commonplace in fields ranging from metabolomics to forensic classification. Many of these studies rely on exploratory visualization methods that represent the multidimensional data in spaces of lower dimensionality, such as hierarchical cluster analysis (HCA) or principal components analysis (PCA). However, such methods rely on assumptions of independent measurement errors with uniform variance and can fail to reveal important information when these assumptions are violated, as they often are for chemical data. This work demonstrates how two alternative methods, maximum likelihood principal components analysis (MLPCA) and projection pursuit analysis (PPA), can reveal chemical information hidden from more traditional techniques. Experimental data to compare different methods consists of near-infrared (NIR) reflectance spectra from 108 samples of wood that are derived from four different species of Brazilian trees. The measurement error characteristics of the spectra are examined and it is shown that, by incorporating measurement error information into the data analysis (through MLPCA) or using alternative projection criteria (i.e., PPA), samples can be separated by species. These techniques are proposed as powerful tools for multivariate data analysis in chemistry.


1980 ◽  
Vol 37 (2) ◽  
pp. 290-294 ◽  
Author(s):  
K. H. Reckhow

Water quality sampling and data analysis are undertaken to acquire and convey information. Therefore, when data are presented, the form of this presentation should be such that information transfer is high. For example, a graph or table of average values is often an inadequate summary of batches of data. As an alternative, a technique is presented (that was developed for exploratory data analysis purposes) that can be used to display several sets of data on a single graph, indicating median, spread, skew, size of data set, and statistical significance of the median. This technique is useful in the study of phosphorus concentration variability in lakes. Additions to, and modifications of, this procedure are easily made and will often enhance the analysis of a particular problem. Some suggestions are made for useful modifications of the plots in the study and display of phosphorus lake data and models.Key words: limnology, exploratory data analysis, statistics, phosphorus, water quality, models, lakes


2019 ◽  
Author(s):  
Jochem H. Smit ◽  
Yichen Li ◽  
Eliza M. Warszawik ◽  
Andreas Herrmann ◽  
Thorben Cordes

AbstractSingle-molecule fluorescence microscopy studies of bacteria provide unique insights into the mechanisms of cellular processes and protein machineries in ways that are unrivalled by any other technique. With the cost of microscopes dropping and the availability of fully automated microscopes, the volume of microscopy data produced has increased tremendously. These developments have moved the bottleneck of throughput from image acquisition and sample preparation to data analysis. Furthermore, requirements for analysis procedures have become more stringent given the requirement of various journals to make data and analysis procedures available. To address this we have developed a new data analysis package for analysis of fluorescence microscopy data of rod-like cells. Our software ColiCoords structures microscopy data at the single-cell level and implements a coordinate system describing each cell. This allows for the transformation of Cartesian coordinates of both cellular images (e.g. from transmission light or fluorescence microscopy) and single-molecule localization microscopy (SMLM) data to cellular coordinates. Using this transformation, many cells can be combined to increase the statistical significance of fluorescence microscopy datasets of any kind. Coli-Coords is open source, implemented in the programming language Python, and is extensively documented. This allows for modifications for specific needs or to inspect and publish data analysis procedures. By providing a format that allows for easy sharing of code and associated data, we intend to promote open and reproducible research.The source code and documentation can be found via the project’s GitHub page.


Sign in / Sign up

Export Citation Format

Share Document