Understanding the exploratory/confirmatory data analysis continuum: Moving beyond the “replication crisis”.

2021 ◽  
Author(s):  
Dustin A. Fife ◽  
Joseph Lee Rodgers
1987 ◽  
Vol 26 (02) ◽  
pp. 77-88 ◽  
Author(s):  
K. Abt

SummaryConfirmatory Data Analysis (CDA) in randomized comparative (“controlled”) studies with many variables and/or time points of interest finds its limitations in the multiplicity of desired inferential statements which leads to unfeasibly small adjusted significance levels (“Bon-ferronization”) and, thereby, to unduly increased risks of not rejecting false hypotheses. In general, analytical models adequate for such complex data structures and suitable for practical use do not exist as yet. Exploratory Data Analysis (EDA), on the other hand, is usually intended to generate hypotheses and not to lead to final conclusions based on the results of the study.In this paper, it is proposed to fill the conceptual gap between CDA and EDA by “Descriptive Data Analysis” (“DDA”) which concept is mainly based on descriptive inferential statements. The results of a DDA in a controlled study are interpreted simultaneously on the basis of the investigator’s experience with respect to numerically relevant treatment effect differences and on “descriptive significances” as they appear in “near regular” patterns corresponding to the resulting relevant effect differences. A DDA may also contain confirmatory parts and/or tests on global hypotheses at a prechosen maximum risk α of erroneously rejecting true hypotheses. The paper is in parts expository and is addressed to investigators as well as statisticians.


2019 ◽  
Author(s):  
Dustin Fife ◽  
Joseph Lee Rodgers

In light of the “replication crisis,” some (e.g., Nelson, Simmons, & Simonsohn, 2018) advocate for greater policing and transparency in research methods. Others (Baumeister, 2016; Finkel, Eastwick, & Reis, 2017; Goldin-meadow, 2016; Levenson, 2017) argue against rigid requirements that may inadvertently restrict discovery. We embrace both positions and argue that proper understanding and implementation of the well-established paradigm of Exploratory Data Analysis (EDA; Tukey, 1977) is necessary to push beyond the replication crisis. Unfortunately, many don’t realize EDA exists (Goldin-Meadow, 2016), fail to understand the philosophy and proper tools for exploration (Baumeister, 2016), or reject EDA as unscientific (Lindsay, 2015). EDA’s mistreatment is unfortunate, and is usually based on misunderstanding the nature and goal of EDA. We develop an expanded typology that situates EDA, CDA, and rough CDA in the same framework with fishing, p-hacking, and HARKing, and argue that most, if not all, questionable research practices (QRPs) would be resolved by understanding and implementing the EDA/CDA gradient. We argue most psychological research is “rough CDA,” which has often and inadvertently used the wrong tools. We conclude with guidelines about how these typologies can be integrated into a cumulative research program that is necessary to move beyond the replication crisis.


2020 ◽  
Author(s):  
Alfonso J. Rodriguez-Morales ◽  
Ram Kumar Singh ◽  
S.S. Singh ◽  
A. K. Pandey ◽  
Vinod Kumar ◽  
...  

BACKGROUND The highly contagious Coronavirus disease (COVID-19) pandemic affected nearly all nations across the world. It was emerged as most swiftly affected disease across the world and more than 2934 lakhs population suffered in four months of the time period as on date April 26, 2020. Its first epicenter was at Wuhan city of China during the month of December 2019. Currently, the most affected people and new epicenter of Coronavirus is at the United States of America (USA). It is identified as the most severe pandemic disease in human history during the past 100 years. Due to non-availability of specific medication, the World Health Organization (WHO) suggested various measures of precautions and social distance in between the people for the restricting the spread of the COVID-19 disease. Various nation’s administration including the India government called for the regional and local lockdown. OBJECTIVE We predicted the confirmed COVID-19 cases for next May-2020 month, map the magnitude of COVID-19 disease for Indian states and model the paucity of COVID-19 disease with statistical confirmatory data analysis model for declining rate for the cases represented for the Indian proportion of population. METHODS The ARIMA model used to predict for next short-term cases, based moving average of past confirmed cases. The restriction of COVID-19 pandemic disease analyzed with predicted cases for month May 2020 data at 95 percent confidence is more than 2.5 lakh cases. RESULTS The confirmatory data analysis model for the time estimation for the paucity of cases it takes in between six to eighteen months of time frame. The Confirmatory model which considers recovery rate, social, economic and government policy. To complete recovery from the COVID-19 cases it takes on an average more than next ten months. CONCLUSIONS The disease impacts also depend upon administrative and local people support for self-quarantine and other measures. The India nation Gross Domestic Product (GDP) based on more than 17% of its agriculture production, due to longer affect of the disease and extended lockdown period it will be severely affected. However, all the economic activities with full of its intensity takes-up after complete paucity of COVID-19 disease spread. CLINICALTRIAL wqew ere re


1990 ◽  
Vol 3 (1) ◽  
pp. 3-12 ◽  
Author(s):  
Frank H. Duffy ◽  
Kenneth Jones ◽  
Peter Bartels ◽  
Marilyn Albert ◽  
Gloria B. McAnulty ◽  
...  

2010 ◽  
Vol 3 (1) ◽  
pp. 4-8
Author(s):  
Fernando Marmolejo-Ramos

In 1968 John Tukey gave a speech at the American Psychological Association in San Francisco about the relevance of proper data analysis in Psychology (Tukey, 1969). His closing message was that “data analysis needs to be both exploratory and confirmatory” (p. 90). Exploratory data analysis (or EDA) is an approach to analysing data in order to formulate sound hypotheses, whereas confirmatory data analysis (CDA) is a method to test those hypotheses (a.k.a., statistical hypothesis testing). As Tukey announced in his speech, these two analytical tools have been, and are somewhat still, at odds. This special issue presents sixteen papers that cover relevant topics in EDA and CDA with the purpose of bringing together seemingly disparate issues.


2020 ◽  
Author(s):  
Alfonso J. Rodriguez-Morales ◽  
Ram Kumar Singh ◽  
S. S. Singh ◽  
A. K. Pandey ◽  
Vinod Kumar ◽  
...  

Abstract Background: The highly contagious Co rona vi rus d isease (COVID-19) pandemic affected nearly all nations across the world. It was emerged as most swiftly affected disease across the world and more than 2934 lakhs population suffered in four months of the time period as on date April 26, 2020. Its first epicenter was at Wuhan city of China during the month of December 2019. Currently, the most affected people and new epicenter of Coronavirus is at the United States of America (USA). Various nation’s administration including the India government called for the regional and local lockdown. We predicted the confirmed COVID-19 cases for next May-2020 month, map the magnitude of COVID-19 disease for Indian states and model the paucity of COVID-19 disease with statistical confirmatory data analysis model for declining rate for the cases represented for the Indian proportion of population. Method: The ARIMA model used to predict for next short-term cases, based moving average of past confirmed cases. The restriction of COVID-19 pandemic disease analyzed with predicted cases for month May 2020 data at 95 percent confidence is more than 2.5 lakh cases. Results: The confirmatory data analysis model for the time estimation for the paucity of cases it takes in between six to eighteen months of time frame. The Confirmatory model which considers recovery rate, social, economic and government policy. To complete recovery from the COVID-19 cases it takes on an average more than next ten months. Conclusion: The disease impacts also depend upon administrative and local people support for self-quarantine and other measures. The India nation Gross Domestic Product (GDP) based on more than 17% of its agriculture production, due to longer affect of the disease and extended lockdown period it will be severely affected. However, all the economic activities with full of its intensity takes-up after complete paucity of COVID-19 disease spread. Keywords: SARS-CoV-2; Lockdown; GDP; Nobel-Corona; Confirmatory data model


2014 ◽  
Vol 53 (1) ◽  
pp. 1-14 ◽  
Author(s):  
Saima Naeem ◽  
Asad Zaman

Razzaque (2009) studied the role of gender in the ultimatum game by running experiments on students in various cities in Pakistan. He used standard confirmatory data analysis techniques, which work well in familiar contexts, where relevant hypotheses of interest are known in advance. Our goal in this paper is to demonstrate that exploratory data analysis is much better suited to the study of experimental data where the goal is to discover patterns of interest. Our exploratory re-analysis of the original data set of Razzaque (2009) leads to several new insights. While we re-confirm the main finding of Razzaque regarding the greater generosity of males, additional analysis suggests that this is driven by student subculture in Pakistan, and would not generalise to the population at large. In addition, we find strong effect of urbanisation. Our exploratory data analysis also offers considerable additional insights into the learning process that takes place over the course of a sequence of games. JEL Classification: C78, C81, C91, J16 Keywords: Ultimatum Game, Gender Differences, Exploratory Data Analysis


2021 ◽  
Author(s):  
Attila Krajcsi

Current data analysis practice and statistics education are suboptimal in many senses, which contributes to the replication crisis. To address some of these issues, a new type of statistical and data analysis software solution is proposed here in which most of the analysis steps are compiled automatically based on the task and the measurement level of the variables and in which the result output is carefully optimized for informativeness and understandability. Automatic data analysis and optimized output can contribute to making data analysis more coherent across studies, remedying some aspects of the issues leading to the replication crisis, making analysis more efficient for users and helping to promote and teach better data analysis practices. Such a solution can be useful for researchers to conduct faster and more precise analyses, for students to see illustrations and demonstrations of data analysis solutions, and for methodologists to formulate straightforward analysis procedures that can promote more precise and more coherent data analysis practice in the literature. A possible implementation of such software, CogStat is presented here, in which additional design considerations make the results more understandable and more precise, the analyses more accessible, and the analysis more efficient.


Sign in / Sign up

Export Citation Format

Share Document