scholarly journals Predicting the Replicability of Social Science Lab Experiments

2019 ◽  
Author(s):  
Adam Altmejd ◽  
Anna Dreber ◽  
Eskil Forsell ◽  
Teck Hua Ho ◽  
Juergen Huber ◽  
...  

We measure how accurately replication of experimental results can be predicted by a black-box statistical model. With data from four large- scale replication projects in experimental psychology and economics, and techniques from machine learning, we train a predictive model and study which variables drive predictable replication.The model predicts binary replication with a cross validated accuracy rate of 70% (AUC of 0.79) and relative effect size with a Spearman ρ of 0.38. The accuracy level is similar to the market-aggregated beliefs of peer scientists (Camerer et al., 2016; Dreber et al., 2015). The predictive power is validated in a pre-registered out of sample test of the outcome of Camerer et al. (2018b), where 71% (AUC of 0.73) of replications are predicted correctly and effect size correlations amount to ρ = 0.25.Basic features such as the sample and effect sizes in original papers, and whether reported effects are single-variable main effects or two- variable interactions, are predictive of successful replication. The models presented in this paper are simple tools to produce cheap, prognostic replicability metrics. These models could be useful in institutionalizing the process of evaluation of new findings and guiding resources to those direct replications that are likely to be most informative.

2018 ◽  
Author(s):  
Eskil Forsell ◽  
Domenico Viganola ◽  
Thomas Pfeiffer ◽  
Johan Almenberg ◽  
Brad Wilson ◽  
...  

Understanding and improving reproducibility is crucial for scientific progress. Prediction markets and related methods of eliciting peer beliefs are promising tools to predict replication outcomes. We invited researchers in the field of psychology to judge the replicability of 24 studies replicated in the large scale Many Labs 2 project. We elicited peer beliefs in prediction markets and surveys about two replication success metrics: the probability that the replication yields a statistically significant effect in the original direction (p<0.001), and the relative effect size of the replication. The prediction markets correctly predicted 75% of the replication outcomes, and were highly correlated with the replication outcomes. Survey beliefs were also significantly correlated with replication outcomes, but had higher prediction errors. The prediction markets for relative effect sizes attracted little trading and thus did not work well. The survey beliefs about relative effect sizes performed better and were significantly correlated with observed relative effect sizes. These results suggest that replication outcomes can be predicted and that the elicitation of peer beliefs can increase our knowledge about scientific reproducibility and the dynamics of hypothesis testing.


2020 ◽  
Vol 34 (04) ◽  
pp. 6340-6347
Author(s):  
Zifeng Wang ◽  
Hong Zhu ◽  
Zhenhua Dong ◽  
Xiuqiang He ◽  
Shao-Lun Huang

In the time of Big Data, training complex models on large-scale data sets is challenging, making it appealing to reduce data volume for saving computation resources by subsampling. Most previous works in subsampling are weighted methods designed to help the performance of subset-model approach the full-set-model, hence the weighted methods have no chance to acquire a subset-model that is better than the full-set-model. However, we question that how can we achieve better model with less data? In this work, we propose a novel Unweighted Influence Data Subsampling (UIDS) method, and prove that the subset-model acquired through our method can outperform the full-set-model. Besides, we show that overly confident on a given test set for sampling is common in Influence-based subsampling methods, which can eventually cause our subset-model's failure in out-of-sample test. To mitigate it, we develop a probabilistic sampling scheme to control the worst-case risk over all distributions close to the empirical distribution. The experiment results demonstrate our methods superiority over existed subsampling methods in diverse tasks, such as text classification, image classification, click-through prediction, etc.


Author(s):  
Yanzhe Sun ◽  
Kai Sun ◽  
Tianyou Wang ◽  
Yufeng Li ◽  
Zhen Lu

Emission and fuel consumption in swirl-supported diesel engines strongly depend on the in-cylinder turbulent flows. But the physical effects of squish flow on the tangential flow and turbulence production are still far from well understood. To identify the effects of squish flow, Particle image velocimetry (PIV) experiments are performed in a motored optical diesel engine equipped with different bowls. By comparing and associating the large-scale flow and turbulent kinetic energy (k), the main effects of the squish flow are clarified. The effect of squish flow on the turbulence production in the r−θ plane lies in the axial-asymmetry of the annular distribution of radial flow and the deviation between the ensemble-averaged swirl field and rigid body swirl field. Larger squish flow could promote the swirl center to move to the cylinder axis and reduce the deformation of swirl center, which could decrease the axial-asymmetry of annular distribution of radial flow, further, that results in a lower turbulence production of the shear stress. Moreover, larger squish flow increases the radial fluctuation velocity which makes a similar contribution to k with the tangential component. The understanding of the squish flow and its correlations with tangential flow and turbulence obtained in this study is beneficial to design and optimize the in-cylinder turbulent flow.


2013 ◽  
Vol 03 (03n04) ◽  
pp. 1350016 ◽  
Author(s):  
Jing-Zhi Huang ◽  
Zhijian Huang

Empirical evidence on the out-of-sample performance of asset-pricing anomalies is mixed so far and arguably is often subject to data-snooping bias. This paper proposes a method that can significantly reduce this bias. Specifically, we consider a long-only strategy that involves only published anomalies and non-forward-looking filters and that each year recursively picks the best past-performer among such anomalies over a given training period. We find that this strategy can outperform the equity market even after transaction costs. Overall, our results suggest that published anomalies persist even after controlling for data-snooping bias.


2014 ◽  
Vol 09 (02) ◽  
pp. 1440001 ◽  
Author(s):  
MARC S. PAOLELLA

Simple, fast methods for modeling the portfolio distribution corresponding to a non-elliptical, leptokurtic, asymmetric, and conditionally heteroskedastic set of asset returns are entertained. Portfolio optimization via simulation is demonstrated, and its benefits are discussed. An augmented mixture of normals model is shown to be superior to both standard (no short selling) Markowitz and the equally weighted portfolio in terms of out of sample returns and Sharpe ratio performance.


2014 ◽  
Vol 6 (2) ◽  
pp. 23-36
Author(s):  
Fatma Molu

Complex financial conversion projects with large budgets have many different challenges. For companies that want to survive in conditions of tough competition, legacy (old) systems must continue to provide the required service throughout the project life cycle and in some circumstances even after project completion partly. In this case, the term coexistence comes into prominence. During this period, testing phase takes more critical role while integration systems' complexity and risk amount increase. Determining testing approach to use is essential to make sure both transformed and legacy systems provide service synchronously. In this paper, testing practices applied in the long conversion processes are discussed. Primarily, the basic features of the critical financial systems are addressed and then the main adoption methods in the literature are summarized. Then a variety of testing methodologies are presented depending on those adoption methods. These samples based on real-life experiences of transformation project. The most extensive example of real-time online financial systems is core banking systems. This paper covers the testing life cycle process of the large scale project of core banking system transformation project of a bank in Turkey.


2016 ◽  
Vol 8 (9) ◽  
pp. 226
Author(s):  
Tsung-Hsun Lu ◽  
Jun-De Lee

This paper investigates whether abnormal trading volume provides information about future movements in stock prices. Utilizing data from the Taiwan 50 Index from October 29, 2002 to December 31, 2013, the researchers employ trading volume rather than stock price to test the principles of resistance and support level employed by technical analysis. The empirical results suggest that abnormal trading volume provides profitable information for investors in the Taiwan stock market. An out-of-sample test and a sensitive analysis are conducted for the robustness of the results.


2021 ◽  
Author(s):  
Karim Ibrahim ◽  
Stephanie Noble ◽  
George He ◽  
Cheryl Lacadie ◽  
Michael Crowley ◽  
...  

Abstract Disruptions in frontoparietal networks supporting emotion regulation have been long implicated in maladaptive childhood aggression. However, the association of connectivity between large-scale functional networks in the human connectome with aggressive behavior has not been tested. By using a data-driven, machine learning approach, we show that the functional organization of the connectome during emotion processing predicts severity of aggression in children (n=129). Connectivity predictive of aggression was identified within and between large-scale networks implicated in cognitive control (frontoparietal), social functioning (default mode), and emotion processing (subcortical). Out-of-sample replication and generalization of findings predicting aggression from the functional connectome was conducted in an independent sample of children from the Adolescent Brain Cognitive Development study (n=1,791; n=1,701). These results define novel connectivity-based networks of child aggression that can serve as biomarkers to inform targeted treatments for aggression.


2020 ◽  
Vol 19 (2) ◽  
pp. 42-50
Author(s):  
A. A. Korneenkov ◽  
◽  
I. V. Fanta ◽  

The article discusses the concepts of measures of the effect of clinical effects, quantitative methods for their calculation and interpretation, their importance for making medical decisions. Algorithms for calculating effect measures are described for different clinical trial endpoints represented by quantitative (numerical) or binary types of variables, and for different types of effect size indicator (absolute, relative effect size, or clinical effectiveness indicator). It is shown that in the context of assessing the effect of therapeutic effects and clinical efficacy in general, measuring the size of the effect provides a valuable tool for data analysis. Evaluation and interpretation of the effect of the therapeutic modality only on the basis of the level of significance p obtained by testing statistical hypotheses without specifying the size of the effect is not sufficient to understand the importance of using the effect in clinical practice. To obtain an adequate quantitative assessment of the effect and its interpretation, the concept of the size of the effect is a convenient system of methods that is widely used. To illustrate the calculation and interpretation of the size of the effect, published data from clinical studies of the effectiveness of local anesthesia to reduce pain after septoplasty were used. It is shown how, using the presented technique, it is possible to efficiently calculate and easily interpret measures of the effect of the application of local anesthesia. All calculations were performed in the statistical program R.


Sign in / Sign up

Export Citation Format

Share Document