The purpose of experiments: Ecological validity versus comparing hypotheses

1996 ◽  
Vol 19 (1) ◽  
pp. 20-20 ◽  
Author(s):  
Robyn M. Dawes

AbstractAs illustrated by research Koehler himself cites (Dawes et al. 1993), the purpose of experiments is to choose between contrasting explanations of past observations – rather than to seek statistical generalizations about the prevalence of effects. True external validity results not from sampling various problems that are representative of “real world” decision making, but from reproducing an effect in the laboratory with minimal contamination (including from real world factors).

2021 ◽  
pp. 62-82
Author(s):  
James Wilson

This chapter argues that the scale of the challenge posed by external validity requires a similarly sizeable response. Not only should practitioners approach evidence collection and interventions in policy differently, but philosophers should also change the way they conceive of ethics. The default should no longer be to start from simplistic causal models or thought experiments, while being dimly aware that these approaches will exclude some features that would be relevant for real-world decision-making. Rather, both practitioners and philosophers should start from the premise that social processes are complex systems. Moreover, complex systems are in important aspects performative: for example, what counts as a breach of trust, or a violation of privacy, is not something that can be discovered once and for all, but is partly constituted by social norms and individual expectations, which will themselves change in response to government action.


2014 ◽  
Vol 25 (4) ◽  
pp. 233-238 ◽  
Author(s):  
Martin Peper ◽  
Simone N. Loeffler

Current ambulatory technologies are highly relevant for neuropsychological assessment and treatment as they provide a gateway to real life data. Ambulatory assessment of cognitive complaints, skills and emotional states in natural contexts provides information that has a greater ecological validity than traditional assessment approaches. This issue presents an overview of current technological and methodological innovations, opportunities, problems and limitations of these methods designed for the context-sensitive measurement of cognitive, emotional and behavioral function. The usefulness of selected ambulatory approaches is demonstrated and their relevance for an ecologically valid neuropsychology is highlighted.


2017 ◽  
Author(s):  
David Skylan Chester

The Taylor Aggression Paradigm (TAP) is a frequently-used laboratory measure of aggressive behavior. However, the flexibility inherent in its implementation and analysis can undermine its validity. To test whether the TAP was a valid aggression measure irrespective of this flexibility, I conducted a preregistered study of a 25-trial version of the TAP using a single scoring approach with 160 diverse undergraduate participants. TAP scores showed agreement with other laboratory aggression measures and were magnified by an experimental provocation manipulation. Mixed evidence was found for associations with aggressive dispositions and real-world violence. These results provide preliminary support for this approach to the TAP to measure state-level aggressive behavior. However, more evidence is needed to assess the TAP’s external validity and ability to measure dispositional forms of aggression. Using preregistered designs, researchers should validate specific variants of their behavioral tasks in order to optimize the veridicality and reproducibility of psychological science.


2021 ◽  
Author(s):  
Paula Jimenez-Fonseca ◽  
Alberto Carmona-Bayonas ◽  
Angela Lamarca ◽  
Jorge Barriuso ◽  
Angel Castaño ◽  
...  

Introduction: Somatostatin analogues (SSA) prolong progression-free survival (PFS) in patients with well-differentiated gastroenteropancreatic neuroendocrine tumors (GEP-NETs). However, the eligibility criteria in randomized clinical trials (RCTs) have been restricted, which contrasts with the vast heterogeneity found in NETs. Methods: We identified patients with well-differentiated (Ki67% ≤20%), metastatic GEP-NETs treated in first-line with SSA monotherapy from the Spanish R-GETNE registry. The therapeutic effect was evaluated using a Bayesian Cox model. The objective was to compare survival-based outcomes from real world clinical practice versus RCTs. Results: The dataset contained 535 patients with a median age of 62 years (range: 26-89). The median Ki67% was 4 (range: 0-20). The most common primary tumor sites were: midgut, 46%; pancreas, 34%; unknown primary, 10%; and colorectal, 10%. Half of the patients received octreotide LAR (n=266) and half, lanreotide autogel (n=269). The median PFS was 28.0 months (95% CI, 22.1-32.0) for octreotide vs 30.1 months (95% CI, 23.1-38.0) for lanreotide. The overall hazard ratio for lanreotide vs octreotide was 0.90 (95% credible interval, 0.71-1.12). The probability of effect sizes >30% with lanreotide vs octreotide was 2% and 6% for midgut and foregut NENs, respectively. Conclusion: Our study evaluated the external validity of RCTs examining SSAs in the real world, as well as the main effect-modifying factors (progression status, symptoms, tumor site, specific metastases, and analytical data).. Our results indicate that both octreotide LAR and lanreotide autogel had a similar effect on PFS. Consequently, both represent valid alternatives in patients with well-differentiated, metastatic GEP-NENs.


2021 ◽  
Vol 11 (6) ◽  
pp. 2817
Author(s):  
Tae-Gyu Hwang ◽  
Sung Kwon Kim

A recommender system (RS) refers to an agent that recommends items that are suitable for users, and it is implemented through collaborative filtering (CF). CF has a limitation in improving the accuracy of recommendations based on matrix factorization (MF). Therefore, a new method is required for analyzing preference patterns, which could not be derived by existing studies. This study aimed at solving the existing problems through bias analysis. By analyzing users’ and items’ biases of user preferences, the bias-based predictor (BBP) was developed and shown to outperform memory-based CF. In this paper, in order to enhance BBP, multiple bias analysis (MBA) was proposed to efficiently reflect the decision-making in real world. The experimental results using movie data revealed that MBA enhanced BBP accuracy, and that the hybrid models outperformed MF and SVD++. Based on this result, MBA is expected to improve performance when used as a system in related studies and provide useful knowledge in any areas that need features that can represent users.


Author(s):  
Jessica M. Franklin ◽  
Kai‐Li Liaw ◽  
Solomon Iyasu ◽  
Cathy Critchlow ◽  
Nancy Dreyer

2021 ◽  
pp. 1-27 ◽  
Author(s):  
Brandon de la Cuesta ◽  
Naoki Egami ◽  
Kosuke Imai

Abstract Conjoint analysis has become popular among social scientists for measuring multidimensional preferences. When analyzing such experiments, researchers often focus on the average marginal component effect (AMCE), which represents the causal effect of a single profile attribute while averaging over the remaining attributes. What has been overlooked, however, is the fact that the AMCE critically relies upon the distribution of the other attributes used for the averaging. Although most experiments employ the uniform distribution, which equally weights each profile, both the actual distribution of profiles in the real world and the distribution of theoretical interest are often far from uniform. This mismatch can severely compromise the external validity of conjoint analysis. We empirically demonstrate that estimates of the AMCE can be substantially different when averaging over the target profile distribution instead of uniform. We propose new experimental designs and estimation methods that incorporate substantive knowledge about the profile distribution. We illustrate our methodology through two empirical applications, one using a real-world distribution and the other based on a counterfactual distribution motivated by a theoretical consideration. The proposed methodology is implemented through an open-source software package.


2021 ◽  
Vol 21 (1) ◽  
Author(s):  
Alan Brnabic ◽  
Lisa M. Hess

Abstract Background Machine learning is a broad term encompassing a number of methods that allow the investigator to learn from the data. These methods may permit large real-world databases to be more rapidly translated to applications to inform patient-provider decision making. Methods This systematic literature review was conducted to identify published observational research of employed machine learning to inform decision making at the patient-provider level. The search strategy was implemented and studies meeting eligibility criteria were evaluated by two independent reviewers. Relevant data related to study design, statistical methods and strengths and limitations were identified; study quality was assessed using a modified version of the Luo checklist. Results A total of 34 publications from January 2014 to September 2020 were identified and evaluated for this review. There were diverse methods, statistical packages and approaches used across identified studies. The most common methods included decision tree and random forest approaches. Most studies applied internal validation but only two conducted external validation. Most studies utilized one algorithm, and only eight studies applied multiple machine learning algorithms to the data. Seven items on the Luo checklist failed to be met by more than 50% of published studies. Conclusions A wide variety of approaches, algorithms, statistical software, and validation strategies were employed in the application of machine learning methods to inform patient-provider decision making. There is a need to ensure that multiple machine learning approaches are used, the model selection strategy is clearly defined, and both internal and external validation are necessary to be sure that decisions for patient care are being made with the highest quality evidence. Future work should routinely employ ensemble methods incorporating multiple machine learning algorithms.


2021 ◽  
Vol 15 (3) ◽  
pp. 1-33
Author(s):  
Wenjun Jiang ◽  
Jing Chen ◽  
Xiaofei Ding ◽  
Jie Wu ◽  
Jiawei He ◽  
...  

In online systems, including e-commerce platforms, many users resort to the reviews or comments generated by previous consumers for decision making, while their time is limited to deal with many reviews. Therefore, a review summary, which contains all important features in user-generated reviews, is expected. In this article, we study “how to generate a comprehensive review summary from a large number of user-generated reviews.” This can be implemented by text summarization, which mainly has two types of extractive and abstractive approaches. Both of these approaches can deal with both supervised and unsupervised scenarios, but the former may generate redundant and incoherent summaries, while the latter can avoid redundancy but usually can only deal with short sequences. Moreover, both approaches may neglect the sentiment information. To address the above issues, we propose comprehensive Review Summary Generation frameworks to deal with the supervised and unsupervised scenarios. We design two different preprocess models of re-ranking and selecting to identify the important sentences while keeping users’ sentiment in the original reviews. These sentences can be further used to generate review summaries with text summarization methods. Experimental results in seven real-world datasets (Idebate, Rotten Tomatoes Amazon, Yelp, and three unlabelled product review datasets in Amazon) demonstrate that our work performs well in review summary generation. Moreover, the re-ranking and selecting models show different characteristics.


Author(s):  
Pedro Serrano-Aguilar ◽  
Iñaki Gutierrez-Ibarluzea ◽  
Pilar Díaz ◽  
Iñaki Imaz-Iglesia ◽  
Jesús González-Enríquez ◽  
...  

Abstract The Monitoring Studies (MS) program, the approach developed by RedETS to generate postlaunch real-world evidence (RWE), is intended to complement and enhance the conventional health technology assessment process to support health policy decision making in Spain, besides informing other interested stakeholders, including clinicians and patients. The MS program is focused on specific uncertainties about the real effect, safety, costs, and routine use of new and insufficiently assessed relevant medical devices carefully selected to ensure the value of the additional research needed, by means of structured, controlled, participative, and transparent procedures. However, despite a clear political commitment and economic support from national and regional health authorities, several difficulties were identified along the development and implementation of the first wave of MS, delaying its execution and final reporting. Resolution of these difficulties at the regional and national levels and a greater collaborative impulse in the European Union, given the availability of an appropriate methodological framework already provided by EUnetHTA, might provide a faster and more efficient comparative RWE of improved quality and reliability at the national and international levels.


Sign in / Sign up

Export Citation Format

Share Document