Benchmarking Machine Learning Methods for Performance Modeling of Scientific Applications

Author(s):  
Preeti Malakar ◽  
Prasanna Balaprakash ◽  
Venkatram Vishwanath ◽  
Vitali Morozov ◽  
Kalyan Kumaran
2021 ◽  
Author(s):  
Marco Del Giudice

In this paper, I highlight a problem that has become ubiquitous in scientific applications of machine learning methods, and can lead to seriously distorted inferences about the phenomena under study. I call it the prediction-explanation fallacy. The fallacy occurs when researchers use prediction-optimized models for explanatory purposes, without considering the tradeoffs between explanation and prediction. This is a problem for at least two reasons. First, prediction-optimized models are often deliberately biased and unrealistic in order to prevent overfitting, and hence fail to accurately explain the phenomenon of interest. In other cases, they have an exceedingly complex structure that is hard or impossible to interpret, which greatly limits their explanatory value. Second, different predictive models trained on the same or similar data can be biased in different ways, so that multiple models may predict equally well but suggest conflicting explanations of the underlying phenomenon. In this note I introduce the tradeoffs between prediction and explanation in a non-technical fashion, present some illustrative examples from neuroscience, and end by discussing some mitigating factors and methods that can be used to limit or circumvent the problem.


Sign in / Sign up

Export Citation Format

Share Document