scholarly journals Statistical Hypothesis Testing versus Machine Learning Binary Classification: Distinctions and Guidelines

Patterns ◽  
2020 ◽  
Vol 1 (7) ◽  
pp. 100115
Author(s):  
Jingyi Jessica Li ◽  
Xin Tong
Author(s):  
Sach Mukherjee

A number of important problems in data mining can be usefully addressed within the framework of statistical hypothesis testing. However, while the conventional treatment of statistical significance deals with error probabilities at the level of a single variable, practical data mining tasks tend to involve thousands, if not millions, of variables. This Chapter looks at some of the issues that arise in the application of hypothesis tests to multi-variable data mining problems, and describes two computationally efficient procedures by which these issues can be addressed.


Sign in / Sign up

Export Citation Format

Share Document