Supervised machine learning methods in psychology: A practical introduction with annotated R code
Machine learning methods for pattern detection and prediction are increasingly prevalent in psychological research. We provide a comprehensive overview of machine learning, its applications, and how to implement models for research. We review fundamental concepts of machine learning, such as prediction accuracy and out-of-sample evaluation, and summarize four standard prediction algorithms: linear regressions, ridge regressions, decision trees, and random forests (plus k-nearest neighbors, Naïve Bayes classifiers, and support vector machines in the supplementary material). This selection provides a set of powerful models that are implemented regularly in machine learning projects. We demonstrate each method with examples and annotated R code, and discuss best practices for determining sample sizes; comparing model performances; tuning prediction models; preregistering prediction models; and reporting results. Finally, we discuss the value of machine learning methods in maintaining psychology’s status as a predictive science.