Feeding the machine: challenges to reproducible predictive modeling in network neuroscience
Machine learning offers a promising set of prediction tools that have enjoyed more recent application in network neuroscience. In this NETN Perspectives, we examine the current application of predictive models, e.g., classifiers trained using machine learning (ML), within the clinical network neurosciences. Our review covers 118 studies published using ML and functional MRI (fMRI) to infer various dimensions of the human functional connectome. We identify several important methodological challenges in this literature. For example, more than half of the studies focused almost exclusively on maximizing the accuracy of classifying brain functional connectomes into one of several predetermined categories (e.g., disease versus healthy), with significantly less emphasis on reproducibility and generalizability of the findings.. . There was also a concerning lack of transparency across many of the key steps in training and evaluating predictive models using machine learning. The summary of this literature underscores the importance of external validation (i.e., lockbox or test-set data) and highlights several methodological pitfalls that can be addressed by the imaging community. We offer recommendations for the principled application of machine learning in the clinical neurosciences to advance imaging biomarkers, understand causative determinants for health risks and track the trajectory of heterogeneous patient outcomes.