Assessment of the need for separate test set and number of medical images necessary for deep learning: a sub-sampling study
AbstractDeep learning algorithms have tremendous potential utility in the classification of biomedical images. For example, images acquired with retinal optical coherence tomography (OCT) can be used to accurately classify patients with adult macular degeneration (AMD), and distinguish them from healthy control patients. However, previous research has suggested that large amounts of data are required in order to train deep learning algorithms, because of the large number of parameters that need to be fit. Here, we show that a moderate amount of data (data from approximately 1,800 patients) may be enough to reach close-to-maximal performance in the classification of AMD patients from OCT images. These results suggest that deep learning algorithms can be trained on moderate amounts of data, provided that images are relatively homogenous, and the effective number of parameters is sufficiently small. Furthermore, we demonstrate that in this application, cross-validation with a separate test set that is not used in any part of the training does not differ substantially from cross-validation with a validation data-set used to determine the optimal stopping point for training.