Biomedical image classification made easier thanks to transfer and semi-supervised learning

2021 ◽  
Vol 198 ◽  
pp. 105782
Author(s):  
A. Inés ◽  
C. Domínguez ◽  
J. Heras ◽  
E. Mata ◽  
V. Pascual
2020 ◽  
Author(s):  
Sayedali Shetab Boushehri ◽  
Ahmad Bin Qasim ◽  
Dominik Waibel ◽  
Fabian Schmich ◽  
Carsten Marr

AbstractDeep learning image classification algorithms typically require large annotated datasets. In contrast to real world images where labels are typically cheap and easy to get, biomedical applications require experts’ time for annotation, which is often expensive and scarce. Therefore, identifying methods to maximize performance with a minimal amount of annotation is crucial. A number of active learning algorithms address this problem and iteratively identify most informative images for annotation from the data. However, they are mostly benchmarked on natural image datasets and it is not clear how they perform on biomedical image data with strong class imbalance, little color variance and high similarity between classes. Moreover, active learning neglects the typically abundant unlabeled data available.In this paper, we thus explore strategies combining active learning with pre-training and semi-supervised learning to increase performance on biomedical image classification tasks. We first benchmarked three active learning algorithms, three pre-training methods, and two training strategies on a dataset containing almost 20,000 white blood cell images, split up into ten different classes. Both pre-training using self-supervised learning and pre-trained ImageNet weights boosts the performance of active learning algorithms. A further improvement was achieved using semi-supervised learning. An extensive grid-search through the different active learning algorithms, pre-training methods and training strategies on three biomedical image datasets showed that a specific combination of these methods should be used. This recommended strategy improved the results over conventional annotation-efficient classification strategies by 3% to 14% macro recall in every case. We propose this strategy for other biomedical image classification tasks and expect to boost performance whenever scarce annotation is a problem.


Sign in / Sign up

Export Citation Format

Share Document