Normalization of HE-Stained Histological Images using Cycle Consistent Generative Adversarial Networks
Abstract Background: Histological images show huge variance (e.g. illumination, color, staining quality) due to differences in image acquisition, tissue processing, staining, etc. The variance can impede many image analyzes such as staining intensity evaluation or classification. Methods to reduce these variances are gathered under the term image normalization. Methods: We present the application of CylceGAN - a cycle consistent Generative Adversarial Network for color normalization in hematoxylin-eosin stained histological images using typical clinical data including variability of internal staining. The network consists of a generator network GB that learns to map an image X from a source domain A to a target domain B, i.e. GB : XA → XB. In addition, a discriminator network DB is trained to distinguish whether an image from domain B is an original or generated one. The same process is applied to another generator-discriminator pair (GA, DA), for the inverse mapping GA : XB → XA. Cycle consistency ensures that the generated image is close to the original image when being mapped backwards (GA(GB(XA)) ≈ XA and vice versa). We validate the CycleGAN approach on a breast cancer challenge and a follicular thyroid carcinoma dataset for various stain variations. We evaluate the quality of the generated images compared to the original images using similarity measures. Results: We present qualitative results of the images generated by our network compared to the original color distributions. Our evaluation shows that by mapping images from a source domain to a target domain, the similarity to original images from the target domain improve up to 96%. We also achieve a high cycle consistency for the inverse mapping by obtaining similarity indices bigger than 0.9. Conclusions: CycleGANs have proven to efficiently normalize HE-stained images. The approach enables to compensate for deviations resulting from image acquisition (e.g. different scanning devices) as well as from tissue staining (e.g. different staining protocols), and thus overcomes the staining variations in images from various institutions. The code is publicly available at https://github.com/m4ln/stainTransfer_CycleGAN_pytorch. The dataset supporting the solutions is available at https://heidata.uni-heidelberg. de/privateurl.xhtml?token=12493b50-1538-4bdf-aca5-03352a1399a8.