Deep learning vs. atlas-based models for fast auto-segmentation of the masticatory muscles on head and neck CT images
Abstract Background: Impaired function of masticatory muscles will lead to trismus. Routine delineation of these muscles during planning may improve dose tracking and facilitate dose reduction resulting in decreased radiation-related trismus. This study aimed to compare a deep learning model with a commercial atlas-based model for fast auto-segmentation of the masticatory muscles on head and neck computed tomography (CT) images. Material and methods: Paired masseter (M), temporalis (T), medial and lateral pterygoid (MP, LP) muscles were manually segmented on 56 CT images. CT images were randomly divided into training (n=27) and validation (n=29) cohorts. Two methods were used for automatic delineation of masticatory muscles (MMs): Deep learning auto-segmentation (DLAS) and atlas-based auto-segmentation (ABAS). The automatic algorithms were evaluated using Dice similarity coefficient (DSC), recall, precision, Hausdorff distance (HD), HD95, and mean surface distance (MSD). A consolidated score was calculated by normalizing the metrics against interobserver variability and averaging over all patients. Differences in dose (∆Dose) to MMs for DLAS and ABAS segmentations were assessed. A paired t-test was used to compare the geometric and dosimetric difference between DLAS and ABAS methods.Results: DLAS outperformed ABAS in delineating all MMs (p < 0.05). The DLAS mean DSC for M, T, MP, and LP ranged from 0.83±0.03 to 0.89±0.02, the ABAS mean DSC ranged from 0.79±0.05 to 0.85±0.04. The mean value for recall, HD, HD95, MSD also improved with DLAS for auto-segmentation. Interobserver variation revealed the highest variability in DSC and MSD for both T and MP, and the highest scores were achieved for T by both automatic algorithms. With few exceptions, the mean ∆D98%, ∆D95%, ∆D50%, and ∆D2% for all structures were below 10% for DLAS and ABAS and had no detectable statistical difference (P >0.05). DLAS based contours had dose endpoints more closely matched with that of the manually segmented when compared with ABAS. Conclusions: DLAS auto-segmentation of masticatory muscles for the head and neck radiotherapy had improved segmentation accuracy compared with ABAS with no qualitative difference in dosimetric endpoints compared to manually segmented contours.