Effects of Spatial Transformer Location on Segmentation Performance of a Dense Transformer Network
Semantic segmentation solves the task of labelling every pixel inan image with its class label, and remains an important unsolvedproblem. While significant work has gone into using deep learningto solve this problem, almost all the existing research uses methodsthat do not make modifications on spatial context considered for thepixel being labelled. Spatial information is an important cue in taskssuch as segmentation, reusing the same spatial span for every pixeland every label may not be the best approach. Spatial TransformerNetworks have shown promising results in improving classificationperformance of existing networks by allowing networks to activelymanipulate their input data to achieve better performance. Our workshows the benefit of incorporating Spatial Transformer Networksand their corresponding decoders into networks tailored to semanticsegmentation. Our experiments show an improvement in performanceover baseline networks when using networks augmentedwith Spatial Transformers.