Learning to Disentangle the Complex Causes of Data
<p>The ability to extract and model the meaning in data has been key to the success of modern machine learning. Typically, data reflects a combination of multiple sources that are mixed together. For example, photographs of people’s faces reflect the subject of the photograph, lighting conditions, angle, and background scene. It is therefore natural to wish to extract these multiple, largely independent, sources, which is known as disentangling in the literature. Additional benefits of disentangling arise from the fact that the data is then simpler, meaning that there are fewer free parameters, which reduces the curse of dimensionality and aids learning. While there has been a lot of research into finding disentangled representations, it remains an open problem. This thesis considers a number of approaches to a particularly difficult version of this task: we wish to disentangle the complex causes of data in an entirely unsupervised setting. That is, given access only to unlabeled, entangled data, we search for algorithms that can identify the generative factors of that data, which we call causes. Further, we assume that causes can themselves be complex and require a high-dimensional representation. We consider three approaches to this challenge: as an inference problem, as an extension of independent components analysis, and as a learning problem. Each method is motivated, described, and tested on a set of datasets build from entangled combinations of images, most commonly MNIST digits. Where the results fall short of disentangling, the reasons for this are dissected and analysed. The last method that we describe, which is based on combinations of autoencoders that learn to predict each other’s output, shows some promise on this extremely challenging problem.</p>