Part of Advances in Neural Information Processing Systems 22 (NIPS 2009)
Ian Goodfellow, Honglak Lee, Quoc Le, Andrew Saxe, Andrew Ng
For many computer vision applications, the ideal image feature would be invariant to multiple confounding image properties, such as illumination and viewing angle. Recently, deep architectures trained in an unsupervised manner have been proposed as an automatic method for extracting useful features. However, outside of using these learning algorithms in a classifier, they can be sometimes difficult to evaluate. In this paper, we propose a number of empirical tests that directly measure the degree to which these learned features are invariant to different image transforms. We find that deep autoencoders become invariant to increasingly complex image transformations with depth. This further justifies the use of “deep” vs. “shallower” representations. Our performance metrics agree with existing measures of invariance. Our evaluation metrics can also be used to evaluate future work in unsupervised deep learning, and thus help the development of future algorithms.