Generalized Lasso based Approximation of Sparse Coding for Visual Recognition

Part of Advances in Neural Information Processing Systems 24 (NIPS 2011)

Bibtex Metadata Paper


Nobuyuki Morioka, Shin'ichi Satoh


Sparse coding, a method of explaining sensory data with as few dictionary bases as possible, has attracted much attention in computer vision. For visual object category recognition, L1 regularized sparse coding is combined with spatial pyramid representation to obtain state-of-the-art performance. However, because of its iterative optimization, applying sparse coding onto every local feature descriptor extracted from an image database can become a major bottleneck. To overcome this computational challenge, this paper presents "Generalized Lasso based Approximation of Sparse coding" (GLAS). By representing the distribution of sparse coefficients with slice transform, we fit a piece-wise linear mapping function with generalized lasso. We also propose an efficient post-refinement procedure to perform mutual inhibition between bases which is essential for an overcomplete setting. The experiments show that GLAS obtains comparable performance to L1 regularized sparse coding, yet achieves significant speed up demonstrating its effectiveness for large-scale visual recognition problems.