Loading [MathJax]/jax/output/CommonHTML/fonts/TeX/fontdata.js

Is Input Sparsity Time Possible for Kernel Low-Rank Approximation?

Part of Advances in Neural Information Processing Systems 30 (NIPS 2017)

Bibtex Metadata Paper Reviews Supplemental

Authors

Cameron Musco, David Woodruff

Abstract

Low-rank approximation is a common tool used to accelerate kernel methods: the n×n kernel matrix K is approximated via a rank-k matrix ˜K which can be stored in much less space and processed more quickly. In this work we study the limits of computationally efficient low-rank kernel approximation. We show that for a broad class of kernels, including the popular Gaussian and polynomial kernels, computing a relative error k-rank approximation to K is at least as difficult as multiplying the input data matrix ARn×d by an arbitrary matrix CRd×k. Barring a breakthrough in fast matrix multiplication, when k is not too large, this requires Ω(nnz(A)k) time where nnz(A) is the number of non-zeros in A. This lower bound matches, in many parameter regimes, recent work on subquadratic time algorithms for low-rank approximation of general kernels [MM16,MW17], demonstrating that these algorithms are unlikely to be significantly improved, in particular to O(nnz(A)) input sparsity runtimes. At the same time there is hope: we show for the first time that O(nnz(A)) time approximation is possible for general radial basis function kernels (e.g., the Gaussian kernel) for the closely related problem of low-rank approximation of the kernelized dataset.