Multi-Stage Dantzig Selector

Part of Advances in Neural Information Processing Systems 23 (NIPS 2010)

Bibtex Metadata Paper Supplemental


Ji Liu, Peter Wonka, Jieping Ye


We consider the following sparse signal recovery (or feature selection) problem: given a design matrix $X\in \mathbb{R}^{n\times m}$ $(m\gg n)$ and a noisy observation vector $y\in \mathbb{R}^{n}$ satisfying $y=X\beta^*+\epsilon$ where $\epsilon$ is the noise vector following a Gaussian distribution $N(0,\sigma^2I)$, how to recover the signal (or parameter vector) $\beta^*$ when the signal is sparse? The Dantzig selector has been proposed for sparse signal recovery with strong theoretical guarantees. In this paper, we propose a multi-stage Dantzig selector method, which iteratively refines the target signal $\beta^*$. We show that if $X$ obeys a certain condition, then with a large probability the difference between the solution $\hat\beta$ estimated by the proposed method and the true solution $\beta^*$ measured in terms of the $l_p$ norm ($p\geq 1$) is bounded as \begin{equation*} \|\hat\beta-\beta^*\|_p\leq \left(C(s-N)^{1/p}\sqrt{\log m}+\Delta\right)\sigma, \end{equation*} $C$ is a constant, $s$ is the number of nonzero entries in $\beta^*$, $\Delta$ is independent of $m$ and is much smaller than the first term, and $N$ is the number of entries of $\beta^*$ larger than a certain value in the order of $\mathcal{O}(\sigma\sqrt{\log m})$. The proposed method improves the estimation bound of the standard Dantzig selector approximately from $Cs^{1/p}\sqrt{\log m}\sigma$ to $C(s-N)^{1/p}\sqrt{\log m}\sigma$ where the value $N$ depends on the number of large entries in $\beta^*$. When $N=s$, the proposed algorithm achieves the oracle solution with a high probability. In addition, with a large probability, the proposed method can select the same number of correct features under a milder condition than the Dantzig selector.