Kenji Fukumizu, Francis Bach, Michael Jordan
We propose a novel method of dimensionality reduction for supervised learning. Given a regression or classiﬁcation problem in which we wish to predict a variable Y from an explanatory vector X, we treat the prob- lem of dimensionality reduction as that of ﬁnding a low-dimensional “ef- fective subspace” of X which retains the statistical relationship between X and Y . We show that this problem can be formulated in terms of conditional independence. To turn this formulation into an optimization problem, we characterize the notion of conditional independence using covariance operators on reproducing kernel Hilbert spaces; this allows us to derive a contrast function for estimation of the effective subspace. Un- like many conventional methods, the proposed method requires neither assumptions on the marginal distribution of X, nor a parametric model of the conditional distribution of Y .