Andrew Howard, Tony Jebara
A discriminative method is proposed for learning monotonic transforma- tions of the training data while jointly estimating a large-margin classi(cid:12)er. In many domains such as document classi(cid:12)cation, image histogram classi(cid:12)- cation and gene microarray experiments, (cid:12)xed monotonic transformations can be useful as a preprocessing step. However, most classi(cid:12)ers only explore these transformations through manual trial and error or via prior domain knowledge. The proposed method learns monotonic transformations auto- matically while training a large-margin classi(cid:12)er without any prior knowl- edge of the domain. A monotonic piecewise linear function is learned which transforms data for subsequent processing by a linear hyperplane classi(cid:12)er. Two algorithmic implementations of the method are formalized. The (cid:12)rst solves a convergent alternating sequence of quadratic and linear programs until it obtains a locally optimal solution. An improved algorithm is then derived using a convex semide(cid:12)nite relaxation that overcomes initializa- tion issues in the greedy optimization problem. The e(cid:11)ectiveness of these learned transformations on synthetic problems, text data and image data is demonstrated.