{"title": "From Regularization Operators to Support Vector Kernels", "book": "Advances in Neural Information Processing Systems", "page_first": 343, "page_last": 349, "abstract": "", "full_text": "From Regularization Operators \n\nto Support Vector Kernels \n\nAlexander J. Smola \n\nGMDFIRST \n\nRudower Chaussee 5 \n12489 Berlin, Germany \n\nsmola@first.gmd.de \n\nBernhard Scholkopf \n\nMax-Planck-Institut fur biologische Kybernetik \n\nSpemannstra.Be 38 \n\n72076 Ttibingen, Germany \n\nbs-@mpik-tueb.mpg.de \n\nAbstract \n\nWe derive the correspondence between regularization operators used in \nRegularization Networks and Hilbert Schmidt Kernels appearing in Sup(cid:173)\nport Vector Machines. More specifica1ly, we prove that the Green's Func(cid:173)\ntions associated with regularization operators are suitable Support Vector \nKernels with equivalent regularization properties. As a by-product we \nshow that a large number of Radial Basis Functions namely condition(cid:173)\nally positive definite functions may be used as Support Vector kernels. \n\n1 INTRODUCTION \n\nSupport Vector (SV) Machines for pattern recognition, regression estimation and operator \ninversion exploit the idea of transforming into a high dimensional feature space where \nthey perform a linear algorithm. Instead of evaluating this map explicitly, one uses Hilbert \nSchmidt Kernels k(x, y) which correspond to dot products of the mapped data in high \ndimensional space, i.e. \n\nk(x, y) = (*(x) \u00b7 **(y)) \n\n(I) \nwith ** : .!Rn --* :F denoting the map into feature space. Mostly, this map and many of \nits properties are unknown. Even worse, so far no general rule was available. which kernel \nshould be used, or why mapping into a very high dimensional space often provides good \nresults, seemingly defying the curse of dimensionality. We will show that each kernel \nk(x, y) corresponds to a regularization operator P, the link being that k is the Green's \nfunction of P* P (with F* denoting the adjoint operator of F). For the sake of simplicity \nwe shall only discuss the case of regression -\nour considerations, however, also hold true \nfoi the other cases mentioned above. \n\nWe start by briefly reviewing the concept of SV Machines (section 2) and of Regularization \nNetworks (section 3). Section 4 contains the main result stating the equivalence of both \n\n~ \n\n\f344 \n\nA. J. Smola and B. Schollwpf \n\nmethods. In section 5, we show some applications of this finding to known SV machines. \nSection 6 introduces a new class of possible SV kernels, and, finally, section 7 concludes \nthe paper with a discussion. \n\n2 SUPPORT VECTOR MACHINES \n\nThe SV algorithm for regression estimation, as described in [Vapnik, 1995] and [Vapnik \net al., 1997], exploit~ the idea of computing a linear function in high dimensional feature \nspace F (furnished with a dot product) and thereby computing a nonlinear function in the \nspace of the input data !Rn. The functions take the form f(x) = (w \u00b7 .IIF 111 2 ). A more subtle reasoning probably will be necessary for \nunderstanding the capacity bounds [Vapnik, 1995] from a Regularization Network point \nof view. Future work will include an analysis of the family of polynomial kernels, which \nperform very well in Pattern Classification [SchOlkopf et al., 1995]. \n\nAcknowledgements \n\nAS is supported by a grant of the DFG (# Ja 379/51 ). BS is supported by the Studienstiftung \ndes deutschen Volkes. The authors thank Chris Burges, Federico Girosi, Leo van Hemmen, \nKlaus-Robert Muller and Vladimir Vapnik for helpful discussions and comments. \n\nReferences \nM.A. Aizerman, E. M. Braverman, and L. I. Rozonoer. Theoretical foundations of the po(cid:173)\ntential function method in pattern recognition learning. Automation and Remote Control, \n25:821-837, 1964. \n\nN. Dyn. Interpolation and approximation by radial and related functions. In C.K. Chui, \nL.L. Schumaker, and D.J. Ward, editors, Approximation Theory, VI, pages 211-234. \nAcademic Press, New York, 1991. \n\nF. Girosi. An equivalence between sparse approximation and suppm1 vector machines. A.I. \n\nMemo No. 1606, MIT, 1997. \n\nF. Girosi, M. Jones, and T. Poggio. Priors, stabilizers and basis functions: From regular(cid:173)\n\nization to radial, tensor and additive splines. A.I. Memo No. 1430, MIT, 1993. \n\nW.R. Madych and S.A. Nelson. Multivariate interpolation and conditionally positive defi(cid:173)\n\nnite functions. II. Mathematics of Computation, 54(189):211-230, 1990. \n\nC. A. Micchelli. Interpolation of scattered data: distance matrices and conditionally posi(cid:173)\n\ntive definite functions. Constructive Approximation, 2:11-22, 1986. \n\nI.J. Schoenberg. Metric spaces and completely monotone functions. Ann. of Math., 39: \n\n811-841, 1938. \n\nB. Scholkopf, C. Burges, and V. Vapnik. Extracting support data for a given task. In U. M. \n\nFayyad and R. Uthurusamy, editors, Proc. KDD I, Menlo Park, 1995. AAAI Press. \n\nB. SchOlkopf, K. Sung, C. Burges, F. Girosi, P. Niyogi, T. Poggio, and V. Vapnik. Com(cid:173)\nparing support vector machines with gaussian kernels to radial basis function classifiers. \nIEEE Trans. Sign. Processing, 45:2758-2765, 1997. \n\nA. J. Smola and B. SchOlkopf. On a kernel-based method for pattern recognition, re(cid:173)\n\ngression, approximation and operator inversion. Algorithmica, 1998. see also GMD \nTechnical Report 1997- I 064, URL: http://svm.first.gmd.de/papers.html. \n\nV. Vapnik. The Nature of Statistical Learning Theory. Springer Verlag, New York, 1995. \nV. Vapnik, S. Golowich, and A. Smola: Support vector method for function approximation, \n\nregression estimation, and signal processing. In NIPS 9, San Mateo, CA, 1997. \n\nA. Yuille and N. Gr:z;ywacz. The motion coherence theory. In Proceedings of the Interna(cid:173)\n\ntional Conference on Computer Vision, pages 344-354, Washington, D.C., 1988. IEEE \nComputer Society Press. \n\n\f\f", "award": [], "sourceid": 1372, "authors": [{"given_name": "Alex", "family_name": "Smola", "institution": null}, {"given_name": "Bernhard", "family_name": "Sch\u00f6lkopf", "institution": null}]}*