Part of Advances in Neural Information Processing Systems 13 (NIPS 2000)
Eiji Mizutani, James Demmel
This paper describes a method of dogleg trust-region steps, or re(cid:173) stricted Levenberg-Marquardt steps, based on a projection pro(cid:173) cess onto the Krylov subspaces for neural networks nonlinear least squares problems. In particular, the linear conjugate gradient (CG) method works as the inner iterative algorithm for solving the lin(cid:173) earized Gauss-Newton normal equation, whereas the outer nonlin(cid:173) ear algorithm repeatedly takes so-called "Krylov-dogleg" steps, re(cid:173) lying only on matrix-vector multiplication without explicitly form(cid:173) ing the Jacobian matrix or the Gauss-Newton model Hessian. That is, our iterative dogleg algorithm can reduce both operational counts and memory space by a factor of O(n) (the number of pa(cid:173) rameters) in comparison with a direct linear-equation solver. This memory-less property is useful for large-scale problems.