(x;,xj).tj \n\nj=1 \n\n(2) \n\nand the response at point Xj is a weighted average of the sampled target values across the \nentire dataset. Furthermore, the response can be viewed as a least squares estimate for \ny(Xj) because we can write it as a solution to the minimization problem: \n\n~ (r.CjJ(xj,Xj).tj - y(xi)Y \n\n];1 \n\n) \n\n(3) \n\nWe can combine the kernel functions to define the smoother matrix S, given by: \n\n+ is the pseudo-inverse of the transfonned data matrix <1>. The network output \ncan then be expressed as: \n\n(8) \n\n(9) \n\n= ~k