Chuanyi Ji, Demetri Psaltis
A general relationship is developed between the VC-dimension and the statistical lower epsilon-capacity which shows that the VC-dimension can be lower bounded (in order) by the statistical lower epsilon-capacity of a network trained with random samples. This relationship explains quan(cid:173) titatively how generalization takes place after memorization, and relates the concept of generalization (consistency) with the capacity of the optimal classifier over a class of classifiers with the same structure and the capacity of the Bayesian classifier. Furthermore, it provides a general methodology to evaluate a lower bound for the VC-dimension of feedforward multilayer neural networks. This general methodology is applied to two types of networks which are important for hardware implementations: two layer (N - 2L - 1) net(cid:173) works with binary weights, integer thresholds for the hidden units and zero threshold for the output unit, and a single neuron ((N - 1) net(cid:173) works) with binary weigths and a zero threshold. Specifically, we obtain OC~L) ::; d2 ::; O(W), and d1 ""' O(N). Here W is the total number of weights of the (N - 2L - 1) networks. d1 and d2 represent the VC(cid:173) dimensions for the (N - 1) and (N - 2L - 1) networks respectively.