{"title": "Analytic Solutions to the Formation of Feature-Analysing Cells of a Three-Layer Feedforward Visual Information Processing Neural Net", "book": "Advances in Neural Information Processing Systems", "page_first": 160, "page_last": 165, "abstract": null, "full_text": "160 \n\nTang \n\nAnalytic Solutions to the Formation of \nFeature-Analysing Cells of a Three-Layer \n\nFeedforward Visual Information \n\nProcessing Neural Net \n\nMicroelectronics and Computer Technology Corporation \n\nD.S. Tang \n\n3500 West Balcones Center Drive \n\nAustin, TX 78759-6509 \nemail: tang@mcc.com \n\nABSTRACT \n\nAnalytic solutions to the information-theoretic evolution equa(cid:173)\ntion of the connection strength of a three-layer feedforward neural \nnet for visual information processing are presented. The results \nare (1) the receptive fields of the feature-analysing cells corre(cid:173)\nspond to the eigenvector of the maximum eigenvalue of the Fred(cid:173)\nholm integral equation of the first kind derived from the evolution \nequation of the connection strength; (2) a symmetry-breaking \nmechanism (parity-violation) has been identified to be respon(cid:173)\nsible for the changes of the morphology of the receptive field; \n(3) the conditions for the formation of different morphologies are \nexplicitly identified. \n\n1 INTRODUCTION \n\nThe use of Shannon's information theory ( Shannon and Weaver,1949) to the study \nof neural nets has been shown to be very instructive in explaining the formation \nof different receptive fi~lds in the early visual information processing, as evident by \nthe works of Linsker (1986,1988). It has been demonstrated that the connection \nstrengths which maximize the information rate from one layer of neurons to the \nnext exhibit center-surround, all-excitatory fall-inhibitory and orientation-selective \nproperties. This could lead to a better understanding on the mechanisms with \nwhich the cells are self-organized to achieve adaptive responses to the changing en(cid:173)\nviroment. However, results from these studies are mainly numerical in nature and \ntherefore do n~t provide deeper insights as to how and under what conditions the \nmorphologies of the feature-aIlalyzing cells are formed. We present in this paper \n\n\fAnalytic Solutions to the Formation of Feature-Analysing Cells \n\n161 \n\naccurate analytic solutions to the problems posed by Linsker. Namely, we solve \nanalytically the evolution equation of the connection strength, obtain close expres(cid:173)\nsions for the receptive fields and derive the formation conditions for different classes \nof morphologies. These results are crucial to the understanding of the architecture \nof neural net as an information processing system. Below, we briefly summarize the \nanalytic techniques involved and the main results we obtained. \n2 THREE-LAYER FEEDFORWARD NEURAL NET \n\nThe neural net configuration (Fig. 1) is identical to that reported in references 2 \nand 3 in which a feedforword three-layer neural net is considered. The layers are \nlabelled consecutively as layer-A, layer-B and layer-C. \n\n-~~-:--_-:--_ _ _ __ LAYERA \n\n---:_-+-~,--...,..........., ___ LAYER B \n\n__ __ -''--_____ LAYER C \n\nFigure 1: The neural net configuration \n\nThe input-output relation for the signals to propagate from one layer to the con(cid:173)\nsecutive layer is assumed to be linear, \n\nNj \n\nMj = L CjdLi + ~). \n\ni=l \n\n(1) \n\n~ is assumed to be an additive Gaussian white noise with constant standard devi(cid:173)\nation Q and Jero mean. L. and Mj are the ith stochastic input signal and the jth \nstochastic output signal respectively. Cji is the connection strength which defines \nthe morphology of the receptive field and is to be determined by maximizing the \ninformation rate. The spatial summation in equation (1) is to sum over all Nj \n\n\f162 \n\nTang \n\ninputs located according to a gaussian distributed within the same layer, with the \ncenter of the distribution lying directly above the location of the Mj output signal. \nIf the statistical behavior of the input signal is assumed to be Gaussian, \n\nthen the information rate can be derived and is given by \nR(M) = !Zn[l + ECiQijCj] \nQ 2 Ect' \n\n2 \n\n(2) \n\n(3) \n\nThe matrix Q is the correlation of the Us, Qi:i = E[(Li - i)(Lj - i)] with mean i. \nThe set of connection strengths which optimize the information rate subject to a \nnormalization condition, E Ct = A, and to their overall absolute mean, Cl: Ci)2 = \nB, constitute physically plausible receptive fields. Below is the solutions to the \nproblem. \n3 FREDHOLM INTEGRAL EQUATION \n\nThe evolution equation for the connection strength Cn. which maximizes the infor(cid:173)\nmation rate subject to the constraints is \n\n1 N \n\n. \ncn. = N L(Qn.i + k2 )Ci. \n\ni=l \n\n(4) \n\nk2 is the Lagrange multiplier. First, we assume that the statistical ensemble of \nthe visual images has the highest information content under the condition of fixed \nvariance. Then, from the maximum entropy principle, it can be shown that the \nGaussian distribution with a correlation Qij being a constant multiple of the kro(cid:173)\nnecker delta function describes the statistics of this ensemble of visual images. It \ncan be shown that the solution to the above equation with Qni being a kronecker \ndelta function is a constant. Therefore, the connection strengths which defines the \nlinear input-output relation from layer A to layer B is either all-excitatory or all(cid:173)\ninhibitory. Hence, without loss of generality, we take the values of the layer A to \nlayer B connection strengths to be all-excitatory. Making use of this result, the \ncorrelation function of the output signals at layer B (i.e. the input signals to layer \nC) is derived \n\n(5) \nwhere r is the distance between the nth and the ith output signals. CQ = 1fNj \n50. To study the connection strengths of the input-output relation from layer B to \nlayer C, it is more convenient to work with continuous spatial variables. Then the \nsolutions to the discrete evolution equation which maximizes the information rate \nare solutions to the following Fredholm integral equation of the first kind with the \nmaximum eigenvalue )., \n\nC(f) = ;). i:\"\" K(RIf)C(R)dR \n\n(6) \n\n\fAnalytic Solutions to the Formation of Feature-Analysing Cells \n\n163 \n\nwhere the kernal is K(RIr) = (Q(R-r)+k2)P(R) and the Gaussian input population \ndistribution density is p(r) = Cpexp(-~) with Cp = ~. In continuous variables, \nthe connection strength is denoted by C(r). A complete set of solutions to this \nFredholm integral equation can be analytically derived. We are interested only in \nthe solutions with the maximum eigenvalues. Below we present the results. \n( ANALYTIC SOLUTIONS \n\nr. \n\nwr. \n\nThe solution with the maximum eigenvalue has a few number of nodes. This can \nbe constructed as a linear superposition of an infinite number of gaussian functions \nwith different variances and means, which are treated as independent variables to be \nsolved with the Fredholm integral equation. Full details are contained in reference \n3. \n(a) Symmetric solution C(-r) = C(r): \nFor k2 :f: 0, the connection strength is \n\nC(r) = b[t + Gexp(--2) + ( H) Gexp(--2 2 )1 \n\n(7) \n\nr2 \n\n20'0 \n\nH \n\n1 -\n\nr2 \n\n0'00 \n\nI \n\nWI \n\n= \n\nd H -\n-\n\na7r \n---L+ \n,-;r 2a'\" \n\n\u00b7th G \n::f- = 0.73205 and a = CQCp/N)'. \nr' \ntroo \nThe eigenvalue is given by \n\nan \n\n\u2022 \n\n\u2022 \n\na7r \n---L+ \nI +~. \n,-;r 2a'\" \n\n.,.\".. \n0 \n\nHere, a 2 \n\n-\n\n2 \n\n.5rB' ;t - 0.66667, \n\nr' \no \n\nk2 Cp1r[ \n\n). = N \n\nG \n\n2 \n\n2a + 1 \no \n\n2trl + 201 2 \n\nH G \n\n1 + (1 _ H) \n\n1 \n\n] \n\n1 \n\n2c;r + 201 2 \n\n00 \n\n1 \u00b7 \n\nFor k2 = 0, the connection strength is \n\nand the eigenvalue is \n\n\\ _ \n1 \\ -\n\nCQCp'K \n\n11\u00b7 \nN[2r=T + 2012 + ~ \n\n1 \n\n1 \n\nGO \n\n\u2022 \n\n(8) \n\n(9) \n\n(10) \n\nThese can be shown to be identical to the case of k2 =1= 0 when the limit k2 - 0 is \nappropriately taken. \n(b) Antisymmetric solution C(-r) = -C(r): \nThe connection strength is \n\nC(r) = (Jx + gy)exp(--2 [1-\n\nr2 \n\n2rB \n\n1 \n, \n\n, D\u00b7 \n\n1 + !:a. + !.a.. \n\ntr~ \n\n01 3 \n\nThe eigenvalue is \n\n). = \n\n'KCQCp \n\nN2rB 2r'\"\" + 201 2 + ~ \n]2 \u2022 \n\n2 [1 \n\u2022 \n\n1 \nGO \n\n1 \n\n(11) \n\n(12) \n\n\f164 \n\nTang \n\nIn the above equations, b, f and 9 are normalization constants. \nBelow are the conditions under which the different morphologies (Fig.2 ) are formed. \n(i)k2 > 0, the symmetric solution has the largest eigenvalue. The receptive field is \neither all-excitatory or all-inhibitory, Fig.2a. \n(ii)-0.891CQ < k2 < 0, the symmetric solution has the largest eigenvalue. The \nreceptive field has a mexcian-hat appearance, Fig.2b. \n(iii)k2 < -0.891CQ, the anti-symmetric solution has the largest eigenvalue. The \nreceptive field has two regions divided by a straight line of arbitrary direc(cid:173)\ntion(degeneracy). The two regions are mirror image of each other. One is totally \ninhibitory and the other is totally excitatory, Fig.2c. \n\n0.4----------------------, \n\n, , \n\nSymmetric solution \n\n0.3' \n\n0.3 \n\ng \n::; 0.:.5 \n:> \n= \n\u00b7u \n\n~ \n:0 \n\n0.2 \n\n0.15 \n\n0.11-\n\n0.05 -\n.J \n\n( b) \n\n.-/\\ \n\n\"'-\nV \n\n' .. \n\nCC) \n\n./\\ \n,V \n\nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \nI \n: Antisymmetric solution \n\nI \n\n~ \n\n.% \n\n\u00b71 \n\n0 \n\n% \n\nk/Cq \n\nFigure 2: Relations between the receptive field and the maximum eigenvalues. \nInserts are examples of the connection strength C(r) versus the spatial \ndimension in the x-direction. \n\n\fAnalytic Solutions to the Formation of Feature-Analysing Cells \n\n165 \n\nNote that the information rate as given by Eq.(3} is invariant under the operation \nof the spatial refiection,-r -+ r. The solutions to the optimaziation problem violates \nparity-conservation as the overall mean of the connection strength (i.e. equivalently \nk2 ) changes to different values. \nResults from numerical simulations agree very well with the analytic results. N u(cid:173)\nmerical simulations are performed from 80 to 600 synapses. The agreement is good \neven for the case in which the number of synapses are 200. \nIn summary, we have shown precisely how the mexican-hat morphology emerges as \nidentified by (ii) above. Furthermore, a symmetry-breaking(parity-violation) mech(cid:173)\nanism has been identified to explain the changes of the morphology from spatially \nsymmetric to anti-symmetric appearance as k2 passes through -0.891CQ. It is very \nlikely that similar symmetry breaking mechanisms are present in neural nets with \nlateral connections. \nReferences \n\n1. C.E.Shannon and W. Weaver, The mathematical Theory of Communication \n(Univ. of illinois Press,Urbana,1949). \n2. R.Linsker, Proc. Natl. Acad. Sci. USA 83,7508(1986); Computer 21 (3), \n105(1988). \n3. D.S. Tang, Phys.Rev A, 40,6626(1989). \n\n\f\fPART II: \n\nSPEECH AND SIGNAL PROCESSING \n\n\f", "award": [], "sourceid": 288, "authors": [{"given_name": "Dun-Sung", "family_name": "Tang", "institution": null}]}