{"title": "A Bifurcation Theory Approach to the Programming of Periodic Attractors in Network Models of Olfactory Cortex", "book": "Advances in Neural Information Processing Systems", "page_first": 459, "page_last": 467, "abstract": null, "full_text": "A BIFURCATION THEORY APPROACH TO THE \nPROGRAMMING OF PERIODIC A TTRACTORS IN \nNETWORK MODELS OF OLFACTORY CORTEX \n\n459 \n\nBill Baird \n\nDepartment of Biophysics \n\nU.C. Berkeley \n\nABSTRACT \n\nA new learning algorithm for the storage of static \nand periodic attractors in biologically inspired \nrecurrent analog neural networks is introduced. \nFor a network of n nodes, n static or n/2 periodic \nattractors may be stored. The algorithm allows \nprogramming of the network vector field indepen(cid:173)\ndent of the patterns to be stored. Stability of \npatterns, basin geometry, and rates of convergence \nmay be controlled. For orthonormal patterns, the \nl~grning operation reduces to a kind of periodic \nouter product rule that allows local, additive, \ncommutative, incremental learning. Standing or \ntraveling wave cycles may be stored to mimic the \nkind of oscillating spatial patterns that appear \nin the neural activity of the olfactory bulb and \nprepyriform cortex during inspiration and suffice, \nin the bulb, to predict the pattern recognition \nbehavior of rabbits in classical conditioning ex(cid:173)\nperiments. These attractors arise, during simulat(cid:173)\ned inspiration, through a multiple Hopf bifurca(cid:173)\ntion, which can act as a critical \"decision pOint\" \nfor their selection by a very small input pattern. \n\nINTRODUCTION \n\nThis approach allows the construction of biological models and \nthe exploration of engineering or cognitive networks that \nemploy the type of dynamics found in the brain. Patterns of 40 \nto 80 hz oscillation have been observed in the large scale ac(cid:173)\ntivity of the olfactory bulb and cortex(Freeman and Baird 86) \nand even visual neocortex(Freeman 87,Grey and Singer 88), and \nfound to predict the olfactory and visual pattern recognition \nresponses of a trained animal. Here we use analytic methods of \nbifurcation theory to design algorithms for determining synap(cid:173)\ntic weights in recurrent network architectures, like those \n\n\f460 \n\nBaird \n\nfound in olfactory cortex, for associative memory storage of \nthese kinds of dynamic patterns. \n\nthe normal form. \n\nThe \"projection algorithm\" introduced here employs higher \norder correlations, and is the most analytically transparent \nof the algorithms to come from the bifurcation theory ap(cid:173)\nproach(Baird 88). Alternative numerical algorithms employing \nunused capacity or hidden units instead of higher order corr(cid:173)\nelations are discussed in (Baird 89). All of these methods \nprovide solutions to the problem of storing exact analog at(cid:173)\ntractors, static or dynamic, in recurrent neural networks, and \nallow programming of the ambient vector field independent of \nthe patterns to be stored. The stability of cycles or equi(cid:173)\nlibria, geometry of basins of attraction, rates of convergence \nto attractors, and the location in parameter space of primary \nand secondary bifurcations can be programmed in a prototype \nvector field -\nTo store cycles by the projection algorithm, we start with the \namplitude equations of a polar coordinate normal form, with \ncoupling coefficients chosen to give stable fixed points on \nthe axes, and transform to Cartesian coordinates. The axes of \nthis system of nonlinear ordinary differential equations are \nthen linearly transformed into desired spatial or spatio-tem(cid:173)\nporal patterns by projecting the system into network coordina(cid:173)\ntes -\nthe standard basis - using the desired vectors as colum(cid:173)\nns of the transformation matrix. This method of network syn(cid:173)\nthesis is roughly the inverse of the usual procedure in bifur(cid:173)\ncation theory for analysis of a given physical system. \nProper choice of normal form couplings will ensure that the \naxis attractors are the only attractors in the system -\nare no \"spurious attractors\". If symmetric normal form coef(cid:173)\nficients are chosen, then the normal form becomes a gradient \nvector field. It is exactly the gradient of an explicit poten(cid:173)\ntial function which is therefore a strict Liapunov function \nfor the system. Identical normal form coefficients make the \nnormal form vector field equivariant under permutation of the \naxes, which forces identical scale and rotation invariant \nbasins of attraction bounded by hyperplanes. Very complex \nperiodic a~tractors may be established by a kind of Fourier \nsynthesis as linear combinations of the simple cycles chosen \nfor a subset of the axes, when those are programmed to be \nunstable, and a single \"mixed mode\" in the interior of that \nsubspace is made stable. Proofs and details on vectorfield \nprogramming appear in (Baird 89). \nIn the general case, the network resulting from the projection \n\nthere \n\n\fA Bifurcation Theory Approach to Programming \n\n461 \n\nalgorithm has fourth order correlations, but the use of restr(cid:173)\nictions on the detail of vector field programming and the \ntypes of patterns to be stored result in network architectures \nrequiring only s~cond order correlations. For biological mod(cid:173)\neling, where possibly the patterns to be stored are sparse and \nnearly orthogonal, the learning rule for periodic patterns \nbecomes a \"periodic\" outer product rule which is local, add(cid:173)\nitive, commutative, and incremental. It reduces to the usual \nHebb-like rule for static attractors. \n\nt . \n\n11 \n\nS \n\nh \n\n1 (ej + wt) \n\n\u2022 1 2 \n, J- , \n\n\"1 \" \ncyc e , r Xj e \n\nCYCLES \nThe observed physiological activity may be idealized mathe-\n, ... ,n. uc a \nma 1ca y as a \ncycle is ~ \"periodic attractor\" if it is stable. The global \namplitude r is just a scaling factor for the pattern ~ , and \nthe global phase w in e 1wt is a periodic scaling that scales x \nby a factor between \u00b1 1 at frequency w as t varies. \nThe same vector XS or \"pattern\" of relative amplitudes can \nappear in space as a standing wave, like that seen in the \nbulb, if the relative phase eS1 of each compartment (component) \nis the same, eS1+, - eS1 , or as a traveling wave, like that seen \nin the ~repyriform cortex. if the relative phase components of \n~s form a gradient in space, eS 1+1 - 1/a e\\. The traveling wave \nwill \"sweep out\" the amplitude pattern XS \nin time, but the \nroot-mean-square amplitude measured in an experiment will be \nthe same ~s, regardless of the phase pattern. For an arbitrary \nphase vector, t~~se \"simple\" single frequency cycles can make \nvery complicated looking spatio-temporal patterns. From the \nmathematical point of view, the relative phase pattern ~ is a \ndegree of freedom in the kind patterns that can be stored. \nPatterns of uniform amplitude ~ which differed only in the \nphase locking pattern ~ could be stored as well. \nTo store the kind of patterns seen in bulb, the amplitude \nvector ~ is assumed to be parsed into equal numbers of excita(cid:173)\ntory and inhibitory components, where each class of component \nhas identical phase. but there is a phase difference of 60 -\n90 degrees between the classes. The traveling wave in the \nprepyriform cortex is modeled by introducing an additional \nphase g~adient into both excitatory and inhibitory classes. \n\nPROJECTION ALGORITHM \n\nThe central result of this paper is most compactly stated as \nthe following: \n\n\f462 \n\nBaird \n\nTHEOREM \nr S x.s e1(9js + wst) of \nAny set S, s - 1,2, ... , n/2 , of cycles \nlinearly independent vectors of relative comJonent amplitudes \nxS E Rn and phases ~s E Sn, with frequencies wS E R and global \namplitudes r S E R, may be established in the vector field of \nthe analog fourth order network: \n\nby some variant of the projection operation : \n\nTij ... Emn Pim J mn P nj , T \n\n-1 \n\nijk1\u00b7 \n\nEPA p-1. p-1 \nnk \n\nmn \n\nim \n\nmn \n\nmJ \n\np-1 \n\nn1' \n\nwhere the n x n matrix P contains the real and imaginary com(cid:173)\nponents [~S cos ~s , ~s sin ~S] of the complex eigenvectors \nxS e 19s as columns, J is an n x n matrix of complex conjugate \neigenvalues in diagonal blocks, Amn is an n x n matrix of 2x2 \nblocks of repeated coefficients of the normal form equations, \nand the input bi &(t) is a delta function in time that establ(cid:173)\nishes an initial condition. The vector field of the dynamics \nof the global amplitudes rs and phases -s is then given exactly \nby the normal form equations : \n\nr s == Us r s \n\n\u2022 \n\n\"T \n\nIn particular, for ask > 0 , and ass/akS < 1 , for all sand k, \nthe cycles \ns - 1,2, ... ,n/2 are stable, and have amplitudes \nrs ;; (us/ass )1I2, where us\u00b7 1 -\nNote that there is a multiple Hopf bifurcation of codimension \nn/2 at \"T = 1. Since there are no approximations here, however, \nthe theorem is not restricted to the neighborhood of this \nbifurcation, and can be discussed without further reference to \nbifurcation theory. The normal form equations for drs/dt and \nd_s/dt determine how r S and _s for pattern s evolve in time in \ninteraction with all the other patterns of the set S. This \ncould be thought of as the process of phase locking of the \npattern that finally emerges. The unusual power of this al(cid:173)\ngorithm lies in the ability to precisely specify these ~ \nlinear interactions. In general, determination of the modes of \nthe linearized system alone (li and Hopfield 89) is insuf(cid:173)\nficient to say what the attractors of the nonlinear system \nwill be. \n\n\fA Bifurcation Theory Approach to Programming \n\n463 \n\nPROOF \nThe proof of the theorem is instructive since it is a constru(cid:173)\nctive proof, and we can use it to explain the learning algori(cid:173)\nthm. We proceed by showing first that there are always fixed \npoints on the axes of these amplitude equations, whose stabil(cid:173)\nity is given by the coefficients of the nonlinear terms. Then \nthe network above is constructed from these equations by two \ncoordinate transformations. The first is from polar to Car(cid:173)\ntesian coordinates, and the second is a linear transformation \nfrom these canonical \"mode\" coordinates into the standard \nbasis e1, e2, ... , eN' or \"network coordinates\". This second \ntransformation constitutes the \"learning algorithm\", because \nit tra\"nSfrirms the simple fixed points of the amplitude equa(cid:173)\ntions into the specific spatio-temporal memory patterns desi(cid:173)\nred for the network. \n\nAmplitude Fixed Points \nBecause the amplitude equations are independent of the rota(cid:173)\ntion _, the fixed points of the amplitude equations charact(cid:173)\nerize the asymptotic states of the underlying oscillatory \nmodes. The stability of these cycles is therefore given by the \nstability of the fixed points of the amplitude equations. On \neach axis r s' the other components rj are zero, by definition, \n\nrj = rj ( uj - Ek ajk r k2 ) \u2022 0, for rj \u2022 0, which leaves \n\nr s -\n\nrs ( Us - ass r s 2 ), and \n\nr s - 0 \n\nlJ \n\n1 \n\n11 \n\n1 \n\n, \n\n- 2 a .. r~. r..... \nJ \n\nlJ \n\n1 \n\nThere is an equilibrium on each axis s, at r s.(us/ass )1I2, as \nclaimed. Now the Jacobian of the amplitude equations at some \nfixed point r~ has elements \n\nJ 11 = u. -\n\n:5 a .. r~.2 - ~ a .. r~.2 . \n\nJ . . -\nFor a fixed point r~s on axis s, J ij \u2022 0 , since r~i or r~j \u2022 0, \nmaking J a diagonal matrix whose entries are therefore its \neigenvalues. Now J l1 \u2022 u1 - a is r~ s 2, for i /. s, and J ss \u2022 Us -\n:5 ass r~/. Since r~/ \u2022 us/ass' J ss \u2022 - 2 us' and J ii \u2022 ui - a is \n(us/ass). This gives aisfass > u1/us as the condition for nega(cid:173)\ntive eigenvalues that assures the stability of r .... s . Choice of \naji/a ii ) uj/u i , for all i, j \nall axis fixed points. \n\n, therefore guarantees stability of \n\n]7-i \n\nlJ \n\nJ \n\nCoordinate Transformations \nWe now construct the neural network from these well behaved \nequations by the following transformations, \nFirst; polar to Cartesian, (rs'-s) \nv2s = r s sin -s \nV 2s-1 \n\nto \n,and differentiating these \n\n(v2s-1.v2s) : Using \n\n'\" r s cos -s \n\n\f464 \n\nBaird \n\ngives: \n\nV2s-1 \u2022 r s cos \"s \n\nby the chain rule. Now substituting cos tis \u2022 v2s-1/r s ' \nand \n\nr s sin \"s \u2022 v2s, \n\ngives: \n\nv2s \n\n- v2s rs + \n\n(v2 l/ r ) .. \ns \n\ns-\n\ns \n\nEntering the expressions of the normal form for rs and tis' \ngives: \n\nand since 222 \n\nrs = v2s-1 + v2s \n\nv2s-1 - Us v2s-1 - Ws v2s + E j [v2s-1 a sj - v2s bsj ] (v2j-/ + v2/) \n\nn/2 \n\nSimilarly, \n\nn/2 \n\nv2s - Us v2s + Ws v2s-' + E j [v2s asj + v2s-1 bSj ] (v2j _/ + v2/)\u00b7 \n\nSetting the bsj - 0 \nto get a standard network form, and reindexing i,j-l,2, ... ,n , \nwe get the Cartesian equivalent of the polar normal form equa(cid:173)\ntions. \n\nfor simplicity, choosing Us -\n\n- T + 1 \n\nn \n\nn \n\nHere J is a matrix containing 2x2 blocks along the diagonal of \nthe local couplings of the linear terms of each pair of the \nprevious equations v2s-1 ' v2s \u2022 with \nseparated out of the \ndiagonal terms. The matrix A has 2x2 blocks of identical coef(cid:173)\nficients a sj of the nonlinear terms from each pair. \n\n-\n\nT \n\n1 \nw, \n\n- w, \n1 \n\nJ = \n\n1 \nw2 \n\n- w2 \n\n1 \n\n\" \n\n~ \n\n\" -\n\na'l a\" \na\" a1, \n\na 12 a'2 \na'2 a'2 \n\na 21 a 21 \na 21 a 21 \n\na 22 a22 \na 22 a22 \n\n., \n\n\fA Bifurcation Theory Approach to Programming \n\n465 \n\nLearning Transformation - Linear Term \nSecond; J is the canonical form of a real matrix with complex \nconjugate eigenvalues, where the conjugate pairs appear in \nblocks along the diagonal as shown. The Cartesian normal form \nequations describe the interaction of these linearly uncoupled \ncomplex modes due to the coupling of the nonlinear terms. We \ncan interpret the normal form equations as network equations \nin eigenvector (or \"memory\") coordinates, given by some diag(cid:173)\nonalizing transformation P, containing those eigenvectors as \nits columns, so that J \na p-1 T P. Then it is clear that T may \ninstead be determined by the reverse projection T _ P J p-1 \nback into network coordinates, if we start with desired eigen(cid:173)\nvectors and eigenvalues. We are free to choose as columns in \nP, the real and imaginary vectors [XS cos 9s , XS sin 9S] of the \ncycles ~s e i9s of any linearly independent- set -S of p~tterns \nto be learned. If we write the matrix expression for the proj(cid:173)\nection in component form, we recover the expression given in \nthe theorem for Tij , \n\nNonlinear Term Projection \nThe nonlinear terms are transformed as well, but the expres(cid:173)\nsion cannot be easily written in matrix form. Using the com(cid:173)\nponent form of the transformation, \n\nsubstituting into the Cartesian normal form, gives: \n\nXi -\n\n(-'T+1) E j Pij (E k P-1jk xk) + E j Pij Ek J jk (E I P-\\I xl) \n\n+ E j Pij (E k P-1jk xk) EI Ajl (Em p-\\m xm) (En p-\\n xn) \n\nRearranging the orders of summation gives, \n\nXi = (-'T+1) Ek (E j Pij P-1jk ) xk + EI (E k E j Pij J jk P-\\l) xl \n\n+ En Em Ek EI E j Pij P jk AjI P 1m \n\n( \n\n-1 \n\n-1 \n\np-1 \n\n) \n\nIn xk xm xn \n\nFinally, performing the bracketed summations and relabeling \nindices gives us the network of the theorem, \n\nxi = - 'T xi + E j T1j Xj + Ejkl Tijkl Xj Xk xl \n\nwith the expression for the tensor of the nonlinear term, \n\n\f466 \n\nBaird \n\nT ijk1 - Emn Pim Amn P mj P nk P n1 \n\n-1 \n\n-1 \n\n-1 \n\nQ.E.D. \n\nLEARNING RULE EXTENSIONS \nThis is the core of the mathematical story, and it may be ex(cid:173)\ntended in many ways. When the columns of P are orthonormal, \nthen p-1 \u2022 pT, and the formula above for the linear network \ncoupling becomes T = pJpT. Then, for complex eigenvectors, \n\n- 0 for a static \n\nThis is now a local, additive, incremental learning rule for \nsynapse ij, and the system can be truly self-organizing be(cid:173)\ncause the net can modify itself based on its own activity. \nBetween units of equal phase, or when 9i s = 9j S \npattern, this reduces to the usual Hebb rule. \nIn a similar fashion, the learning rule for the higher order \nnonlinear terms becomes a multiple periodic outer product rule \nwhen the matrix A is chosen to have a simple form. Given our \npresent ignorance of the full biophysics of intracellular \nprocessing, it is not entirely impossible that some dimension(cid:173)\nality of the higher order weights in the mathematical network \ncoul~ be implemented locally within the cells of a biological \nnetwork, using the information available on the primary lines \ngiven by the linear connections discussed above. When the A \nmatrix is chosen to have uniform entries Aij - c for all its \noff-diagonal 2 x 2 blocks, and uniform entries Aij - c - d \nfor the diagonal blocks, then, \n\nT ijk1 \u2022 \n\nThis reduces to the multiple outer product \n\nThe network architecture generated by this learning rule is \n\nThis reduces to an architecture without higher order correla(cid:173)\ntions in the case that we choose a completely uniform A matrix \n(A1j - c \n\n, for all i,j). Then \n\n+ \n\n+ \n\n\fA Bifurcation Theory Approach to Programming \n\n467 \n\nThis network has fixed points on the axes of the normal form \nas always, but the stability condition is not satisfied since \nthe diagonal normal form coefficients are equal, not less, \nthan the remaining A matrix entries. In (Baird 89) we describe \nhow clamped input (inspiration) can break this symmetry and \nmake the nearest stored pattern be the only attractor. \nAll of the above results hold as well for networks with sig(cid:173)\nmoids, provided their coupling is such that they have a Tayl(cid:173)\nor's expansion which is equal to the above networks up to \nthird order. The results then hold only in the neighborhood of \nthe origin for which the truncated expansion is accurate. The \nexpected performance of such systems has been verified in \nsimulations. \n\nAcknowledgements \nSupported by AFOSR-87-0317. I am very grateful for the support \nof Walter Freeman and invaluable assistance of Morris Hirsch. \n\nReferences \nB. Baird. Bifurcation Theory Methods For Programming Static \n\nor Periodic Attractors and Their Bifurcations in Dynamic \nNeural Networks. Proc. IEEE Int. Conf. Neural Networks, \nSan Diego, Ca.,pI-9, July(1988). \n\nB. Baird. Bifurcation Theory Approach to Vectorfield Program(cid:173)\n\nming for Periodic Attractors. Proc. INNS/IEEE Int. Conf. \non Neural Networks. Washington D.C., June(1989). \n\nW. J. Freeman & B. Baird. Relation of Olfactory EEG to Be(cid:173)\n\nhavior: Spatial Analysis. Behavioral Neuroscience (1986). \n\nW. J. Freeman & B. W. van Dijk. Spatial Patterns of Visual \n\nCortical EEG During Conditioned Reflex in a Rhesus Monkey. \nBrain Research, 422, p267(1987). \n\nC. M. Grey and W. Singer. Stimulus Specific Neuronal \n\nOscillations in Orientation Columns of Cat Visual Cortex. \nPNAS. In Press(1988). \n\nZ. Li & J.J. Hopfield. Modeling The Olfactory Bulb. Biologi(cid:173)\n\ncal Cybernetics. Submitted(1989}. \n\n\f", "award": [], "sourceid": 145, "authors": [{"given_name": "Bill", "family_name": "Baird", "institution": null}]}