{"title": "Multi-modular Associative Memory", "book": "Advances in Neural Information Processing Systems", "page_first": 52, "page_last": 58, "abstract": null, "full_text": "Multi-modular Associative Memory \n\nNir Levy \n\nDavid Horn \n\nSchool of Physics and Astronomy \n\nTel-Aviv University Tel Aviv 69978, Israel \n\nEytan Ruppin \n\nDepartments of Computer Science & Physiology \n\nTel-Aviv University Tel Aviv 69978, Israel \n\nAbstract \n\nMotivated by the findings of modular structure in the association \ncortex, we study a multi-modular model of associative memory that \ncan successfully store memory patterns with different levels of ac(cid:173)\ntivity. We show that the segregation of synaptic conductances into \nintra-modular linear and inter-modular nonlinear ones considerably \nenhances the network's memory retrieval performance. Compared \nwith the conventional, single-module associative memory network, \nthe multi-modular network has two main advantages: It is less sus(cid:173)\nceptible to damage to columnar input, and its response is consistent \nwith the cognitive data pertaining to category specific impairment. \n\n1 \n\nIntroduction \n\nCortical modules were observed in the somatosensory and visual cortices a few \ndecades ago. These modules differ in their structure and functioning but are likely to \nbe an elementary unit of processing in the mammalian cortex. Within each module \nthe neurons are interconnected. Input and output fibers from and to other cortical \nmodules and subcortical areas connect to these neurons. More recently, modules \nwere also found in the association cortex [1] where memory processes supposedly \ntake place. Ignoring the modular structure of the cortex, most theoretical models \nof associative memory have treated single module networks. This paper develops \na novel multi-modular network that mimics the modular structure of the cortex. \nIn this framework we investigate the computational rational behind cortical multi(cid:173)\nmodular organization, in the realm of memory processing. \n\nDoes multi-modular structure lead to computational advantages? Naturally one \n\n\fMulti-modular Associative Memory \n\n53 \n\nmay think that modules are necessary in order to accommodate memories of dif(cid:173)\nferent coding levels. We show in the next section that this is not the case, since \none may accommodate such memories in a standard sparse coding network . In \nfact, when trying to capture the same results in a modular network we run into \nproblems, as shown in the third section: If both inter and intra modular synapses \nhave linear characteristics, the network can sustain memory patterns with only a \nlimited range of activity levels. The solution proposed here is to distinguish be(cid:173)\ntween intra-modular and inter-modular couplings, endowing the inter-modular ones \nwith nonlinear characteristics. From a computational point of view, this leads to \na modular network that has a large capacity for memories with different coding \nlevels. The resulting network is particularly stable with regard to damage to mod(cid:173)\nular inputs. From a cognitive perspective it is consistent with the data concerning \ncategory specific impairment. \n\n2 Homogeneous Network \n\nWe study an excitatory-inhibitory associative memory network [2], having N ex(cid:173)\ncitatory neurons. We assume that the network stores M1 memory patterns 7]1-' of \nsparse coding level p and M2 patterns ~v with coding level f such that p < f < < 1. \nThe synaptic efficacy Jjj between the jth (presynaptic) neuron and the ith (post(cid:173)\nsynaptic) neuron is chosen in the Hebbian manner \n\n1 Ml \n\nJij = N L 7]1-' i 7]1-' j + N L ~v i~v j \n\n, \n\n1 M2 \n\nP 1-'=1 \n\nP 1-'=1 \n\nThe updating rule for the activity state Vi of the ith binary neuron is given by \n\nVi(t + 1) = e (hj(t) - 0) \nwhere e is the step function and 0 is the threshold. \nlQ(t) \np \n\nhi(t) = hHt) -\n\n(1) \n\n(2) \n\n(3) \n\n(5) \n\nis the local field, or membrane potential. It includes the excitatory Hebbian coupling \nof all other excitatory neurons, \n\nN \n\nhi(t) = L Jij Vj(t) , \n\nj::f.i \n\n(4) \n\nand global inhibition that IS proportional to the total activity of the excitatory \nneurons \n\n1 N \n\nQ(t) = N L Vj(t) . \n\nThe overlap m(t) between the network activity and the memory patterns is defined \nfor the two memory populations as \n\nj \n\nm{V(t) = ~f ~~VjVj(t) , \n\nN \n\n(6) \n\nJ \n\nThe storage capacity a = M / N of this network has two critical capacities. a c{ \nabove which the population of ~v patterns is unstable and a C 7] above which the \npopulation of 7]1-' patterns is unstable. We derived equations for the overlap and \ntotal activity of the two populations using mean field analysis. Here we give the \n\n\f54 \nfixed-point equations for the case of Ml = M2 = ~ and 'Y = Md2 + M2p2. The \nresulting equations are \n\nN. Levy, D. Hom and E. Ruppin \n\nml1 = * \n\n(\n\nand \n\nwhere \n\nand \n\n(a) \n\n(J - m ) \n\n1> \n\n'1 \n\n, \n\nQ = pml1 + ** (~) , \n\n**(x) = \n\n100 \n\nx \n\n(z2) dz \n\nexp - - -\n\n2 ~ \n\n(b) \n\n(7) \n\n(8) \n\n(9) \n\n(10) \n\np \n\np \n\nFigure 1: (a) The critical capacity acl1 VS. f and p for f ~ p, (J = 0.8 and N = 1000. \n(b) (acl1 - ac~) / acl1 versus f and p for the same parameters as in (a). The validity \nof these analytical results was tested and verified in simulations. \n\nNext, we look for the critical capacities, acl1 and ac~ at which the fixed-point \nequations become marginally stable. The results are shown in Figure 1. Figure 1 (a) \nshows acl1 VS. the coding levels f and p (f ~ p). Similar results were obtained for \nac~. As evident the critical capacities of both populations are smaller than the one \nobserved in a homogeneous network in which f = p. One hence necessarily pays a \nprice for the ability to store patterns with different levels of activity. \nFigure l(b) plots the relative capacity difference (acl1 - ac~)/acl1 vs. f and p. The \nfunction is non negative, i.e., acl1 ~ ac~ for all f and p. Thus, low activity memories \nare more stable than high activity ones. \n\nAssuming that high activity codes more features [3], these results seem to be at \nodds with the view [3, 4] that memories that contain more semantic features, and \ntherefore correspond to larger Hebbian cell assemblies, are more stable, such as \nconcrete versus abstract words. The homogeneous network, in which the memories \nwith high activity are more susceptible to damage, cannot account for these obser(cid:173)\nvations. In the next section we show how a modular network can store memories \nwith different activity levels and account for this cognitive phenomenon. \n\n\fMulti-modular Associative Memory \n\n3 Modular Network \n\n55 \n\nWe study a multi modular excitatory-inhibitory associative memory network, stor(cid:173)\ning M memory patterns in L modules of N neurons each. The memories are coded \nsuch that in every memory a variable number n of 1 to L modules is active. This \nnumber will be denoted as modular coding. The coding level inside the modules \nis sparse and fixed, i.e., each modular Hebbian cell assembly consists of pN active \nneurons with p < < 1. The synaptic efficacy Ji/k between the jth (presynaptic) \nneuron from the kth module and the ith (postsynaptic) neuron from the lth module \nis chosen in a Hebbian manner \n\nM \n\nJ .. lk _ 1 ~ ~ \nIJ -N L.JTJiITJjk, \n\n~ \n\np ~=1 \n\n(11) \n\nwhere TJ~ il are the stored memory patterns. The updating rule for the activity state \nVii of the ith binary neuron in the lth module is given by \n\n(12) \n\nwhere (J$ is the threshold, and S(x) is a stochastic sigmoid function, getting the \nvalue 1 with probability (1 + e- X )-1 and 0 otherwise. The neuron 's local field, or \nmembrane potential has two components, \n\nhil (t) = h/ internal(t) + h/ external(t) . \n\n(13) \n\nThe internal field , hilinternal(t), includes the contributions from all other excitatory \nneurons that are situated in the lth module, and inhibition that is proportional to \nthe total modular activity of the excitatory neurons, i.e., \n\nwhere \n\nh/ internal(t) = L Ji/ Y~/ (t) - ')'$ QI (t) , \n\nN \n\nj:\u00a2.j \n\nN \n\nI \n\nQ (t) = N L.J Yj (t) . \n\n1 ~ I \nP . \nJ \n\n(14) \n\n(15) \n\nThe external field component, hil external(t), includes the contributions from all \nother excitatory neurons that are situated outside the lth module, and inhibition \nthat is proportional to the total network activity. \n\n. \n\n(16) \n\nhil external(t) = 9 (t t Ji/kYjk(t) - ')'d t Qk(t) - (Jd) \n\nk~1 j \n\nk \n\nWe allow here for the freedom of using more complicated behavior than the standard \n9(x) = x one. In fact, as we will see, the linear case is problematic, since only \nmemory storage with limited modular coding is possible. \n\nThe retrieval quality at each trial is measured by the overlap function, defined by \n\nm~ (t) = Nn~ L L TJ~ ik Vi k (t) , \n\nL N \n\n1 \n\nP \n\nk=1 ;=1 \n\n(17) \n\nwhere n~ is the modular coding of TJ~. \n\n\f56 \n\nN. Levy, D. Hom and E. Ruppin \n\nIn the simulations we constructed a network of L = 10 modules, where each module \ncontains N = 500 neurons. The network stores M = 50 memory patterns randomly \ndistributed over the modules. Five sets of ten memories each are defined. In each \nset the modular coding is distributed homogeneously between one to ten active \nmodules. The sparse coding level within each module was set to be p = 0.05. Every \nsimulation experiment is composed of many trials. In each trial we use as initial \ncondition a corrupted version of a stored memory pattern with error rate of 5%, \nand check the network's retrieval after it converges to a stable state. \n\n... \n\n0.9 \n\n0.8 \n\n0.7 \n\n0.8 \n\n& 0.5 \n\n0.4 \n\n0.3 \n\n0.2 \n\n0.1 \n\n0 \n\n4 \n\n5 \n\n6 \n\nModular Coding \n\n10 \n\nFigure 2: Quality of retrieval vs. memory modular coding. The dark shading repre(cid:173)\nsents the mean overlap achieved by a network with linear intra-modular and inter(cid:173)\nmodular synaptic couplings. The light shading represents the mean overlap of a \nnetwork with sigmoidal inter-modular connections, which is perfect for all memory \npatterns. The simulation parameters were: L = 10, N = 500, M = 50, p = 0.05, \n). = 0.7, (Jd = 2 and (J, = 0.6. \n\nWe start with the standard choice of 9(x) = x, i.e. treating similarly the intra(cid:173)\nmodular and inter-modular synaptic couplings. The performance of this network \nis shown in Figure 2. As evident, the network can store only a relatively narrow \nspan of memories with high modular coding levels, and completely fails to retrieve \nmemories with low modular coding levels (see also [5]). If, however, 9 is chosen to be \na sigmoid function, a completely stable system is obtained, with all possible coding \nlevels allowed. A sigmoid function on the external connections is hence very effective \nin enhancing the span of modular coding of memories that the network can sustain. \nThe segregation of the synaptic inputs to internal and external connections has been \nmotivated by observed patterns of cortical connectivity: Axons forming excitatory \nintra-modular connections make synapses more proximal to the cell body than do \ninter-modular connections [6]. Dendrites, having active conductances, embody a \nrich repertoire of nonlinear electrical and chemical dynamics (see [7] for a review). \nIn our model, the setting of 9 to be a sigmoid function crudely mimics these active \nconductance properties. \n\nWe may go on and envisage the use of a nested set of sigmoidal dendritic transmis(cid:173)\nsion functions. This turns out to be useful when we test the effects of pathologic \nalterations on the retrieval of memories with different modular codings. The amaz(cid:173)\ning result is that if the damage is done to modular inputs, the highly nonlinear \ntransmission functions are very resistible to it. An example is shown in Fig. 3. \n\n\fMulti-modular Associative Memory \n\n57 \n\nHere we compare two nonlinear functions: \n\n91 =.x8 [t t Ji/kV/(t) - \"Yd t Qk(t) - Od] \ng, = Ae [t. e [~J';IkV;'(t) - ,.Q.(t) - 0.] -0.] \n\nk:f.1 \n\n, \n\nk:f.1 \n\nj \n\nThe second one is the nested sigmoidal function mentioned above. Two types of \ninput cues are compared: correct TJIJ il to one of the modules and no input to the \nrest, or partial input to all modules. \n\n, \n\nI \n\nI \nI \nI \n\n, \n\nI \n\n0.9 \n\nO.S \n\n0.7 \n\n0.4 \n\n0.3 \n\n0.2 \n\n0.1 \n\n00 \n\n0.1 \n\n0.2 \n\n0.3 \n\n0.4 \n\n0.5 \n\nm(I'()) \n\n0.8 \n\n0.7 \n\nO.B \n\n0.9 \n\nFigure 3: The performance of modular networks with different types of non-linear \ninter-connections when partial input cues are given. The mean overlap is plotted \nvs. the overlap of the input cue. The solid line represents the performance of \nthe network with 92 and the dash-dot line represents 91. The left curve of 92 \ncorresponds to the case when full input is presented to only one module (out of \nthe 5 that comprise a memory), while the right solid curve corresponds to partial \ninput to all modules. The two 91 curves describe partial input to all modules, but \ncorrespond to two different choices of the threshold parameter Od, 1.5 (left) and 2 \n(right). Parameters are L = 5, N = 1000, p = 0.05, .x = 0.8, n = 5, 0, = 0.7 and \nOk = 0.7. \n\nAs we can see, the nested nonlinearities enable retrieval even if only the input to \na single module survives. One may therefore conclude that, under such conditions, \npatterns of high modular coding have a grater chance to be retrieved from an input \nto a single module and thus are more resilient to afferent damage. Adopting the \nassumption that different modules code for distinct semantic features, we now find \nthat a multi-modular network with nonlinear dendritic transmission can account \nfor the view of [3], that memories with more features are more robust. \n\n4 Summary \n\nWe have studied the ability of homogeneous (single-module) and modular networks \nto store memory patterns with variable activity levels. Although homogeneous net(cid:173)\nworks can store such memory patterns, the critical capacity oflow activity memories \nwas shown to be larger than that of high activity ones. This result seems to be in(cid:173)\nconsistent with the pertaining cognitive data concerning category specific semantic \n\n\f58 \n\nN. Levy, D. Hom and E. Ruppin \n\nimpairment, which seem to imply that high activity memories should be the more \nstable ones. \nMotivated by the findings of modular structure in associative cortex, we developed a \nmulti-modular model of associative memory. Adding the assumption that dendritic \nnon-linear processing operates on the signals of inter-modular synaptic connections, \nwe obtained a network that has two important features: coexistence of memories \nwith different modular codings and retrieval of memories from cues presented to a \nsmall fraction of all modules. The latter implies that memories encoded in many \nmodules should be more resilient to damage in afferent connections, hence it is \nconsistent with the conventional interpretation of the data on category specific \nimpairment. \n\nReferences \n\n[1] R. F. Hevner. More modules. TINS, 16(5):178,1993. \n[2] M. V. Tsodyks. Associative memory in neural networks with the hebbian learn(cid:173)\n\ning rule. Modern Physics Letters B, 3(7):555-560, 1989. \n\n[3] G. E. Hinton and T. Shallice. Lesioning at attractor network: investigations of \n\nacquired dyslexia. Psychological Review, 98(1):74-95, 1991. \n\n[4] G. V. Jones. Deep dyslexia, imageability, and ease of predication. Brain and \n\nLanguage, 24:1-19, 1985. \n\n[5] R. Lauro Grotto, S. Reich, and M. A. Virasoro. The computational role of \nconscious processing in a model of semantic memory. In Proceedings of the lIAS \nSymposium on Cognition Computation and Consciousness, 1994. \n\n[6] P. A. Hetherington and L. M. Shapiro. Simulating hebb cell assemblies: the \nnecessity for partitioned dendritic trees and a post-not-pre ltd rule. Network, \n4:135-153,1993. \n\n[7] R. Yuste and D. W. Tank. Dendritic integration in mammalian neurons a cen(cid:173)\n\ntury after cajal. Neuron, 16:701-716, 1996. \n\n\f", "award": [], "sourceid": 1345, "authors": [{"given_name": "Nir", "family_name": "Levy", "institution": null}, {"given_name": "David", "family_name": "Horn", "institution": null}, {"given_name": "Eytan", "family_name": "Ruppin", "institution": null}]}*