{"title": "Analog Memories in a Balanced Rate-Based Network of E-I Neurons", "book": "Advances in Neural Information Processing Systems", "page_first": 2231, "page_last": 2239, "abstract": "The persistent and graded activity often observed in cortical circuits is sometimes seen as a signature of autoassociative retrieval of memories stored earlier in synaptic efficacies. However, despite decades of theoretical work on the subject, the mechanisms that support the storage and retrieval of memories remain unclear. Previous proposals concerning the dynamics of memory networks have fallen short of incorporating some key physiological constraints in a unified way. Specifically, some models violate Dale's law (i.e. allow neurons to be both excitatory and inhibitory), while some others restrict the representation of memories to a binary format, or induce recall states in which some neurons fire at rates close to saturation. We propose a novel control-theoretic framework to build functioning attractor networks that satisfy a set of relevant physiological constraints. We directly optimize networks of excitatory and inhibitory neurons to force sets of arbitrary analog patterns to become stable fixed points of the dynamics. The resulting networks operate in the balanced regime, are robust to corruptions of the memory cue as well as to ongoing noise, and incidentally explain the reduction of trial-to-trial variability following stimulus onset that is ubiquitously observed in sensory and motor cortices. Our results constitute a step forward in our understanding of the neural substrate of memory.", "full_text": "Analog Memories in a Balanced Rate-Based\n\nNetwork of E-I Neurons\n\nDylan Festa\n\ndf325@cam.ac.uk\n\nGuillaume Hennequin\ngjeh2@cam.ac.uk\n\nM\u00b4at\u00b4e Lengyel\n\nm.lengyel@eng.cam.ac.uk\n\nComputational & Biological Learning Lab, Department of Engineering\n\nUniversity of Cambridge, UK\n\nAbstract\n\nThe persistent and graded activity often observed in cortical circuits is some-\ntimes seen as a signature of autoassociative retrieval of memories stored earlier\nin synaptic ef\ufb01cacies. However, despite decades of theoretical work on the sub-\nject, the mechanisms that support the storage and retrieval of memories remain\nunclear. Previous proposals concerning the dynamics of memory networks have\nfallen short of incorporating some key physiological constraints in a uni\ufb01ed way.\nSpeci\ufb01cally, some models violate Dale\u2019s law (i.e. allow neurons to be both excita-\ntory and inhibitory), while some others restrict the representation of memories to\na binary format, or induce recall states in which some neurons \ufb01re at rates close\nto saturation. We propose a novel control-theoretic framework to build function-\ning attractor networks that satisfy a set of relevant physiological constraints. We\ndirectly optimize networks of excitatory and inhibitory neurons to force sets of\narbitrary analog patterns to become stable \ufb01xed points of the dynamics. The re-\nsulting networks operate in the balanced regime, are robust to corruptions of the\nmemory cue as well as to ongoing noise, and incidentally explain the reduction\nof trial-to-trial variability following stimulus onset that is ubiquitously observed\nin sensory and motor cortices. Our results constitute a step forward in our under-\nstanding of the neural substrate of memory.\n\n1\n\nIntroduction\n\nMemories are thought to be encoded in the joint, persistent activity of groups of neurons. According\nto this view, memories are embedded via long-lasting modi\ufb01cations of the synaptic connections\nbetween neurons (storage) such that partial or noisy initialization of the network activity drives\nthe collective dynamics of the neurons into the corresponding memory state (recall) [1]. Models of\nmemory circuits following these principles abound in the theoretical neuroscience literature, but few\nrespect some of the most fundamental properties of brain networks, including: i) the separation of\nneurons into distinct classes of excitatory (E) and inhibitory (I) cells \u2013 known as Dale\u2019s law \u2013, ii) the\npresence of recurrent and sparse synaptic connections, iii) the possibility for each neuron to sustain\ngraded levels of activity in different memories, iv) the \ufb01ring of action potentials at reasonably low\nrates, and v) a dynamic balance of E and I inputs.\nIn the original Hop\ufb01eld network [1], connectivity must be symmetrical, which violates Dale\u2019s law.\nMoreover, just as in much of the work following it up, memories are encoded in binary neuronal\nresponses and so converge towards effectively binary recall states even if the recall dynamics for-\nmally uses graded activities [2]. Subsequent work considered non-binary pattern distributions [3, 4],\nand derived high theoretical capacity limits for them, but those capacities proved dif\ufb01cult \u2013 if not\nimpossible \u2013 to realise in practice [5, 6], and the network dynamics therein did not explicitly model\ninhibitory neurons thus implicitly assuming instantaneous inhibitory feedback. More recent work\n\n1\n\n\fFigure 1: (a) Examples of analog patterns of excitatory neuronal activities, drawn from a log-normal\ndistribution. In all our training experiments, network parameters were optimized to stabilize a set\nof such analog patterns and the baseline, uniform activity state (top row). For ease of visualization,\nonly 30 of the 100 excitatory neurons are shown. (b) Optimized values of the inhibitory (auxiliary)\nneuronal \ufb01ring rates for 5 of 30 learned memories (corresponding to those in panel a). Only 30 of\nthe 50 auxiliary neurons are shown. (c) Empirical distributions of \ufb01ring rates across neurons and\nmemory patterns, for each population.\n\nincorporated Dale\u2019s law, and described neurons using the more realistic, leaky integrate-and-\ufb01re\n(LIF) neuron model [7]. However, the stability of the recall states still relied critically on the satu-\nrating behavior of the LIF input-output transfer function at high rates. Although it was later shown\nthat dynamic feedback inhibition can stabilize relatively low \ufb01ring rates in subpopulations of more\ntightly connected neurons [8, 9], inhibitory feedback in these models is global, and calibrated for a\nsingle stereotypical level of excitation for all memories, implying effectively binary memories again.\nFinally, spatially connected networks are able to sustain graded activity patterns (spatial \u201cbumps\u201d),\nbut make strong assumptions about the spatial structure of both the connectivity and the memory\npatterns, and are sensitive to ongoing noise (e.g. [10, 11]). Ref. [12] provides a rare example of\nspike timing-based graded memory network, but it again did not contain inhibitory units.\nHere we propose a general control-theoretic framework that overcomes all of the above limitations\nwith minimal additional assumptions. We formalize memory storage as implying two conditions:\nthat the desired activity states be \ufb01xed points of the dynamics, and that the dynamics be stable\naround those \ufb01xed points. We directly optimize the network parameters, including the synaptic\nconnectivity, to satisfy both conditions for a collection of arbitrary, graded memory patterns (Fig. 1).\nThe \ufb01xed point condition is achieved by minimizing the time derivative of the neural activity, such\nthat ideally it reaches zero, at each of the desired attractor states. Stability, however, is more dif\ufb01cult\nto achieve because the \ufb01xed-point constraints tend to create strong positive feedback loops in the\nrecurrent circuitry, and direct measures of dynamical stability (eg. the spectral abscissa) do not admit\nef\ufb01cient, gradient-based optimization. Thus, we use recently developed methods from robust control\ntheory, namely the minimization of the Smoothed Spectral Abscissa (SSA, [13, 14]) to perform\nrobust stability optimization. To satisfy biological constraints, we parametrize the networks that we\noptimize such that they have realistic \ufb01ring rate dynamics and their connectivities obey Dale\u2019s law.\nWe show that despite these constraints the resulting networks perform memory recall that is robust\nto noise in both the recall cue and the ongoing dynamics, and is stabilized through a tight dynamic\nbalance of excitation and inhibition. This novel way of constructing structurally realistic memory\nnetworks should open new routes to the understanding of memory and its neural substrate.\n\n2 Methods\n\nWe study a network of n = nE (excitatory) +nI (inhibitory) neurons. The activity of neuron i is\nrepresented by a single scalar potential vi, which is converted into a \ufb01ring rate ri via a threshold-\nquadratic gain function (e.g. [15]):\n\nri = g(vi)\n\ni\n\nif\nvi > 0\notherwise\n\n(1)\n\n:= (cid:26)\u03b3v2\n\n0\n\n2\n\n5Hzmemoriesabc20Hzexc.(prescribeddistribution)0102030inh.(optimizeddistribution)exc.neuronsinh.neurons\ufb01ringrate[Hz]\fWe set \u03b3 to 0.04, such that g(vi) spans a few tens of Hz when vi spans a few tens of mV, as\nexperimentally observed in cortical areas (e.g. cat\u2019s V1 [16]). The instantaneous state of the system\ncan be expressed as a vector v(t) := (v1(t), . . . , vn(t)). We denote the activity of the excitatory or\ninhibitory subpopulation by vexc and vinh, respectively. The recurrent interactions between neurons\nare governed by a synaptic weight matrix W, in which the sign of each element Wij depends on\nthe nature (excitatory or inhibitory) of the presynaptic neuron j. We enforce Dale\u2019s law via a re-\nparameterization of the synaptic weights:\n\nWij = sj log(1 + exp \u03b2ij) with sj =(cid:26)+1\n\n\u22121\n\nif j \u2264 nE\notherwise\n\n(2)\n\nwhere the \u03b2ij\u2019s are free, unconstrained parameters. (We do not allow for autapses, i.e. we \ufb01x Wii =\n0). The network dynamics are thus given by:\n\n\u03c4i\n\ndvi\ndt\n\n= \u2212vi +\n\nn(cid:88)j=1\n\nWij g(vj) + hi\n\n,\n\n(3)\n\nwhere \u03c4i is the membrane time constant, and hi is a constant external input, independent of the\nmemory we wish to recall.\nIt is worth noting that, since the gain function g(vi) de\ufb01ned in Eq (1) has no upper saturation,\nrecurrent interactions can easily result in runaway excitation and \ufb01ring rates growing unbounded.\nHowever, our optimization algorithm will naturally seek stable solutions, in which \ufb01ring rates are\nkept within a limited range due to a \ufb01ne dynamic balance of excitation and inhibition [14].\n\nOptimizing network parameters to embed attractor memories\n\nWe are going to build and study networks that have a desired set of analog activity patterns as stable\n\ufb01xed points of their dynamics. Let {v\u00b5\nexc}\u00b5=1,...,m be a set of m target analog patterns (Fig. 1),\nde\ufb01ned in the space of excitatory neuronal activity (potentials). For a given pattern \u00b5, the inhibitory\nneurons will be free to adjust their steady state \ufb01ring rates v\u00b5\ninh to whatever pattern proves to be\noptimal to maintain stability.\nIn other words, we think of the activity of inhibitory neurons as\n\u201cauxiliary\u201d variables.\nA given activity pattern v\u00b5 \u2261 (v\u00b5(cid:62)\nonly if, it satis\ufb01es the following two conditions:\n\ninh)(cid:62) is a stable \ufb01xed point of the network dynamics if, and\n\nexc, v\u00b5(cid:62)\n\n= 0\n\nand\n\n\u03b1 (J\u00b5) < 0\n\n(4)\n\ndv\n\ndt(cid:12)(cid:12)(cid:12)(cid:12)v=v\u00b5\n\nij := Wij g(cid:48)(v\u00b5\n\nwhere J\u00b5 is the Jacobian matrix of the dynamics in Eq. 3, i.e. J \u00b5\nj ) \u2212 \u03b4ij (Kronecker\u2019s\ndelta), and \u03b1(J\u00b5) denotes the spectral abscissa (SA), de\ufb01ned as the largest real part in the eigenvalue\nspectrum of J\u00b5. The \ufb01rst condition makes v\u00b5 a \ufb01xed point of the dynamics, while the second\ncondition makes that \ufb01xed point asymptotically stable with respect to small local perturbations.\nNote that the width of the basin of attraction is not captured by the SA.\nThe two conditions in Eq. 4 depend on a set of network parameters that we will allow ourselves\nto optimize. These are all the synaptic weight parameters (\u03b2ij, i (cid:54)= j), as well as the values of the\ninhibitory neurons\u2019 \ufb01ring rates in each attractor (v\u00b5\ninh, \u00b5 = 1, . . . , m). Thus, we may adjust a total\nof n(n \u2212 1) + nI m parameters.\nUsing Eq. 3, the \ufb01rst condition in Eq. 4 can be rewritten as v\u00b5\nj ) \u2212 hi = 0.\nDespite this equation being linear in the synaptic weights, the re-parameterization of Eq. 2 makes\nit nonlinear in \u03b2, and it is in any case nonlinear in v\u00b5\ninh. We will therefore seek to satisfy this\ncondition by minimizing (cid:107) dv/dt|v=v\u00b5 (cid:107)2, which quanti\ufb01es how fast the potentials drift away when\ninitialized in the desired attractor state v\u00b5. When it is zero, v\u00b5 is a \ufb01xed point of the dynamics. Our\noptimization procedure (see below) may not be able to set this term to exactly zero, especially as we\ntry to store a large number of memories, but in practice we \ufb01nd it becomes small enough that the\nJacobian-based stability criterion remains valid.\nMeeting the stability condition (second condition in Eq. 4) turns out to be more involved. The SA\nis, in general, a non-smooth function of the matrix elements and is therefore dif\ufb01cult to minimize.\n\ni \u2212(cid:80)n\n\nj=1 Wijg(v\u00b5\n\n3\n\n\fij}, which in turn depend both on the connectivity parameters {\u03b2ij} and on v\u00b5\n\nA more suitable stability measure has been introduced recently in the context of robust control\ntheory [13, 14], called the Smoothed Spectral Abscissa (SSA), which we will use here and denote\nby \u02dc\u03b1\u03b5(J\u00b5). The SSA, de\ufb01ned for some smoothness parameter \u03b5 > 0, is a differentiable relaxation of\nthe SA, with the properties \u03b1(J\u00b5) < \u02dc\u03b1\u03b5(J\u00b5) and lim\u03b5\u21920 \u02dc\u03b1\u03b5(J\u00b5) = \u03b1(J\u00b5). Therefore, the criterion\n\u02dc\u03b1\u03b5(J\u00b5) \u2264 0 implies \u03b1(J\u00b5) < 0, and can therefore be used as an indication of local stability.\nBoth the SSA and its gradient are straightforward to evaluate numerically, making it amenable to\nminimization through gradient descent. Note that the SSA depends on the Jacobian matrix elements\n{J \u00b5\ninh. Note also that\nthe parameter \u03b5 > 0 controls how tightly the SSA hugs the SA. Small values make it a tight upper\nbound, with increasingly ill-behaved gradients. Large values imply more smoothness, but may no\nlonger guarantee that the SSA has a negative minimum even though the SA might have one. In our\nsystem of n = 150 neurons we found \u03b5 = 0.01 to yield a good compromise. In the general case the\ndistance between SA and SSA grows linearly with the number of dimensions.To keep it invariant, \u03b5\nshould be scaled accordingly. We therefore used the following heuristic rule \u03b5 = 0.01 \u00b7 150/n.\nWe summarize the above objective into a global cost function by lumping together the \ufb01xed point\nand stability conditions, summing over the entire set of m target memory patterns, and adding an L2\npenalty term on the synaptic weights to regularize:\n\n\u03c8 ({\u03b2ij},{v\u00b5\n\ninh}) :=\n\n1\nm\n\nm(cid:88)\u00b5=1(cid:32) 1\nn(cid:13)(cid:13)(cid:13)(cid:13)\n\ndv\n\ndt(cid:13)(cid:13)(cid:13)(cid:13)\n\n2\n\nv=v\u00b5\n\n+ \u03b7s \u02dc\u03b1\u03b5 (J\u00b5)(cid:33) +\n\n\u03b7F\nn2 (cid:107)W(cid:107)2\n\nF .\n\n(5)\n\nF is the squared Frobenius norm of W, i.e. the sum of its squared elements, and the\nwhere (cid:107)W(cid:107)2\nparameters \u03b7s and \u03b7F control the relative importance of each component of the objective function.\nWe set them heuristically (Table 1). We used a variant of the low-storage BFGS algorithm included\nin the open source library NLopt [17] to minimize \u03c8.\n\nChoice of initial parameters and attractors\n\nThe synaptic weights are initially drawn randomly from a Gamma distribution with a shape factor of\n2 and a mean that depends only on the type of pre- and post-synaptic population. The mean synaptic\nweights of the four synapse types were computed using a mean-\ufb01eld reduction of the full network\nin which all\nto meet the condition that the network initially exhibits a stable baseline state v\u00b5=1\nexc\nexcitatory \ufb01ring rates equal rbaseline = 5 Hz (Table 1, and Supplementary Material). This base-\nline state was included in every set of m target attractors that we used and was thus stable from\nexc}\u00b5=2,...,m were generated\nthe beginning, by construction. For the remaining target patterns, {v\u00b5\nby inverting (using g\u22121) \ufb01ring rates that were sampled from a log-normal distribution with a mean\nmatching the baseline \ufb01ring rate, rbaseline (Fig. 1a) and a variance of 5 Hz. This log-normal distri-\nbution was chosen to roughly capture the skewed and heavy-tailed nature of \ufb01ring rate distributions\nobserved in vivo (see e.g. for a review [18]). The inhibitory potentials in the memory states, {v\u00b5\ninh},\nwere initialized to the baseline, g\u22121(5 Hz), and were subsequently used as free parameters by the\nlearning algorithm (cf. above; see also Fig. 1b).\n\n3 Results\n\nExample of successful storage\n\nFigure 2 shows an example of stability optimization: in this speci\ufb01c run we used 150 neurons to em-\nbed 30 graded attractors (examples of which where shown in Fig. 1), yielding a storage capacity of\n0.2. Other parameters are listed in Table 1. Gradient descent gradually reduces each of the attractor-\nspeci\ufb01c sub-objectives in Eq. 5, namely the SSA, the SA, and the potential velocities (cid:107)dv/dt(cid:107)2 in\neach target state (Fig. 2). After convergence, the SSA has become negative for all desired states,\nindicating stability. Note, however, that (cid:107)dv/dt(cid:107) after convergence is small but non-zero in each\nof the target memories. Thus, strictly speaking, the target patterns haven\u2019t become \ufb01xed points of\nthe dynamics, but only slow points from which the system will eventually drift away. In practice\nthough, we found that stability was robust enough that an exact, stable \ufb01xed point had in fact been\ncreated very near each target pattern. This is detailed below.\n\n4\n\n\fFigure 2: (a) Decrease of the SA (solid line) and of the SSA (dotted line) during learning in systems\nwith 30 (purple) and 50 attractors (orange). Thick lines show averages across attractors, \ufb02anking\nlines show the corresponding standard deviations. The x-axis marks the actual duration of the run of\nthe learning algorithm. (b) Euclidean norm of the velocity at the \ufb01xed point during learning. Lines\nand colors as in a. Note the logarithmic y-axis.\n\nTable 1: Parameter settings\n\n100\nnE\n50\nnI\nm 30\n\n\u03c4E\n\u03c4I\nrbaseline\n\n20 ms\n10 ms\n5 Hz\n\n\u03b7s\n\u03b7F\n\n0.02\n0.001\n\nMemory recall performance and robustness\n\nFor recall, we initialize neuronal activities at a noisy version of one of the target patterns, and study\nthe subsequent evolution of the network state. The network performs well if its dynamics clean up\nthe noise and home in on the target pattern (autoassociative behavior) and if it achieves this robustly\neven in the face of large amounts of noise.\nInitial cues are chosen to be linear combinations of the form r(t = 0) = \u03c3 \u02dcr + (1\u2212 \u03c3) r\u00b5, where r\u00b5\nis the memory we intend to recall and \u02dcr is an independent random vector with the same lognormal\nstatistics used to generate the memory patterns themselves. The parameter \u03c3 regulates the noise\nlevel: \u03c3 = 0 sets the network activity directly in the desired attractor, while \u03c3 = 1 initializes it with\ncompletely random values.\nThe deviation of the momentary network state r(t) \u2261 g(v(t)) from the target pattern r\u00b5 \u2261 g(v\u00b5)\nis measured in terms of the squared Euclidean distance, further normalized by the expected squared\ndistance between r\u00b5 and a random pattern drawn from the same distribution (log-normal in our\ncase). Formally:\n\n.\n\n(6)\n\nd\u00b5(t) := (cid:107)rexc(t) \u2212 r\u00b5\n(cid:104)(cid:107)\u02dcrexc \u2212 r\u00b5\n\nexc(cid:107)2\nexc(cid:107)2(cid:105)\u02dcr\n\nFigure 3a shows the temporal evolution of d\u00b5(t) on a few sample recall trials, for two different noise\nlevels \u03c3. For \u03c3 = 0.5, recalls are always successful, as the network state converges to the right target\npattern on each trial. For \u03c3 = 0.75, the network activity occasionally settles in another, well distinct\nattractor.\nWe used the convention that a trial is deemed successful if the distance d\u00b5(t) falls below 0.001. (A\n\u223c 3 Hz deviation from the target in only one of the 100 exc. neurons, with all other 99 neurons\nbehaving perfectly, would be suf\ufb01cient to cross this threshold and fail the test.) We further measure\nperformance as the probability of successful recall, which we estimated from many independent\ntrials with different realizations of the noise \u02dcr in the initial condition (Figure 3b). The network\nperformance is also compared to an \u201cideal observer\u201d [6] that has direct access to all the stored\nmemories (rather than just their re\ufb02ection in the synaptic weights) and simply returns that pattern\nin the training set {r\u00b5} to which the initial cue is closest (Fig. 3b). Thus, as an upper bound on\nperformance, the ideal observer only produces a wrong recall when the added noise brings the\ninitial state closer to an attractor that is different from the target. Remarkably, our network dynamics\n\n5\n\n0204060\u22121\u22120.50time(hours)SA/SSAh\u02dc\u03b1\u03b5(J\u00b5)i\u00b5h\u03b1(J\u00b5)i\u00b5m=30m=50a020406010\u2212410\u22122time(hours)D(cid:13)(cid:13)\u02d9v(\u00b5)(cid:13)(cid:13)2E\u00b5b\f(a) Example recall trials for a single memory r\u00b5, which is presented to the network at\nFigure 3:\ntime t = 0 in a corrupted version that is different on every trial, for two different values of the\nnoise level \u03c3 (colors). Shown here is the temporal evolution of the momentary distance between the\nvector of excitatory \ufb01ring rates rexc(t) and the memory pattern r\u00b5\nexc. Different lines correspond to\ndifferent trials. (b) Fraction of trials that converged onto the correct attractor (\ufb01nal distance d\u00b5(t =\n\u221e) < 0.001, cf. text) as a function of the normalized distance between the initial condition and the\ndesired attractor, d\u00b5(t = 0). Thick lines show medians across attractors, \ufb02anking thin lines show\nthe 25th and 75th percentiles. The performance of the baseline state is shown separately (orange).\nThe dashed lines show the performance of an \u201cideal observer\u201d, always selecting the memory closest\nto the initial condition, for the same trials.\n\n(continuous lines) and the ideal observer (dashed lines) have comparable performances. When trying\nto recall the uniform pattern of baseline activity, the performance appears much better (orange line)\nboth for the ideal observer and the network. This is simply because the random vectors used to\nperturb the system have a high probability of lying closer to the mean of the log normal distribution\n(that is, the baseline state) than to any other memory pattern. Moreover, the network was initialized\nprior to learning with the baseline as the single global attractor, and this might account for the\nadditional tendency of the network (solid orange line) to fall on such state, as compared to the ideal\nobserver (dotted orange line).\n\nOnly a few strong synaptic weights contribute to memory recall\n\nSynaptic weights after learning (Fig. 4a) are sparse: their distribution shows the characteristic peak\nnear zero and the long tail observed in real cortical circuits [19, 20] (Fig. 4b). This sparseness cannot\nbe accounted for by the L2 norm regularizer in the cost function (Eq. 5) as it does not promote\nsparsity as an L1 term would. Thus, the observed sparsity in the trained network must be a genuine\nconsequence of having optimized the connectivity for robust stability.\nIf we assume that weights |Wij| \u2264 0.01 correspond to functionally silent synapses, then the trained\nnetwork contains 52% of silent excitatory synapses and 46% of silent inhibitory ones (Fig. 4c). We\nwondered if those weak, \u201csilent\u201d synapses are necessary for stability of memory recall, or could be\nremoved altogether without affecting performance. To test that, we clipped those synapses {|Wij| <\n0.01} to zero, and computed recall performance again (Fig. 4d). This clipping turns out to slightly\nshift the position of the attractors in state space, so we increased the distance threshold that de\ufb01nes\na successful recall trial to 0.08. The test reveals that one of the attractors loses stability, reducing\nthe average performance. However the remaining 29 attractors are robust to this removal of weak\nsynapses and show near-equal recall performance as above. This demonstrates that small weights,\nthough numerous, are not necessary for competent recall performance.\n\nBalanced state\n\nAs a result of the connection weight distributions and robust stability, the trained network produces\na regime in which excitation and inhibition balance each other, precisely tuning each neuron to\nits target frequency in each attractor. Excitatory and inhibitory inputs are de\ufb01ned as hexc\n(t) =\n(t)\n\nj=1(cid:98)\u2212 Wij(cid:99)+ rj(t) so that the difference hexc\n\ncorresponds to the total recurrent input, i.e. the second term on the r.h.s. of Eq. 3.\n\n(cid:80)n\nj=1(cid:98)Wij(cid:99)+ rj(t) and hinh\n\ni\n\n(t) =(cid:80)n\n\ni\n\ni\n\n(t) \u2212 hinh\n\ni\n\n6\n\n00.10.200.511.52(a)t(s)d\u00b5(t)\u03c3=0.50\u03c3=0.75a00.20.40.60.8100.20.40.60.81bd\u00b5(t0)probabilityofsuccessnetworkidealmemoriesbaseline\fFigure 4: (a) Synaptic weight matrix after learning. Note the logarithmic color scale. (b) Distri-\nbution of the excitatory (red) and inhibitory (blue) weights. (c) Cumulative weight distribution of\nabsolute weight values. Gray line marks the 0.01 threshold we use to de\ufb01ned \u201csilent\u201d synapses. (d)\nPerformance of the network after clipping the weights below 0.01 to zero (black, median with 25th\nand 75th percentiles), compared to the performance of the unperturbed network redrawn from Fig. 3\n(purple).\n\nFigure 5: (a) Dynamics of the excitatory and inhibitory inputs during a memory recall trial, for\nthree sample neurons. (b) Scatter plot of steady-state excitatory versus inhibitory inputs. Each dot\ncorresponds to a different memory pattern, and several neurons are shown in different colors. (c)\nHistogram of E and I input correlations across all memories for each neuron (for example, one value\nbinned in this histogram would be the correlation between all green dots in b).\n\ni\n\ni\n\n(t) and hinh\n\nFigure 5a shows the evolution of hexc\n(t) during a recall trial for one of the stored random\nattractors, for 3 different neurons. Neuron 3 has rate target of 9Hz, well above average, therefore its\nexcitation is much higher than inhibition. Neuron 72 has a steady state \ufb01ring rate of 2 Hz, below\naverage: its inhibitory input is greater than the excitatory one, and \ufb01ring is driven by the external\ncurrent. Finally, neuron 101 is inhibitory and has a target rate 0, and indeed its inhibitory input\nis large enough to overwhelm the combined effects of the external and recurrent excitatory inputs.\nNotably, in all these cases, both E and I input currents are fairly large but cancel each other to leave\nsomething smaller, either positive or negative.\nFigure 5b shows the E vs. I inputs at steady-state across all the embedded attractors, for various\nneurons plotted in different colors. These E and I inputs tend to be correlated across attractors for\nevery single neuron (dots in Fig. 5 tend to hug the identity line), with relative differences \ufb01ne-tuned\nto yield the desired \ufb01ring rates. These across-attractors E/I correlations are summarized in Fig. 5c\nas a histogram over neurons.\n\nRobustness to ongoing noise and reduction of across-trial variability following recall onset\n\nFinally, to probe the system under more realistic dynamics, we added time-varying, Gaussian white\nnoise such that, in an excitatory neuron free from network interactions, the potential would \ufb02uctuate\n\n7\n\n11501150postsynapticpresynapticexc.inh.-15-5-1-0.100.11515Wij\u221210\u221250510weightexc.inh.ba00.5100.250.50.751startingdistancefromattr.successrateclippedfullcd10\u2212410\u2212210000.51weight020406002040t(ms)hexck(t),hinhk(t)k=3k=72k=101a02040600204060hexck(t\u221e)hinhk(t\u221e)k=3k=72k=101b00.51correlationk=15c\fFigure 6: (a) Normalized distance calculated according to Eq. 6 between the network activity and\neach of the attractors (targeted attractor: green line; others: orange lines) during a noisy recall\nepisode. (b) Trial-to-trial variability, expressed as the standard deviation of a neuron\u2019s activity across\nmultiple repetitions with random initial conditions. At time t = 0.5 s the network receives a pulse\nin the direction of one target attractor (\u00b5 = 2). Gray lines are for single neurons; the black line is\nan average over the population.\n\nwith standard deviation 0.33. Figure 6a shows the momentary distance d\u00b5(t) of the network state\nfrom the attractor closest to the initial cue (green), and for all other attractors (orange), during a\nrecall trial. It is clear that the system revolves around the desired attractor, performing successful\nrecall despite the ongoing noise. In a second experiment, we ran many trials in which the initial-\nization at time t = 0 was random, while the same spatially patterned stimulation \u2013 aligned onto a\nchosen attractor \u2013 is given to the network in each trial at time t = 0.5 sec. Figure 6b shows the stan-\ndard deviation of the internal state of a neuron across trials, averaged across the neural population.\nFollowing stimulus onset, neurons are always pushed towards the target attractor, and this greatly\nreduces trial-by-trial variability, compared to the initial spontaneous regime in which the neurons\nwould \ufb02uctuate around any of the activity levels corresponding to its assigned attractors. Interest-\ningly, such stimulus-induced variability reduction has been observed very broadly across sensory\nand motor cortical areas [21]. This extends previous work, e.g. [22] and [23], showing variability\nreduction in a multiple-attractor scenario with effectively binary patterns, to the case of patterns with\ngraded activities.\n\n4 Discussion\n\nWe have provided a proof of concept that a model cortical networks of E and I neurons can embed\nmultiple analog memories as stable \ufb01xed-points of their dynamics. Memories are stable in the face\nof ongoing noise and corruption of the recall cues. Neuronal activities do not saturate, and indeed,\nour single-neuron model did not explicitly incorporate an upper saturation mechanism: dynamic\nfeedback inhibition, precisely matched to the level of excitation incurred by each attractor, ensures\nthat each neuron can \ufb01re at a relatively low rate during recall. As a result, excitation and inhibition\nare tightly balanced.\nWe have used a rate-based formulation of the circuit dynamics, which raises the question of the\napplicability of our method to understanding spiking memory networks. Once the connectivity\nin the rate model is generated and optimized, it could still be used in a spiking model, provided\nthe gain function we have used here matches that of the single spiking neurons. In this respect,\nthe gain function we have used here is likely an appropriate choice: in physiological conditions,\ncortical neurons have input-output gain functions that are well approximated by a recti\ufb01ed power-\nlaw function over their entire dynamic range [24, 25, 26].\nAn important question for future research is how local synaptic learning rules can achieve the stabi-\nlization objective that we have approached here from an optimal, algorithmic viewpoint. Inhibitory\nsynaptic plasticity is a promising candidate, as it has already been shown to enable self-regulation of\nthe spontaneous, baseline activity regime, and also to promote the stable storage of binary memory\npatterns [27]. More work is required in this direction.\nAcknowledgements. This work was supported by the Wellcome Trust (GH, ML), the European\nUnion Seventh Framework Programme (FP7/20072013) under grant agreement no. 269921 (Brain-\nScaleS) (DF, ML), and the Swiss National Science Foundation (GH).\n\n8\n\n00.20.40.60.81012at(s)d\u00b5(t)nearestothers00.20.40.60.810123bt(s)hstd[vi(t)]ii\fReferences\n[1] Hop\ufb01eld J. Neural networks and physical systems with emergent collective computational abilities, Pro-\n\nceedings of the national academy of sciences 79:2554, 1982.\n\n[2] Hop\ufb01eld J. Neurons with graded response have collective computational properties like those of two-state\n\nneurons, Proceedings of the national academy of sciences 81:3088, 1984.\n\n[3] Treves A. Graded-response neurons and information encodings in autoassociative memories, Phys. Rev.\n\nA 42:2418, 1990.\n\n[4] Treves A, Rolls ET. What determines the capacity of autoassociative memories in the brain?, Network:\n\nComputation in Neural Systems 2:371, 1991.\n\n[5] Battaglia FP, Treves A. Stable and rapid recurrent processing in realistic autoassociative memories, Neural\n\nComput 10:431, 1998.\n\n[6] Lengyel M, Dayan P. Rate- and phase-coded autoassociative memory, In Advances in Neural Information\n\nProcessing Systems 17, 769, Cambridge, MA, 2005. MIT Press.\n\n[7] Amit D, Brunel N. Dynamics of a recurrent network of spiking neurons before and following learning,\n\nNetwork: Computation in Neural Systems 8:373, 1997.\n\n[8] Latham P, Nirenberg S. Computing and stability in cortical networks, Neural computation 16:1385, 2004.\n[9] Roudi Y, Latham PE. A balanced memory network, PLoS Computational Biology 3:e141, 2007.\n[10] Ben-Yishai R, et al. Theory of orientation tuning in visual cortex, Proc. Natl. Acad. Sci. USA 92:3844,\n\n1995.\n\n[11] Goldberg JA, et al. Patterns of ongoing activity and the functional architecture of the primary visual\n\ncortex, Neuron 42:489, 2004.\n\n[12] Lengyel M, et al. Matching storage and recall: hippocampal spike timing\u2013dependent plasticity and phase\n\nresponse curves, Nature Neuroscience 8:1677, 2005.\n\n[13] Vanbiervliet J, et al. The smoothed spectral abscissa for robust stability optimization, SIAM Journal on\n\nOptimization 20:156, 2009.\n\n[14] Hennequin G, et al. Optimal control of transient dynamics in balanced networks supports generation of\n\ncomplex movements, Neuron 82:1394, 2014.\n\n[15] Ahmadian Y, et al. Analysis of the stabilized supralinear network, Neural Comput. 25:1994, 2013.\n[16] Anderson JS, et al. The contribution of noise to contrast invariance of orientation tuning in cat visual\n\ncortex, Science 290:1968, 2000.\n\n[17] Johnson SG. The NLopt nonlinear-optimization package, http://ab-initio.mit.edu/nlopt .\n[18] Roxin A, et al. On the distribution of \ufb01ring rates in networks of cortical neurons, The Journal of Neuro-\n\nscience 31:16217, 2011.\n\n[19] Song S, et al. Highly nonrandom features of synaptic connectivity in local cortical circuits, PLoS Biol 3:\n\ne68, 2005.\n\n[20] Lefort S, et al. The excitatory neuronal network of the C2 barrel column in mouse primary somatosensory\n\ncortex, Neuron 61:301 , 2009.\n\n[21] Churchland MM, et al. Stimulus onset quenches neural variability: a widespread cortical phenomenon,\n\nNat Neurosci 13:369, 2010.\n\n[22] Litwin-Kumar A, Doiron B. Slow dynamics and high variability in balanced cortical networks with clus-\n\ntered connections, Nat Neurosci 15:1498, 2012.\n\n[23] Deco G, Hugues E. Neural network mechanisms underlying stimulus driven variability reduction, PLoS\n\ncomputational biology 8:e1002395, 2012.\n\n[24] Priebe NJ, Ferster D. Direction selectivity of excitation and inhibition in simple cells of the cat primary\n\nvisual cortex, Neuron 45:133, 2005.\n\n[25] Priebe NJ, Ferster D. Mechanisms underlying cross-orientation suppression in cat visual cortex, Nat Neu-\n\nrosci 9:552, 2006.\n\n[26] Finn IM, et al. The emergence of contrast-invariant orientation tuning in simple cells of cat visual cortex,\n\nNeuron 54:137, 2007.\n\n[27] Vogels TP, et al. Inhibitory plasticity balances excitation and inhibition in sensory pathways and memory\n\nnetworks, Science 334:1569, 2011.\n\n9\n\n\f", "award": [], "sourceid": 1173, "authors": [{"given_name": "Dylan", "family_name": "Festa", "institution": "University of Cambridge"}, {"given_name": "Guillaume", "family_name": "Hennequin", "institution": "University of Cambridge"}, {"given_name": "Mate", "family_name": "Lengyel", "institution": "University of Cambridge"}]}