{"title": "Spike-based Learning Rules and Stabilization of Persistent Neural Activity", "book": "Advances in Neural Information Processing Systems", "page_first": 199, "page_last": 208, "abstract": null, "full_text": "Spike-based learning rules and stabilization of \n\npersistent neural activity \n\nXiaohui Xie and H. Sebastian Seung \n\nDept. of Brain & Cog. Sci., MIT, Cambridge, MA 02139 \n\n{xhxie, seung}@mit.edu \n\nAbstract \n\nWe analyze the conditions under which synaptic learning rules based \non action potential timing can be approximated by learning rules based \non firing rates. In particular, we consider a form of plasticity in which \nsynapses depress when a presynaptic spike is followed by a postsynaptic \nspike, and potentiate with the opposite temporal ordering. Such differen(cid:173)\ntial anti-Hebbian plasticity can be approximated under certain conditions \nby a learning rule that depends on the time derivative of the postsynaptic \nfiring rate. Such a learning rule acts to stabilize persistent neural activity \npatterns in recurrent neural networks. \n\n1 \n\nINTRODUCTION \n\n\u00b0 \n\n1000 \n\ntime (ms) \n\n2000 \n\nexperiments \n\no \n- t \n\npre \n\nt \npost \n\nA o~i =~=::====: \nB 0L - . i _:3/_-----' \n\n~re 11111111111 111111111111111111 \npost 11111111111 11111111 \u2022 \u2022 11. \nt .:oLl ____ \\:;J \n__ ~ \n\nRecent \nhave \ndemonstrated types of synaptic \nplasticity that depend on the \ntemporal ordering of presynap(cid:173)\ntic and postsynaptic spiking. At \ncortical [ I] and hippocampal[2] \nsynapses, \nlong-term potenti(cid:173)\nation is induced by repeated \npairing of a presynaptic spike \nand a succeeding postsynaptic \nspike, while long-term depres(cid:173)\nsion results when the order \nis reversed. The dependence \nin synaptic \nof \nthe difference \nstrength on \nl:!..t = \ntpre between \npostsynaptic and presynaptic \nspike times has been measured \nquantitatively. \nThis pairing \nfunction, sketched in Figure \nlA, has positive and negative lobes correspond to potentiation and depression. and a \nwidth of tens of milliseconds. We will refer to synaptic plasticity associated with this \npairing function as differential Hebbian plasticity-Hebbian because the conditions for \n\nFigure I: (A) Pairing function for differential Heb(cid:173)\nbian learning. The change in synaptic strength is plot(cid:173)\nted versus the time difference between postsynaptic \nand presynaptic spikes. (B) Pairing function for dif(cid:173)\nferential anti-Hebbian learning. (C) Differential anti(cid:173)\nHebbian learning is driven by changes in firing rates. \nThe synaptic learning rule of Eq. (l) is applied to two \nPoisson spike trains. The synaptic strength remains \nroughly constant in time, except when the postsynap(cid:173)\ntic rate changes. \n\nthe change \n\ntpost -\n\n\f200 \n\nX Xie and H. S. Seung \n\npotentiation are as predicted by Hebb[3], and differential because it is driven by the \ndifference between the opposing processes of potentiation and depression. \n\nThe pairing function of Figure IA is not characteristic of all synapses. For example, an \nopposite temporal dependence has been observed at electrosensory lobe synapses of elec(cid:173)\ntric fish[4]. As shown in Figure IB, these synapses depress when a presynaptic spike is \nfollowed by a postsynaptic one, and potentiate when the order is reversed. We will refer to \nthis as differential anti-Hebbian plasticity. \n\nAccording to these experiments, the maximum ranges of the differential Hebbian and anti(cid:173)\nHebbian pairing functions are roughly 20 and 40 ms, respectively. This is fairly short, and \nseems more compatible with descriptions of neural activity based on spike timing rather \nthan instantaneous firing rates[5, 6]. In fact, we will show that there are some conditions \nunder which spike-based learning rules can be approximated by rate-based learning rules. \nOther people have also studied the relationship between spike-based and rate-based learn(cid:173)\ning rules[7, 8]. \n\nThe pairing functions of Figures IA and IB lead to rate-based learning rules like those \ntraditionally used in neural networks, except that they depend on temporal derivatives of \nfiring rates as well as firing rates themselves. We will argue that the differential anti(cid:173)\nHebbian learning rule of Figure IB could be a general mechanism for tuning the strength \nof positive feedback in networks that maintain a short-term memory of an analog variable \nin persistent neural activity. A number of recurrent network models have been proposed \nto explain memory-related neural activity in motor [9] and prefrontal [ 10] cortical areas, \nas well as the head direction system [11] and oculomotor integrator[ 12, 13, 14]. All of \nthese models require precise tuning of synaptic strengths in order to maintain continuously \nvariable levels of persistent activity. As a simple illustration of tuning by differential anti(cid:173)\nHebbian learning, a model of persistent activity maintained by an integrate-and-fire neuron \nwith an excitatory autapse is studied. \n\n2 SPIKE-BASED LEARNING RULE \n\nPairing functions like those of Figure 1 have been measured using repeated pairing of a \nsingle presynaptic spike with a single postsynaptic spike. Quantitative measurements of \nsynaptic changes due to more complex patterns of spiking activity have not yet been done. \nWe will assume a simple model in which the synaptic change due to arbitrary spike trains is \nthe sum of contributions from all possible pairings of presynaptic with postsynaptic spikes. \nThe model is unlikely to be an exact description of real synapses, but could turn out to be \napproximately valid. \nWe will write the spike train of the ith neuron as a series of Dirac delta functions, Si (t) = \nLn <5(t - Tr), where Tr is the nth spike time of the ith neuron. The synaptic weight from \nneuron j to i at time t is denoted by Wij (t). Then the change in synaptic weight induced \nby presynaptic spikes occurring in the time interval [0, Tj is modeled as \n\nWij(T + >.) - Wij(>') = [T dtj foo dti f(ti - tj)Si(ti) Sj(tj) \n\nio \n\n-00 \n\n(1) \n\nEach presynaptic spike is paired with all postsynaptic spikes produced before and after. \nFor each pairing, the synaptic weight is changed by an amount depending on the pairing \nfunction f. The pairing function is assumed to be nonzero inside the interval [-T, Tj, and \nzero outside. We will refer to T as the pairing range. \n\nAccording to our model, each presynaptic spike results in induction of plasticity only after \na latency>.. Accordingly, the arguments T + >. and >. of Wij on the left hand side of the \nequation are shifted relative to the limits T and 0 of the integral on the right hand side. We \n\n\fSpike-based Learning and Stabilization of Persistent Neural Activity \n\n201 \n\nwill assume that the latency>. is greater than the pairing range T, so that Wi} at any time is \nonly influenced by presynaptic and postsynaptic spikes that happened before that time, and \ntherefore the learning rule is causal. \n\n3 RELATION TO RATE-BASED LEARNING RULES \n\nThe learning rule of Eq. (1) is driven by correlations between presynaptic and postsynaptic \nactivities. This dependence can be made explicit by making the change of variables u = \nti - t j in Eq. (I), which yields \n\nWij(T + >.) - Wij (>.) = iTT duf(u)Cij(u) \n\nwhere we have defined the cross-correlation \n\nCij(u) = !aT dt Si(t + u) Sj(t) . \n\n(2) \n\n(3) \n\nand made use of the fact that f vanishes outside the interval [-T, T]. Our immediate goal \nis to relate Eq. (2) to learning rules that are based on the cross-correlation between firing \nrates, \n\nCrre(u) = !aT dt Vi(t + u) Vj(t) \n\n(4) \n\nThere are a number of ways of defining instantaneous firing rates. Sometimes they are \ncomputed by averaging over repeated presentations of a stimulus. In other situations, they \nare defined by temporal filtering of spike trains. The following discussion is general, and \nshould apply to these and other definitions of firing rates. \n\nThe \"rate correlation\" is commonly subtracted from the total correlation to obtain the \"spike \ncorrelation\" C:rke = Cij - Cijate. To derive a rate-based approximation to the learning \nrule (2), we rewrite it as \n\nWij(T + >.) - Wij(>') = iTT du f(u)Cijate(u) + iTT du f(u)C:r ke (u) \n\n(5) \n\nand simply neglect the second term. Shortly we will discuss the conditions under which \nthis is a good approximation. But first we derive another form for the first term by applying \nthe approximation Vi(t + u) ~ Vi(t) + UVi(t) to obtain \n\nj T duf(u)Crre(u) ~ iT dt[fiovi(t) + 131Vi(t)]VJ (t) \n\n-T \n\n0 \n\nwhere we define \n\n(6) \n\n(7) \n\nThis approximation is good when firing rates vary slowly compared to the pairing range \nT . The learning rule depends on the postsynaptic rate through fio Vi + 131 Vi . When the \nfirst term dominates the second, then the learning rule is the conventional one based on \ncorrelations between firing rates, and the sign of fio determines whether the rule is Hebbian \nor anti-Hebbian. \nIn the remainder of the paper, we will discuss the more novel case where 130 = O. This \nholds for the pairing functions shown in Figures lA and IB, which have positive and neg(cid:173)\native lobes with areas that exactly cancel in the definition of 130. Then the dependence on \n\n\f202 \n\nX Xie and H. S. Seung \n\npostsynaptic activity is purely on the time derivative of the firing rate. Differential Hebbian \nlearning corresponds to /31 > 0 (Figure IA), while differential anti-Hebbian learning leads \nto /31 < 0 (Figure IB). To summarize the /30 = 0 case, the synaptic changes due to rate \ncorrelations are approximated by \n\nWij ex: -ViVj \n\n(diff. anti-Hebbian) \n\n(8) \n\nfor slowly varying rates. These formulas imply that a constant postsynaptic firing rate \ncauses no net change in synaptic strength. Instead, changes in rate are required to induce \nsynaptic plasticity. \n\nTo illustrate this point, Figure lC shows the result of applying differential anti-Hebbian \nlearning to two spike trains. The presynaptic spike train was generated by a 50 Hz Poisson \nprocess, while the postsynaptic spike train was generated by an inhomogeneous Poisson \nprocess with rate that shifted from 50 Hz to 200 Hz at 1 sec. Before and after the shift, \nthe synaptic strength fluctuates but remains roughly constant. But the upward shift in firing \nrate causes a downward shift in synaptic strength, in accord with the sign of the differential \nanti-Hebbian rule in Eq. (8). \n\nThe rate-based approximation works well for this example, because the second term of Eq. \n(5) is not so important. Let us return to the issue of the general conditions under which \n\nthis term can be neglected. With Poisson spike trains, the spike correlations C: Pike (u) are \nzero in the limit T -7 00, but for finite T they fluctuate about zero. The integr~l over u in \nthe second term of (5) dampens these fluctuations. The amount of dampening depends on \nthe pairing range T, which sets the limits of integration. In Figure 1 C we used a relatively \nlong pairing range of 100 ms, which made the fluctuations small even for small T. On the \nother hand, if T were short, the fluctuations would be small only for large T_ Averaging \nover large T is relevant when the amplitUde of f is small, so that the rate of learning is \nslow. In this case, it takes a long time for significant synaptic changes to accumulate, so \nthat plasticity is effectively driven by integrating over long time periods T in Eq. (l). \nIn the brain, nonvanishing spike correlations are sometimes observed even in the T -7 00 \nlimit, unlike with Poisson spike trains. These correlations are often roughly symmetric \nabout zero, in which case they should produce little plasticity if the pairing functions are \nantisymmetric as in Figures lA and lB. On the other hand, if the spike correlations are \nasymmetric, they could lead to substantial effects[6]. \n\n4 EFFECTS ON RECURRENT NETWORK DYNAMICS \n\nThe learning rules of Eq. (8) depend on both presynaptic and postsynaptic rates, like learn(cid:173)\ning rules conventionally used in neural networks. They have the special feature that they \ndepend on time derivatives, which has computational consequences for recurrent neural \nnetworks of the form \n\nXi + Xi = L Wiju(Xj) + bi \n\n(9) \n\nj \n\nSuch classical neural network equations can be derived from more biophysically realistic \nmodels using the method of averaging[ 15] or a mean field approximation[ 16]. The firing \nrate of neuron j is conventionally identified with Vj = u(Xj). \nThe cost function E( {Xi}; {Wij}) = ~ Li v; quantifies the amount of drift in firing rate at \nthe point Xl , ... , X N in the state space of the network. If we consider Vi to be a function of \nXi and Wij defined by (9), then the gradient ofthe cost function with respect to Wij is given \nby BE / BWij = U' (Xi)ViVj. Assuming that U is a monotonically increasing function so that \nu' (xd > 0, it follows that the differential Hebbian update of (8) increases the cost function, \n\n\fSpike-based Learning and Stabilization of Persistent Neural Activity \n\n203 \n\nand hence increases the magnitude of the drift velocity. In contrast, the differential anti(cid:173)\nHebbian update decreases the drift velocity. This suggests that the differential anti-Hebbian \nupdate could be useful for creating fixed points of the network dynamics (9). \n\n5 PERSISTENT ACTIVITY IN A SPIKING AUTAPSE MODEL \n\nThe preceding arguments about drift velocity were based on approximate rate-based de(cid:173)\nscriptions of learning and network dynamics. It is important to implement spike-based \nlearning in a spiking network dynamics, to check that our approximations are valid. \nTherefore we have numerically simu(cid:173)\nlated the simple recurrent circuit of \nintegrate-and-fire neurons shown in Fig(cid:173)\nure 2. The core of the circuit is the \n\"memory neuron,\" which makes an exci(cid:173)\ntatory autapse onto itself. It also receives \nsynaptic input from three input neurons: \na tonic neuron, an excitatory burst neu(cid:173)\nron, and an inhibitory burst neuron. It is \nknown that this circuit can store a short(cid:173)\nterm memory of an analog variable in \npersistent activity, if the strengths of the \nautapse and tonic synapse are precisely \ntuned[ 17]. Here we show that this tun(cid:173)\ning can be accomplished by the spike(cid:173)\nbased learning rule of Eq. (1), with a d(cid:173)\nifferential anti-Hebbian pairing function \nlike that of Figure 1 B. \n\nINHIBITORY BURST \n\u2022 \n\nFigure 2: Circuit diagram for autapse model \n\nThe memory neuron is described by the equations \n\n= \n\nC dV \nm dt \ndr \n\nTsyn dt + r \n\nn \n\n(10) \n\n(1) \n\nwhere V is the membrane potential. When V reaches V'thres, a spike is considered to have \noccurred, and V is reset to Vreset. Each spike at time Tn causes a jump in the synaptic \nactivation r of size CY.r/Tsyn, after which r decays exponentially with time constant Tsyn \nuntil the next spike. \n\nThe synaptic conductances of the memory neuron are given by \n\n(12) \n\nThe term W r is recurrent excitation from the autapse, where W is the strength of the au(cid:173)\ntapse. The synaptic activations ro, r +, and r _ of the tonic, excitatory burst, and inhibitory \nburst neurons are governed by equations like (10) and (1), with a few differences. These \nneurons have no synaptic input; their firing patterns are instead determined by applied cur(cid:173)\nrents lapp,o, lapp,+ and lapp,_. The tonic neuron has a constant applied current, which \nmakes it fire repetitively at roughly 20 Hz (Figure 3). For the excitatory and inhibitory \nburst neurons the applied current is normally zero, except for brief 100 ms current pulses \nthat cause bursts of action potentials. \n\nAs shown in Figure 3, if the synaptic strengths W and Wo are arbitrarily set before learning, \nthe burst neurons cause only transient changes in the firing rate of the memory neuron. \nAfter applying the spike-based learning rule (1) to tune both W and Wo, the memory \n\n\f204 \n\nX Xie and H. S. Seung \n\n111111111111 I \n\nIUIIIIIIII I \n\n~IIIIIIIII I \n/untuned \n\n111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 \nI~ ____ ~I~ ____ ~ ______ =-____ _ \n\nI \n\nI \n\nI \n\n\" \n\ntuned \n\n1 sec \n\n1IIIIIIIIIIIIIIIIIIIIIIIIIIINIIIIII'.tl \n\n1111111111'\"111111111111111111111111111 \n\nFigure 3: Untuned and tuned autapse activity. The middle three traces are the membrane \npotentials of the three input neurons in Figure 2 (spikes are drawn at the reset times of \nthe integrate-and-fire neurons). Before learning, the activity of the memory neuron is not \npersistent, as shown in the top trace. After the spike-based learning rule (1) is applied to \nthe synaptic weights Wand Wo, then the burst inputs cause persistent changes in activity. \nem = 1 nF, gL = 0.025 J-lS, VL = -70 mY, VE = 0 mY, VI = -70 mY, vthres = -52 \nmY, Vr eset = -59 mY, a s = 1, Tsyn = 100 ms, Iapp,o = 0.5203 nA, I app,\u00b1 = 0 or 0.95 \nnA, Ts yn ,O = 100 ms, Tsyn,+ = Tsyn,- = 5 ms, W+ = 0.1, W_ = 0.05. \n\nneuron is able to maintain persistent activity. During the interburst intervals (from A after \none burst until A before the next), we made synaptic changes using the differential anti(cid:173)\nHebbian pairing function f(t) = -Asin(7l'tjT) for spike time differences in the range \n[-T, T] with A = 1.5 X 10-4 and T=A=120 ms. The resulting increase in persistence time \ncan be seen in Figure 4A, along with the values of the synaptic weights versus time. \n\nTo quantify the performance of the system at maintaining persistent activity, we determined \nthe relationship between dv / dt and v using a long sequence of interburst intervals, where v \nwas defined as the reciprocal of the interspike interval. If Wand Wo are fixed at optimally \ntuned values, there is still a residual drift, as shown in Figure 4B. But if these parameters are \nallowed to adapt continuously, even after good tuning has been achieved, then the residual \ndrift is even smaller in magnitude. This is because the learning rule tweaks the synaptic \nweights during each interburst interval, reducing the drift for that particular firing rate. \n\nAutapse learning is driven by the autocorrelation of the spike train, rather than a cross(cid:173)\ncorrelation. The peak in the autocorrelogram at zero lag has no effect, since the pairing \nfunction is zero at the origin. Since the autocorrelation is zero for small time lags, we used \na fairly large pairing range in our simulations. In a recurrent network of many neurons, a \nshorter pairing range would suffice, as the cross-correlation does not vanish near zero. \n\n6 DISCUSSION \n\nWe have shown that differential anti-Hebbian learning can tune a recurrent circuit to main(cid:173)\ntain persistent neural activity. This behavior can be understood by reducing the spike-based \nlearning rule (l) to the rate-based learning rules ofEqs. (6) and (8). The rate-based approx(cid:173)\nimations are good if two conditions are satisfied. First, the pairing range must be large, or \nthe rate of learning must be slow. Second, spike synchrony must be weak, or have little \neffect on learning due to the shape of the pairing function. \n\nThe differential anti-Hebbian pairing function results in a learning rule that uses -Vi as a \nnegative feedback signal to reduce the amount of drift in firing rate, as illustrated by our \nsimulations of an integrate-and-fire neuron with an excitatory autapse. More generally, \nthe learning rule could be relevant for tuning the strength of positive feedback in network(cid:173)\ns that maintain a short-term memory of an analog variable in persistent neural activity. \n\n\fSpike-based Learning and Stabilization of Persistent Neural Activity \n\n205 \n\nA \n\n200 \n\n250 c ' 6 \n\n0.16 1 \n0.12 1 ~ \n:I: 0 \n~-2 \n\n0.395 W WO \n\n10 \n\"me Is) \n\n0385 \n\n4 \n\n2 \n\n0 \n\n20 \n\nB \n\nI \n\n~150 \n~ \ni! \n\u00a7100 \n\"\" \n\nI ~~ -4f \n\n50 \n\n-at \n! -8' \n\n00 \n\n5 \n\n10 \n\n15 \ntlme(s) \n\n20 \n\n25 \n\n20 \n\n40 \n\n60 \n\nrate (Hzl \n\n.~~ \n~r\u00b7 ~~~ \n\n'0 \n\n80 \n\ni \n100 \n\nFigure 4: Tuning the autapse. (A) The persistence time of activity increases as the weight(cid:173)\ns Wand Wo are tuned. Each transition is driven by pseudorandom bursts of input (B) \nSystematic relationship between drift dv/dt in firing rate and v, as measured from a long \nsequence of interburst intervals. If the weights are continuously fine-tuned ('*') the drift is \nless than with fixed well-tuned weights ('0'). \n\nFor example, the learning rule could be used to improve the robustness of the oculomotor \nintegrator[12, 13, 14] and head direction system[l1] to mistuning of parameters. In deriv(cid:173)\ning the differential forms of the learning rules in (8), we assumed that the areas under the \npositive and negative lobes of the pairing function are equal, so that the integral defining \n130 vanishes. In reality, this cancellation might not be exact. Then the ratio of 131 and 130 \nwould limit the persistence time that can be achieved by the learning rule. \n\nBoth the oculomotor integrator and the head direction system are also able to integrate \nvestibular inputs to produce changes in activity patterns. The problem of finding general(cid:173)\nizations of the present learning rules that train networks to integrate is still open. \n\nReferences \n\n[1] H. Markram, J. Lubke, M. Frotscher, and B. Sakmann. Science, 275(5297):213-5, 1997. \n[2] G. Q. Bi and M. M. Poo. 1 Neurosci, 18(24):10464-72,1998. \n[3] D. O . Hebb. Organization of behavior. Wiley, New York, 1949. \n[4] C. C. Bell, V. Z. Han, Y. Sugawara, and K. Grant. Nature, 387(6630):278-81 , 1997. \n[5] w. Gerstner, R . Kempter, 1. L. van Hemmen, and H. Wagner. Nature, 383(6595):76-81, 1996. \n[6] L. F. Abbott and S. Song. Adv. Neural Info. Proc. Syst., 11, 1999. \n[7] P. D. Roberts. 1. Comput. Neurosci., 7:235-246, 1999. \n[8] R. Kempter, W. Gerstner, and J. L. van Hemmen. Phys. Rev. E, 59(4):4498-4514, 1999. \n[9] A. P. Georgopoulos, M. Taira, and A. Lukashin. Science, 260:47-52, 1993. \n[10] M. Camperi and X. J. Wang. 1 Comput Neurosci, 5(4):383-405, 1998. \n[11] K. Zhang. 1. Neurosci., 16:2112-2126, 1996. \n[12] S. C. Cannon, D. A. Robinson, and S. Shamma. Bio!. Cybern., 49:127-136,1983. \n[13] H. S. Seung. Proc. Nat!. A cad. Sci. USA, 93:13339-13344, 1996. \n[14] H. S. Seung, D. D. Lee, B. Y. Reis, and D. W. Tank. Neuron, 2000. \n[15] B. Ermentrout. Neural Comput., 6:679-695, 1994. \n[16] O. Shriki, D. Hansel, and H. Sompolinsky. Soc. Neurosci. Abstr., 24:143, 1998. \n[17] H. S. Seung, D. D. Lee, B. Y. Reis, and D. W. Tank. 1. Comput. Neurosci., 2000. \n\n\f\fPART III \nTHEORY \n\n\f\f", "award": [], "sourceid": 1658, "authors": [{"given_name": "Xiaohui", "family_name": "Xie", "institution": null}, {"given_name": "H. Sebastian", "family_name": "Seung", "institution": null}]}