{"title": "Noisy Spiking Neurons with Temporal Coding have more Computational Power than Sigmoidal Neurons", "book": "Advances in Neural Information Processing Systems", "page_first": 211, "page_last": 217, "abstract": null, "full_text": "Noisy Spiking Neurons with Temporal \n\nCoding have more Computational Power \n\nthan Sigmoidal Neurons \n\nWolfgang Maass \n\nInstitute for Theoretical Computer Science \n\nTechnische Universitaet Graz, Klosterwiesgasse 32/2 \nA-80lO Graz, Austria, e-mail: maass@igLtu-graz.ac.at \n\nAbstract \n\nWe exhibit a novel way of simulating sigmoidal neural nets by net(cid:173)\nworks of noisy spiking neurons in temporal coding. Furthermore \nit is shown that networks of noisy spiking neurons with temporal \ncoding have a strictly larger computational power than sigmoidal \nneural nets with the same number of units. \n\n1 \n\nIntroduction and Definitions \n\nWe consider a formal model SNN for a \u00a7piking neuron network that is basically \na reformulation of the spike response model (and of the leaky integrate and fire \nmodel) without using 6-functions (see [Maass, 1996a] or [Maass, 1996b] for further \nbackgrou nd). \n\nAn SNN consists of a finite set V of spiking neurons, a set E ~ V x V of synapses, a \nweight wu,v 2: 0 and a response function cu,v : R+ --+ R for each synapse {u, v} E E \n(where R+ := {x E R: x 2: O}) , and a threshold function 8 v : R+ --+ R+ for each \nneuron v E V . \n\nIf Fu ~ R+ is the set of firing times of a neuron u , then the potential at the trigger \nzone of neuron v at time t is given by \n\nPv(t) := L \n\nL \n\nu:{u,v)EE \n\nsEF,,:s 0 such that each response function eu,v(X) is of the \nform Ctu,v . (x - du,v) with Ctu,v E {-I, 1} for x E [du,v, du,v + D.] , and eu,v(X) = 0 \nfor x E [0, du,v] . \n\nConsider a spiking neuron v that receives postsynaptic potentials from n presynaptic \nneurons at, ... ,an' For simplicity we asfiume that interspike intervals are so large \nthat the firing time tv of neuron v depends just on a single firing time ta\u2022 of each \nneuron ai, and 8 v has returned to its \"resting value\" 8 v(0) before v fires again. \nThen if the next firing of v occurs at a time when the postsynaptic potentials \ndescribed by Wa;,V . ea.,v(t - ta.) are all in their initial linear phase, its firing time \ntv is determined in the noise-free model for Wi := Wa;,v . Cta.,v by the equation \nE~=t Wi . (tv - tao - da.,v) = 8 v(0) , or equivalently \n\n(1) \n\nThis equation revealfi the fiomewhat fiurprifiing fact that (for a certain range of \ntheir parameters) spiking neurons can compute a weighted sum in terms of firing \ntimes, i.e. temporal coding. One fihould alfio note that in the case where all delays \nda.,v have the same value, the \"weights\" Wi of this weighted sum are encoded in \nthe \"strengths\" wa.,v of the synapsefi and their \"sign\" Cta.,t, , as in the \"firing rate \ninterpretation\". Finally according to (1) the coefficients of the presynaptic firing \ntimes tao are automatically normalized, which appears to be of biological interest. \n\nIn the simplest scheme for temporal coding (which is closely related to that in \n[Hopfield, 1995]) an analog variable x E [0,1] is encoded by the firing time T -,' x \nof a neuron, where T is assumed to be independent of x (in a biological context T \nmight be time-locked to the onset of a fitimulus, or to some oscillation) and , is \nsome constant that ifi determined in the proof of Theorem 2.1 (e.g. , = tlj2 in the \nnoifie-free case). In contrafit to [Hopfield, 1995] we afifiume that both the inputs and \nthe outputs of computationfi are encoded in thifi fafihion. This has the advantage \nthat one can compose computational modules. \n\n\f214 \n\nW. Maass \n\nWe will first focus in Theorem 2.1 on the simulation of sigmoidal neural nets that \nemploy the piecewise linear \"linear saturated\" activation function 1r : R -\n[0,1] \ndefined by 1r(Y) = 0 ify < 0, 1r(Y) = y if 0 ~ y ~ 1, and 1r(Y) = 1 ify > 1 . \nThe Theorem 3.1 in the next section will imply that one can simulate with spik(cid:173)\ning neurons also sigmoidal neural nets that employ arbitrary continuous activation \nfunctions. Apart from the previously mentioned assumptions we will assume for \nthe proofs of Theorem 2.1 and 3.1 that any EPSP satisfies eu,v(X) = 0 for all suffi(cid:173)\nciently large x, and eu,v(X) ~ eu,v(du,v + ~) for all x E [du,v + Ll, du,v + Ll + 1'] . \nWe assume that each IPSP is continuous, and has value 0 except for some interval \nof R. Furthermore we assume for each EPSP and IPSP that jeu,v(x)1 grows at \nmost linearly during the interval [du,v + Ll, du,v + Ll + 1'] . In addition we assume \nthat 0 v (x) = 0 v(0) for sufficiently large x , and that 0 v(x) is sufficiently large for \nO 0 one can simulate any given feed forward sig(cid:173)\nmoidal neural net N with activation function 1r by a network NN,e,6 of noisy spiking \nneurons in temporal coding. More precisely, for any network input Xl, . .\u2022 ,Xm E \n[O,IJ the output of NN,e,6 differs with probability 2:: 1 - 0 by at most e from that \nof N . Furthermore the computation time of NN,e,6 depends neither on the number \nof gates in N nor on the parameters e, 0, but only on the number of layers of the \nsigmoidal neural network N . \n\n- -\n\nWe refer to [Maass, 1997] for details of the somewhat complicated proof. One \nemploys the mechanism described by (1) to simulate through the firing time of a \nspiking neuron v a sigmoidal gate with activation function 1r for those gate-inputs \nwhere 1r operates in its linearly rising range. With the help of an auxiliary spiking \nneuron that fires at time T one can avoid the automatic \"normalization\" of the \nweights Wi that is provided by (1), and thereby compute a weighted sum with \narbitrary given weights. In order to simulate in temporal coding the behaviour of \nthe gate in the input range where 1r is \"saturated\" (Le. constant), it suffices to \nemploy some auxiliary spiking neurons which make sure that v fires exactly once \nduring the relevant time window (and not shortly before that). \n\nSince inputs and outputs of the resulting modules for each single gate of N are all \ngiven in temporal coding, one can compose these modules to simulate the multi(cid:173)\nlayer sigmoidal neural net N. With a bit of additional work one can ensure that \nthis construction also works with noisy spiking neurons. \n\u2022 \n\n3 Universal Approximation Property of Networks of Noisy \n\nSpiking Neurons with Temporal Coding \n\nIt is known [Leshno et al., 1993J that feed forward sigmoidal neural nets whose gates \nemploy the activation function 1r can approximate with a single hidden layer for \nany n, kEN any given continuous function F : [O,I]n -\n[O,I]k within any e > 0 \nwith regard to the Loo-norm (i.e. uniform convE'rgence). Hence we can derive the \nfollowing result from Theorem 2.1 : \n\nTheorem 3.1 Any given continuous function F : [0, l]n _ [O,I]k can be approxi(cid:173)\nmated within any given e > 0 with arbitrarily high reliability in temporal coding by \n\n\fSpiking Neurons have more Computational Power than Sigmoidal Neurons \n\n215 \n\na network of noisy spiking neurons (SNN) with a single hidden layer (and hence \nwithin 15 ms for biologically realistic values of their time-constants). \n\u2022 \n\nBecause of its generality this Theorem implies the same result also for more general \nschemes of coding analog variables by the firing times of neurons, besides the par(cid:173)\nticular one that we have considered so far. In fact it implies that the same result \nholds for any other coding scheme C that is \"continuously related\" to the previ(cid:173)\nously considered one in the sense that the transformation between firing times that \nencode an analog variable x in the here considered coding scheme and in the coding \nscheme C can be described by uniformly continuous functions in both directions. \n\n4 Spiking Neurons have more Computational Power than \n\nSigmoidal Neurons \n\nWe consider the \"element distinctness function\" EDn \nby \n\nEDn(Sl, ... ,Sn) = \n\n{\n\nI, \n\n0, \n\narbitrary, \n\nif s i = s i for some i =I=- j \nif lSi -ail ~ 1 for all i,j with i =l=-j \nelse. \n\nIf one encodes the value of input variable Si by a firing of input neuron ai at time \nTin - c\u00b7 Si , then for sufficiently large values of the constant C > 0 a single noisy \nspiking neuron v can compute EDn with arbitrarily high reliability. This holds for \nany reasonable type ofresponse functions, e.g. the ones shown in Fig. 1. The binary \noutput of this computation is assumed to be encoded by the firing/non-firing of v . \nHair-trigger situations are avoided since no assumptions have to be made about the \nfiring or non-firing of v if EPSP's arrive with a temporal distance between 0 and c . \n\nOn the other hand the following result shows that a fairly large sigmoidal neural \nnet is needed to compute the same function. Its proof provides the first application \nfor Sontag's recent results about a new type of \"dimension\" d of a neural network \nN , where d is chosen maximal so that every subset of d inputs is shattered by N . \nFurthermore it expands a method due to [Koiran, 1995] for llsing the VC-dimension \nto prove lower bounds on network size. \n\nTheorem 4.1 Any sigmoidal neural net N that computes EDn has at least n2\"4 -1 \nhidden units. \n\nProof: Let N be an arbitrary sigmoidal neural net with k gates that computes \nEDn. Consider any set S ~ R+ of size n -1. Let .x > 0 be sufficiently large so that \nthe numbers in .x . S have pairwise distance ~ 2 . Let A be a set of n - 1 numbers \n> max (.x . S) + 2 with pairwise distance ~ 2 . \nBy assumption N can decide for n arbitrary inputs from .x . SuA whether they \nare all different. Let N>. be a variation of N where all weights on edges from the \nfirst input variable are mUltiplied with .x. Then N>. can compute any function from \n\n\f216 \n\nW Maass \n\nS into {O, I} after one has assigned a suitable fixed set of n - 1 pairwise different \nnumbers from ..\\ . SuA to the last n - 1 input variables. \nThus if one considers a'l programmable parameters of N the factor ..\\ in the weights \non edges from the first input variable and the ~ k thresholds of gates that are \nconnected to some of the other n - 1 input variables, then N shatters S with these \nk + 1 programmable parameters. \nSince S ~ R + of size n - 1 was chosen arbitrarily, we can now apply the result from \n[Sontag, 1996], which yields an upper bound of 2w + 1 for the maximal number d \nsuch that every set of d different inputs can be shattered by a sigmoidal neural net \nwith w programmable parameters (note that this parameter d is in general much \nsmaller than the VC-dimension of the neural net). For w := k + 1 this implies in \nour case that n - 1 ~ 2(k + 1) + 1 , hence k ~ (n - 4)/2 . Thus N has at least \n(n - 4) /2 computation nodes, and therefore at least (n - 4)/2 -1 hidden units. One \nshould point out that due to the generality of Sontag's result this lower bound is \nvalid for all common activation functions of sigmoidal gates, and even if N employs \nheaviside gates besides sigmoidal gates. \n\u2022 \n\nTheorem 4.1 yields a lower bound of 4997 for the number of hidden units in any \nsigmoidal neural net that computes EDn for n = 10 000 , where 10 000 is a common \nestimate for the number of inputs (i.e. synapses) of a biological neuron. \n\nFinally we would like to point out that to the best of our knowledge Theorem 4.1 \nprovides the largest known lower bound for any concrete function with n inputs on \na sigmoidal neural net. The largest previously known lower bound for sigmoidal \nneural nets wa<; O(nl/4) , due to [Koiran, 1995J. \n\n5 Conclusions \n\nTheorems 2.1 and 3.1 provide a model for analog computations in network of spiking \nneurons that is consistent with experimental results on the maximal computation \nspeed of biological neural systems. As explained after Theorem 3.1, this result holds \nfor a large variety of possible schemes for encoding analog variables by firing times. \n\nThese theoretical results hold rigoro'U.sly only for a rather small time window of \nlength, for temporal coding. However a closer inspection of the construction \nshows that the actual shape of EPSP's and IPSP's in biological neurons provides \nan automatic adjustment of extreme values of the inputs tao towards their average, \nwhich allows them to carry out rather similar computations for a substantially larger \nwindow size. It also appears to be of interest from the biological point of view that \nthe synaptic weights play for temporal coding in our construction basically the same \nrole as for rate coding, and hence the .~ame network is in principle able to compute \nclosely related analog functions in both coding schemes. \n\nWe have focused in our constructions on feedforward nets, but our method can for \nexample also be used to simulate a Hopfield net with graded response by a network \nof noisy spiking neurons in temporal coding. A stable state of the Hopfield net \ncorresponds then to a firing pattern of the simulating SNN where all neurons fire \nat the same frequency, with the ((pattern\" of the stahle state encoded in their phase \ndifferences. \n\n\fSpiking Neurons have more Computational Power than Sigmoidal Neurons \n\n217 \n\nThe theoretical results in this article may also provide additional goals and direc(cid:173)\ntions for a new computer technology based on artificial spiking neurons. \n\nAcknowledgement \n\nI would like to thank David Haussler, Pa.')cal Koiran, and Eduardo Sontag for helpful \ncommunications. \n\nReferences \n\n[Bair & Koch, 1996] W. Bair, C. Koch, \"Temporal preCISIon of spike trains in \nextra.')triate cortex of the behaving macaque monkey\", Neural Computation, \nvol. 8, pp 1185-1202, 1996. \n\n[Bialek & Rieke, 1992] W. Bialek, and F. Rieke, \"Reliability and information trans(cid:173)\n\nmission in spiking neurons\", Trends in Neuroscience, vol. 15, pp 428-434,1992. \n\n[Hopfield, 1995J J. J. Hopfield, \"Pattern recognition computation using action po(cid:173)\ntential timing for stimulus representations\", Nature, vol. 376, pp 33-36, 1995. \n\n[Koiran, 1995] P. Koiran, \"VC-dimension in circuit complexity\", Proc. of the 11th \n\nIEEE Conference on Computational Complexity, pp 81-85, 1996. \n\n[Leshno et aI., 1993] M. Leshno, V. Y. Lin, A. Pinkus, and S. Schocken, \"Multilayer \n\nfeed forward networks with a nonpolynomial activation function can approxi(cid:173)\nmate any function\", Neural Networks, vol. 6, pp 861-867, 1993. \n\n[Maass, 1996a] W. Maass, \"On the computational power of noisy spiking neurons\", \nAdvances in Neural Information Processing Systems, vol. 8, pp 211-217, MIT \nPress, Cambridge, 1996. \n\n[Maass, 1996b] W. Maass, \"Networks of spiking neurons: the third generation of \nneural network models\", FTP-host: archive.cis.ohio-state.edu, FTP-filename: \n/pub/neuroprose/maass.third-generation.ps.Z, Neural Networks, to appear. \n\n[Maass, 1997] W. Maa.')s, \"Fa.')t sigmoidal networks via spiking neurons\", to appear \nin Neural Computation. FTP-host: archive.cis.ohio-state.edu FTP-filename: \n/pub/neuroprose/maass.sigmoidal-spiking.ps.Z, Neural Computation, to ap(cid:173)\npear in vol. 9, 1997. \n\n[de Ruyter van Steveninck & Bialek, 1988] R. de Ruyter van Steveninck, and \nW. Bialek, \"Real-time performance of a movement sensitive neuron in the \nblowfly visual system\", Proc. Roy. Soc. B, vol. 234, pp 379-414, 1988. \n\n[Sontag, 1996] E. D. Sontag, \"Shattering all sets of k points in 'general position' re(cid:173)\nquires (k -1)/2 parameters\", http://www.math.rutgers.edu/'''sontag/ , follow \nlinks to FTP archive. \n\n[Thorpe & Imbert, 1989J S. T. Thorpe, and M. Imbert, \"Biological constraints \non connectionist modelling\", In: Connectionism in Perspective, R. Pfeifer, \nZ. Schreter, F. Fogelman-Soulie, and 1. Steels, eds., Elsevier, North-Holland, \n1989. \n\n\f", "award": [], "sourceid": 1307, "authors": [{"given_name": "Wolfgang", "family_name": "Maass", "institution": null}]}