{"title": "A Model for Real-Time Computation in Generic Neural Microcircuits", "book": "Advances in Neural Information Processing Systems", "page_first": 229, "page_last": 236, "abstract": null, "full_text": "A Model for Real-Time Computation in Generic\n\nNeural Microcircuits\n\nWolfgang Maass , Thomas Natschl\u00a8ager\nInstitute for Theoretical Computer Science\nTechnische Universitaet Graz, Austria\n@igi.tu-graz.ac.at\n\u0001 maass, tnatschl\n\nAbstract\n\nHenry Markram\nBrain Mind Institute\n\nEPFL, Lausanne, Switzerland\n\nhenry.markram@ep\ufb02.ch\n\nA key challenge for neural modeling is to explain how a continuous\nstream of multi-modal input from a rapidly changing environment can be\nprocessed by stereotypical recurrent circuits of integrate-and-\ufb01re neurons\nin real-time. We propose a new computational model that is based on\nprinciples of high dimensional dynamical systems in combination with\nstatistical learning theory. It can be implemented on generic evolved or\nfound recurrent circuitry.\n\n1 Introduction\n\nDiverse real-time information processing tasks are carried out by neural microcircuits in\nthe cerebral cortex whose anatomical and physiological structure is quite similar in many\nbrain areas and species. However a model that could explain the potentially universal com-\nputational capabilities of such recurrent circuits of neurons has been missing. Common\nmodels for the organization of computations, such as for example Turing machines or at-\ntractor neural networks, are not suitable since cortical microcircuits carry out computations\non continuous streams of inputs. Often there is no time to wait until a computation has\nconverged, the results are needed instantly (\u201canytime computing\u201d) or within a short time\nwindow (\u201creal-time computing\u201d). Furthermore biological data prove that cortical micro-\ncircuits can support several real-time computational tasks in parallel, a fact that is incon-\nsistent with most modeling approaches. In addition the components of biological neural\nmicrocircuits, neurons and synapses, are highly diverse [1] and exhibit complex dynamical\nresponses on several temporal scales. This makes them completely unsuitable as building\nblocks of computational models that require simple uniform components, such as virtually\nall models inspired by computer science or arti\ufb01cial neural nets. Finally computations in\ncommon computational models are partitioned into discrete steps, each of which require\nconvergence to some stable internal state, whereas the dynamics of cortical microcircuits\nappears to be continuously changing. In this article we present a new conceptual framework\nfor the organization of computations in cortical microcircuits that is not only compatible\nwith all these constraints, but actually requires these biologically realistic features of neu-\nral computation. Furthermore like Turing machines this conceptual approach is supported\nby theoretical results that prove the universality of the computational model, but for the\nbiologically more relevant case of real-time computing on continuous input streams.\n\u0003 The work was partially supported by the Austrian Science Fond FWF, project #P15386.\n\n\u0002\n\fPSfrag replacementsA\n\n\u0002\u0001\u0004\u0003\n\n\t\u000b\n\n\u0001\b\u0007\u0004\u0005\n\nd(u,v)=0.4\nd(u,v)=0.2\nd(u,v)=0.1\nd(u,v)=0\n\n2.5\n\nB\n\n2\n\n1\n\ni\n\ne\nc\nn\na\nt\ns\nd\n \ne\nt\na\nt\ns\n\n1.5\n\n0.5\n\nPSfrag replacements\n\u0001\b\u0007\u0004\u0005\n\n0\n\n0\n\n0.1 0.2 0.3 0.4 0.5\n\ntime [sec]\n\n\u0001\u0010\u0007\u0004\u0005\u0012\u0011\n\nFigure 1: A Structure of a Liquid State Machine (LSM), here shown with just a single\nreadout. B Separation property of a generic neural microcircuit. Plotted on the\n-axis is the\n\u000e denotes the Euclidean norm, and \f\n\u0001\u0010\u0007\u0004\u0005 , \f\nvalue of \u000e\n\u0001\b\u0007\u0004\u0005\ndenote the liquid states at time \u0007 for Poisson spike trains and \u0014 as inputs, averaged over\nis de\ufb01ned as distance (\r\u0018\u0017 -norm)\nmany \n\u0001\b\u0002\u0016\nbetween low-pass \ufb01ltered versions of and \u0014 .\n2 A New Conceptual Framework for Real-Time Neural Computation\n\n\u000e , where \u000e\nand \u0014 with the same distance \u0015\n\n\u0001\b\u0002\u0016\n\n\u0001\u0010\u0007\u0004\u0005\n\n\u0005 .\n\n\u0001\b\u0007\u0004\u0005\n\n\u0001\u0010\u0007\u0004\u0005 of the circuit, then \n\nOur approach is based on the following observations. If one excites a suf\ufb01ciently com-\nplex recurrent circuit (or other medium) with a continuous input stream \u0019\u0001\u001b\u001a\u001c\u0005 , and looks\nat a later time \u0007\u001e\u001d\u001f\u001a at the current internal state \nis likely to\nhold a substantial amount of information about recent inputs \u0002\u0001!\u001a\"\u0005 (for the case of neural\ncircuit models this was \ufb01rst demonstrated by [2]). We as human observers may not be able\nto understand the \u201ccode\u201d by which this information about \u0019\u0001\u001b\u001a\u001c\u0005\nis encoded in the current\ncircuit state \n\u0001\u0010\u0007\u0004\u0005 , but that is obviously not essential. Essential is whether a readout neu-\nron that has to extract such information at time \u0007 for a speci\ufb01c task can accomplish this.\nBut this amounts to a classical pattern recognition problem, since the temporal dynamics\nof the input stream \u0019\u0001\u001b\u001a\u001c\u0005 has been transformed by the recurrent circuit into a high dimen-\n\u0001\b\u0007\u0004\u0005 . A related approach for arti\ufb01cial neural nets was independently\nsional spatial pattern \nexplored in [3].\nIn order to analyze the potential capabilities of this approach, we introduce the abstract\nmodel of a Liquid State Machine (LSM), see Fig. 1A. As the name indicates, this model\nhas some weak resemblance to a \ufb01nite state machine. But whereas the \ufb01nite state set\nand the transition function of a \ufb01nite state machine have to be custom designed for each\nparticular computational task, a liquid state machine might be viewed as a universal \ufb01nite\n\u0001\u0010\u0007\u0004\u0005 changes continuously over\nstate machine whose \u201cliquid\u201d high dimensional analog state \ntime. Furthermore if this analog state \nis suf\ufb01ciently high dimensional and its dynamics\nis suf\ufb01ciently complex, then it has embedded in it the states and transition functions of\nconsists of a \ufb01lter \r\n(i.e. a\nmany concrete \ufb01nite state machines. Formally, an LSM #\nfunction that maps input streams \u0019\u0001$\u0003%\u0005 onto streams \n\u0001\b\u0007\u0004\u0005 may depend not just on\n\u0005 , where \n\u0001\u0004\u0003\n\u0019\u0001\b\u0007\u0004\u0005 , but in a quite arbitrary nonlinear fashion also on previous inputs\u0019\u0001\u001b\u001a\u001c\u0005 ; in mathematical\n)\u0005*\u0001\b\u0007\u0004\u0005 ), and a (potentially memoryless) readout\nterminology this is written \nfunction \t\nthe \ufb01lter output \n\u0001\b\u0007\u0004\u0005 (i.e., the \u201cliquid state\u201d) into some\ntarget output\n\u0005 .\n\u0001\u0004\u0003\nIn our application to neural microcircuits, the recurrently connected microcircuit could be\nviewed in a \ufb01rst approximation as an implementation of a general purpose \ufb01lter \r\n(for\nexample some unbiased analog memory), from which different readout neurons extract and\nrecombine diverse components of the information contained in the input \u0019\u0001$\u0003%\u0005 . The liquid\nthat is accessible to readout neu-\nstate \nrons. An example where \u0019\u0001$\u0003%\u0005 consists of 4 spike trains is shown in Fig. 2. The generic\nmicrocircuit model (270 neurons) was drawn from the distribution discussed in section 3.\n\n\u0001\b\u0007\u0004\u0005'&(\u0001\nthat maps at any time \u0007\n\u0001\b\u0007\u0004\u0005 . Hence the LSM itself computes a \ufb01lter that maps \u0019\u0001$\u0003%\u0005 onto\n\nis that part of the internal circuit state at time \u0007\n\n\u0001\u0010\u0007\u0004\u0005\n\n\u0001\b\u0007\u0004\u0005\n\n\u0005\n\u0006\n\f\n\n\u0006\n\f\n\n\u000f\n\f\n\n\u0013\n\u0003\n\n\u000f\n\n\u0013\n\u0014\n\u0015\n\u0014\n\u0005\n\n\u0006\n\u0006\n\n\finput\n\n\u0002\u0001\u0002\u0003\u0005\u0004\u0007\u0006 : sum of rates of inputs 1&2 in the interval [\b -30 ms, \b ]\n\n0.4\n0.2\n\n\n\t\u000b\u0003\u0005\u0004\u0007\u0006 : sum of rates of inputs 3&4 in the interval [\b -30 ms, \b ]\n\n0.6\n\n0\n\n\r\f\u000b\u0003\u0005\u0004\u0007\u0006 : sum of rates of inputs 1-4 in the interval [\b -60 ms, \b -30 ms]\n\n0.8\n\n0\n\n\u000f\u000e\u0010\u0003\u0005\u0004\u0007\u0006 : sum of rates of inputs 1-4 in the interval [\b -150 ms, \b ]\n\n0.4\n0.2\n\n\n\u0011\u0002\u0003\u0005\u0004\u0007\u0006 : spike coincidences of inputs 1&3 in the interval [\b -20 ms, \b ]\n\n3\n\n0\n\n\u000f\u0012\u0010\u0003\u0005\u0004\u0007\u0006 : nonlinear combination \u0013\u0015\u0014\r\u0016\n\n0.15\n0\n\n\b\u0018\u0017\u001a\u0019\n\n\u0013\u001c\u001b\r\u0016\n\n\b\u0018\u0017\u001e\u001d\n\n\u0013 \u001f\n\u0016\n\n\b\u0018\u0017\n\nPSfrag replacements\n\n\n!\u0002\u0003\u0005\u0004\u0007\u0006 : nonlinear combination \u0013#\"\r\u0016\n\n0.3\n0.1\n\n0\n\n0.2\n\n\b\u0018\u0017\u001a\u0019\n\n0.4\n\n\b\u0018\u0017\u001a$&%\n\n\b\u0018\u0017('*)\n\n\u0016+\u0013\n\n\b\u0018\u0017\u001a$&,\u0002-\n\n.\u001c\u0017\n\n0.6\n\n0.8\n\n1\n\ntime [sec]\n\nFigure 2: Multi-tasking in real-time. Input spike trains were randomly generated in such\nthe input contained no information about preceding input more\na way that at any time \u0007\n\u0001\u0010\u0007\u0004\u0005 were randomly drawn from the uniform distribution over\nthan 30 ms ago. Firing rates /\n[0 Hz, 80 Hz] every 30 ms, and input spike trains 1 and 2 were generated for the present\n\u0001\b\u0007\u0004\u0005 . This\n30 ms time segment as independent Poisson spike trains with this \ufb01ring rate /\nprocess was repeated (with independent drawings of /\n\u0001\u0010\u0007\u0004\u0005 and Poission spike trains) for\neach 30 ms time segment. Spike trains 3 and 4 were generated in the same way, but with\n\u0001\u0010\u0007\u0004\u0005 every 30 ms. The results shown in this\nindependent drawings of another \ufb01ring rate 0\n\ufb01gure are for test data, that were never before shown to the circuit. Below the 4 input\nspike trains the target (dashed curves) and actual outputs (solid curves) of 7 linear readout\nneurons are shown in real-time (on the same time axis). Targets were to output every\n30 ms the actual \ufb01ring rate (rates are normalized to a maximum rate of 80 Hz) of spike\ntrains 1&2 during the preceding 30 ms (\t21 ), the \ufb01ring rate of spike trains 3&4 (\t\n\u0017 ), the\nsum of \t31 and \t\n) and during the interval\n[\u0007 -150 ms,\u0007 ] (\t\nis de\ufb01ned as the number\nof spikes which are accompanied by a spike in the other spike train within 5 ms during the\ninterval [\u0007 -20 ms,\u0007 ]), a simple nonlinear combinations \t76 and a randomly chosen complex\nnonlinear combination \t98 of earlier described values. Since that all readouts were linear\nunits, these nonlinear combinations are computed implicitly within the generic microcircuit\nmodel. Average correlation coef\ufb01cients between targets and outputs for 200 test inputs of\nlength 1 s for \t\u001a1\nIn this case the 7 readout neurons \t\n(modeled for simplicity just as linear units with\na membrane time constant of 30 ms, applied to the spike trains from the neurons in the\n\nin an earlier time interval [\u0007 -60 ms,\u0007 -30 ms] (\t\n), spike coincidences between inputs 1&3 (\t54\n\u0001\b\u0007\u0004\u0005\n\nto \t:8\n\n8 were 0.91, 0.92, 0.79, 0.75, 0.68, 0.87, and 0.65.\n\nto \t\n\n\u0017\n\u0013\n\u001b\n\u0016\n\u0013\n\u001f\n\u001b\n\u0016\n\u001f\n\u001f\n\u0016\n\u001f\n/\n\u0017\n.\n%\n1\n\fcircuit) were trained to extract completely different types of information from the input\nstream \u0019\u0001$\u0003%\u0005 , which require different integration times stretching from 30 to 150 ms. Since\nthe readout neurons had a biologically realistic short time constant of just 30 ms, additional\ntemporally integrated information had to be contained at any instance \u0007\nin the current \ufb01r-\ning state \n\u0001\u0010\u0007\u0004\u0005 of the recurrent circuit (its \u201cliquid state\u201d). In addition a large number of\nnonlinear combinations of this temporally integrated information are also \u201cautomatically\u201d\nprecomputed in the circuit, so that they can be pulled out by linear readouts. Whereas the\ninformation extracted by some of the readouts can be described in terms of commonly dis-\ncussed schemes for \u201cneural codes\u201d, this example demonstrates that it is hopeless to capture\nthe dynamics or the information content of the primary engine of the neural computation,\nthe liquid state of the neural circuit, in terms of simple coding schemes.\n\n3 The Generic Neural Microcircuit Model\n\nWe used a randomly connected circuit consisting of leaky integrate-and-\ufb01re (I&F) neu-\nrons, 20% of which were randomly chosen to be inhibitory, as generic neural microcircuit\nmodel.1 Parameters were chosen to \ufb01t data from microcircuits in rat somatosensory cortex\n(based on [1], [4] and unpublished data from the Markram Lab). 2 It turned out to be es-\nsential to keep the connectivity sparse, like in biological neural systems, in order to avoid\nchaotic effects.\n\nIn the case of a synaptic connection from \ning to the model proposed in [4], with the synaptic parameters \u0002\nfor depression), \u0004\n\nto \u0001 we modeled the synaptic dynamics accord-\n(time constant\n(time constant for facilitation) randomly chosen from Gaussian distri-\nbutions that were based on empirically found data for such connections. 3 We have shown\nin [5] that without such synaptic dynamics the computational power of these microcircuit\nmodels decays signi\ufb01cantly. For each simulation, the initial conditions of each I&F neuron,\n&\u0006\u0005 , were drawn randomly (uniform distribution) from\ni.e. the membrane voltage at time \u0007\nthe interval [13.5 mV, 15.0 mV]. The \u201cliquid state\u201d \n\u0001\b\u0007\u0004\u0005 of the recurrent circuit consisting\nof \u0007 neurons was modeled by an \u0007 -dimensional vector computed by applying a low pass\n\ufb01lter with a time constant of 30 ms to the spike trains generated by the \u0007 neurons in the\n\nrecurrent microcicuit.\n\n(use), \u0003\n\n\t\u000b\n\r\f\u000f\u000e\u0011\u0010\n\nis a parameter which controls both the average number of connec-\n\n1The software used to simulate the model is available via www.lsm.tugraz.at .\n2Neuron parameters: membrane time constant 30 ms, absolute refractory period 3 ms (excita-\ntory neurons), 2 ms (inhibitory neurons), threshold 15 mV (for a resting membrane potential assumed\nto be 0), reset voltage 13.5 mV, constant nonspeci\ufb01c background current\n\n\u0012 nA, input re-\n. Connectivity structure: The probability of a synaptic connection from neuron \u0014\nsistance 1 M\u0013\nto neuron \u0014 ) was de\ufb01ned as\n(as well as that of a synaptic connection from neuron \u0015\nto neuron \u0015\n\u0016\u0018\u0017\u001a\u0019\u001c\u001b\u001e\u001d \u001f\"!$#&%'\u001f\n+ , where .\n\u0014)(*\u0015,+*-'.\ntions and the average distance between neurons that are synaptically connected (we set .&\n0/ , see [5]\nfor details). We assumed that the neurons were located on the integer points of a 3 dimensional grid\nin space, where #1\u001f\nis the Euclidean distance between neurons \u0014 and \u0015 . Depending on whether\n\u0014)(*\u0015,+\n) or inhibitory (\b ), the value of \u0016 was 0.3 (232\n\u0014 and \u0015 were excitatory (2\n(\b4\b ).\n3Depending on whether \u0014 and \u0015 were excitatory (2\n) or inhibitory (\b ), the mean values of these\nthree parameters (with #\nexpressed in seconds, s) were chosen to be .5, 1.1, .05 (232\n1.2 (26\b ), .25, .7, .02 (\b\u00112\n), .32, .144, .06 (\b4\b ). The SD of each parameter was chosen to be 50%\nof its mean. The mean of the scaling parameter 7\n(in nA) was chosen to be 30 (EE), 60 (EI), -19\n(IE), -19 (II). In the case of input synapses the parameter 7\nhad a value of 18 nA if projecting onto\na excitatory neuron and 9 nA if projecting onto an inhibitory neuron. The SD of the 7\nparameter\nwas chosen to be 100% of its mean and was drawn from a gamma distribution. The postsynaptic\ncurrent was modeled as an exponential decay \u0019\u001c\u001b\u001e\u001d \u001f\"!98\n(inhibitory) synapses. The transmission delays between liquid neurons were chosen uniformly to be\n), and 0.8 ms for the other connections.\n1.5 ms (232\n\n-;:\"<\u001c+ with :\u000f<=\n0\u000e ms (:,<9\n?> ms) for excitatory\n\n,5\n\n), 0.2 (23\b ), 0.4 (\b\u00112\n\n), 0.1\n\n), .05, .125,\n\n\b\n%\n\f4 Towards a non-Turing Theory for Real-Time Neural Computation\n\nWhereas the famous results of Turing have shown that one can construct Turing machines\nthat are universal for digital sequential of\ufb02ine computing, we propose here an alternative\ncomputational theory that is more adequate for analyzing parallel real-time computing on\nanalog input streams. Furthermore we present a theoretical result which implies that within\nthis framework the computational units of the system can be quite arbitrary, provided that\nsuf\ufb01ciently diverse units are available (see the separation property and approximation prop-\nerty discussed below). It also is not necessary to construct circuits to achieve substantial\ncomputational power. Instead suf\ufb01ciently large and complex \u201cfound\u201d circuits (such as the\ngeneric circuit used as the main building block for Fig. 2) tend to have already large compu-\ntational power, provided that the reservoir from which their units are chosen is suf\ufb01ciently\nrich and diverse.\n\n\u000b\u0005\n\nConsider a class\nof basis \ufb01lters\nthat are available for building \ufb01lters \r\nsay that this class\n\u0001$\u0003%\u0005 with \u0019\u0001\u001b\u001a\u001c\u0005\u0003\u0002\n\u0019\u0001$\u0003%\u0005\n\u0005*\u0001\b\u0007\u0004\u0005 .4 There exist completely different classes\n\u0001\u001e\u0014\nseparation property:\nrelevant\n\n(that may for example consist of the components\nof neural LSMs, such as dynamic synapses). We\nhas the point-wise separation property if for any two input functions\n\u0001\u0010\u0007\u0004\u0005\t\u0002\n\u0001\u0007\u0006\b\nof \ufb01lters that satisfy this point-wise\n, and biologically more\n\nfor some \u001a\u0005\u0004\n\u0001!\u001a\u001c\u0005\n= \u0001 all delay lines\n,\n= \u0001 models for dynamic synapses\n\n= \u0001 all linear \ufb01lters\n(see [6]).\n\nthere exists some\n\nwith \u0001\n\n\u0001 feedforward sigmoidal neural nets\n\non this domain with any desired degree of precision by some \t\n\nThe complementary requirement that is demanded from the class\nof functions from\nwhich the readout maps \t\nare to be picked is the well-known universal approximation\nand any closed and bounded domain one can ap-\nproperty: for any continuous function\nproximate\n. An\nexample for such a class is\n. A rigorous mathe-\nof \ufb01lters that satis\ufb01es the point-wise sepa-\nmatical theorem [5], states that for any class\nration property and for any class\nof functions that satis\ufb01es the universal approximation\nproperty one can approximate any given real-time computation on time-varying inputs with\nfading memory (and hence any biologically relevant real-time computation) by a LSM #\n, and whose readout map \t\nwhose \ufb01lter \r\nis\n. This theoretial result supports the following pragmatic procedure:\nchosen from the class\nIn order to implement a given real-time computation with fading memory it suf\ufb01ces to take\na \ufb01lter \r whose dynamics is \u201csuf\ufb01ciently complex\u201d, and train a \u201csuf\ufb01ciently \ufb02exible\u201d read-\nout to assign for each time \u0007 and state \n\u0001\b\u0007\u0004\u0005 . Actually, we\nfound that if the neural microcircuit model is not too small, it usually suf\ufb01ces to use linear\nreadouts. Thus the microcircuit automatically assumes \u201con the side\u201d the computational role\nof a kernel for support vector machines.\n\nis composed of \ufb01nitely many \ufb01lters in\n\nthe target output\n\n)\u0005*\u0001\b\u0007\u0004\u0005\n\n\u0006\f\n\n\u0001\b\u0007\u0004\u0005\n\n\u0001\u0010\u0007\u0004\u0005\n\n)\u0005*\u0001\b\u0007\u0004\u0005 and \n\nFor physical implementations of LSMs it makes more sense to study instead of the theoret-\nically relevant point-wise separation property the following qualitative separation property\nas a test for the computational capability of a \ufb01lter \r\n: how different are the liquid states\n\u0005 . This is\n\u0001\u0004\u0003\nis a generic\nevaluated in Fig. 1B for the case where \u0002\u0001\u0004\u0003\nneural microcircuit model. It turns out, that the difference between the liquid states scales\nroughly proportionally to the difference between the two input histories. This appears to\nbe desirable from the practical point of view, since it implies that saliently different input\nhistories can be distinguished more easily and in a more noise robust fashion by the read-\nout. We propose to use such evaluation of the separation capability of neural microcircuits\nas a new standard test for their computational capabilities.\n\n\u0005*\u0001\b\u0007\u0004\u0005 for two different input histories \u0019\u0001$\u0003%\u0005\n\u0001$\u0003%\u0005 are Poisson spike trains and \n\n\u0001\b\u0007\u0004\u0005\n\n4Note that it is not required that there exists a single\n\nany two different input histories\n\n\u001f\"\u0017\n\n+ ,\n\n\u001f\"\u0017\n\n+ .\n\nwhich achieves this separation for\n\n\u000f\u000e\u0011\u0010\n\n\n\u0001\n\n\n\u0016\n\u0014\n&\n\u0014\n\u0007\n\u0001\n&\n\u0001\n\n\n\u0002\n\n\u0002\n\n\u0002\n\n\u000b\n\u000b\n\n&\n\u0002\n\n\n\n\n&\n\u0001\n\n\u0006\n \n\u000f\n&\n\u0001\n\n\u0013\n&\n\u0001\n\n\u0014\n\u0016\n\u0014\n\u0005\n\u0016\n\u0014\n\u0012\n\u0013\n\f5 A Generic Neural Microcircuit on the Computational Test Stand\n\nThe theoretical results sketched in the preceding section can be interpreted as saying that\nthere are no strong a priori limitations for the power of neural microcircuits for real-time\ncomputing with fading memory, provided they are suf\ufb01ciently large and their components\nare suf\ufb01ciently heterogeneous. In order to evaluate this somewhat surprising theoretical\nprediction, we use a well-studied computational benchmark task for which data have been\nmade publicly available5: the speech recognition task considered in [7] and [8].\nThe dataset consists of 500 input \ufb01les: the words \u201czero\u201d, \u201cone\u201d, ..., \u201cnine\u201d are spoken by 5\ndifferent (female) speakers, 10 times by each speaker. The task was to construct a network\n. Each of the 500 input\n\ufb01les had been encoded in the form of 40 spike trains, with at most one spike per spike train 6\nsignaling onset, peak, or offset of activity in a particular frequency band. A network was\npresented in [8] that could solve this task with an error 7 of 0.15 for recognizing the pattern\n\u201cone\u201d. No better result had been achieved by any competing networks constructed during a\nwidely publicized internet competition [7]. The network constructed in [8] transformed the\n40 input spike trains into linearly decaying input currents from 800 pools, each consisting\nof a \u201clarge set of closely similar unsynchronized neurons\u201d [8]. Each of the 800 currents\n\nof I&F neurons that could recognize each of the 10 spoken words \n\nwas delivered to a separate pair of neurons consisting of an excitatory \u201c\u0001 -neuron\u201d and an\ninhibitory \u201c\u0002 -neuron\u201d. To accomplish the particular recognition task some of the synapses\nbetween \u0001 -neurons (\u0002 -neurons) are set to have equal weights, the others are set to zero. A\n\nparticular achievement of this network (resulting from the smoothly and linearly decaying\n\ufb01ring activity of the 800 pools of neurons) is that it is robust with regard to linear time-\nwarping of the input spike pattern.\n\nWe tested our generic neural microcircuit model on the same task (in fact on exactly the\nsame 500 input \ufb01les). A randomly chosen subset of 300 input \ufb01les was used for training,\nthe other 200 for testing. The generic neural microcircuit model was drawn from the dis-\ntribution described in section 3, hence from the same distribution as the circuit drawn for\nthe completely different task discussed in Fig. 2, with randomly connected I&F neurons\n\noptimized (like for SVMs with linear kernels) to \ufb01re whenever the input encoded the spo-\n\nlocated on the integer points of a \u0003\u0005\u0004\u0007\u0006\t\b\n\u0006\u000b\b column. The synaptic weights of 10 linear\nreadout neurons \t\r\f which received inputs from the 135 I&F neurons in the circuit were\n\b\u0011\u0010 of\nken word \nthe size of the network constructed in [8] for the same task 8. Nevertheless the average error\nachieved after training by these randomly generated generic microcircuit models was 0.14\n(measured in the same way, for the same word \u201done\u201d), hence slightly better than that of the\n30 times larger network custom designed for this task. The score given is the average for\n50 randomly drawn generic microcircuit models.\n\n. Hence the whole circuit consisted of 145 I&F neurons, less than \u0003\u0005\u000e\u000f\b\n\nThe comparison of the two different approaches also provides a nice illustration of the\n\n5http://moment.princeton.edu/ mus/Organism/Competition/digits data.html\n6The network constructed in [8] required that each spike train contained at most one spike.\n\nfor a particular word \u0013 was de\ufb01ned in [8] by \u0012\n\n\u0015\u0014\u0017\u0016\u0019\u0018\n\u0014\u001b\u001a\n\n7The error (or \u201crecognition score\u201d)\u0012\n, where\u001f! #\"\n\n(\u001f%$\n\n\u0014\u0017\u0016\u001e\u001d\n\n\" ) is the number of false (correct) positives and\u001f& (' and\u001f)$\n\nfalse and correct negatives. We use the same de\ufb01nition of error to facilitate comparison of results. The\nrecognition scores of the network constructed in [8] and of competing networks of other researchers\ncan be found at http://moment.princeton.edu/ \u02dcmus/Organism/Docs/winners.html. For the competition\nthe networks were allowed to be constructed especially for their task, but only one single pattern for\neach word could be used for setting the synaptic weights. Since our microcircuit models were not\nprepared for this task, they had to be trained with substantially more examples.\n\n' are the numbers of\n\n8If one assumes that each of the 800 \u201dlarge\u201d pools of neurons in that network would consist of\n\njust 5 neurons, it contains together with the * and + -neurons 5600 neurons.\n\n\u0005\n\u0018\n\u001c\n\u0014\n\u001a\n\u001d\n\f\"one\", speaker 5\n\n\"one\", speaker 3\n\n\"five\", speaker 1\n\n\"eight\", speaker 4\n\n40\n\n20\n\n0\n135\n\n90\n\n45\n\n0\n\nPSfrag replacements\n\nt\nu\np\nn\n\ni\n\nt\ni\nu\nc\nr\ni\nc\no\nr\nc\nm\n\ni\n\nt\nu\no\nd\na\ne\nr\n\n\u0002\u0001\n\nPSfrag replacements\n0\n\n0.4\n\n0.2\n\nPSfrag replacements\n\n0.2\n\n0.4\n\n0\n\ntime [s]\n\ntime [s]\n\nPSfrag replacements\n\n0\n\n0.2\ntime [s]\n\n0\n\n0.2\n\ntime [s]\n\nFigure 3: Application of our generic neural microcircuit model to the speech recognition\nfrom [8]. Top row: input spike patterns. Second row: spiking response of the 135 I&F\nneurons in the neural microcircuit model. Third row: output of an I&F neuron that was\ntrained to \ufb01re as soon as possible when the word \u201cone\u201d was spoken, and as little as possible\nelse.\n\ndifference between of\ufb02ine computing, real-time computing, and any-time computing.\nWhereas the network of [8] implements an algorithm that needs a few hundred ms of pro-\ncessing time between the end of the input pattern and the answer to the classi\ufb01cation task\n(450 ms in the example of Fig. 2 in [8]), the readout neurons from the generic neural mi-\ncrocircuit were trained to provide their answer (through \ufb01ring or non-\ufb01ring) immediately\nwhen the input pattern ended. In fact, as illustrated in Fig. 3, one can even train the read-\nout neurons quite successfully to provide provisional answers long before the input pattern\nhas ended (thereby implementing an \u201danytime\u201d algorithm). More precisely, each of the 10\nlinear readout neurons was trained to recognize the spoken word at any multiple of 20 ms\nwhile the word was spoken. An error score of 1.4 was achieved for this anytime speech\nrecognition task.\n\nWe also compared the noise robustness of the generic microcircuit models with that of\n[8], which had been constructed to be robust with regard to linear time warping of the input\npattern. Since no benchmark input data were available to calculate this noise robustness, we\nconstructed such data by creating as templates 10 patterns consisting each of 40 randomly\ndrawn Poisson spike trains at 4 Hz over 0.5 s. Noisy variations of these templates were\n\ncreated by \ufb01rst multiplying their time scale with a randomly drawn factor from \u0005\n\n(thereby allowing for a 9 fold time warp), and subsequently dislocating each spike by an\namount drawn independently from a Gaussian distribution with mean 0 and SD 32 ms.\nThese spike patterns were given as inputs to the same generic neural microcircuit models\nconsisting of 135 I&F neurons as discussed before. 10 linear readout neurons were trained\n(with 1000 randomly drawn training examples) to recognize which of the 10 templates had\nbeen used to generate a particular input. On 500 novel test examples (drawn from same\ndistribution) they achieved an error of 0.09 (average performance of 30 randomly generated\nmicrocircuit models). As a consequence of achieving this noise robustness generically,\nrather then by a construction tailored to a speci\ufb01c type of noise, we found that the same\ngeneric microcircuit models are also robust with regard to nonlinear time warp of the input.\nFor the case of nonlinear (sinusoidal) time warp 9 an average (50 microcircuits) error of 0.2\n\n\b\u0007\u0006 )\n\n\u0003\u0005\u000e\n\n\u0014\u0016\u0015\u0018\u0017\n\nand\n\n9A spike at time 8 was transformed into a spike at time 8\t\b\n\u001c\u001c\u001b\n+*+ with \u0012\n/\u0019\u0010\u001a\u0012\nchosen such that \n\n\u000b\n\nrandomly drawn from [0.5,2], \u001b\n\n/ Hz, \u000e\n\n\u001c\u001e .\n\n\u001c\u000f\u000e\n\n+\r\f\n\n\b\r\nrandomly drawn from \u001d\n\n/\u0011\u0010\u0013\u0012)+\n\u001e\u001e(\n/\u0011\u0010 \u001f\n\n\u0003\n\u0004\n\b\n\u0016\n\u001f\n8\n\u0017\n\u001f\n8\n\u001c\n\f\n-\n\u001f\n\u0017\n\u001f\n8\n\n\u001f\n\u001e\n+\n\fis achieved. This demonstrates that it is not necessary to build noise robustness explicitly\ninto the circuit. A randomly generated microcircuit model has at least the same noise\nrobustness as a circuit especially constructed to achieve that.\n\nThis test had implicitly demonstrated another point. Whereas the network of [8] was only\nable to classify spike patterns consisting of at most one spike per spike train, a generic\nneural microcircuit model can classify spike patterns without that restriction. It can for\nexample also classify the original version of the speech data encoded into onsets, peaks,\nand offsets in various frequency bands, before all except the \ufb01rst events of each kind were\narti\ufb01cially removed to \ufb01t the requirements of the network from [8].\n\nThe performance of the same generic neural microcircuit model on completely different\ncomputational tasks (recall of information from preceding input segments, movement pre-\ndiction and estimation of the direction of movement of extended moving objects) turned out\nto be also quite remarkable, see [5], [9] and [10]. Hence this microcircuit model appears to\nhave quite universal capabilities for real-time computing on time-varying inputs.\n\n6 Discussion\n\nWe have presented a new conceptual framework for analyzing computations in generic\nneural microcircuit models that satis\ufb01es the biological constraints listed in section 1. Thus\nfor the \ufb01rst time one can now take computer models of neural microcircuits, that can be as\nrealistic as one wants to, and use them not just for demonstrating dynamic effects such as\nsynchronization or oscillations, but to really carry out demanding computations with these\nmodels. Furthermore our new conceptual framework for analyzing computations in neural\ncircuits not only provides theoretical support for their seemingly universal capabilities for\nreal-time computing, but also throws new light on key concepts such as neural coding. Fi-\nnally, since in contrast to virtually all computational models the generic neural microcircuit\nmodels that we consider have no preferred direction of information processing, they offer\nan ideal platform for investigating the interaction of bottom-up and top-down processing\nof information in neural systems.\n\nReferences\n[1] A. Gupta, Y. Wang, and H. Markram. Organizing principles for a diversity of GABAergic\n\ninterneurons and synapses in the neocortex. Science, 287:273\u2013278, 2000.\n\n[2] D. V. Buonomano and M. M. Merzenich. Temporal information transformed into a spatial code\n\nby a neural network with realistic properties. Science, 267:1028\u20131030, Feb. 1995 1995.\n\n[3] H. Jaeger. The \u201decho state\u201d approach to analysing and training recurrent neural networks.\n\nGerman National Research Center for Information Technology, Report 148, 2001.\n\n[4] H. Markram, Y. Wang, and M. Tsodyks. Differential signaling via the same axon of neocortical\n\npyramidal neurons. Proc. Natl. Acad. Sci., 95:5323\u20135328, 1998.\n\n[5] W. Maass, T. Natschl\u00a8ager, and H. Markram. Real-time computing without stable states: A new\nframework for neural computation based on perturbations. Neur. Comp., 14:2531\u20132560, 2002.\n[6] W. Maass and E. D. Sontag. Neural systems as nonlinear \ufb01lters. Neur. Comp., 12:1743\u20131772,\n\n2000.\n\n[7] J. J. Hop\ufb01eld and C. D. Brody. What is a moment? \u201ccortical\u201d sensory integration over a brief\n\ninterval. Proc. Natl. Acad. Sci. USA, 97(25):13919\u201313924, 2000.\n\n[8] J. J. Hop\ufb01eld and C. D. Brody. What is a moment? transient synchrony as a collective mecha-\n\nnism for spatiotemporal integration. Proc. Natl. Acad. Sci. USA, 98(3):1282\u20131287, 2001.\n\n[9] W. Maass, R. A. Legenstein, and H. Markram. A new approach towards vision suggested by\nbiologically realistic neural microcircuit models. In H. H. Buelthoff, S. W. Lee, T. A. Poggio,\nand C. Wallraven, editors, Proc. of the 2nd International Workshop on Biologically Motivated\nComputer Vision 2002, volume 2525 of LNCS, pages 282\u2013293. Springer, 2002.\n\n[10] W. Maass, T. Natschl\u00a8ager, and H. Markram. Computational models for generic cortical micro-\ncircuits. In J. Feng, editor, Computational Neuroscience: A Comprehensive Approach. CRC-\nPress, 2002. to appear.\n\n\f", "award": [], "sourceid": 2307, "authors": [{"given_name": "Wolfgang", "family_name": "Maass", "institution": null}, {"given_name": "Thomas", "family_name": "Natschl\u00e4ger", "institution": null}, {"given_name": "Henry", "family_name": "Markram", "institution": null}]}