{"title": "Methods for Estimating the Computational Power and Generalization Capability of Neural Microcircuits", "book": "Advances in Neural Information Processing Systems", "page_first": 865, "page_last": 872, "abstract": null, "full_text": " Methods for Estimating the Computational\n Power and Generalization Capability of Neural\n Microcircuits\n\n\n\n Wolfgang Maass, Robert Legenstein, Nils Bertschinger\n Institute for Theoretical Computer Science\n Technische Universitat Graz\n A-8010 Graz, Austria\n {maass, legi, nilsb}@igi.tugraz.at\n\n\n\n Abstract\n\n\n What makes a neural microcircuit computationally powerful? Or more\n precisely, which measurable quantities could explain why one microcir-\n cuit C is better suited for a particular family of computational tasks than\n another microcircuit C ? We propose in this article quantitative measures\n for evaluating the computational power and generalization capability of a\n neural microcircuit, and apply them to generic neural microcircuit mod-\n els drawn from different distributions. We validate the proposed mea-\n sures by comparing their prediction with direct evaluations of the com-\n putational performance of these microcircuit models. This procedure is\n applied first to microcircuit models that differ with regard to the spatial\n range of synaptic connections and with regard to the scale of synaptic\n efficacies in the circuit, and then to microcircuit models that differ with\n regard to the level of background input currents and the level of noise\n on the membrane potential of neurons. In this case the proposed method\n allows us to quantify differences in the computational power and gen-\n eralization capability of circuits in different dynamic regimes (UP- and\n DOWN-states) that have been demonstrated through intracellular record-\n ings in vivo.\n\n\n\n1 Introduction\n\n\nRather than constructing particular microcircuit models that carry out particular computa-\ntions, we pursue in this article a different strategy, which is based on the assumption that\nthe computational function of cortical microcircuits is not fully genetically encoded, but\nrather emerges through various forms of plasticity (\"learning\") in response to the actual\ndistribution of signals that the neural microcircuit receives from its environment. From this\nperspective the question about the computational function of cortical microcircuits C turns\ninto the questions:\n\n\n a) What functions (i.e. maps from circuit inputs to circuit outputs) can the circuit C\n learn to compute.\n\n\f\n b) How well can the circuit C generalize a specific learned computational function\n to new inputs?\n\nWe propose in this article a conceptual framework and quantitative measures for the in-\nvestigation of these two questions. In order to make this approach feasible, in spite of\nnumerous unknowns regarding synaptic plasticity and the distribution of electrical and bio-\nchemical signals impinging on a cortical microcircuit, we make in the present first step of\nthis approach the following simplifying assumptions:\n\n1. Particular neurons (\"readout neurons\") learn via synaptic plasticity to extract specific\ninformation encoded in the spiking activity of neurons in the circuit.\n\n2. We assume that the cortical microcircuit itself is highly recurrent, but that the impact of\nfeedback that a readout neuron might send back into this circuit can be neglected.1\n\n3. We assume that synaptic plasticity of readout neurons enables them to learn arbitrary\nlinear transformations. More precisely, we assume that the input to such readout neuron\ncan be approximated by a term n-1 w\n i=1 ixi(t), where n - 1 is the number of presynaptic\nneurons, xi(t) results from the output spike train of the ith presynaptic neuron by filtering\nit according to the low-pass filtering property of the membrane of the readout neuron,2 and\nwi is the efficacy of the synaptic connection. Thus wixi(t) models the time course of the\ncontribution of previous spikes from the ith presynaptic neuron to the membrane potential\nat the soma of this readout neuron. We will refer to the vector x(t) as the circuit state at\ntime t.\n\nUnder these unpleasant but apparently unavoidable simplifying assumptions we propose\nnew quantitative criteria based on rigorous mathematical principles for evaluating a neural\nmicrocircuit C with regard to questions a) and b). We will compare in sections 4 and 5\nthe predictions of these quantitative measures with the actual computational performance\nachieved by 132 different types of neural microcircuit models, for a fairly large number of\ndifferent computational tasks. All microcircuit models that we consider are based on bio-\nlogical data for generic cortical microcircuits (as described in section 3), but have different\nsettings of their parameters.\n\n\n2 Measures for the kernel-quality and generalization capability of\n neural microcircuits\n\nOne interesting measure for probing the computational power of a neural circuit is the pair-\nwise separation property considered in [Maass et al., 2002]. This measure tells us to what\nextent the current circuit state x(t) reflects details of the input stream that occurred some\ntime back in the past (see Fig. 1). Both circuit 2 and circuit 3 could be described as being\nchaotic since state differences resulting from earlier input differences persist. The \"edge-of-\nchaos\" [Langton, 1990] lies somewhere between points 1 and 2 according to Fig. 1c). But\nthe best computational performance occurs between points 2 and 3 (see Fig. 2b)). Hence\nthe \"edge-of-chaos\" is not a reliable predictor of computational power for circuits of spik-\ning neurons. In addition, most real-world computational tasks require that the circuit gives\na desired output not just for 2, but for a fairly large number m of significantly different\ninputs. One could of course test whether a circuit C can separate each of the m pairs of\n 2\n\n\n 1This assumption is best justified if such readout neuron is located for example in another brain\narea that receives massive input from many neurons in this microcircuit and only has diffuse back-\nwards projection. But it is certainly problematic and should be addressed in future elaborations of the\npresent approach.\n 2One can be even more realistic and filter it also by a model for the short term dynamics of the\nsynapse into the readout neuron, but this turns out to make no difference for the analysis proposed in\nthis article.\n\n\f\n 8\n a 4\n b state separation 0.25\n c\n 4 7 2\n circuit 3 0.2\n 0\n 2 6\n 3 0 1 2 3\n 5 0.2\n 1 0.15\n 0.7\n scale 2 4 0.1\n 0.5\n W circuit 2 0.1\n 0.3 3\n 1 00 1 2 3 state separation\n 2 0.1 0.05\n 0.1 1 0.05\n 0.05 circuit 1 0\n 0.5 1 1.4 2 3 4 6 8 0 1.4 1.6 1.8 2 2.2\n 0 1 2 3 \n t [s]\n\n\n\n\nFigure 1: Pointwise separation property for different types of neural microcircuit models as specified\nin section 3. Each circuit C was tested for two arrays u and v of 4 input spike trains at 20 Hz over\n3 s that differed only during the first second. a) Euclidean differences between resulting circuit\nstates xu(t) and xv(t) for t = 3 s, averaged over 20 circuits C and 20 pairs u, v for each indicated\nvalue of and Wscale (see section 3). b) Temporal evolution of xu(t) - xv(t) for 3 different\ncircuits with values of , Wscale according to the 3 points marked in panel a) ( = 1.4, 2, 3 and\nWscale = 0.3, 0.7, 2 for circuit 1, 2, and 3 respectively). c) Pointwise separation along a straight line\nbetween point 1 and point 2 of panel a).\n\n\n\nsuch inputs. But even if the circuit can do this, we do not know whether a neural readout\nfrom such circuit would be able to produce given target outputs for these m inputs.\n\nTherefore we propose here the linear separation property as a more suitable quantitative\nmeasure for evaluating the computational power of a neural microcircuit (or more precisely:\nthe kernel-quality of a circuit; see below). To evaluate the linear separation property of a\ncircuit C for m different inputs u1, . . . , um (which are in this article always functions of\ntime, i.e. input streams such as for example multiple spike trains) we compute the rank of\nthe n m matrix M whose columns are the circuit states xu (t\n i 0 ) resulting at some fixed\ntime t0 for the preceding input stream ui. If this matrix has rank m, then it is guaranteed\nthat any given assignment of target outputs yi R at time t0 for the inputs ui can be\nimplemented by this circuit C (in combination with a linear readout). In particular, each of\nthe 2m possible binary classifications of these m inputs can then be carried out by a linear\nreadout from this fixed circuit C. Obviously such insight is much more informative than a\ndemonstration that some particular classification task can be carried out by such circuit C.\nIf the rank of this matrix M has a value r < m, then this value r can still be viewed as a\nmeasure for the computational power of this circuit C, since r is the number of \"degrees\nof freedom\" that a linear readout has in assigning target outputs yi to these inputs ui (in\na way which can be made mathematically precise with concepts of linear algebra). Note\nthat this rank-measure for the linear separation property of a circuit C may be viewed as an\nempirical measure for its kernel-quality, i.e. for the complexity and diversity of nonlinear\noperations carried out by C on its input stream in order to boost the classification power of\na subsequent linear decision-hyperplane (see [Vapnik, 1998]).\n\nObviously the preceding measure addresses only one component of the computational per-\nformance of a neural circuit C. Another component is its capability to generalize a learnt\ncomputational function to new inputs. Mathematical criteria for generalization capability\nare derived in [Vapnik, 1998] (see ch. 4 of [Cherkassky and Mulier, 1998] for a compact ac-\ncount of results relevant for our arguments). According to this mathematical theory one can\nquantify the generalization capability of any learning device in terms of the VC-dimension\nof the class H of hypotheses that are potentially used by that learning device.3 More pre-\n\n 3The VC-dimension (of a class H of maps H from some universe Suniv of inputs into {0, 1})\nis defined as the size of the largest subset S Suniv which can be shattered by H. One says that\nS Suniv is shattered by H if for every map f : S {0, 1} there exists a map H in H such that\nH(u) = f (u) for all u S (this means that every possible binary classification of the inputs u S\n\n\f\ncisely: if VC-dimension (H) is substantially smaller than the size of the training set Strain,\none can prove that this learning device generalizes well, in the sense that the hypothesis (or\ninput-output map) produced by this learning device is likely to have for new examples an\nerror rate which is not much higher than its error rate on Strain, provided that the new\nexamples are drawn from the same distribution as the training examples (see equ. 4.22 in\n[Cherkassky and Mulier, 1998]).\n\nWe apply this mathematical framework to the class HC of all maps from a set Suniv of\ninputs u into {0, 1} which can be implemented by a circuit C. More precisely: HC consists\nof all maps from Suniv into {0, 1} that a linear readout from circuit C with fixed internal\nparameters (weights etc.) but arbitrary weights w Rn of the readout (that classifies the\ncircuit input u as belonging to class 1 if w xu(t0) 0, and to class 0 if w xu(t0) < 0)\ncould possibly implement.\n\nWhereas it is very difficult to achieve tight theoretical bounds for the VC-dimension of even\nmuch simpler neural circuits, see [Bartlett and Maass, 2003], one can efficiently estimate\nthe VC-dimension of the class HC that arises in our context for some finite ensemble Suniv\nof inputs (that contains all examples used for training or testing) by using the following\nmathematical result (which can be proved with the help of Radon's Theorem):\n\n\nTheorem 2.1 Let r be the rank of the n s matrix consisting of the s vectors xu(t0)\nfor all inputs u in Suniv (we assume that Suniv is finite and contains s inputs). Then\nr VC-dimension(HC) r + 1.\n\nWe propose to use the rank r defined in Theorem 2.1 as an estimate of VC-dimension(HC ),\nand hence as a measure that informs us about the generalization capability of a neural\nmicrocircuit C. It is assumed here that the set Suniv contains many noisy variations\nof the same input signal, since otherwise learning with a randomly drawn training set\nStrain Suniv has no chance to generalize to new noisy variations. Note that each family\nof computational tasks induces a particular notion of what aspects of the input are viewed\nas noise, and what input features are viewed as signals that carry information which is rel-\nevant for the target output for at least one of these computational tasks. For example for\ncomputations on spike patterns some small jitter in the spike timing is viewed as noise. For\ncomputations on firing rates even the sequence of interspike intervals and temporal rela-\ntions between spikes that arrive from different input sources are viewed as noise, as long\nas these input spike trains represent the same firing rates. Examples for both families of\ncomputational tasks will be discussed in this article.\n\n\n3 Models for generic cortical microcircuits\n\nWe test the validity of the proposed measures by comparing their predictions with direct\nevaluations of the computational performance for a large variety of models for generic cor-\ntical microcircuits consisting of 540 neurons. We used leaky-integrate-and-fire neurons4\nand biologically quite realistic models for dynamic synapses.5 Neurons (20 % of which\nwere randomly chosen to be inhibitory) were located on the grid points of a 3D grid of\ndimensions 6 6 15 with edges of unit length. The probability of a synaptic connection\n\ncan be carried out by some hypothesis H in H).\n 4Membrane voltage V dVm\n m modeled by m = -(V\n dt m -Vresting )+Rm (Isyn(t)+Ibackground +\nInoise), where m = 30 ms is the membrane time constant, Isyn models synaptic inputs from other\nneurons in the circuits, Ibackground models a constant unspecific background input and Inoise models\nnoise in the input.\n 5Short term synaptic dynamics was modeled according to [Markram et al., 1998], with distribu-\ntions of synaptic parameters U (initial release probability), D (time constant for depression), F (time\nconstant for facilitation) chosen to reflect empirical data (see [Maass et al., 2002] for details).\n\n\f\nfrom neuron a to neuron b was proportional to exp(-D2(a, b)/2), where D(a, b) is the\nEuclidean distance between a and b, and regulates the spatial scaling of synaptic connec-\ntivity. Synaptic efficacies w were chosen randomly from distributions that reflect biological\ndata (as in [Maass et al., 2002]), with a common scaling factor Wscale.\n\n\n 8 0.7\n b 4\n\n a 2 3\n 0.65\n 1\n 0.7\n scale 2\n 0.5\n W\n 0.3 1 0.6\n\n 0 50 100 150 200 0 50 100 150 200 0.1\n t [ms] t [ms] 0.05\n 0.5 1 1.4 2 3 4 6 8 \n \n\n\n\nFigure 2: Performance of different types of neural microcircuit models for classification of spike\npatterns. a) In the top row are two examples of the 80 spike patterns that were used (each consisting of\n4 Poisson spike trains at 20 Hz over 200 ms), and in the bottom row are examples of noisy variations\n(Gaussian jitter with SD 10 ms) of these spike patterns which were used as circuit inputs. b) Fraction\nof examples (for 200 test examples) that were correctly classified by a linear readout (trained by\nlinear regression with 500 training examples). Results are shown for 90 different types of neural\nmicrocircuits C with varying on the x-axis and Wscale on the y-axis (20 randomly drawn circuits\nand 20 target classification functions randomly drawn from the set of 280 possible classification\nfunctions were tested for each of the 90 different circuit types, and resulting correctness-rates were\naveraged. The mean SD of the results is 0.028.). Points 1, 2, 3 defined as in Fig. 1.\n\n\nLinear readouts from circuits with n - 1 neurons were assumed to compute a weighted\nsum n-1 w\n i=1 ixi(t) + w0 (see section 1). In order to simplify notation we assume that the\nvector x(t) contains an additional constant component x0(t) = 1, so that one can write\nw x(t) instead of n-1 w\n i=1 ixi(t) + w0. In the case of classification tasks we assume that\nthe readout outputs 1 if w x(t) 0, and 0 otherwise.\n\n\n4 Evaluating the influence of synaptic connectivity on computational\n performance\n\nNeural microcircuits were drawn from the distribution described in section 3 for 10 differ-\nent values of (which scales the number and average distance of synaptically connected\nneurons) and 9 different values of Wscale (which scales the efficacy of all synaptic connec-\ntions). 20 microcircuit models C were drawn for each of these 90 different assignments\nof values to and Wscale. For each circuit a linear readout was trained to perform one\n(randomly chosen) out of 280 possible classification tasks on noisy variations u of 80 fixed\nspike patterns as circuit inputs u. The target performance of any such circuit input was to\noutput at time t = 100 ms the class (0 or 1) of the spike pattern from which the preceding\ncircuit input had been generated (for some arbitrary partition of the 80 fixed spike patterns\ninto two classes. Each spike pattern u consisted of 4 Poisson spike trains over 200 ms. Per-\nformance results are shown in Fig. 2b for 90 different types of neural microcircuit models.\n\nWe now test the predictive quality of the two proposed measures for the computational\npower of a microcircuit on spike patterns. One should keep in mind that the proposed\nmeasures do not attempt to test the computational capability of a circuit for one particu-\nlar computational task, but for any distribution on Suniv and for a very large (in general\ninfinitely large) family of computational tasks that only have in common a particular bias\nregarding which aspects of the incoming spike trains may carry information that is relevant\nfor the target output of computations, and which aspects should be viewed as noise. Fig. 3a\n\n\f\nexplains why the lower left part of the parameter map in Fig. 2b is less suitable for any\n\n\n 8 8 8\n a b c 20\n 4 450 4 450 4\n\n 2 400 2 400 2 3 15\n 1 350 1 350 1\n 0.7 0.7 0.7\n scale 2\n 0.5\n W 0.5 0.5 10\n 300 300\n 0.3 0.3 0.3 1\n 250 250\n 5\n 0.1 0.1 200 0.1\n 200\n 0.05 0.05 0.05 0\n 0.5 1 1.4 2 3 4 6 8 0.5 1 1.4 2 3 4 6 8 0.5 1 1.4 2 3 4 6 8 \n \n\n\nFigure 3: Values of the proposed measures for computations on spike patterns. a) Kernel-quality\nfor spike patterns of 90 different circuit types (average over 20 circuits, mean SD = 13; For each\ncircuit, the average over 5 different sets of spike patterns was used).6 b) Generalization capability for\nspike patterns: estimated VC-dimension of HC (for a set Suniv of inputs u consisting of 500 jittered\nversions of 4 spike patterns), for 90 different circuit types (average over 20 circuits, mean SD = 14;\nFor each circuit, the average over 5 different sets of spike patterns was used). c) Difference of both\nmeasures (mean SD = 5.3). This should be compared with actual computational performance plotted\nin Fig. 2b. Points 1, 2, 3 defined as in Fig. 1.\n\n\nsuch computation, since there the kernel-quality of the circuits is too low. Fig. 3b explains\nwhy the upper right part of the parameter map in Fig. 2b is less suitable, since a higher\nVC-dimension (for a training set of fixed size) entails poorer generalization capability. We\nare not aware of a theoretically founded way of combining both measures into a single\nvalue that predicts overall computational performance. But if one just takes the difference\nof both measures then the resulting number (see Fig. 3c) predicts quite well which types of\nneural microcircuit models perform well for the particular computational tasks considered\nin Fig. 2b.\n\n\n5 Evaluating the computational power of neural microcircuit models\n in UP- and DOWN-states\n\nData from numerous intracellular recordings suggest that neural circuits in vivo switch be-\ntween two different dynamic regimes that are commonly referred to as UP- and DOWN\nstates. UP-states are characterized by a bombardment with synaptic inputs from recurrent\nactivity in the circuit, resulting in a membrane potential whose average value is signifi-\ncantly closer to the firing threshold, but also has larger variance. We have simulated these\ndifferent dynamic regimes by varying the background current Ibackground and the noise\ncurrent Inoise. Fig. 4a shows that one can simulate in this way different dynamic regimes\nof the same circuit where the time course of the membrane potential qualitatively matches\ndata from intracellular recordings in UP- and DOWN-states (see e.g. [Shu et al., 2003]).\nWe have tested the computational performance of circuits in 42 different dynamic regimes\n(for 7 values of Ibackground and 6 values of Inoise) with 3 complex nonlinear computations\non firing rates of circuit inputs.7 Inputs u consisted of 4 Poisson spike trains with time-\nvarying rates (drawn independently every 30 ms from the interval of 0 to 80 Hz for the first\ntwo and the second two of 4 input spike trains, see middle row of Fig. 4a for a sample).\nLet f1(t) (f2(t)) be the actual sum of rates normalized to the interval [0, 1] for the first\n\n 6The rank of the matrix consisting of 500 circuit states xu(t) for t = 200 ms was computed for\n500 spike patterns over 200 ms as described in section 2, see Fig. 2a.\n 7Computations on firing rates were chosen as benchmark tasks both because UP states were con-\njectured to enhance the performance for such tasks, and because we want to show that the proposed\nmeasures are applicable to other types of computational tasks than those considered in section 4.\n\n\f\n 16\n a\n 100\n UP-state [mV] 14\n m 50\n V\n 12 0\n\n\n\n 16\n 100\n DOWN-state [mV] 14\n m 50\n V\n 12 0\n 300 350 400 450 500 350 400 450 500\n t [ms] t [ms]\n b c d\n 10 10 10\n 120\n 6 70 6 6 0.2\n 4.5 UP 4.5 100 4.5\n 60 0.15\n 3.2 3.2 3.2\n 80\n I noise 50 0.1\n 1.9 DOWN 1.9 60 1.9\n 40 0.05\n 40\n 30\n 0\n 0.6 0.6 20 0.6\n 11.5 12 12.5 13.5 \n 14.3 11.5 12 12.5 13.5 \n 14.3 11.5 12 12.5 13.5 \n 14.3 \n e f g\n 10 10 10\n 0.25\n 6 6 6 0.3\n 4.5 0.7 4.5 4.5\n 0.2\n 3.2 3.2 3.2\n\n I noise 0.6\n 1.9 1.9 0.15 1.9 0.25\n\n\n 0.5 0.1\n 0.2\n 0.6 0.6 0.6\n 11.5 12 12.5 13.5 \n 14.3 11.5 12 12.5 13.5 \n 14.3 11.5 12 12.5 13.5 \n 14.3 \n I I I\n background background background\n\n\n\n\n\nFigure 4: Analysis of the computational power of simulated neural microcircuits in different dy-\nnamic regimes. a) Membrane potential (for a firing threshold of 15 mV) of two randomly selected\nneurons from circuits in the two parameter regimes marked in panel b), as well as spike rasters for\nthe same two parameter regimes (with the actual circuit inputs shown between the two rows). b)\nEstimates of the kernel-quality for input streams u with 34 different combinations of firing rates from\n0, 20, 40 Hz in the 4 input spike trains (mean SD = 12). c) Estimate of the VC-dimension for a set\nSuniv of inputs consisting of 200 different spike trains u that represent 2 different combinations of\nfiring rates (mean SD = 4.6). d) Difference of measures from panels b and c (after scaling each lin-\nearly into a common range [0,1]). e), f), g): Evaluation of the computational performance (correlation\ncoefficient; all for test data; mean SD is 0.06, 0.04, and 0.03 for panels e), f), and g) respectively.) of\nthe same circuits in different dynamic regimes for computations involving multiplication and abso-\nlute value of differences of firing rates (see text). The theoretically predicted parameter regime with\ngood computational performance for any computations on firing rates (see panel d) agrees quite well\nwith the intersection of areas with good computational performance in panels e, f, g.\n\n\n\ntwo (second two) input spike trains computed from the time interval [t - 30ms, t]. The\ncomputational tasks considered in Fig. 4 were to compute online (and in real-time) every\n30 ms the functions f1(t) f2(t) (see panel e), to decide whether the value of the product\nf1(t) f2(t) lies in the interval [0.1, 0.3] or lies outside of this interval (see panel f), and to\ndecide whether the absolute value of the difference f1(t) - f2(t) is greater than 0.25 (see\npanel g).\n\nWe wanted to test whether the proposed measures for computational power and general-\nization capability were able to make reasonable predictions for this completely different\nparameter map, and for computations on firing rates instead of spike patterns. It turns\nout that also in this case the kernel-quality (Fig. 4b) explains why circuits in the dynamic\nregime corresponding to the left-hand side of the parameter map have inferior computa-\ntional power for all three computations on firing rates (see Fig. 4 e,f,g). The VC-dimension\n(Fig. 4c) explains the decline of computational performance in the right part of the pa-\nrameter map. The difference of both measures (Fig. 4d) predicts quite well the dynamic\nregime where high performance is achieved for all three computational tasks considered in\nFig. 4 e,f,g. Note that Fig. 4e has high performance in the upper right corner, in spite of a\nvery high VC-dimension. This could be explained by the inherent bias of linear readouts\n\n\f\nto compute smooth functions on firing rates, which fits particularly well to this particular\ntarget output.\n\nIf one estimates kernel-quality and VC-dimension for the same circuits, but for computa-\ntions on sparse spike patterns (for an input ensemble Suniv similarly as in section 4), one\nfinds that circuits at the lower left corner of this parameter map (corresponding to DOWN-\nstates) are predicted to have better computational performance for these computations on\nsparse input. This agrees quite well with direct evaluations of computational performance\n(not shown). Hence the proposed quantitative measures may provide a theoretical founda-\ntion for understanding the computational function of different states of neural activity.\n\n\n6 Discussion\n\nWe have proposed a new method for understanding why one neural microcircuit C is com-\nputationally more powerful than another neural microcircuit C . This method is in principle\napplicable not just to circuit models, but also to neural microcircuits in vivo and in vitro.\nHere it can be used to analyze (for example by optical imaging) for which family of com-\nputational tasks a particular microcircuit in a particular dynamic regime is well-suited. The\nmain assumption of the method is that (approximately) linear readouts from neural micro-\ncircuits have the task to produce the actual outputs of specific computations. We are not\naware of specific theoretically founded rules for choosing the sizes of the ensembles of\ninputs for which the kernel-measure and the VC-dimension are to be estimated. Obviously\nboth have to be chosen sufficiently large so that they produce a significant gradient over the\nparameter map under consideration (taking into account that their maximal possible value\nis bounded by the circuit size). To achieve theoretical guarantees for the performance of\nthe proposed predictor of the generalization capability of a neural microcircuit one should\napply it to a relatively large ensemble Suniv of circuit inputs (and the dimension n of cir-\ncuit states should be even larger). But the computer simulations of 132 types of neural\nmicrocircuit models that were discussed in this article suggest that practically quite good\nprediction can already be achieved for a much smaller ensemble of circuit inputs.\n\nAcknowledgment: The work was partially supported by the Austrian Science Fund FWF,\nproject # P15386, and PASCAL project # IST2002-506778 of the European Union.\n\n\nReferences\n\n[Bartlett and Maass, 2003] Bartlett, P. L. and Maass, W. (2003). Vapnik-Chervonenkis dimension of\n neural nets. In Arbib, M. A., editor, The Handbook of Brain Theory and Neural Networks, pages\n 11881192. MIT Press (Cambridge), 2nd edition.\n\n[Cherkassky and Mulier, 1998] Cherkassky, V. and Mulier, F. (1998). Learning from Data. Wiley,\n New York.\n\n[Langton, 1990] Langton, C. G. (1990). Computation at the edge of chaos. Physica D, 42:1237.\n\n[Maass et al., 2002] Maass, W., Natschlager, T., and Markram, H. (2002). Real-time computing\n without stable states: A new framework for neural computation based on perturbations. Neural\n Computation, 14(11):25312560.\n\n[Markram et al., 1998] Markram, H., Wang, Y., and Tsodyks, M. (1998). Differential signaling via\n the same axon of neocortical pyramidal neurons. PNAS, 95:53235328.\n\n[Shu et al., 2003] Shu, Y., Hasenstaub, A., and McCormick, D. A. (2003). Turning on and off recur-\n rent balanced cortical activity. Nature, 103:288293.\n\n[Vapnik, 1998] Vapnik, V. N. (1998). Statistical Learning Theory. John Wiley (New York).\n\n\f\n", "award": [], "sourceid": 2737, "authors": [{"given_name": "Wolfgang", "family_name": "Maass", "institution": null}, {"given_name": "Robert", "family_name": "Legenstein", "institution": null}, {"given_name": "Nils", "family_name": "Bertschinger", "institution": null}]}