{"title": "Optimal Information Decoding from Neuronal Populations with Specific Stimulus Selectivity", "book": "Advances in Neural Information Processing Systems", "page_first": 937, "page_last": 944, "abstract": null, "full_text": " Optimal information decoding from neuronal\n populations with specific stimulus selectivity\n\n\n\n Marcelo A. Montemurro Stefano Panzeri \n The University of Manchester The University of Manchester\n Faculty of Life Sciences Faculty of Life Sciences\n Moffat Building Moffat Building\n PO Box 88, Manchester M60 1QD, UK PO Box 88, Manchester M60 1QD, UK\n m.montemurro@manchester.ac.uk s.panzeri@manchester.ac.uk\n\n\n\n\n Abstract\n\n A typical neuron in visual cortex receives most inputs from other cortical\n neurons with a roughly similar stimulus preference. Does this arrange-\n ment of inputs allow efficient readout of sensory information by the tar-\n get cortical neuron? We address this issue by using simple modelling of\n neuronal population activity and information theoretic tools. We find that\n efficient synaptic information transmission requires that the tuning curve\n of the afferent neurons is approximately as wide as the spread of stim-\n ulus preferences of the afferent neurons reaching the target neuron. By\n meta analysis of neurophysiological data we found that this is the case\n for cortico-cortical inputs to neurons in visual cortex. We suggest that\n the organization of V1 cortico-cortical synaptic inputs allows optimal in-\n formation transmission.\n\n\n\n1 Introduction\n\nA typical neuron in visual cortex receives most of its inputs from other visual cortical neu-\nrons. The majority of cortico-cortical inputs arise from afferent cortical neurons with a\npreference to stimuli which is similar to that of the target neuron [1, 2, 3]. For exam-\nple, orientation selective neurons in superficial layers in ferret visual cortex receive more\nthan 50% of their cortico-cortical excitatory inputs from neurons with orientation prefer-\nence which is less than 30o apart. However, this input structure is rather broad in terms\nof stimulus-specificity: cortico-cortical connections between neurons tuned to dissimilar\nstimulus orientation also exist [4]. The structure and spread of the stimulus specificity of\ncortico-cortical connections has received a lot of attention because of its importance for\nunderstanding the mechanisms of generation of orientation tuning (see [4] for a review).\nHowever, little is still known on whether this structure of inputs allows efficient transmis-\nsion of sensory information across cortico-cortical synapses.\n\nIt is likely that efficiency of information transmission across cortico-cortical synapses also\ndepends on the width of tuning curves of the afferent cortical neurons to stimuli. In fact,\ntheoretical work on population coding has shown that the width of the tuning curves has\n\n Corresponding author\n\n\f\nan important influence on the quality and the nature of the information encoding in cortical\npopulations [5, 6, 7, 8]. Another factor that may influence the efficiency of cortico-cortical\nsynaptic information transmission is the biophysical capability of the target neuron. To\nconserve all information during synaptic transmission, the target neuron must conserve the\n`label' of the spikes arriving from multiple input neurons at different places on its dendritic\ntree [9]. Because of biophysical limitations, a target neuron that e.g. can only sum inputs\nat the soma may lose a large part of the information present in the afferent activity. The\noptimal arrangement of cortico-cortical synapses may also depend on the capability of\npostsynaptic neurons in processing separately spikes from different neurons.\n\nIn this paper, we address the problem of whether cortico-cortical synaptic systems encode\ninformation efficiently. We introduce a simple model of neuronal information processing\nthat takes into account both the selective distribution of stimulus preferences typical of\ncortico-cortical connections and the potential biophysical limitations of cortical neurons.\nWe use this model and information theoretic tools to investigate whether there is an opti-\nmal trade-off between the spread of distribution of stimulus preference across the afferent\nneurons and the tuning width of the afferent neurons itself. We find that efficient synaptic\ninformation transmission requires that the tuning curve of the afferent neurons is approx-\nimately as wide as the spread of stimulus preferences of the afferent fibres reaching the\ntarget neuron. By reviewing anatomical and physiological data, we argue that this optimal\ntrade-off is approximately reached in visual cortex. These results suggest that neurons in\nvisual cortex are wired to decode optimally information from a stimulus-specific distribu-\ntion of synaptic inputs.\n\n\n2 Model of the activity of the afferent neuronal population\n\nWe consider a simple model for the activity of the afferent neuronal population based on\nthe known tuning properties and spatial and synaptic organisation of sensory areas.\n\n\n2.1 Stimulus tuning of individual afferent neurons\n\nWe assume that the the population is made of a large number N of neurons (for a real\ncortical neuron, the number N of afferents is in the order of few thousands [10]). The\nresponse of each neuron rk(k = 1, , N) is quantified as the number of spikes fired in\na salient post-stimulus time window of a length . Thus, the overall neuronal population\nresponse is represented as a spike count vector r = (r1, , rN ).\n\nWe assume that the neurons are tuned to a small number D of relevant stimulus parameters\n[11, 12], such as e.g. orientation, speed or direction of motion of a visual object. The\nstimulus variable will thus be described as a vector s = (s1, . . . , sD) of dimension D. The\nnumber of stimulus features that are encoded by the neuron will be left as a free parameter\nto be varied within the range 1-5 for two reasons. First, although there is evidence that the\nnumber of stimulus features encoded by a single neuron is limited [11, 12], more research\nis still needed to determine exactly how many stimulus parameters are encoded in different\nareas. Second, a previous related study [8] has shown that, when considering large neuronal\npopulations with a uniform distribution of stimulus preferences (such as an hypercolumn\nin V1 containing all stimulus orientations) the tuning width of individual neurons which is\noptimal for population coding depends crucially on the number of stimulus features being\nencoded. Thus, it is interesting to investigate how the optimal arrangement of cortico-\ncortical synaptic systems depends on the number of stimulus features being encoded.\n\nThe neuronal tuning function of the k - th neuron (k = 1, , N ), which quantifies the\nmean spike count of the k - th neuron to the presented stimulus, is modelled as a Gaussian\ndistribution, characterised by the following parameters: preferred stimulus s(k), tuning\n\n\f\nwidth f , and response modulation m:\n\n - (s-s(k))2\n f (k)(s) = me 2f 2 (1)\n\nThe Gaussian tuning curve is a good description of the tuning properties of e.g. V1 or\nMT neurons to variables such as stimulus orientation motion direction [13, 14, 15], and is\nhence widely used in models of sensory coding [16, 17]. Large values of f indicate coarse\ncoding, whereas small values of f indicate sharp tuning.\n\nSpike count responses of each neuron on each trial are assumed to follow a Poisson distri-\nbution whose mean is given by the above neuronal tuning function (Eq. 1). The Poisson\nmodel is widely used because it is the simplest model of neuronal firing that captures the\nsalient property of neuronal firing that the variance of spike counts is proportional to its\nmean. The Poisson model neglects all correlations between spikes. This assumption is\ncertainly a simplification but it is sufficient to account for the majority of the information\ntransmitted by real cortical neurons [18, 19, 20] and, as we shall see later, it is mathemati-\ncally convenient because it makes our model tractable.\n\n\n2.2 Distribution of stimulus preferences among the afferent population\n\nNeurons in sensory cortex receive a large number of inputs from other neurons with a vari-\nety of stimulus preferences. However, the majority of their inputs come from neurons with\nroughly similar stimulus preference [1, 2, 3]. To characterise correctly this type of spread\nof stimulus preference among the afferent population, we assume (unlike in previous stud-\nies [8]), that the probability distribution of the preferred stimulus among afferent neurons\nfollows a Gaussian distribution:\n\n 1 - (^s-^s0)2\n P (^s) = 22\n p\n (2 e (2)\n )D/2D\n p\n\nIn Eq. (2) the parameter ^\n s0 represents the the center of the distribution, thus being the\nmost represented preferred stimulus in the population. (we set, without loss of general-\nity, ^\n s0 = 0.) The parameter p controls the spread of stimulus preferences of the afferent\nneuronal population: a small value of p indicates that a large fraction of the population\nhave similar stimulus preferences, and a large value of p indicates that all stimuli are\nrepresented similarly. A Gaussian distribution of stimulus preferences of the afferent pop-\nulation fits well empirical data on distribution of preferred orientations of synaptic inputs\nof neurons in both deep and superficial layers of ferret primary visual cortex [3].\n\n\n3 Width of tuning and spread of stimulus preferences in visual cortex\n\nTo estimate the width of tuning f and the spread of stimulus preferences p of cortico-\ncortical afferent populations in visual cortex, we reviewed critically published anatomical\nand physiological data. We concentrated on excitatory synaptic inputs, which form the\nmajority of inputs to a cortical pyramidal neuron [10]. We computed p by fitting (by a least\nsquare method) the published histograms of synaptic connections as function of stimulus\npreference of the input neuron to Gaussian distributions. Similarly, we determined f by\nfitting spike count histograms to Gaussian tuning curves.\n\nWhen considering a target neuron in ferret primary visual cortex and using orientation as\nthe stimulus parameters, the spread of stimulus preferences p of its inputs is 20o for\nlayer 5/6 neurons [3], and 16o [3] to 23o [21] for layer 2/3 neurons. The orientation tuning\nwidth f of the cortical inputs to the V1 target neuron is that of other V1 neurons that\nproject to it. This f is 17o for Layer 4 neurons [22], and it is similar for neurons in deep\nand superficial layers [3]. When considering a target neuron in Layer 4 of cat visual cortex\n\n\f\nand orientation tuning, the spread of stimulus preference p is 20o [2] and f is 17o.\nWhen considering a target neuron in ferret visual cortex and motion direction tuning, the\nspread of tuning of its inputs p is 30 o [1]. Motion direction tuning widths of macaque\nneurons is 28o, and this width is similar across species (see [13]).\n\nThe most notable finding of our meta-analysis of published data is that p and f appear\nto be approximately of the same size and their ratio f /p is distributed around 1, in the\nrange 0.7 to 1.1 for the above data. We will use our model to understand whether this range\nof f /p corresponds to an optimal way to transmit information across a synaptic system.\n\n\n4 Information theoretic quantification of population decoding\n\n\nTo characterise how a target neuronal system can decode the information about sensory\nstimuli contained in the activity of its afferent neuronal population, we use mutual infor-\nmation [23]. The mutual information between a set of stimuli and the neuronal responses\nquantifies how well any decoder can discriminate among stimuli by observing the neuronal\nresponses. This measure has the advantage of being independent of the decoding mecha-\nnism used, and thus puts precise constraints on the information that can be decoded by any\nbiological system operating on the afferent activity.\n\nPrevious studies on the information content of an afferent neuronal population [7, 8] have\nassumed that the target neuronal decoding system can extract all the information during\nsynaptic transmission. To do so, the target neuron must conserve the \"label\" of the spikes\narriving from multiple neurons at different sites on its dendritic tree [9]. Given the poten-\ntial biophysical difficulty in processing each spike separately, a simple alternative to spike\nlabelling has been proposed, - spike pooling [10, 24]. In this scheme, the target neuron\nsimply sums up the afferent activity. To characterize how the decoding of afferent informa-\ntion would work in both cases, we compute both the information that can be decoded by\neither a system that processes separately spikes from different neurons (the \"labeled-line\ninformation\") and the information available to a decoder that sums all incoming spikes (the\n\"pooled information\") [9, 24]. In the next two subsections we define these quantities and\nwe explain how we compute it in our model.\n\n\n4.1 The information available to the the labeled-line decoder\n\nThe mutual information between the set of the stimuli and the labeled-line neuronal popu-\nlation activity is defined as follows [9, 24]:\n\n\n ILL(S, R) = dsP (s) P (r|s) log P (r|s)\n r P (r) (3)\n\nwhere P (s) is the probability of stimulus occurrence (here taken for simplicity as a uni-\nform distribution over the hypersphere of D dimensions and `radius' s). P (r|s) is the\nprobability of observing a neuronal population response r conditional to the occurrence of\nstimulus s, and P (r) = dsP (s)P (r|s). Since the response vector r keeps separate the\nspike counts of each neuron, the amount of information in Eq. (3) is only accessible to a\ndecoder than can keep the label of which neuron fired which spike [9, 24]. The probability\nP (r|s) is computed according to the Poisson distribution, which is entirely determined by\nthe knowledge of the tuning curves [5]. The labeled-line mutual information is difficult to\ncompute for large populations, because it requires the knowledge of the probability of the\nlarge-dimensional response vector r. However, since in our model we assume that we have\na very large number of independent neurons in the population and that the total activity of\nthe system is of the order of its size, then we can use the following simpler (but still exact)\n\n\f\nexpression[16, 25]:\n 1\n ILL(S, R) = H(S) - D ln (2\n 2 e) + 2 ds P(s) ln (|J (s)|) (4)\nwhere H(S) is the entropy of the prior stimulus presentation distribution P (S), J (s) is\nthe Fisher information matrix and | . . . | stands for the determinant. The Fisher information\nmatrix is a D D matrix who's elements i, j are defined as follows:\n Ji,j(s) = - P (r|s) 2 log P(r|s) , (5)\n r si sj\nFisher information is a useful measure of the accuracy with which a particular stimulus can\nbe reconstructed from a single trial observation of neuronal population activity. However,\nin this paper it is used only as a step to obtain a computationally tractable expression for the\nlabeled-line mutual information. The Fisher information matrix can be computed by taking\ninto account that for a population of Poisson neurons is just the sum of the Fisher informa-\ntion for individual neurons, and the latter has a simple expression in terms of tuning curves\n[16]. Since the neuronal population size N is is large, the sum over Fisher information\nof individual neurons can be replaced by an integral over the stimulus preferences of the\nneurons in the population, weighted by their probability density P (^s). After performing\nthe integral over the distribution of preferred stimuli, we arrived at the following result for\nthe elements of the Fisher information matrix:\n\n\n J D-2 - 2\n i,j(s) = N m i,j + 2 (i,j + ij) e 2(1+2) (6)\n 2p (1 + 2)D2 +2\nwhere we have introduced the following short-hand notation f /p and s/p ;\ni,j stands for the Kroneker Delta. From Eq. (6) it is possible to compute explicitly the\ndeterminant |J (s)|, which has the following form:\n D\n |J (s)| = i = ()D(1 + 2)D-1 1 + 2(1 + 2) (7)\n i=1\nwhere () is given by:\n D-2 - 2\n () = N m e 2(1+2) (8)\n 2p (1 + 2)D2 +1\nInserting Eq. (7) into Eq. (4), one obtains a tractable but still exact expression for the\nmutual information , which has the advantage over Eq. (3) of requiring only an integral\nover a D-dimensional stimulus rather than a sum over an infinite population.\n\nWe have studied numerically the dependence of the labeled-line information on the pa-\nrameters f and p as a function of the number of encoded stimulus features D 1. We\ninvestigated this by fixing p and then varying the ration f /p over a wide range. Results\n(obtained for p = 1 but representative of a wide f range) are reported in Fig. 1. We\nfound that, unlike the case of a uniform distribution of stimulus preferences [8], there is a\nfinite value of the width of tuning f that maximizes the information for all D 2. Inter-\nestingly, for D 2 the range 0.7 f /p 1.1 found in visual cortex either contains the\nmaximum or corresponds to near optimal values of information transmission. For D = 1,\ninformation is maximal for very narrow tuning curves. However, also in this case the in-\nformation values are still efficient in the cortical range f /p 1, in that the tail of the\nD = 1 information curve is avoided in that region. Thus, the range of values of f and p\nfound in visual cortex allows efficient synaptic information transmission over a wide range\nof number of stimulus features encoded by the neuron.\n\n 1We found (data not shown) that other parameters such as m and , had a weak or null effect on\nthe optimal configuration; see [17] for a D = 1 example in a different context.\n\n\f\n D=1\n\n\n\n\n (S,R)\n LL I\n\n\n D=5\n\n\n\n\n 0 2 4 6 8\n /\n f p\n\n\nFigure 1: Mutual labeled-line information as a function of the ratio of tuning curve width\nand stimulus preference spread f /p. The curves for each stimulus dimensionality D\nwere shifted by a constant factor to separate them for visual inspection (lower curves cor-\nrespond to higher values of D). The y-axis is thus in arbitrary units. The position of the\nmaximal information for each stimulus dimension falls either inside the range of values\nof f /p found in visual cortex, or very close to it (see text) . Parameters are as follows:\ns = 2, rmax = 50Hz, = 10ms.\n\n\n4.2 The information available to the the pooling decoder\n\nWe now consider the case in which the target neuron cannot process separately spikes\nfrom different neurons (for example, a neuron that just sums up post-synaptic potentials\nof approximately equal weight at the soma). In this case the label of the neuron that fired\neach spike is lost by the target neuron, and it can only operate on the pooled neuronal\nsignal, in which the identity of each spike is lost. Pooling mechanisms have been proposed\nas simple information processing strategies for the nervous system. We now study how\npooling changes the requirements for efficient decoding by the target neuron.\n\nMathematically speaking, pooling maps the vector r onto a scalar equal to the sum of the\nindividual activities: = rk. Thus, the mutual information that can be extracted by any\ndecoder that only pools it inputs is given by the following expression:\n\n\n IP (S, R) = dsP (s) P (|s) log P (|s)\n P () (9)\n\nwhere P (|s) and P () are the the stimulus-conditional and stimulus-unconditional proba-\nbility of observing a pooled population response on a single trial. The probability P (|s)\ncan be computed by noting that a sum of Poisson-distributed responses is still a Poisson-\ndistributed response whose tuning curve to stimuli is just the sum of the individual tuning\ncurves. The pooled mutual information is thus a function of a single Poisson-distributed\nresponse variables and can be computed easily also for large populations.\n\nThe dependence of the pooled information on the parameters f and p as a function of\nthe number of encoded stimulus features D is reported in Fig. 2. There is one important\ndifference with respect to the labeled-line information transmission case. The difference is\nthat, for the pooled information, there is a finite optimal value for information transmission\nalso when the neurons are tuned to one-dimensional stimulus feature. For all cases of stim-\nulus dimensionality considered, the efficient information transmission though the pooled\n\n\f\n D=1 \n\n\n\n\n\n (S,R) D=3 \n P I\n\n\n\n\n\n 0 1 2 3 4\n /\n f p\n\n\n\nFigure 2: Pooled mutual information as a function of the ratio of tuning curve width and\nstimulus preference spread f /p. The maxima are inside the range of experimental values\nof f /p found in the visual cortex, or very close to it (see text). As for Fig. 1, the curves\nfor each stimulus dimensionality D were shifted by a constant factor to separate them for\nvisual inspection (lower curves correspond to higher values of D). The y-axis is thus in\narbitrary units. Parameters are as follows: s = 2, rmax = 50 Hz, = 10ms.\n\n\n\nneuronal decoder can still be reached in the visual cortical range 0.7 f p 1.1. This\nfinding shows that the range of values of f and p found in visual cortex allows effi-\ncient synaptic information transmission even if the target neuron does not rely on complex\ndendritic processing to conserve the label of the neuron that fired the spike.\n\n\n5 Conclusions\n\nThe stimulus specificity of cortico-cortical connections is important for understanding the\nmechanisms of generation of orientation tuning (see [4]) for a review). Here, we have\nshown that the stimulus-specific structure of cortico-cortical connections may have also im-\nplications for understanding cortico-cortical information transmission. Our results suggest\nthat, whatever the exact role of cortico-cortical synapses in generating orientation tuning,\ntheir wiring allows efficient transmission of sensory information.\n\n\nAcknowledgments\n\nWe thanks A. Nevado and R. Petersen for many interesting discussions. Research supported\nby ICTP, HFSP, Royal Society and Wellcome Trust 066372/Z/01/Z.\n\n\nReferences\n\n [1] B. Roerig and J. P. Y. Kao. Organization of intracortical circuits in relation to direction\n preference maps in ferret visual cortex. J. Neurosci., 19:RC44(105), 1999.\n\n [2] T. Yousef, T. Bonhoeffer, D-S. Kim, U. T. Eysel, \n E. Toth, and Z. F. Kisvarday. Orien-\n tation topography of layer 4 lateral networks revealed by optical imaging in cat visual\n cortex (area 18). European J. Neurosci., 11:42914308, 1999.\n\n [3] B. Roerig and B. Chen. Relations of local inhibitory and excitatory circuits to orien-\n tation preference maps in ferret visual cortex. Cerebral Cortex, 12:187198, 2002.\n\n\f\n [4] K. A. C. Martin. Microcircuits in visual cortex. Current Opinion in Neurobiology,\n 12:418425, 2002.\n\n [5] P. Dayan and L. F. Abbott. Theoretical Neuroscience. MIT Press, 2001.\n\n [6] D. C. Fitzpatrick, R. Batra, T. R. Stanford, and S. Kuwada. A neuronal population\n code for sound localization. Nature, 388:871874, 1997.\n\n [7] A. Pouget, S. Deneve, J-C. Ducom, and P.E. Latham. Narrow versus wide tuning\n curves: what's best for a population code? Neural Computation, 11:8590, 1999.\n\n [8] K. Zhang and T.J. Sejnowski. Neuronal tuning: to sharpen or to broaden? Neural\n Computation, 11:7584, 1999.\n\n [9] D. S. Reich, F. Mechler, and J. D. Victor. Independent and redundant information in\n nearby cortical neurons. Science, 294:25662568, 2001.\n\n[10] M. N. Shadlen and W. T. Newsome. The variable discharge of cortical neurons:\n implications for connectivity, computation and coding. J. Neurosci., 18(10):3870\n 3896, 1998.\n\n[11] N. Brenner, W. Bialek, and R. de Ruyter van Steveninck. Adaptive rescaling maxi-\n mizes information transmission. Neuron, 26:695702, 2000.\n\n[12] J. Touryan, B. Lau, and Y. Dan. Isolation of relevant visual features from random\n stimuli for cortical complex cells. J. Neurosci, 22:1081110818, 2002.\n\n[13] T. D. Albright. Direction and orientation selectivity of neurons in visual area MT of\n the macaque. J. Neurophysiol., 52:11061130, 1984.\n\n[14] K.H. Britten, M. N. Shadlen, W. T. Newsome, and J. A. Movshon. The analysis\n of visual-motion - a comparison of neuronal and psychophysical performance. J.\n Neurosci., 12:47454765, 1992.\n\n[15] K Kang, RM Shapley, and H Sompolinsky. Information tuning of population of neu-\n rons in primary visual cortex. J. Neurosci., 24:37263735, 2004.\n\n[16] N. Brunel and J. P. Nadal. Mutual information, fisher information and population\n coding. Neural Computation, 10:17311757, 1998.\n\n[17] A. Nevado, M.P. Young, and S. Panzeri. Functional imaging and neural information\n coding. Neuroimage, 21:10951095, 2004.\n\n[18] S. Nirenberg, S. M. Carcieri, A.L. Jacobs, and P. E. Latham. Retinal ganglion cells\n act largely as independent encoders. Nature, 411:698701, 2001.\n\n[19] R. S. Petersen, S. Panzeri, and M.E. Diamond. Population coding of stimulus location\n in rat somatosensory cortex. Neuron, 32:503514, 2001.\n\n[20] M. W. Oram, N.G. Hatsopoulos, B.J. Richmond, and J.P. Donoghue. Excess syn-\n chrony in motor cortical neurons provides redundant direction information with that\n from coarse temporal measures. J. Neurophysiol., 86:17001716, 2001.\n\n[21] M. B. Dalva, M. Weliky, and L. Katz. Relations between local synaptic connections\n and orientation domains in primary visual cortex. Neuron, 19:871880, 1997.\n\n[22] W. M. Usrey, M. P. Sceniak, and B. Chapman. Receptive fields and response prop-\n erties of neurons in layer 4 of ferret visual cortex. J. Neurophysiol., 89:10031015,\n 2003.\n\n[23] T.M. Cover and J.A. Thomas. Elements of information theory. John Wiley, 1991.\n\n[24] S. Panzeri, F. Petroni, R.S. Petersen, and M.E. Diamond. Decoding neuronal popu-\n lation activity in rat somatosensory cortex: role of columnar organization. Cerebral\n Cortex, 13:4552, 2003.\n\n[25] K Kang and H Sompolinsky. Mutual information of population codes and distance\n measures in probability space. Phys. Rev. Lett., 86:49584961, 2001.\n\n\f\n", "award": [], "sourceid": 2649, "authors": [{"given_name": "Marcelo", "family_name": "Montemurro", "institution": null}, {"given_name": "Stefano", "family_name": "Panzeri", "institution": null}]}