{"title": "Group Redundancy Measures Reveal Redundancy Reduction in the Auditory Pathway", "book": "Advances in Neural Information Processing Systems", "page_first": 173, "page_last": 180, "abstract": null, "full_text": "Group Redundancy Measures Reveal \n\nRedundancy Reduction in the Auditory \n\nPathway \n\nGal Chechik Amir Globerson Naftali Tishby \n\nSchool of Computer Science and Engineering \n\nand The Interdisciplinary Center for Neural Computation \n\nHebrew University of Jerusalem, Israel \n\nggal@cs.huji.ac.il \n\nMichael J. Anderson \n\nEric D. Young \n\nDepartment of Biomedical Engineering \n\nJohns Hopkins University, Baltimore, MD, USA \n\nIsrael N elken \n\nDepartment of Physiology, Hadassah Medical School \n\nand The Interdisciplinary Center for Neural Computation \n\nHebrew University of Jerusalem, Israel \n\nAbstract \n\nThe way groups of auditory neurons interact to code acoustic in(cid:173)\nformation is investigated using an information theoretic approach. \nWe develop measures of redundancy among groups of neurons, and \napply them to the study of collaborative coding efficiency in two \nprocessing stations in the auditory pathway: the inferior colliculus \n(IC) and the primary auditory cortex (AI). Under two schemes for \nthe coding of the acoustic content, acoustic segments coding and \nstimulus identity coding, we show differences both in information \ncontent and group redundancies between IC and AI neurons. These \nresults provide for the first time a direct evidence for redundancy \nreduction along the ascending auditory pathway, as has been hy(cid:173)\npothesized for theoretical considerations [Barlow 1959,2001]. The \nredundancy effects under the single-spikes coding scheme are signif(cid:173)\nicant only for groups larger than ten cells, and cannot be revealed \nwith the redundancy measures that use only pairs of cells. The \nresults suggest that the auditory system transforms low level rep(cid:173)\nresentations that contain redundancies due to the statistical struc(cid:173)\nture of natural stimuli, into a representation in which cortical neu(cid:173)\nrons extract rare and independent component of complex acoustic \nsignals, that are useful for auditory scene analysis. \n\n\fIntroduction \n\n1 \nHow do groups of sensory neurons interact to code information and how do these \ninteractions change along the ascending sensory pathways? According to the a \ncommon view, sensory systems are composed of a series of processing stations, \nrepresenting more and more complex aspects of sensory inputs. The changes in \nrepresentations of stimuli along the sensory pathway reflect the information pro(cid:173)\ncessing performed by the system. Several computational principles that govern \nthese changes were suggested, such as information maximization and redundancy \nreduction [2, 3, 11]. In order to investigate such changes in practice, it is neces(cid:173)\nsary to develop methods to quantify information content and redundancies among \ngroups of neurons, and trace these measures along the sensory pathway. \n\nInteractions and high order correlations between neurons were mostly investigated \nwithin single brain areas on the level of pairs of cells (but also for larger groups of \ncells [9]) showing both synergistic and redundant interactions [8, 10, 21, 6, 7, 13]. \nThe current study develops information theoretic redundancy measures for larger \ngroups of neurons, focusing on the case of stimulus-conditioned independence. We \nthen compare these measures in electro-physiological recordings from two auditory \nstations: the auditory mid-brain and the primary auditory cortex. \n\n2 Redundancy measures for groups of neurons \n\nTo investigate high order correlations and interactions within groups of neurons \nwe start by defining information measures for groups of cells and then develop \ninformation redundancy measures for such groups. The properties of these measures \nare then further discussed for the specific case of stimulus-conditioned independence. \n\nFormally, the level of independence of two variables X and Y is commonly quantified \nby their mutual information (MI) [17,5]. This well known quantity, now widely used \nin analysis of neural data, is defined by \n\nJ(X; Y) = DKL[P(X, Y)IIP(X)P(Y)] = ~p(x, y)log (:~~~~~)) \n\n(1) \n\nand measures how close the joint distribution P(X, Y) is to the factorization by the \nmarginal distributions P(X)P(Y) (DKL is the Kullback Leiber divergence [5]). \n\nFor larger groups of cells, an important generalized measure quantifies the infor(cid:173)\nmation that several variables provide about each other. This multi information \nmeasure [18] is defined by \n\n(2) \n\nSimilar to the mutual information case, the multi information measures how close \nthe joint distribution is to the factorization by the marginals. It thus vanishes when \nvariables are independent and is otherwise positive. \n\nWe now turn to develop measures for group redundancies. Consider first the simple \ncase of a pair of neurons (Xl, X 2 ) conveying information about the stimulus S. In \nthis case, the redundancy-synergy index ([4, 7]) is defined by \n\n(3) \n\n\fIntuitively, RSpairs measures the amount of information on the stimulus S gained \nby observing the joint distribution of both Xl and X 2 , as compared with observing \nthe two cells independently. In the extreme case where Xl = X 2 , the two cells \nare completely redundant and provide the same information about the stimulus, \nyielding RSpairs = I(Xl' X 2 ; S) - I(Xl ; S) - I(X2 ; S) = -I(Xl; S), which is always \nnon-positive. On the other hand, positive RSpairs values testify for synergistic \ninteraction between Xl and X 2 ([8, 7, 4]). \n\nFor larger groups of neurons, several different measures of redundancy-synergy may \nbe considered, that encompass different levels of interactions. For example, one can \nquantify the residual information obtained from a group of N neurons compared \nto all its N - 1 subgroups. As with inclusion-exclusion calculations this measure \ntakes the form of a telescopic sum: RSNIN-l = I(XN; S) - L{X N -l} I(X N-\\ S) + \n... + (_l)N-l L{Xd I(Xi ; S), where {Xk} are all the subgroups of size k out of the \nN available neurons. Unfortunately, this measure involves 2N information terms, \nmaking its calculation infeasible even for moderate N values 1. \nA different RS measure quantifies the information embodied in the joint distribution \nof N neurons compared to that provided by N single independent neurons, and is \ndefined by \n\nRSN ll = I(Xl ' ... , X N ; S) - 2..: I(Xi ; S) \n\nN \n\ni=l \n\n(4) \n\nInterestingly, this synergy-redundancy measure may be rewritten as the difference \nbetween two multi-information terms \n\nI(Xl ' ... , X N ; S) - 2..: I(Xi ; S) = \n\nN \n\ni = l \n\n(5) \n\nH(Xl' ... ,XN) - H(Xl' ... , XNIS) - 2..: H(Xi ) - H(XiIS) = \nI(Xl ; ... ; XNIS) - I(Xl ; ... ;XN) \n\ni=l \n\nN \n\nwhere H(X) = - L x P(x)log(P(x)) is the entropy of X 2 . We conclude that the \nindex RSN ll can be separated into two terms: one that is always non-negative, \nand measures the coding synergy, and the second which is always non-positive and \nquantifies the redundancy. These two terms correspond to two types of interactions \nbetween neurons: The first type are within-stimulus correlations (sometimes termed \nnoise correlations) that emerge from functional connections between neurons and \ncontribute to synergy. The second type are between stimulus correlations (or across \nstimulus correlations) that reflect the fact that the cells have similar responses per \nstimulus, and contribute to redundancy. Being interested in the latter type of \ncorrelations, we limit the discussion to the redundancy term -I(Xl; ... ; XN)' \nFormulating RSN ll as in equation 5 proves highly useful when neural activities \nare independent given the stimulus P(XIS) = II~l P(XiIS). In this case, the \nfirst (synergy) term vanishes, thus limiting neural interactions to the redundant \n\nlOur results below suggest that some redundancy effects become significant only for \n\ngroups larger than 10-15 cells. \n\n2When comparing redundancy in different processing stations, one must consider the \neffects of the baseline information conveyed by single neurons. We thus use the normalized \nredundancy (compare with [15] p.315 and [4]) defined by !iSNll = RSNldI(Xl; ... ; X N; S) \n\n\fregime. More importantly, under the independence assumption we only have to \nestimate the marginal distributions P(XiIS = s) for each stimulus s instead of \nthe full distribution P(XIS = s). \nIt thus allows to estimate an exponentially \nsmaller number of parameters, which in our case of small sample sizes, provides \nmore accurate information estimates. This approximation makes it possible to \ninvestigate redundancy among considerably larger groups of neurons than the 2-3 \nneuron groups considered previously in the literature. \nHow reasonable is the conditional-independence approximation ? It is a good ap(cid:173)\nproximation whenever neuronal activity is mostly determined by the presented stim(cid:173)\nulus and to a lesser extent by interactions with nearby neurons. A possible example \nis the high input regime of cortical neurons receiving thousands of inputs, where \na single input has only a limited influence on the activity of the target cell. The \nexperimental evidence in this regard is however mixed (see e.g.[9]). One should note \nhowever, that stimulus-conditioned independence is implicitly assumed in analysis \nof non-simultaneously recorded data. \n\nTo summarize, the stimulus-conditioned independence assumption limits interac(cid:173)\ntions to the redundant regime, but allows to compare the extent of redundancy \namong large groups of cells in different brain areas. \n\n3 Experimental Methods \n\nTo investigate redundancy in the auditory pathway, we analyze extracellular record(cid:173)\nings from two brain areas of gas-anesthetized cats: 16 cells from the Inferior Col(cid:173)\nliculus (Ie) - the third processing station of the ascending auditory pathway - and \n19 cells from the Primary Auditory Cortex (AI) - the fifth station. Neural activity \nwas recorded non-simultaneously from a total of 6 different animals responding to \na set of complex natural and modified stimuli. Because cortical auditory neurons \nrespond differently to simple and complex stimuli [12 , 1], we refrain from using ar(cid:173)\ntificial over-simplified acoustic stimuli but instead use a set of stimuli based on bird \nvocalizations which contains complex 'real-life' acoustic features. A representative \nexample is shown in figure 1. \n\nQ) \n\n\"0 . .e \n\"1i \nE \n'\" \n\n20 \n\n40 \ntime (milliseconds) \n\n60 \n\n80 \n\n7 \n\n6 \n\n100 \n\n20 \n\n40 \ntime (milliseconds) \n\n60 \n\n80 \n\n100 \n\nFigure 1: A representative stimulus containing a short bird vocalization recorded in \na natural environment. The set of stimuli consisted of similar natural and modified \nrecordings. A. Signal in time domain B. Signal in frequency domain. \n\n4 Experimental Results \n\nIn practice, in order to estimate the information conveyed by neural activity from \nlimited data, one must assume a decoding procedure, such as focusing on a simple \nstatistic of the spike trains that encompasses some of its informative properties. In \n\n\fthis paper we consider two extreme cases: coding short acoustic segments with sin(cid:173)\ngle spikes and coding the stimulus identity with spike counts in a long window. In \naddition, we estimated information and redundancy obtained with two other statis(cid:173)\ntics. First, the latency of the first spike after stimulus onset, and secondly, a statistic \nwhich generalizes the counts statistics for a general renewal process [19]. These cal(cid:173)\nculations yielded higher information content on average, but similar redundancies \nas presented below. Their detailed results will be reported elsewhere. \n\n1.2 \n\n~0 . 8 \n.$ \n\n:0 iO.6 \n\n0.4 \n\nAuditory Cortex (AI) \n\n0.15 \n\n~ 0.1 \n'\" c \n\nu \nC \n:::l \nU \n~ 0.05 \n(cid:173)\" \n:0 \" al \n<:0.85 \no \"u \n~ 0.8 \n\nQ. \n\n0.75 \n\n0.7 '-------=---~--~---~--~ \n\n5 \n\nnumber of cells \n\n10 \n\n15 \n\n20 \n\nWe have developed information theoretic measures of redundancy among groups of \nneurons and applied them to investigate the collaborative coding efficiency in the \nauditory modality. Under two different coding paradigms, we show differences in \nboth information content and group redundancies between Ie and cortical auditory \nneurons. Single Ie neurons carry more information about the presented stimulus, \nbut are also more redundant. On the other hand, auditory cortical neurons carry \nless information but are more independent, thus allowing information to be summed \nalmost linearly when considering groups of few tens of neurons. The results provide \nfor the first time direct evidence for redundancy reduction along the ascending \nauditory pathway, as has been hypothesized by Barlow [2, 3]. The redundancy \neffects under the single-spikes coding paradigm are significant only for groups larger \nthan ten cells, and cannot be revealed with the standard redundancy measures that \nuse only pairs of cells. \n\nOur results suggest that transformations leading to redundancy reduction are not \nlimited to low level sensory processing (aimed to reduce redundancy in input statis(cid:173)\ntics) but are applied even at cortical sensory stations. We suggest that an essential \nexperimental prerequisite to reveal these effects is the use of complex acoustic stim(cid:173)\nuli whose processing occurs at high level processing stations. \n\nThe above findings are in agreement with the view that along the ascending sensory \npathways, the number of neurons increase, their firing rates decrease, and neurons \nbecome tuned to more complex and independent features. Together, these suggest \nthat the neural representation is mapped into a representation with higher effective \ndimensionality. Interestingly, recent advances in kernel-methods learning [20] have \nshown that nonlinear mapping into higher dimension and over-complete represen(cid:173)\ntations may be useful for learning of complex classifications. It is therefore possible \nthat such mappings provide easier readout and more efficient learning in the brain. \n\nAcknowledgements \n\nThis work supported in part by a Human Frontier Science Project (HFSP) grant \nRG 0133/1998 and by a grant from the Israeli Ministry of Science. \n\nReferences \n\n[1] O. Bar-Yosef and I. Nelken. Responses of neurons in cat primary auditory cortex to \nbird chirps: Effects of temporal and spectral context. J. Neuroscience, in press, 2001. \n\n\f[2] H.B. Barlow. Sensory mechanisms, the reduction of redundancy, and intelligence. In \nMechanisation of thought processes, pages 535- 539. Her Majesty's stationary office, \nLondon , 1959. \n\n[3] H .B. Barlow. Redundancy reduction revisited. Network: Computation in neural \n\nsystems, 12:241-253, 200l. \n\n[4] N . Brenner, S.P. Strong, R . Koberle, R. de Ruyter van Steveninck, and W . Bialek. \n\nSynergy in a neural code. Neural Computation, 13(7):1531, 2000. \n\n[5] T.M. Cover and J.A. Thomas. The elements of information theory. Plenum Press, \n\nNew York, 1991. \n\n[6] Y. Dan, J.M. Alonso, W.M. Usrey, and R.C. Reid. Coding of visual information \nby precisely correlated spikes in the lateral geniculate nucleus. Nature Neuroscience, \n1(6):501- 507, 1998. \nI. Gat and N. Tishby. Synergy and redundancy among brain cells of behaving mon(cid:173)\nkeys. In M.S. Kearns, S.A. Solla, and D.A.Cohn, editors, Advances in Neural Infor(cid:173)\nmation Proceedings systems, volume 11, Cambridge, MA, 1999. MIT Press. \n\n[7] \n\n[8] T.J. Gawne and B.J. Richmond. How independent are the messages carried by adja(cid:173)\n\ncent inferior temporal cortical neurons? J. Neurosci., 13(7):2758- 2771, 1993. \n\n[9] P.M. Gochin, M. Colombo, G. A. Dorfman, G.L. Gerstein, and C.G. Gross. Neural \n\nensemble coding in inferior temporal cortex. J. Neurophysiol., 71:2325- 2337, 1994. \n\n[10] M. Meister. Multineural codes in retinal signaling. Proc. Natl. Acad. Sci., 93:609- 614, \n\n1996. \n\n[11] J .P. Nadal, N. Brunei, and N . Parga. Nonlinear feedforward networks with stochastic \noutputs: infomax implies redundancy reduction. Network: Computation in neural \nsystems, 9:207- 217, 1998. \n\n[12] I. Nelken, Y. Rotman, and O. Bar-Yosef. Specialization of the auditory system for \nthe analysis of natural sounds. In J. Brugge and P.F. Poon, editors, Central Auditory \nProcessing and Neural Modeling. Plenum, New York, 1997. \n\n[13] S. Nirenberg, S.M. Carcieri, A.L. Jacobs, and P.E. Latham. Retinal ganglion cells act \n\nlargely as independent encoders. Nature, 411:698- 701, 200l. \n\n[14] LM. Optican, T.J. Gawne, B.J. Richmond, and P.J . Joseph. Unbiased measures of \ntransmitted information and channel capacity from multivariate neuronal data. Bioi. \nCyber, 65:305- 310, 1991. \n\n[15] E . T. Rolls and A. Treves. Neural Networks and Brain Function. Oxford Univ. Press, \n\n1998. \n\n[16] I. Samengo. Independent neurons representing a fintie set of stimuli: dependence of \nthe mutual information on the number of units sampled. Network: Comput. Neural \nSyst., 12:21- 31 , 200l. \n\n[17] C.E. Shanon. A mathematical theory of communication. The Bell systems technical \n\njournal, 27:379- 423,623- 656, 1948. \n\n[18] M. Studenty and J. Vejnarova. The multiinformation function as a tool for measuring \nstochastic dependence. In M.I. Jordan, editor, Learning in Graphical Models, pages \n261-297. Dordrecht: Kluwer, 1998. \n\n[19] C. van Vreeswijk. Information trasmission with renewal neurons. In J.M. Bower, \n\neditor, Computational Neuroscience: Trends in Research. Elsevier Press, 200l. \n\n[20] V.N. Vapnik. The nature of statistical learning theory. Springer-Verlag, Berlin, 1995. \n[21] DK. Warland, P. Reinagel, and M. Meister. Decoding visual information from a \n\npopulation of retinal ganglion cells. J. Neurophysiol., 78:2336- 2350, 1997. \n\n\f", "award": [], "sourceid": 2021, "authors": [{"given_name": "Gal", "family_name": "Chechik", "institution": null}, {"given_name": "Amir", "family_name": "Globerson", "institution": null}, {"given_name": "M.", "family_name": "Anderson", "institution": null}, {"given_name": "E.", "family_name": "Young", "institution": null}, {"given_name": "Israel", "family_name": "Nelken", "institution": null}, {"given_name": "Naftali", "family_name": "Tishby", "institution": null}]}