{"title": "A lattice filter model of the visual pathway", "book": "Advances in Neural Information Processing Systems", "page_first": 1709, "page_last": 1717, "abstract": "Early stages of visual processing are thought to decorrelate, or whiten, the incoming temporally varying signals. Because the typical correlation time of natural stimuli, as well as the extent of temporal receptive fields of lateral geniculate nucleus (LGN) neurons, is much greater than neuronal time constants, such decorrelation must be done in stages combining contributions of multiple neurons. We propose to model temporal decorrelation in the visual pathway with the lattice filter, a signal processing device for stage-wise decorrelation of temporal signals. The stage-wise architecture of the lattice filter maps naturally onto the visual pathway (photoreceptors -> bipolar cells -> retinal ganglion cells -> LGN) and its filter weights can be learned using Hebbian rules in a stage-wise sequential manner. Moreover, predictions of neural activity from the lattice filter model are consistent with physiological measurements in LGN neurons and fruit fly second-order visual neurons. Therefore, the lattice filter model is a useful abstraction that may help unravel visual system function.", "full_text": "A lattice \ufb01lter model of the visual pathway\n\nKarol Gregor\n\nDmitri B. Chklovskii\n\nJanelia Farm Research Campus, HHMI\n\n19700 Helix Drive, Ashburn, VA\n\n{gregork, mitya}@janelia.hhmi.org\n\nAbstract\n\nEarly stages of visual processing are thought to decorrelate, or whiten, the incom-\ning temporally varying signals. Motivated by the cascade structure of the visual\npathway (retina \u2192 lateral geniculate nucelus (LGN) \u2192 primary visual cortex, V1)\nwe propose to model its function using lattice \ufb01lters - signal processing devices\nfor stage-wise decorrelation of temporal signals. Lattice \ufb01lter models predict neu-\nronal responses consistent with physiological recordings in cats and primates. In\nparticular, they predict temporal receptive \ufb01elds of two different types resembling\nso-called lagged and non-lagged cells in the LGN. Moreover, connection weights\nin the lattice \ufb01lter can be learned using Hebbian rules in a stage-wise sequential\nmanner reminiscent of the neuro-developmental sequence in mammals. In addi-\ntion, lattice \ufb01lters can model visual processing in insects. Therefore, lattice \ufb01lter\nis a useful abstraction that captures temporal aspects of visual processing.\n\nOur sensory organs face an ongoing barrage of stimuli from the world and must transmit as much\ninformation about them as possible to the rest of the brain [1]. This is a formidable task because, in\nsensory modalities such as vision, the dynamic range of natural stimuli (more than three orders of\nmagnitude) greatly exceeds the dynamic range of relay neurons (less than two orders of magnitude)\n[2]. The reason why high \ufb01delity transmission is possible at all is that the continuity of objects\nin the physical world leads to correlations in natural stimuli, which imply redundancy.\nIn turn,\nsuch redundancy can be eliminated by compression performed by the front end of the visual system\nleading to the reduction of the dynamic range [3, 4].\nA compression strategy appropriate for redundant natural stimuli is called predictive coding [5, 6, 7].\nIn predictive coding, a prediction of the incoming signal value is computed from past values delayed\nin the circuit. This prediction is subtracted from the actual signal value and only the prediction\nerror is transmitted. In the absence of transmission noise such compression is lossless as the original\nsignal could be decoded on the receiving end by inverting the encoder. If predictions are accurate, the\ndynamic range of the error is much smaller than that of the natural stimuli. Therefore, minimizing\ndynamic range using predictive coding reduces to optimizing prediction.\nExperimental support for viewing the front end of the visual system as a predictive encoder comes\nfrom the measurements of receptive \ufb01elds [6, 7]. In particular, predictive coding suggests that, for\nnatural stimuli, the temporal receptive \ufb01elds should be biphasic and the spatial receptive \ufb01elds -\ncenter-surround. These predictions are born out by experimental measurements in retinal ganglion\ncells, [8], lateral geniculate nucleus (LGN) neurons [9] and \ufb02y second order visual neurons called\nlarge monopolar cells (LMCs) [2]. In addition, the experimentally measured receptive \ufb01elds vary\nwith signal-to-noise ratio as would be expected from optimal prediction theory [6]. Furthermore,\nexperimentally observed whitening of the transmitted signal [10] is consistent with removing corre-\nlated components from the incoming signals [11].\nAs natural stimuli contain correlations on time scales greater than hundred milliseconds, experimen-\ntally measured receptive \ufb01elds of LGN neurons are equally long [12]. Decorrelation over such long\ntime scales requires equally long delays. How can such extended receptive \ufb01eld be produced by\n\n1\n\n\fbiological neurons and synapses whose time constants are typically less than hundred milliseconds\n[13]?\nThe \ufb01eld of signal processing offers a solution to this problem in the form of a device called a lattice\n\ufb01lter, which decorrelates signals in stages, sequentially adding longer and longer delays [14, 15, 16,\n17]. Motivated by the cascade structure of visual systems [18], we propose to model decorrelation\nin them by lattice \ufb01lters. Naturally, visual systems are more complex than lattice \ufb01lters and perform\nmany other operations. However, we show that the lattice \ufb01lter model explains several existing\nobservations in vertebrate and invertebrate visual systems and makes testable predictions. Therefore,\nwe believe that lattice \ufb01lters provide a convenient abstraction for modeling temporal aspects of visual\nprocessing.\nThis paper is organized as follows. First, we brie\ufb02y summarize relevant results from linear prediction\ntheory. Second, we explain the operation of the lattice \ufb01lter in discrete and continuous time. Third,\nwe compare lattice \ufb01lter predictions with physiological measurements.\n\n1 Linear prediction theory\n\nDespite the non-linear nature of neurons and synapses, the operation of some neural circuits in\nvertebrates [19] and invertebrates [20] can be described by a linear systems theory. The advantage\nof linear systems is that optimal circuit parameters may be obtained analytically and the results are\noften intuitively clear. Perhaps not surprisingly, the \ufb01eld of signal processing relies heavily on the\nlinear prediction theory, offering a convenient framework [15, 16, 17]. Below, we summarize the\nresults from linear prediction that will be used to explain the operation of the lattice \ufb01lter.\nConsider a scalar sequence y = {yt} where time t = 1, . . . , n. Suppose that yt at each time\npoint depends on side information provided by vector zt. Our goal is to generate a series of linear\npredictions, \u02c6yt from the vector zt, \u02c6yt = w \u00b7 zt. We de\ufb01ne a prediction error as:\n\nand look for values of w that minimize mean squared error:\n\net = yt \u2212 \u02c6yt = yt \u2212 w \u00b7 zt\n\n(cid:88)\n\nt\n\n(cid:88)\n\nt\n\n(cid:104)e2(cid:105) =\n\n1\nnt\n\ne2\nt =\n\n1\nnt\n\n(yt \u2212 w \u00b7 zt)2.\n\n(1)\n\n(2)\n\nThe weight vector, w is optimal for prediction of sequence y from sequence z if and only if the\nprediction error sequence e = y \u2212 w \u00b7 z is orthogonal to each component of vector z:\n\n(cid:104)ez(cid:105) = 0.\n\n(3)\n\nWhen the whole series y is given in advance, i.e.\nin the of\ufb02ine setting, these so-called normal\nequations can be solved for w, for example, by Gaussian elimination [21]. However, in signal\nprocessing and neuroscience applications, another setting called online is more relevant: At every\ntime step t, prediction \u02c6yt must be made using only current values of zt and w. Furthermore, after a\nprediction is made, w is updated based on the prediction \u02c6yt and observed yt, zt .\nIn the online setting, an algorithm called stochastic gradient descent is often used, where, at each\ntime step, w is updated in the direction of negative gradient of e2\nt :\nw \u2192 w \u2212 \u03b7\u2207w(yt \u2212 w \u00b7 zt)2.\n\n(4)\n\nThis leads to the following weight update, known as least mean square (LMS) [15], for predicting\nsequence y from sequence z:\n\n(5)\nwhere \u03b7 is the learning rate. The value of \u03b7 represents the relative in\ufb02uence of more recent obser-\nvations compared to more distant ones. The larger the learning rate the faster the system adapts to\nrecent observations and less past it remembers.\nIn this paper, we are interested in predicting a current value xt of sequence x from its past values\nxt\u22121, . . . , xt\u2212k restricted by the prediction order k > 0:\n\nw \u2192 w + \u03b7etzt,\n\n\u02c6xt = wk \u00b7 (xt\u22121, . . . , xt\u2212k)T .\n\n(6)\n\n2\n\n\fThis problem is a special case of the online linear prediction framework above, where yt = xt,\nzt = (xt\u22121, . . . , xt\u2212k)T . Then the gradient update is given by:\n\nw \u2192 wk + \u03b7et(xt\u22121, . . . , xt\u2212k)T .\n\n(7)\n\nWhile the LMS algorithm can \ufb01nd the weights that optimize linear prediction (6), the \ufb01lter wk has\na long temporal extent making it dif\ufb01cult to implement with neurons and synapses.\n\n2 Lattice \ufb01lters\nOne way to generate long receptive \ufb01elds in circuits of biological neurons is to use a cascade ar-\nchitecture, known as the lattice \ufb01lter, which calculates optimal linear predictions for temporal se-\nquences and transmits prediction errors [14, 15, 16, 17]. In this section, we explain the operation of\na discrete-time lattice \ufb01lter, then adapt it to continuous-time operation.\n\n2.1 Discrete-time implementation\nThe \ufb01rst stage of the lattice \ufb01lter, Figure 1, calculates the error of the \ufb01rst order optimal prediction\n(i.e. only using the preceding element of the sequence), the second stage uses the output of the \ufb01rst\nstage and calculates the error of the second order optimal prediction (i.e. using only two previous\nvalues) etc. To make such stage-wise error computations possible the lattice \ufb01lter calculates at every\nstage not only the error of optimal prediction of xt from past values xt\u22121, . . . , xt\u2212k, called forward\nerror,\n\n(8)\nbut, perhaps non-intuitively, also the error of optimal prediction of a past value xt\u2212k from the more\nrecent values xt\u2212k+1, . . . , xt, called backward error:\n\nt = xt \u2212 wk \u00b7 (xt\u22121, . . . , xt\u2212k)T ,\nf k\n\nt = xt\u2212k \u2212 w\nbk\n\n(cid:48)k \u00b7 (xt\u2212k+1, . . . , xt)T ,\n\n(9)\n\n(cid:48)k are the weights of the optimal prediction.\n\nt = xt \u2212 u1xt\u22121 as well as the backward error b1\n\nt of optimal prediction of xt\nt of optimal prediction of xt\u22121 from\nt = xt\u22121 \u2212 v1xt, Figure 1. Here, we assume that coef\ufb01cients u1 and v1 that give optimal linear\n\nwhere wk and w\nFor example, the \ufb01rst stage of the \ufb01lter calculates the forward error f 1\nfrom xt\u22121: f 1\nxt: b1\nprediction are known and return to learning them below.\nEach following stage of the lattice \ufb01lter performs a stereotypic operation on its inputs, Figure 1. The\nk-th stage (k > 1) receives forward, f k\u22121\n, errors from the previous stage,\ndelays backward error by one time step and computes a forward error:\n\n, and backward, bk\u22121\n\nt\n\nt\n\nt\n\nt\n\nt\n\n.\n\n(10)\n\n(11)\n\nt \u2212 ukbk\u22121\nt\u22121\nt\u22121 . In addition, each stage computes a backward\nt\u22121 \u2212 vkf k\u22121\n\nt = f k\u22121\nf k\nof the optimal linear prediction of f k\u22121\nfrom bk\u22121\nerror\nt = bk\u22121\nbk\nof the optimal linear prediction of bk\u22121\nt\u22121 from f k\u22121\nAs can be seen in Figure 1, the lattice \ufb01lter contains forward prediction error (top) and backward\nprediction error (bottom) branches, which interact at every stage via cross-links. Operation of the\nlattice \ufb01lter can be characterized by the linear \ufb01lters acting on the input, x, to compute forward\nor backward errors of consecutive order, so called prediction-error \ufb01lters (blue bars in Figure 1).\nBecause of delays in the backward error branch the temporal extent of the \ufb01lters grows from stage\nto stage.\nIn the next section, we will argue that prediction-error \ufb01lters correspond to the measurements of\ntemporal receptive \ufb01elds in neurons. For detailed comparison with physiological measurements we\nwill use the result that, for bi-phasic prediction-error \ufb01lters, such as the ones in Figure 1, the \ufb01rst\nbar of the forward prediction-error \ufb01lter has larger weight, by absolute value, than the combined\nweights of the remaining coef\ufb01cients of the corresponding \ufb01lter. Similarly, in backward prediction-\nerror \ufb01lters, the last bar has greater weight than the rest of them combined. This fact arises from\nthe observation that forward prediction-error \ufb01lters are minimum phase, while backward prediction-\nerror \ufb01lters are maximum phase [16, 17].\n\n3\n\n\fFigure 1: Discrete-time lattice \ufb01lter performs stage-wise computation of forward and back-\nward prediction errors. In the \ufb01rst stage, the optimal prediction of xt from xt\u22121 is computed by\ndelaying the input by one time step and multiplying it by u1. The upper summation unit subtracts the\npredicted xt from the actual value and outputs prediction error f 1\nt . Similarly, the optimal prediction\nof xt\u22121 from xt is computed by multiplying the input by v1. The lower summation unit subtracts\nthe optimal prediction from the actual value and outputs backward error b1\nt . In each following stage\nk, the optimal prediction of f k\u22121\nby one time step and\nt\nmultiplying it by uk. The upper summation unit subtracts the prediction from the actual f k\u22121\nand\nis computed by\noutputs prediction error f k\nmultiplying it by uk. The lower summation unit subtracts the optimal prediction from the actual\nt . Black connections have unitary weights and red connections\nvalue and outputs backward error bk\nhave learnable negative weights. One can view forward and backward error calculations as applica-\ntions of so-called prediction-error \ufb01lters (blue) to the input sequence. Note that the temporal extent\nof the \ufb01lters gets longer from stage to stage.\n\nt . Similarly, the optimal prediction of bk\u22121\n\nis computed by delaying bk\u22121\n\nt\u22121 from f k\u22121\n\nfrom bk\u22121\n\nt\n\nt\n\nt\n\nt\n\nNext, we derive a learning rule for \ufb01nding optimal coef\ufb01cients u and v in the online setting. The uk\nis used for predicting f k\u22121\nt\u22121 and\net = f k\n\nt . By substituting yt = f k\u22121\n\nt into (5) the update of uk becomes\n\nt\u22121 to obtain error f k\n\n, zt = bk\u22121\n\nfrom bk\u22121\n\nt\n\nt\n\nuk \u2192 uk + \u03b7f k\n\nt bk\u22121\nt\u22121 .\n\n(12)\n\nSimilarly, vk is updated by\n\nvk \u2192 vk + \u03b7bk\n\nt f k\u22121\n\nt\n\n.\n\n(13)\nInterestingly, the updates of the weights are given by the product of the activities of outgoing and\nincoming nodes of the corresponding cross-links. Such updates are known as Hebbian learning rules\nthought to be used by biological neurons [22, 23].\nFinally, we give a simple proof that, in the of\ufb02ine setting when the entire sequence x is known, f k\nand bk, given by equations (10, 11), are indeed errors of optimal k-th order linear prediction. Let D\nbe one step time delay operator (Dx)t = xt\u22121. The induction statement at k is that f k and bk are\nk-th order forward and backward errors of optimal linear prediction which is equivalent to f k and bk\nbeing of the form f k = x\u2212wk\n(cid:48)k\nk x and, from\nnormal equations (3), satisfying (cid:104)f kDix(cid:105) = 0 and (cid:104)DbkDix(cid:105) = (cid:104)bkDi\u22121x(cid:105) = 0 for i = 1, . . . , k.\nThat this is true for k = 1 directly follows from the de\ufb01nition of f 1 and b1. Now we assume that\nthis is true for k \u2212 1 \u2265 1 and show it is true for k. It is easy to see from the forms of f k\u22121 and bk\u22121\nand from f k = f k\u22121 \u2212 ukDbk\u22121 that f k has the correct form f k = x \u2212 wk\nkDkx.\nRegarding orthogonality for i = 1, . . . , k \u2212 1 we have (cid:104)f kDix(cid:105) = (cid:104)(f k\u22121 \u2212 ukDbk\u22121)Dix(cid:105) =\n(cid:104)f k\u22121Dix(cid:105) \u2212 uk(cid:104)(Dbk\u22121)Dix(cid:105) = 0 using the induction assumptions of orhogonality at k \u2212 1. For\nthe remaining i = k we note that f k is the error of the optimal linear prediction of f k\u22121 from Dbk\u22121\nand therefore 0 = (cid:104)f kDbk\u22121(cid:105) = (cid:104)f k(Dkx \u2212 w\nk\u22121 Dx)(cid:105) = (cid:104)f kDkx(cid:105) as\n(cid:48)k\u22121\ndesired. The bk case can be proven similarly.\n\nkDkx and bk = Dkx\u2212w\n\n1 Dk\u22121x \u2212 . . . + w\n(cid:48)k\u22121\n\n1 Dk\u22121x\u2212. . .\u2212w\n(cid:48)k\n\n1 Dx \u2212 . . . \u2212 wk\n\n1 Dx\u2212. . .\u2212wk\n\n2.2 Continuous-time implementation\nThe last hurdle remaining for modeling neuronal circuits which operate in continuous time with a\nlattice \ufb01lter is its discrete-time operation. To obtain a continuous-time implementation of the lattice\n\n4\n\n\f\ufb01lter we cannot simply take the time step size to zero as prediction-error \ufb01lters would become\nin\ufb01nitesimally short. Here, we adapt the discrete-time lattice \ufb01lter to continous-time operation in\ntwo steps.\nFirst, we introduce a discrete-time Laguerre lattice \ufb01lter [24, 17] which uses Laguerre polynomials\nrather than the shift operator to generate its basis functions, Figure 2. The input signal passes\nthrough a leaky integrator whose leakage constant \u03b1 de\ufb01nes a time-scale distinct from the time step\n(14). A delay, D, at every stage is replaced by an all-pass \ufb01lter, L, (15) with the same constant\n\u03b1, which preserves the magnitude of every Fourier component of the input but shifts its phase in a\nfrequency dependent manner. Such all-pass \ufb01lter reduces to a single time-step delay when \u03b1 = 0.\nThe optimality of a general discrete-time Laguerre lattice \ufb01lter can be proven similarly to that for\nthe discrete-time \ufb01lter, simply by replacing operator D with L in the proof of section 2.1.\n\nFigure 2: Continuous-time lattice \ufb01lter using Laguerre polynomials. Compared to the discrete-\ntime version, it contains a leaky integrator, L0,(16) and replaces delays with all-pass \ufb01lters, L, (17).\nSecond, we obtain a continuous-time formulation of the lattice \ufb01lter by replacing t \u2212 1 \u2192 t \u2212 \u03b4t,\nde\ufb01ning the inverse time scale \u03b3 = (1 \u2212 \u03b1)/\u03b4t and taking the limit \u03b4t \u2192 0 while keeping \u03b3 \ufb01xed.\nAs a result L0 and L are given by:\n\nDiscrete time\n\nContinuous time\n\ndL0(x)/dt = \u2212\u03b3L0(x) + x\nL(x) = x \u2212 2\u03b3L0(x)\n\n(14)\nL0(x)t = \u03b1L0(x)t\u22121 + xt\nL(x)t = \u03b1(L(x)t\u22121 \u2212 xt) + xt\u22121 (15)\n\n(16)\n(17)\nRepresentative impulse responses of the continuous Laguerre \ufb01lter are shown in Figure 2. Note that,\nsimilarly to the discrete-time case, the area under the \ufb01rst (peak) phase is greater than the area under\nthe second (rebound) phase in the forward branch and the opposite is true in the backward branch.\nMoreover, the temporal extent of the rebound is greater than that of the peak not just in the forward\nbranch like in the basic discrete-time implementation but also in the backward branch. As will be\nseen in the next section, these predictions are con\ufb01rmed by physiological recordings.\n\n3 Experimental evidence for the lattice \ufb01lter in visual pathways\n\nIn this section we demonstrate that physiological measurements from visual pathways in vertebrates\nand invertebrates are consistent with the predictions of the lattice \ufb01lter model. For the purpose of\nmodeling visual pathways, we identify summation units of the lattice \ufb01lter with neurons and propose\nthat neural activity represents forward and backward errors.\nIn the \ufb02y visual pathway neuronal\nactivity is represented by continuously varying graded potentials. In the vertebrate visual system,\nall neurons starting with ganglion cells are spiking and we identify their \ufb01ring rate with the activity\nin the lattice \ufb01lter.\n\n3.1 Mammalian visual pathway\nIn mammals, visual processing is performed in stages. In the retina, photoreceptors synapse onto\nbipolar cells, which in turn synapse onto retinal ganglion cells (RGCs). RGCs send axons to the\nLGN, where they synapse onto LGN relay neurons projecting to the primary visual cortex, V1.\nIn addition to this feedforward pathway, at each stage there are local circuits involving (usually\ninhibitory) inter-neurons such as horizontal and amacrine cells in the retina. Neurons of each class\n\n5\n\n\fcome in many types, which differ in their connectivity, morphology and physiological response. The\nbewildering complexity of these circuits has posed a major challenge to visual neuroscience.\n\nFigure 3: Electrophysiologically measured temporal receptive \ufb01elds get progressively longer\nalong the cat visual pathway. Left: A cat LGN cell (red) has a longer receptive \ufb01eld than a\ncorresponding RGC cell (blue) (adapted from [12] which also reports population data). Right (A,B):\nExtent of the temporal receptive \ufb01elds of simple cells in cat V1 is greater than that of corresponding\nLGN cells as quanti\ufb01ed by the peak (A) and zero-crossing (B) times. Right (C): In the temporal\nreceptive \ufb01elds of cat LGN and V1 cells the peak can be stronger or weaker than the rebound\n(adapted from [25]).\n\nHere, we point out several experimental observations related to temporal processing in the visual\nsystem consistent with the lattice \ufb01lter model. First, measurements of temporal receptive \ufb01elds\ndemonstrate that they get progressively longer at each consecutive stage:\ni) LGN neurons have\nlonger receptive \ufb01elds than corresponding pre-synaptic ganglion cells [12], Figure 3left; ii) simple\ncells in V1 have longer receptive \ufb01elds than corresponding pre-synaptic LGN neurons [25], Figure\n3rightA,B. These observation are consistent with the progressively greater temporal extent of the\nprediction-error \ufb01lters (blue plots in Figure 2).\nSecond, the weight of the peak (integrated area under the curve) may be either greater or less than\nthat of the rebound both in LGN relay cells [26] and simple cells of V1 [25], Figure 3right(C).\nNeurons with peak weight exceeding that of rebound are often referred to as non-lagged while the\nothers are known as lagged found both in cat [27, 28, 29] and monkey [30]. The reason for this\nbecomes clear from the response to a step stimulus, Figure 4(top).\nBy comparing experimentally measured receptive \ufb01elds with those of the continuous lattice \ufb01lter,\nFigure 4, we identify non-lagged neurons with the forward branch and lagged neurons with the\nbackward branch. Another way to characterize step-stimulus response is whether the sign of the\ntransient is the same (non-lagged) or different (lagged) relative to sustained response.\nThird, measurements of cross-correlation between RGCs and LGN cell spikes in lagged and non-\nlagged neurons reveals a difference of the transfer function indicative of the difference in underlying\ncircuitry [30]. This is consistent with backward pathway circuit of the Laguerre lattice \ufb01lter, Figure\n2, being more complex then that of the forward path (which results in different transfer function). \u201d\n(or replacing \u201dmore complex\u201d with \u201ddifferent\u201d)\nThird, measurements of cross-correlation between RGCs and LGN cell spikes in lagged and non-\nlagged neurons reveals a difference of the transfer function indicative of the difference in underlying\ncircuitry [31]. This is consistent with the backward branch circuit of the Laguerre lattice \ufb01lter, Fig-\nure 2, being different then that of the forward branch (which results in different transfer function).\nIn particular, a combination of different glutamate receptors such as AMPA and NMDA, as well as\nGABA receptors are thought to be responsible for observed responses in lagged cells [32]. How-\never, further investigation of the corresponding circuitry, perhaps using connectomics technology, is\ndesirable.\nFourth, the cross-link weights of the lattice \ufb01lter can be learned using Hebbian rules, (12,13) which\nare biologically plausible [22, 23]. Interestingly, if these weights are learned sequentially, starting\nfrom the \ufb01rst stage, they do not need to be re-learned when additional stages are added or learned.\nThis property maps naturally on the fact that in the course of mammalian development the visual\npathway matures in a stage-wise fashion - starting with the retina, then LGN, then V1 - and implying\nthat the more peripheral structures do not need to adapt to the maturation of the downstream ones.\n\n6\n\nRGCLGNTime (ms)Temporal Filter0-1-0.500.51100200simplecellsandgeniculatecellsdifferedforalltemporalparam-etersmeasured,therewasconsiderableoverlapbetweenthedis-tributions(Fig.7).Thisoverlapraisesthefollowingquestion:doesconnectivitydependonhowwellgeniculateandcorticalresponsesarematchedwithrespecttotime?Forinstance,dosimplecellswithfastsubregions(earlytimestopeakandearlyzerocrossings)receiveinputmostlyfromgeniculatecellswithfastcenters?Figure8illustratesthevisualresponsesfromageniculatecellandasimplecellthatweremonosynapticallyconnected.Astrongpositivepeakwasobservedinthecorrelogram(shownwitha10msectimewindowtoemphasizeitsshortlatencyandfastrisetime).Inthiscase,anONcentralsubregionwaswelloverlappedwithanONgeniculatecenter(preciselyatthepeakofthesubregion).Moreover,thetimingsofthevisualresponsesfromtheoverlappedsubregionandthegeniculatecenterwereverysimilar(sameonset,;0\u201325msec;samepeak,;25\u201350msec).Itisworthnotingthatthetwocentralsubregionsofthesimplecellwerefasterandstrongerthanthetwolateralsubregions.Theresponsesofthecentralsubregionsmatchedthetimingofthegeniculatecenter.Incontrast,thetimingofthelateralsubregionsresembledmorecloselythetimingofthegeniculatesurround(bothpeakedat25\u201350msec).UnliketheexampleshowninFigure8,aconsiderablenumberofgeniculocorticalpairsproducedresponseswithdifferenttim-ing.Forexample,Figure9illustratesacaseinwhichageniculatecenterfullyoverlappedastrongsimple-cellsubregionofthesamesign,butwithslowertiming(LGNonset,;0\u201325msec;peak,;25\u201350msec;simple-cellonset,;25\u201350msec;peak,;50\u201375msec).Thecross-correlogrambetweenthispairofneuronswas\ufb02at,whichindicatestheabsenceofamonosynapticconnection(Fig.9,topright).Toexaminetheroleoftimingingeniculocorticalconnectivity,wemeasuredtheresponsetimecoursefromallcellpairsthatmettwocriteria.First,thegeniculatecenteroverlappedasimple-cellsubregionofthesamesign(n5104).Second,thegeniculatecenteroverlappedthecorticalsubregioninanear-optimalposi-tion(relativeoverlap.50%,n547;seeMaterialsandMethods;Fig.5A).Allthesecellpairshadahighprobabilityofbeingmonosynapticallyconnectedbecauseoftheprecisematchinreceptive-\ufb01eldpositionandsign(31of47wereconnected).Thedistributionsofpeaktime,zero-crossingtime,andreboundindexfromthesecellpairswereverysimilartothedistributionsfromtheentiresample(Fig.7;seealsoFig.10legend).Theselectedcellpairsincludedbothpresumeddirectional(predictedDI.0.3,seeMaterialsandMethods;12/20connected)andnondirec-tional(19/27connected)simplecells.Mostgeniculatecellshadsmallreceptive\ufb01elds(lessthantwosimple-cellsubregionwidths;seeReceptive-\ufb01eldsign),although\ufb01vecellswithlargerreceptive\ufb01eldswerealsoincluded(threeconnected).Fromthe47cellpairsusedinthisanalysis,thosewithsimilarresponsetimecourseshadahigherprobabilityofbeingconnected(Fig.10).Inparticular,cellpairsthathadbothsimilarpeaktimeandzero-crossingtimewereallconnected(n512;Fig.10A).Directionallyselectivesimplecellswereincludedinalltiminggroups.Forexample,inFigure10Atherewerefour,\ufb01ve,two,andonedirectionallyselectivecellsinthetimegroups,20,40,60,and.60msec,respectively.Similarresultswereobtainedifwerestrictedoursampletogeniculatecentersoverlappedwiththedominantsub-regionofthesimplecell(n531).Interestingly,theef\ufb01cacyandcontributionsoftheconnectionsseemedtodependlittleontherelativetimingofthevisualresponses(Fig.10,right).Althoughoursampleofthemwasquitesmall,laggedcellsareofconsiderableinterestandthereforedeservecomment.Werecordedfrom13potentiallylaggedLGNcellswhosecentersweresuperimposedwithasimple-cellsubregion(eightwithre-boundindicesbetween1.2and1.5;\ufb01vewithreboundindices.1.9).Onlysevenofthesepairscouldbeusedfortimingcom-parisons(inonepairthebaselineofthecorrelogramhadinsuf-\ufb01cientspikes;inthreepairsthegeniculatereceptive\ufb01eldswereFigure7.Distributionofgeniculatecellsandsimplecellswithrespecttothetimingoftheirresponses.Thedistributionofthreeparametersderivedfromimpulseresponsesofgeniculateandcorticalneuronsisshown.A,Peaktime.B,Zero-crossingtime.C,Reboundindex.Peaktimeisthetimewiththestrongestresponseinthe\ufb01rstphaseoftheimpulseresponse.Zero-crossingtimeisthetimebetweenthe\ufb01rstandsecondphases.Reboundindexistheareaoftheimpulseresponseafterthezerocrossingdividedbytheareabeforethezerocrossing.Onlyimpulseresponseswithgoodsignaltonoisewereincluded(.5SDabovebaseline;n5169).Alonsoetal.\u2022ConnectionsbetweenLGNandCortexJ.Neurosci.,June1,2001,21(11):4002\u201340154009\fFigure 4: Comparison of electrophysiologically measured responses of cat LGN cells with the\ncontinuous-time lattice \ufb01lter model. Top: Experimentally measured temporal receptive \ufb01elds and\nstep-stimulus responses of LGN cells (adapted from [26]). Bottom: Typical examples of responses\nin the continuous-time lattice \ufb01lter model. Lattice \ufb01lter coef\ufb01cients were u1 = v1 = 0.4, u2 = v2 =\n0.2 and 1/\u03b3 = 50ms to model the non-lagged cell and u1 = v1 = u2 = v2 = 0.2 and 1/\u03b3 = 60ms\nto model the lagged cell. To model photoreceptor contribution to the responses, an additional leaky\nintegrator L0 was added to the circuit of Figure 2.\n\nWhile Hebbian rules are biologically plausible, one may get an impression from Figure 2 that they\nmust apply to inhibitory cross-links. We point out that this circuit is meant to represent only the com-\nputation performed rather than the speci\ufb01c implementation in terms of neurons. As the same linear\ncomputation can be performed by circuits with a different arrangement of the same components,\nthere are multiple implementations of the lattice \ufb01lter. For example, activity of non-lagged OFF\ncells may be seen as representing minus forward error. Then the cross-links between the non-lagged\nOFF pathway and the lagged ON pathway would be excitatory. In general, classi\ufb01cation of cells\ninto lagged and non-lagged seems independent of their ON/OFF and X/Y classi\ufb01cation [31, 28, 29],\nbut see[33].\n\n3.2\n\nInsect visual pathway\n\nIn insects, two cell types, L1 and L2, both post-synaptic to photoreceptors play an important role\nin visual processing. Physiological responses of L1 and L2 indicate that they decorrelate visual\nsignals by subtracting their predictable parts. In fact, receptive \ufb01elds of these neurons were used as\nthe \ufb01rst examples of predictive coding in neuroscience [6]. Yet, as the numbers of synapses from\nphotoreceptors to L1 and L2 are the same [34] and their physiological properties are similar, it has\nbeen a mystery why insects, have not just one but a pair of such seemingly redundant neurons per\nfacet. Previously, it was suggested that L1 and L2 provide inputs to the two pathways that map onto\nON and OFF pathways in the vertebrate retina [35, 36].\nHere, we put forward a hypothesis that the role of L1 and L2 in visual processing is similar to that of\nthe two branches of the lattice \ufb01lter. We do not incorporate the ON/OFF distinction in the effectively\nlinear lattice \ufb01lter model but anticipate that such combined description will materialize in the future.\nAs was argued in Section 2, in forward prediction-error \ufb01lters, the peak has greater weight than\nthe rebound, while in backward prediction-error \ufb01lters the opposite is true. Such difference implies\nthat in response to a step-stimulus the signs of sustained responses compared to initial transients\nare different between the branches. Indeed, Ca2+ imaging shows that responses of L1 and L2 to\nstep-stimulus are different as predicted by the lattice \ufb01lter model [35], Figure 5b. Interestingly, the\nactivity of L1 seems to represent minus forward error and L2 - plus backward error, suggesting that\nthe lattice \ufb01lter cross-links are excitatory. To summarize, the predictions of the lattice \ufb01lter model\nseem to be consistent with the physiological measurements in the \ufb02y visual system and may help\nunderstand its operation.\n\n7\n\n\fFigure 5: Response of the lattice \ufb01lter and fruit \ufb02y LMCs to a step-stimulus. Left: Responses\nof the \ufb01rst order discrete-time lattice \ufb01lter to a step stimulus. Right: Responses of \ufb02y L1 and L2\ncells to a moving step stimulus (adapted from [35]). Predicted and the experimentally measured\nresponses have qualitatively the same shape: a transient followed by sustained response, which has\nthe same sign for the forward error and L1 and the opposite sign for the backward error and L2.\n\n4 Discussion\nMotivated by the cascade structure of the visual pathway, we propose to model its operation with\nthe lattice \ufb01lter. We demonstrate that the predictions of the continuous-time lattice \ufb01lter model are\nconsistent with the course of neural development and the physiological measurement in the LGN,\nV1 of cat and monkey, as well as \ufb02y LMC neurons. Therefore, lattice \ufb01lters may offer a useful\nabstraction for understanding aspects of temporal processing in visual systems of vertebrates and\ninvertebrates.\nPreviously, [11] proposed that lagged and non-lagged cells could be a result of recti\ufb01cation by\nspiking neurons. Although we agree with [11] that LGN performs temporal decorrelation, our ex-\nplanation does not rely on non-linear processing but rather on the cascade architecture and, hence, is\nfundamentally different. Our model generates the following predictions that are not obvious in [11]:\ni) Not only are LGN receptive \ufb01elds longer than RGC but also V1 receptive \ufb01elds are longer than\nLGN; ii) Even a linear model can generate a difference in the peak/rebound ratio; iii) The circuit\nfrom RGC to LGN should be different for lagged and non-lagged cells consistent with [31]; iv) The\nlattice \ufb01lter circuit can self-organize using Hebbian rules, which gives a mechanistic explanation of\nreceptive \ufb01elds beyond the normative framework of [11].\nIn light of the redundancy reduction arguments given in the introduction, we note that, if the only\ngoal of the system were to compress incoming signals using a given number of lattice \ufb01lter stages,\nthen after the compression is peformed only one kind of prediction errors, forward or backward\nneeds to be transmitted. Therefore, having two channels, in the absence of noise, may seem redun-\ndant. However, transmitting both forward and backward errors gives one the \ufb02exibility to continue\ndecorrelation further by adding stages performing relatively simple operations.\nWe are grateful to D.A. Butts, E. Callaway, M. Carandini, D.A. Clark, J.A. Hirsch, T. Hu, S.B.\nLaughlin, D.N. Mastronarde, R.C. Reid, H. Rouault, A. Saul, L. Scheffer, F.T. Sommer, X. Wang\nfor helpful discussions.\n\nReferences\n[1] F. Rieke, D. Warland, R.R. van Steveninck, and W. Bialek. Spikes: exploring the neural code. MIT press,\n\n1999.\n\n[2] S.B. Laughlin. Matching coding, circuits, cells, and molecules to signals: general principles of retinal\n\ndesign in the \ufb02y\u2019s eye. Progress in retinal and eye research, 13(1):165\u2013196, 1994.\n\n[3] F. Attneave. Some informational aspects of visual perception. Psychological review, 61(3):183, 1954.\n[4] H. Barlow. Redundancy reduction revisited. Network: Comp in Neural Systems, 12(3):241\u2013253, 2001.\n[5] R.M. Gray. Linear Predictive Coding and the Internet Protocol. Now Publishers, 2010.\n[6] MV Srinivasan, SB Laughlin, and A. Dubs. Predictive coding: a fresh view of inhibition in the retina.\n\nProceedings of the Royal Society of London. Series B. Biological Sciences, 216(1205):427\u2013459, 1982.\n\n[7] T. Hosoya, S.A. Baccus, and M. Meister. Dynamic predictive coding by the retina. Nature, 436:71, 2005.\n\n8\n\n0510152000.51Stimulus05101520\u2212101\u2212 Forward Error05101520\u2212101Backward Errortime\f[8] HK Hartline, H.G. Wagner, and EF MacNichol Jr. The peripheral origin of nervous activity in the visual\nsystem. Studies on excitation and inhibition in the retina: a collection of papers from the laboratories of\nH. Keffer Hartline, page 99, 1974.\n\n[9] N.A. Lesica, J. Jin, C. Weng, C.I. Yeh, D.A. Butts, G.B. Stanley, and J.M. Alonso. Adaptation to stimulus\n\ncontrast and correlations during natural visual stimulation. Neuron, 55(3):479\u2013491, 2007.\n\n[10] Y. Dan, J.J. Atick, and R.C. Reid. Ef\ufb01cient coding of natural scenes in the lateral geniculate nucleus:\n\nexperimental test of a computational theory. The Journal of Neuroscience, 16(10):3351\u20133362, 1996.\n\n[11] D.W. Dong and J.J. Atick. Statistics of natural time-varying images. Network: Computation in Neural\n\nSystems, 6(3):345\u2013358, 1995.\n\n[12] X. Wang, J.A. Hirsch, and F.T. Sommer. Recoding of sensory information across the retinothalamic\n\nsynapse. The Journal of Neuroscience, 30(41):13567\u201313577, 2010.\n\n[13] C. Koch. Biophysics of computation: information processing in single neurons. Oxford Univ Press, 2005.\n[14] F. Itakura and S. Saito. On the optimum quantization of feature parameters in the parcor speech synthe-\nsizer. In Conference Record, 1972 International Conference on Speech Communication and Processing,\nBoston, MA, pages 434\u2013437, 1972.\n\n[15] B. Widrow and S.D. Stearns. Adaptive signal processing. Prentice-Hall, Inc. Englewood Cliffs, NJ, 1985.\n[16] S. Haykin. Adaptive \ufb01lter theory. Prentice-Hall, Englewood-Cliffs, NJ, 2003.\n[17] A.H. Sayed. Fundamentals of adaptive \ufb01ltering. Wiley-IEEE Press, 2003.\n[18] D.J. Felleman and D.C. Van Essen. Distributed hierarchical processing in the primate cerebral cortex.\n\nCerebral cortex, 1(1):1\u201347, 1991.\n\n[19] X. Wang, F.T. Sommer, and J.A. Hirsch. Inhibitory circuits for visual processing in thalamus. Current\n\nOpinion in Neurobiology, 2011.\n\n[20] SB Laughlin, J. Howard, and B. Blakeslee. Synaptic limitations to contrast coding in the retina of\nthe blow\ufb02y calliphora. Proceedings of the Royal society of London. Series B. Biological sciences,\n231(1265):437\u2013467, 1987.\n\n[21] D.C. Lay. Linear Algebra and Its Applications. Addison-Wesley/Longman, New York/London, 2000.\n[22] D.O. Hebb. The organization of behavior: A neuropsychological theory. Lawrence Erlbaum, 2002.\n[23] O. Paulsen and T.J. Sejnowski. Natural patterns of activity and long-term synaptic plasticity. Current\n\nopinion in neurobiology, 10(2):172\u2013180, 2000.\n\n[24] Z. Fejzo and H. Lev-Ari. Adaptive laguerre-lattice \ufb01lters. Signal Processing, IEEE Transactions on,\n\n45(12):3006\u20133016, 1997.\n\n[25] J.M. Alonso, W.M. Usrey, and R.C. Reid. Rules of connectivity between geniculate cells and simple cells\n\nin cat primary visual cortex. The Journal of Neuroscience, 21(11):4002\u20134015, 2001.\n\n[26] D. Cai, G.C. Deangelis, and R.D. Freeman. Spatiotemporal receptive \ufb01eld organization in the lateral\n\ngeniculate nucleus of cats and kittens. Journal of Neurophysiology, 78(2):1045\u20131061, 1997.\n\n[27] D.N. Mastronarde. Two classes of single-input x-cells in cat lateral geniculate nucleus. i. receptive-\ufb01eld\n\nproperties and classi\ufb01cation of cells. Journal of Neurophysiology, 57(2):357\u2013380, 1987.\n\n[28] J. Wolfe and L.A. Palmer. Temporal diversity in the lateral geniculate nucleus of cat. Visual neuroscience,\n\n15(04):653\u2013675, 1998.\n\n[29] AB Saul and AL Humphrey. Spatial and temporal response properties of lagged and nonlagged cells in\n\ncat lateral geniculate nucleus. Journal of Neurophysiology, 64(1):206\u2013224, 1990.\n\n[30] A.B. Saul. Lagged cells in alert monkey lateral geniculate nucleus. Visual neurosci, 25:647\u2013659, 2008.\n[31] D.N. Mastronarde. Two classes of single-input x-cells in cat lateral geniculate nucleus. ii. retinal inputs\n\nand the generation of receptive-\ufb01eld properties. Journal of Neurophysiology, 57(2):381\u2013413, 1987.\n\n[32] P. Heggelund and E. Hartveit. Neurotransmitter receptors mediating excitatory input to cells in the cat\n\nlateral geniculate nucleus. i. lagged cells. Journal of neurophysiology, 63(6):1347\u20131360, 1990.\n\n[33] J. Jin, Y. Wang, R. Lashgari, H.A. Swadlow, and J.M. Alonso. Faster thalamocortical processing for dark\n\nthan light visual targets. The Journal of Neuroscience, 31(48):17471\u201317479, 2011.\n\n[34] M. Rivera-Alba, S.N. Vitaladevuni, Y. Mischenko, Z. Lu, S. Takemura, L. Scheffer, I.A. Meinertzhagen,\nD.B. Chklovskii, and G.G. de Polavieja. Wiring economy and volume exclusion determine neuronal\nplacement in the drosophila brain. Current Biology, 21(23):2000\u20135, 2011.\n\n[35] D.A. Clark, L. Bursztyn, M.A. Horowitz, M.J. Schnitzer, and T.R. Clandinin. De\ufb01ning the computational\n\nstructure of the motion detector in drosophila. Neuron, 70(6):1165\u20131177, 2011.\n\n[36] M. Joesch, B. Schnell, S.V. Raghu, D.F. Reiff, and A. Borst. On and off pathways in drosophila motion\n\nvision. Nature, 468(7321):300\u2013304, 2010.\n\n9\n\n\f", "award": [], "sourceid": 817, "authors": [{"given_name": "Karol", "family_name": "Gregor", "institution": null}, {"given_name": "Dmitri", "family_name": "Chklovskii", "institution": null}]}