{"title": "Reconstructing Stimulus Velocity from Neuronal Responses in Area MT", "book": "Advances in Neural Information Processing Systems", "page_first": 34, "page_last": 40, "abstract": null, "full_text": "Reconstructing Stimulus Velocity from \n\nNeuronal Responses in Area MT \n\nWyeth Bair, James R. Cavanaugh, J. Anthony Movshon \n\nHoward Hughes Medical Institute, and \n\nCenter for Neural Science \n\nNew York University \n\n4 Washington Place, Room 809 \n\nNew York, NY 10003 \n\nwyeth@cns.nyu.edu, jamesc@cns.nyu.edu, tony@cns.nyu.edu \n\nAbstract \n\nWe employed a white-noise velocity signal to study the dynamics \nof the response of single neurons in the cortical area MT to visual \nmotion. Responses were quantified using reverse correlation, opti(cid:173)\nmal linear reconstruction filters, and reconstruction signal-to-noise \nratio (SNR). The SNR and lower bound estimates of information \nrate were lower than we expected. Ninety percent of the informa(cid:173)\ntion was transmitted below 18 Hz, and the highest lower bound on \nbit rate was 12 bits/so A simulated opponent motion energy sub(cid:173)\nunit with Poisson spike statistics was able to out-perform the MT \nneurons. The temporal integration window, measured from the re(cid:173)\nverse correlation half-width, ranged from 30-90 ms. The window \nwas narrower when a stimulus moved faster, but did not change \nwhen temporal frequency was held constant. \n\n1 \n\nINTRODUCTION \n\nArea MT neurons can show precise and rapid modulation in response to dynamic \nnoise stimuli (Bair and Koch, 1996); however, computational models of these neu(cid:173)\nrons and their inputs (Adelson and Bergen, 1985; Heeger, 1987; Grzywacz and \nYuille, 1990; Emerson et al., 1992; Qian et al., 1994; Nowlan and Sejnowski, 1995) \nhave primarily been compared to electrophysiological results based on time and en(cid:173)\nsemble averaged responses to deterministic stimuli, e.g., drifting sinusoidal gratings. \n\n\fReconstructing Stimulus Velocity from Neuronal Responses in Area MT \n\n35 \n\nUsing methods introduced by Bialek et al. (1991) and further analyzed by Gabbiani \nand Koch (1996) for the estimation of information transmission by a neuron about \na white-noise stimulus, we set out to compare the responses of MT neurons for \nwhite-noise velocity signals to those of a model based on opponent motion energy \nsub-units. \n\nThe results of two analyses are summarized here. In the first, we compute a lower \nbound on information transmission using optimal linear reconstruction filters and \nexamine the SNR as a function of temporal frequency. The second analysis examines \nchanges in the reverse correlation (the cross-correlation between the stimulus and \nthe resulting spike trains) as a function of spatial frequency and temporal frequency \nof the moving stimulus pattern. \n\n2 EXPERIMENTAL METHODS \n\nSpike trains were recorded extracellularly from 26 well-isolated single neurons in area \nMT of four anesthetized, paralyzed macaque monkeys using methods described in \ndetail elsewhere (Levitt et al., 1994). The size of the receptive fields and the spatial \nand temporal frequency preferences of the neurons were assessed quantitatively \nusing drifting sinusoidal gratings, after which a white-noise velocity signal, s(t), \nwas used to modulate the position (within a fixed square aperture) of a low-pass \nfiltered 1D Gaussian white-noise (GWN) pattern. The frame rate of the display \nwas 54 Hz or 81 Hz. The spatial noise pattern consisted of 256 discrete intensity \nvalues, one per spatial unit. Every 19 ms (or 12 ms at 81 Hz), the pattern shifted, or \njumped, 6. spatial units along the axis of the neuron's preferred direction, where 6., \nthe jump size, was chosen according to a Gaussian, binary, or uniform probability \ndistribution. The maximum spatial frequency in the pattern was limited to prevent \naliasing. \n\nIn the first type of experiment, 10 trials of a 30 s noise sequence, s(t), and 10 trials of \nits reverse, -s(t), were interleaved. A standard GWN spatial pattern and velocity \nmodulation pattern were used for all cells, but for each cell, the stimulus was scaled \nfor the receptive field size and aligned to the axis of preferred motion. Nine cells \nwere tested with Gaussian noise at 81 Hz, 15 cells with binary noise at 81 Hz and \n54 Hz, and 10 cells with uniform noise at 54 Hz. \n\nIn another experiment, a sinusoidal spatial pattern (rather than GWN) moved \naccording to a binary white-noise velocity signal. Thials were interleaved with all \ncombinations of four spatial frequencies at octave intervals and four relative jump \nsizes: 1/4, 1/8, 1/16, and 1/32 of each spatial period. Typically 10 trials of length \n3 s were run. Four cells were tested at 54 Hz and seven at 81 Hz. \n\n3 ANALYSIS AND MODELING METHODS \n\nWe used the linear reconstruction methods introduced by Bialek et al. \n(1991) \nand further analyzed by Gabbiani and Koch (1996) to compute an optimal linear \nestimate of the stimulus, s(t), described above, based on the neuronal response, x(t). \nA single neuronal response was defined as the spike train produced by s (t) minus \nthe spike train produced by -s(t). This overcomes the neuron's limited dynamic \nrange in response to anti-preferred direction motion (Bialek et al., 1991). \n\n\f36 \n\nW. Bair; J. R. Cavanaugh and J. A. Movshon \n\nThe linear filter, h(t), which when convolved with the response yields the minimum \nmean square error estimate, Sest, of the stimulus can be described in terms of its \nFourier transform, \n\nH(w) _ Rsz( -w) \n- Rzz(w) , \n\n(1) \n\nwhere Rsz(w) is the Fourier transform of the cross-correlation rsz(r) of the stimulus \nand the resulting spike train and Rzz(w) is the power spectrum, i.e., the Fourier \ntransform of the auto-correlation, of the spike train (for details and references, see \nBialek et aI., 1991; Gabbiani and Koch, 1996). The noise, n(t), is defined as the \ndifference between the stimulus and the reconstruction, \n\nand the SNR is defined as \n\nn(t) = Sest(t) - s(t), \n\nSNR(w) = Rss(w) , \nRnn(w) \n\n(2) \n\n(3) \n\nwhere Rss(w) is the Fourier power spectrum of the stimulus and Rnn(w) is the \npower spectrum of the noise. If the stimulus amplitude distribution is Gaussian, \nthen SNR(w) can be integrated to give a lower bound on the rate of information \ntransmission in bits/s (Gabbiani and Koch, 1996). \n\nThe motion energy model consisted of opponent energy sub-units (Adelson and \nBergen, 1985) implemented with Gabor functions (Heeger, 1987; Grzywacz and \nYuille, 1990) in two spatial dimensions and time. The spatial frequency of the \nGabor function was set to match the spatial frequency of the stimulus, and the \ntemporal frequency was set to match that induced by a sequence of jumps equal to \nthe standard deviation (SD) of the amplitude distribution (which is the jump size in \nthe case of a binary distribution). We approximated causality by shifting the output \nforward in time before computing the optimal linear filter. The model operated on \nthe same stimulus patterns and noise sequences that were used to generate stimuli \nfor the neurons. The time-varying response of the model was broken into two half(cid:173)\nwave rectified signals which were interpreted as the firing probabilities of two units, \na neuron and an anti-neuron that preferred the opposite direction of motion. From \neach unit, ten 30 s long spike trains were generated with inhomogeneous Poisson \nstatistics. These 20 model spike trains were used to reconstruct the velocity signal \nin the same manner as the MT neuron output. \n\n4 RESULTS \n\nStimulus reconstruction. Optimal linear reconstruction filters, h(t), were com(cid:173)\nputed for 26 MT neurons from responses to 30 s sequences of white-noise motion. A \ntypical h(t), shown in Fig. lA (large dots), was dominated by a single positive lobe, \noften preceded by a smaller negative lobe. It was thinner than the reverse correla(cid:173)\ntion rsz(r) (Fig. lA, small dots) from which it was derived due to the division by \nthe low-pass power spectrum of the spikes (see Eqn. 1). Also, rsz(r) occasionally \nhad a slower, trailing negative lobe but did not have the preceding negative lobe \nof h(t). On average, h(t) peaked at -69 ms (SD 17) and was 33 ms (SD 12) wide \nat half-height. The peak for rsz(r) occurred at the same time, but the width was \n41 ms (SD 15), ranging from 30-90 ms. The point of half-rise on the right side of \nthe peak was -53 ms (SD 9) for h(t) and -51 ms (SD 9) for rsz(r). For all plots, \n\n\fReconstructing Stimulus Velocity from Neuronal Responses in Area MT \n\n37 \n\nvertical axes for velocity show normalized stimulus velocity, i.e., stimulus velocity \nwas scaled to have unity SD before all computations. \n\nFig lC (dots) shows the SNR for the reconstruction using the h(t) in panel A. For \n8 cells tested with Gaussian velocity noise, the integral of the log of the SNR gives \na lower bound for information transmission, which was 6.7 bits/s (SD 2.8), with a \nhigh value of 12.3 bits/so Most of the information was carried below 10 Hz, and 90% \nof the information was carried below 18.4 Hz (SD 2.1). In Fig. ID, the failure of the \nreconstruction (dots) to capture higher frequencies in the stimulus (thick line) is \ndirectly visible. Both h(t) and SNR(w) were similar but on average slightly greater \nin amplitude for tests using binary and uniform distributed noise. Gaussian noise \nhas many jumps at or near zero which may induce little or no response. \n\n3 \n\n~2 \nZ \n\nCJ:) \n\no \n\no \n40 \nFrequency (Hz) \n\n20 \n\n0.6 \n\nB \n\nA \n\n~ 0.3 \n.... \n..... u \n..9 0.2 \no \n> \n\"'2 0.1 \n~ S \no \nZ -0.1 +--..--N..,...e..,...ur..,...o..,...ll..........,..............,.........-.-.....-.-....--.--I Model \n\nN \n~ \n\n-160 \n\n-80 \n\n0 -80 \n\nTime relative to spike (ms) \n\nD \n\n3 \n~ .... \n..... \nU \n0 \n....-4 \n0 > \n'i:j 0 \n0 \n..... \nN \n....-4 \n~ S 0 \nZ \n\n-3 \n\n100 \n\n150 \n\n200 \n\nTime (ms) \n\n250 \n\n300 \n\nFigure 1: (A) Optimal linear filter h(t) (big dots) from Eqn. 1 and cross-correlation \nTsz(r) ,(small dots) for one MT neuron. (B) h(t) (thick line) and Tsz(r) (thin line) \nfor an opponent motion energy model. (C) SNR(w) for the neuron (dots) and the \nmodel (line). (D) Reconstruction for the neuron (dots) and model (thin line) of \nthe stimulus velocity (thick line). Velocity was normalized to unity SD. Curves for \nTsz(r) were scaled by 0.5. \u00b7Note the different vertical scale in B. \n\n\f38 \n\nW. Bair, J. R. Cavanaugh and J. A. Movshon \n\nAn opponent motion energy model using Gabor functions was simulated with spa(cid:173)\ntial SD 0.50 , spatial frequency 0.625 cycr, temporal SD 12.5 ms, and temporal \nfrequency 20 Hz. The model was tested with a Gaussian velocity stimulus with SD \n320 / s. Because an arbitrary scaling of the spatial parameters in the model does not \naffect the temporal properties of the information transmission, this was effectively \nthe same stimulus that yielded the neuronal data shown in Fig. 1A. Spike trains \nwere generated from the model at 20 Hz (matched to the neuron) and used to com(cid:173)\npute h(t) (Fig. 1B, thick line). The model h(t) was narrower than that for the MT \nneuron, but was similar to h(t) for Vi neurons that have been tested (unpublished \nanalysis). This simple model of a putative input sub-unit to MT transmitted 29 \nbits/s- more than the best MT neurons studied here. The SNR ratio and the re(cid:173)\nconstruction for the model are shown in Fig. 1C,D (thin lines). The filter h(t) for \nthe model (Fig. 1B thick line) was more symmetric than that for the neuron due to \nthe symmetry of the Gabor function used in the model. \n\nNeuron 1 \n\nNeuron 2 \n\n1 \n\n0.5 \n\n-120 \n\n-80 \n\n-40 \n\n0 -160 \n\n-120 \n\n-80 \n\n-40 \n\n0 \n\n1/32 \n\n1/16 \n\n-0. 5 +-----.-....:,....;.....,-.-~\"T-T---r--.--.___,_~~__i \n\n-160 \n\n-120 \n\n-40 \n\n-80 \nTime relative to spike (ms) \n\n0 -160 \n\n-120 \n\n-80 \n\n-40 \n\no \n\nFigure 2: The width of r 8Z (r) changes with temporal frequency, but not spatial \nfrequency. Data from two neurons are shown, one on the left, one on the right. \nTop: rsx(r) is shown for binary velocity stimuli with jump sizes 1/4, 1/8, 1/16, \nand 1/32 (thick to thin lines) of the spatial period (10 trials, 3 s/trial). The left side \nof the peak shifts leftward as jump size decreases. See text for statistics. Bottom: \nThe relative jump size, thus temporal frequency, was constant for the four cases in \neach panel (1/32 on the left, 1/16 on the right). The peaks do not shift left or right \nas spatial frequency and jump size change inversely. Thicker lines represent larger \njumps and lower spatial frequencies. \n\n\fReconstructing Stimulus Velocity from Neuronal Responses in Area MT \n\n39 \n\nChanges in rsx(r). We tested 11 neurons with a set of binary white-noise motion \nstimuli that varied in spatial frequency and jump size. The spatial patterns were \nsinusoidal gratings. The peaks in rsx(r) and h(t) were wider when smaller jumps \n(slower velocities) were used to move the same spatial pattern. Fig. 2 shows data \nfor two neurons plotted for constant spatial frequency (top) and constant effective \ntemporal frequency, or contrast frequency (bottom). Jump sizes were 1/4, 1/8, \n1/16, and 1/32 (thick to thin lines, top panels) ofthe period of the spatial pattern. \n(Note, a 1/2 period jump would cause an ambiguous motion.) Relative jump size \nwas constant in the bottom panels, but both the spatial period and the velocity \nincreased in octaves from thin to thick lines. One of the plots in the upper panel \nalso appears in the lower panel for each neuron. For 26 conditions in 11 MT neurons \n(up to 4 spatial frequencies per neuron) the left and right half-rise points of the peak \nof rsx(r) shifted leftward by 19 ms (SD 12) and 4.5 ms (SD 4.0), respectively, as \njump size decreased. The width, therefore, increased by 14 ms (SD 12). These \nchanges were statistically significant (p < 0.001, t-test). In fact, the left half-rise \npoint moved leftward in all 26 cases, and in no case did the width at half-height \ndecrease. On the other hand, there was no significant change in the peak width \nor half-rise times when temporal frequency was constant, as demonstrated in the \nlower panels of Fig. 2. \n\n5 DISCUSSION \n\nFrom other experiments using frame-based displays to present moving stimuli to \nMT cells, we know that roughly half of the cells can modulate to a 60 Hz signal \nin the preferred direction and that some provide reliable bursts of spikes on each \nframe but do not respond to null direction motion. Therefore, one might expect that \nthese cells could easily be made to transmit nearly 60 bits/ s by moving the stimulus \nrandomly in either the preferred or null direction on each video frame. However, \nthe stimuli that we employed here did not result in such high frequency modulation, \nnor did our best lower bound estimate of information transmission for an MT cell, \n12 bits/s, approach the 64 bits/s capacity of the motion sensitive HI neuron in the \nfly (Bialek et al., 1991). In recordings from seven VI neurons (not shown here), \ntwo directional complex cells responded to the velocity noise with high temporal \nprecision and fired a burst of spikes on almost every preferred motion frame and no \nspikes on null motion frames. At 53 frames/s, these cells transmitted over 40 bits/so \nWe hope that further investigation will reveal whether the lack of high frequency \nmodulation in our MT experiments was due to statistical variation between animals, \nthe structure of the stimulus, or possibly to anesthesia. \n\nIn spite of finding less high frequency bandwidth than expected, we were able to \ndocument consistent changes, namely narrowing, of the temporal integration win(cid:173)\ndow of MT neurons as temporal frequency increased. Similar changes in the time \nconstant of motion processing have been reported in the fly visual system, where \nit appears that neither velocity nor temporal frequency alone can account for all \nchanges (de Ruyter et al., 1986; Borst & Egelhaaf, 1987). The narrowing of rsx(r) \nwith higher temporal frequency does not occur in our simple motion energy model, \nwhich lacks adaptive mechanisms, but it could occur in a model which integrated \nsignals from many motion energy units having distributed temporal frequency tun(cid:173)\ning, even without other sources of adaptation. \n\n\f40 \n\nW. Bair, J. R. CavanaughandJ. A. Movshon \n\nWe were not able to assess whether changes in the integration window developed \nquickly at the beginning of individual trials, but an analysis not described here at \nleast indicates that there was very little change in the position and width of r 83: (T) \nand h(t) after the first few seconds during the 30 s trials. \n\nAcknowledgements \n\nThis work was funded by the Howard Hughes Medical Institute. We thank Fab(cid:173)\nrizio Gabbiani and Christof Koch for helpful discussion, Lawrence P. O'Keefe for \nassistance with electrophysiology, and David Tanzer for assistance with software. \n\nReferences \n\nAdelson EH, Bergen JR (1985) Spatiotemporal energy models for the perception of \n\nmotion. J Opt Soc Am A 2:284- 299. \n\nBair W, Koch C (1996) Temporal precision of spike trains in extrastriate cortex of \n\nthe behaving macaque monkey. Neural Comp 8:1185-1202. \n\nBialek W, Rieke F, de Ruyter van Steveninck RR, Warland D (1991) Reading a \n\nneural code. Science 252:1854- 1857. \n\nBorst A, Egelhaaf M (1987) Temporal modulation of luminance adapts time con(cid:173)\n\nstant of fly movement detectors. BioI Cybern 56:209-215. \n\nEmerson RC, Bergen JR, Adelson EH (1992) Directionally selective complex cells \nand the computation of motion energy in cat visual cortex. Vision Res 32:203-\n218. \n\nGabbiani F, Koch C (1996) Coding of time-varying signals in spike trains of \n\nintegrate-and-fire neurons with random threshold. Neural Comp 8:44-66. \n\nGrzywacz NM, Yuille AL (1990) A model for the estimate of local image velocity \n\nby cells in the visual cortex. Proc R Soc Lond B 239:129-161. \n\nHeeger DJ (1987) Model for the extraction of image flow. J Opt Soc Am A 4:1455-\n\n1471. \n\nLevitt JB, Kiper DC, Movshon JA (1994) Receptive fields and functional architec(cid:173)\n\nture of macaque V2. J .Neurophys. 71:2517-2542. \n\nNowlan SJ, Sejnowski TJ (1994) Filter selection model for motion segmentation \n\nand velocity integration. J Opt Soc Am A 11:3177-3200. \n\nQian N, Andersen RA, Adelson EH (1994) Transparent motion perception as de(cid:173)\ntection of unbalanced motion signals .3. Modeling. J Neurosc 14:7381-7392. \n\nde Ruyter van Steveninck R, Zaagman WH, Mastebroek HAK (1986) Adaptation \nof transient responses of a movement-sensitive neuron in the visual-system of \nthe blowfly calliphora-erythrocephala. BioI Cybern 54:223-236. \n\n\f", "award": [], "sourceid": 1301, "authors": [{"given_name": "Wyeth", "family_name": "Bair", "institution": null}, {"given_name": "James", "family_name": "Cavanaugh", "institution": null}, {"given_name": "J.", "family_name": "Movshon", "institution": null}]}