{"title": "Time-rescaling methods for the estimation and assessment of non-Poisson neural encoding models", "book": "Advances in Neural Information Processing Systems", "page_first": 1473, "page_last": 1481, "abstract": "Recent work on the statistical modeling of neural responses has focused on modulated renewal processes in which the spike rate is a function of the stimulus and recent spiking history. Typically, these models incorporate spike-history dependencies via either: (A) a conditionally-Poisson process with rate dependent on a linear projection of the spike train history (e.g., generalized linear model); or (B) a modulated non-Poisson renewal process (e.g., inhomogeneous gamma process). Here we show that the two approaches can be combined, resulting in a {\\it conditional renewal} (CR) model for neural spike trains. This model captures both real and rescaled-time effects, and can be fit by maximum likelihood using a simple application of the time-rescaling theorem [1]. We show that for any modulated renewal process model, the log-likelihood is concave in the linear filter parameters only under certain restrictive conditions on the renewal density (ruling out many popular choices, e.g. gamma with $\\kappa \\neq1$), suggesting that real-time history effects are easier to estimate than non-Poisson renewal properties. Moreover, we show that goodness-of-fit tests based on the time-rescaling theorem [1] quantify relative-time effects, but do not reliably assess accuracy in spike prediction or stimulus-response modeling. We illustrate the CR model with applications to both real and simulated neural data.", "full_text": "Time-rescaling methods for the estimation and\n\nassessment of non-Poisson neural encoding models\n\nJonathan W. Pillow\n\nDepartments of Psychology and Neurobiology\n\nUniversity of Texas at Austin\n\npillow@mail.utexas.edu\n\nAbstract\n\nRecent work on the statistical modeling of neural responses has focused on mod-\nulated renewal processes in which the spike rate is a function of the stimulus and\nrecent spiking history. Typically, these models incorporate spike-history depen-\ndencies via either: (A) a conditionally-Poisson process with rate dependent on\na linear projection of the spike train history (e.g., generalized linear model); or\n(B) a modulated non-Poisson renewal process (e.g., inhomogeneous gamma pro-\ncess). Here we show that the two approaches can be combined, resulting in a\nconditional renewal (CR) model for neural spike trains. This model captures both\nreal-time and rescaled-time history effects, and can be \ufb01t by maximum likelihood\nusing a simple application of the time-rescaling theorem [1]. We show that for\nany modulated renewal process model, the log-likelihood is concave in the linear\n\ufb01lter parameters only under certain restrictive conditions on the renewal density\n(ruling out many popular choices, e.g. gamma with shape \u03ba (cid:54)= 1), suggesting that\nreal-time history effects are easier to estimate than non-Poisson renewal proper-\nties. Moreover, we show that goodness-of-\ufb01t tests based on the time-rescaling\ntheorem [1] quantify relative-time effects, but do not reliably assess accuracy in\nspike prediction or stimulus-response modeling. We illustrate the CR model with\napplications to both real and simulated neural data.\n\n1 Introduction\n\nA central problem in computational neuroscience is to develop functional models that can accurately\ndescribe the relationship between external variables and neural spike trains. All attempts to measure\ninformation transmission in the nervous system are fundamentally attempts to quantify this relation-\nship, which can be expressed by the conditional probability P ({ti}|X), where {ti} is a set of spike\ntimes generated in response to an external stimulus X.\nRecent work on the neural coding problem has focused on extensions of the Linear-Nonlinear-\nPoisson (LNP) \u201ccascade\u201d encoding model, which describes the neural encoding process using a\nlinear receptive \ufb01eld, a point nonlinearity, and an inhomogeneous Poisson spiking process [2, 3].\nWhile this model provides a simple, tractable tool for characterizing neural responses, one obvious\nshortcoming is the assumption of Poisson spiking. Neural spike trains exhibit spike-history depen-\ndencies (e.g., refractoriness, bursting, adaptation), violating the Poisson assumption that spikes in\ndisjoint time intervals are independent. Such dependencies, moreover, have been shown to be es-\nsential for extracting complete stimulus information from spike trains in a variety of brain areas\n[4, 5, 6, 7, 8, 9, 10, 11].\nPrevious work has considered two basic approaches for incorporating spike-history dependencies\ninto neural encoding models. One approach is to model spiking as a non-Poisson inhomogeneous\nrenewal process (e.g., a modulated gamma process [12, 13, 14, 15]). Under this approach, spike\n\n1\n\n\fFigure 1: The conditional renewal (CR) model and time-rescaling transform. (A) Stimuli are con-\nvolved with a \ufb01lter k then passed through a nonlinearity f, whose output is the rate \u03bb(t) for an inho-\nmogeneous spiking process with renewal density q. The post-spike \ufb01lter h provides recurrent additive\ninput to f for every spike emitted. (B) Illustration of the time-rescaling transform and its inverse. Top:\nthe intensity \u03bb(t) (here independent of spike history) in response to a one-second stimulus. Bottom\nleft: interspike intervals (left, intervals between red dots) are drawn i.i.d. in rescaled time from renewal\ndensity q, here set to gamma with shape \u03ba = 20. Samples are mapped to spikes in real time (bot-\ntom) via \u039b\u22121(t), the inverse of the cumulative intensity. Alternatively, \u039b(t) maps the true spike times\n(bottom) to samples from a homogeneous renewal process in rescaled time (left edge).\n\ntimes are Markovian, depending on the most recent spike time via a (non-exponential) renewal\ndensity, which may be rescaled in proportion to the instantaneous spike rate. A second approach\nis to use a conditionally Poisson process in which the intensity (or spike rate) is a function of the\nrecent spiking history [4, 16, 17, 18, 19, 20]. The output of such a model is a conditionally Poisson\nprocess, but not Poisson, since the spike rate itself depends on the spike history.\nThe time-rescaling theorem, described elegantly for applications to neuroscience in [1] , provides a\npowerful tool for connecting these two basic approaches, which is the primary focus of this paper.\nWe begin by reviewing inhomogeneous renewal models and generalized linear model point process\nmodels for neural spike trains.\n\n2 Point process neural encoding models\n\n2.1 De\ufb01nitions and Terminology\nLet {ti} be a sequence of spike times on the interval (0, T ], with 0 < t0 < t1 < . . . , < tn \u2264 T ,\nand let \u03bb(t) denote the intensity (or \u201cspike rate\u201d) for the point process, where \u03bb(t) \u2265 0,\u2200t. Gener-\nally, this intensity is a function of some external variable (e.g., a visual stimulus). The cumulative\nintensity function is given by the integrated intensity,\n\n\u039b(t) =\n\n\u03bb(s)ds,\n\n(1)\n\nand is also known as the time-rescaling transform [1]. This function rescales the original spike\ntimes into spikes from a (homogeneous) renewal process, that is, a process in which the intervals\nare i.i.d. samples from a \ufb01xed distribution. Let {ui} denote the inter-spike intervals (ISIs) of the\nrescaled process, which are given by the integral of the intensity between successive spikes, i.e.,\n\n(cid:90) t\n\n0\n\n(cid:90) ti\n\nti\u22121\n\nui = \u039bti\u22121(ti) =\n\n\u03bb(s)ds.\n\n(2)\n\nIntuitively, this transformation stretches time in proportion to the spike rate \u03bb(t) , so that when the\nrate \u03bb(t) is high, ISIs are lengthened and when \u03bb(t) is low, ISIs are compressed. (See \ufb01g. 1B for\nillustration).\n\n2\n\nAnonlinearityrescaled renewal spiking post-spike filterstimulus filter+01201234567real time (s)rescaled time (unitless)rescaled timep(ISI)...Brenewal density 050100rate (Hz)\fLet q(u) denote the renewal density, the probability density function from which the rescaled-time\nintervals {ui} are drawn. A Poisson process arises if q is exponential, q(u) = e\u2212u; for any other\ndensity, the probability of spiking depends on the most recent spike time. For example, if q(u) is\nzero for u \u2208 [0, a], the neuron exhibits a refractory period (whose duration varies with \u03bb(t)).\nTo sample from this model (illustrated in \ufb01g. 1B), we can draw independent intervals ui from re-\nnewal density q(u), then apply the inverse time-rescaling transform to obtain ISIs in real time:\n\n(ti \u2212 ti\u22121) = \u039b\u22121\n\nti\u22121(ui),\n\n(3)\n\nti\u22121(t) is the inverse of time-rescaling transform (eq 2).1\n\nwhere \u039b\u22121\nWe will generally de\ufb01ne the intensity function (which we will refer to as the base intensity2) in terms\nof a linear-nonlinear cascade, with linear dependence on some external covariates of the response\n(optionally including spike-history), followed by a point nonlinearity. The intensity in this case can\nbe written:\n\n(4)\nwhere xt is a vector representing the stimulus at time t, k is a stimulus \ufb01lter, yt is a vector repre-\nsenting the spike history at t, and h is a spike-history \ufb01lter. We assume that the nonlinearity f is\n\ufb01xed.\n\n\u03bb(t) = f(xt \u00b7 k + yt \u00b7 h),\n\n2.2 The conditional renewal model\n\nWe refer to the most general version of this model, in which \u03bb(t) is allowed to depend on both\nthe stimulus and spike train history, and q(u) is an arbitrary (\ufb01nite-mean) density on R+, as a\nconditional renewal (CR) model (see \ufb01g. 1A). The output of this model forms an inhomogeneous\nrenewal process conditioned on the process history. Although it is mathematically straightforward\nto de\ufb01ne such a model, to our knowledge, no previous work has sought to incorporate both real-time\n(via h) and rescaled-time (via q) dependencies in a single model.\nSpeci\ufb01c (restricted) cases of the CR model include the generalized linear model (GLM) [17], and the\nmodulated renewal model with \u03bb = f(x \u00b7 k) and q a right-skewed, non-exponential renewal density\n[13, 15]. (Popular choices for q include gamma, inverse Gaussian, and log-normal distributions).\nThe conditional probability distribution over spike times {ti} given the external variables X can\nbe derived using the time-rescaling transformation.\nIn rescaled time, the CR model speci\ufb01es a\nprobability over the ISIs,\n\nP ({ui}|X) =\n\nq(ui).\n\n(5)\n\nn(cid:89)\n\ni=1\n\nn(cid:89)\n\ni=1\n\nA change-of-variables ti = \u039b\u22121\ntimes:\n\nti\u22121(ui) + ti\u22121 (eq. 3) provides the conditional probability over spike\n\nP ({ti}|X) =\n\n\u03bb(ti)q(\u039bti\u22121(ti)).\n\n(6)\n\nThis probability, considered as a function of the parameters de\ufb01ning \u03bb(t) and q(u), is the likelihood\nfunction for the CR model, as derived in [13].3 The log-likelihood function can be approximated in\ndiscrete time, with bin-size dt taken small enough to ensure \u2264 1 spike per bin:\n\nn(cid:88)\n\nn(cid:88)\n\n\uf8eb\uf8ed ti(cid:88)\n\n\uf8f6\uf8f8 ,\n\nlog P ({ti}|X) =\n\nlog \u03bb(ti) +\n\nlog q\n\n\u03bb(j)dt\n\n(7)\n\nwhere ti indicates the bin for the ith spike. This approximation becomes exact in the limit as dt \u2192 0.\n\ni=1\n\ni=1\n\nj=ti\u22121+1\n\n1Note that \u039bt\u2217 (t) is invertible for all spike times ti, since necessarily ti \u2208 {t; \u03bb(t) > 0}.\n2A note on terminology: we follow [13] in de\ufb01ning \u03bb(t) to be the instantaneous rate for an inhomogeneous\nrenewal process, which is not identical to the hazard function H(t) = P (ti \u2208 [t, t + \u2206]|ti > ti\u22121)/\u2206, also\nknown as the conditional intensity [1]. We will use \u201cbase intensity\u201d for \u03bb(t) to avoid this confusion.\n\n3For simplicity, we have ignored the intervals (0, t0], the time to the \ufb01rst spike, and (tn, T ], the time after\n\nthe last spike, which are simple to compute but contribute only a small fraction to the total likelihood.\n\n3\n\n\fFigure 2: Time-rescaling and likelihood-based goodness-of-\ufb01t tests with simulated data. : Left: Stim-\nulus \ufb01lter and renewal density for three point process models (all with nonlinearity f (x) = ex and\nhistory-independent intensity). \u201cTrue\u201d spikes were generated from (a), a conditional renewal model\nwith a gamma renewal density (\u03ba = 10). These responses were \ufb01t by: (b), a Poisson model with the\ncorrect stimulus \ufb01lter; and (c), a modulated renewal process with incorrect stimulus \ufb01lter (set to the\nnegative of the correct \ufb01lter), and renewal density estimated nonparametrically from the transformed\nintervals (eq. 10). Middle: Repeated responses from all three models to a novel 1-s stimulus, showing\nthat spike rate is well predicted by (b) but not by (c). Right: KS plots (above) show time-rescaling\nbased goodness-of-\ufb01t. Here, (b) fails badly, while (c) passes easily, with cdf entirely within within 99%\ncon\ufb01dence region (gray lines). Likelihood-based cross-validation tests (below) show that (b) preserves\nroughly 1/3 as much information about spike times as (a), while (c) carries slightly less information\nthan a homogeneous Poisson process with the correct spike rate.\n\n3 Convexity condition for inhomogeneous renewal models\n\nWe now turn to the tractability of estimating the CR model parameters from data. Here, we present\nan extension to the results of [21], which proved a convexity condition for maximum-likelihood\nestimation of a conditionally Poisson encoding model (i.e., generalized linear model). Speci\ufb01cally,\n[21] showed that the log-likelihood for the \ufb01lter parameters \u03b8 = {k, h} is concave (i.e., has no non-\nglobal local maxima) if the nonlinear function f is both convex and log-concave (meaning log f is\nconcave). Under these conditions4, minimizing the negative log-likelihood is a convex optimization\nproblem.\nBy extension, we can ask whether the estimation problem remains convex when we relax the Poisson\nassumption and allow for a non-exponential renewal density q. Let us write the log-likelihood\nfunction for the linear \ufb01lter parameters \u03b8 = [kT , hT ]T as\n\nL{D,q}(\u03b8) =\n\nlog f(X(ti) \u00b7 \u03b8) +\n\nlog q\n\nf(X(t) \u00b7 \u03b8)dt\n\n,\n\n(8)\n\nn(cid:88)\n\ni\n\nn(cid:88)\n\n(cid:32)(cid:90) ti\n\ni=1\n\nti\u22121\n\n(cid:33)\n\nt , yT\n\nt ]T is a vector containing the relevant stimulus and spike history at time t, and\n\nwhere X(t) = [xT\nD = {{ti},{X(t)}} represents the full set of observed data. The condition we obtain is:\nTheorem 1. The CR model log-likelihood L{D,q}(\u03b8) is concave in the \ufb01lter parameters \u03b8, for any\nobserved data D, if: (1) the nonlinearity f is convex and log-concave; and (2) the renewal density\nq is log-concave and non-increasing on (0,\u221e].\n\nProof. It suf\ufb01ces to show that both terms in the equation (8) are concave in \u03b8, since the sum of two\nconcave functions is concave. The \ufb01rst term is obviously concave, since log f is concave. For the\n\n4Allowed nonlinearities must grow monotonically, at least linearly and at most exponentially: e.g., exp(x);\n\nlog(1 + exp(x)); (cid:98)x(cid:99)p, p \u2265 1.\n\n4\n\nstimulus \ufb01lternon-parametric1000246052gamma2exponentialISI\r(rescaled time)50 msrenewal densityrate (Hz)(a)(b)(c)cross-validation 01050time (s)(c)(a/b)rasters(a)(b)(c)01020 bits/s00.5100.51quantilesCDFKS plot\fsecond term, note that(cid:82) f(X \u00b7 \u03b8) is a convex function, since it is the integral of a convex function\nover a convex region. Then log q[(cid:82) f(X \u00b7 \u03b8)] is a concave, non-increasing function of a convex\n\nfunction, since log q is concave and non-increasing; such a function is necessarily concave.5 The\nsecond term is therefore also a sum of concave functions, and thus concave.\n\nMaximum likelihood \ufb01lter estimation under the CR model is therefore a convex problem so long as\nthe renewal density q is both log-concave and non-increasing. This restriction rules out a variety of\nrenewal densities that are commonly employed to model neural data [13, 14, 15]. Speci\ufb01cally, the\nlog-normal and inverse-Gaussian densities both have increasing regimes on a subset of [0,\u221e), as\ndoes the gamma density q(u) \u221d u\u03ba\u22121e\u2212u\u03ba when \u03ba > 1. For \u03ba < 1, gamma fails to be log-concave,\nmeaning that the only gamma density satisfying both conditions is the exponential (\u03ba = 1).\nThere are nevertheless many densities (besides the exponential) for which these conditions are met,\nincluding\n\n\u2022 q(u) \u221d e\u2212up/\u03c32, for any p \u2265 1\n\u2022 q(u) = uniform density\n\u2022 q(u) \u221d (cid:98)f(u)(cid:99), or q(u) \u221d ef (u), for any concave, decreasing function f(u)\n\nUnfortunately, no density in this family can exhibit refractory effects, since this would require a q\nthat is initially zero and then rises. From an estimation standpoint, this suggests that it is easier to\nincorporate certain well-known spike-history dependencies using recurrent spike-history \ufb01lters (i.e.,\nusing the GLM framework) than via a non-Poisson renewal density.\nAn important corollary of this convexity result is that the decoding problem of estimating stimuli\n{xt} from a set of observed spike times {ti} using the maximum of the posterior (i.e., computing\nthe MAP estimate) is also a convex problem under the same restrictions on f and q, so long as the\nprior over stimuli is log-concave.\n\n4 Nonparametric Estimation of the CR model\n\nIn practice, we may wish to optimize both the \ufb01lter parameters governing the base intensity \u03bb(t) and\nthe renewal density q, which is not in general a convex problem. We may proceed, however, bearing\nin mind that gradient ascent may not achieve the global maximum of the likelihood function.\nHere we formulate a slightly different\ninterval-rescaling function that allows us to non-\nparametrically estimate renewal properties using a density on the unit interval. Let us de\ufb01ne the\nmapping\n\n(9)\nwhich is the cumulative density function (cdf) for the intervals from a conditionally Poisson process\nwith cumulative intensity \u039b(t). This function maps spikes from a conditionally Poisson process to\ni.i.d. samples from U[0, 1]. Any discrepancy between the distribution of {vi} and the uniform dis-\ntribution represents failures of a Poisson model to correctly describe the renewal statistics. (This is\nthe central idea underlying time-rescaling based goodness-of-\ufb01t test, which we will discuss shortly).\nWe propose to estimate a density \u03c6(v) for the rescaled intervals {vi} using cubic splines (piecewise\n3rd-order polynomials with continuous 2nd derivatives), with evenly spaced knots on the interval\n[0, 1].6 This allows us to rewrite the likelihood function (6) as the product of two identi\ufb01able terms:\n\nvi = 1 \u2212 exp(\u2212\u039bti\u22121(ti)),\n\n(cid:32) n(cid:89)\n\n(cid:33)(cid:32) n(cid:89)\n\n(cid:33)\n\nP ({ti}|X) =\n\n\u03bb(ti) e\u2212\u039b0(T )\n\n\u03c6(vi)\n\n,\n\n(10)\n\ni=1\n\ni=1\n\nwhere the \ufb01rst term is the likelihood under the conditional Poisson model [17], and the second is\nthe probability of the rescaled intervals {vi} under the density \u03c6(v). This formulation allows us to\nseparate the (real-time) contributions of the intensity function under the assumption of conditionally\n5To see this, note that if g is concave (g(cid:48)(cid:48) \u2264 0) and non-increasing (g(cid:48) \u2264 0), and f is convex (f(cid:48)(cid:48) \u2265 0), then\n\ndx2 g(f (x)) = g(cid:48)(cid:48)(f (x))f(cid:48)(x)2 + g(cid:48)(f (x))f(cid:48)(cid:48)(x) \u2264 0, implying g(f (x)) is concave.\nR 1\n6ML estimation of the spline parameters is a convex problem with one linear equality constraint\n0 \u03c6(v)dv = 1 and a family of inequality constraints q(v) \u2265 0,\u2200v, which can be optimized ef\ufb01ciently.\n\nd2\n\n5\n\n\f1\n\n0\n\n0\n\n1\n\nFigure 3: Left: pairwise dependencies between successive rescaled ISIs from model (\u201ca\u201d, see \ufb01g. 2)\nwhen \ufb01t by a non-Poisson renewal model \u201cc\u201d. Center: \ufb01tted model of the conditional distribution over\nrescaled ISIs given the previous ISI, discretized into 7 intervals for the previous ISI. Right: rescaling\nthe intervals using the cdf (cid:31), obtained from the conditional (cid:31)(zi+1(cid:124) zi), produces successive ISIs which\nare much more independent. This transformation adds roughly 3 bits/s to the likelihood-based cross-\nvalidation performance of model (c).\n\nPoisson spiking, from the (rescaled-time) contributions of a non-Poisson renewal density. (For a\nconditionally Poisson process, (cid:31) is the uniform density on [0(cid:44) 1], and makes zero contribution to the\ntotal log-likelihood).\nWe \ufb01t this model to simulated data (\ufb01g. 2), and to real neural data using alternating coordinate ascent\nof the \ufb01lter parameters and the renewal density parameters (\ufb01g. 4). In \ufb01g. 2, we plot the renewal\ndistribution (cid:136)q(u) (red trace), which can be obtained from the estimated (cid:136)(cid:31)(v) via the transformation\n(cid:136)q(u) = (cid:136)(cid:31)(1 (cid:31) e(cid:31)u)e(cid:31)u.\n\n4.1\n\nIncorporating dependencies between intervals\n\nThe cdf de\ufb01ned by the CR model, (cid:31)(v) =(cid:31) v\n\nso that the\nmarginal distribution over zi = (cid:31)(vi) is uniform on [0(cid:44) 1]. However, there is no guarantee that the\nresulting random variables are independent, as assumed in the likelihood (eq. 10). We can examine\ndependencies between successive ISIs by making a scatter plot of pairs (zi(cid:44) zi+1) (see \ufb01g. 3). De-\npartures from independence can then be modeled by introducing a nonparametric estimator for the\nconditional distribution \u03c6(zi(cid:124) zi(cid:31)1). In this case, the likelihood becomes\n\n0 (cid:31)(s)ds, maps the transformed ISIs (cid:123) vi(cid:125)\n\nP ((cid:123) ti(cid:125)\n\n(cid:124) X) =\n\n(cid:29)(ti) e(cid:31)(cid:31)0(T )\n\n(cid:31)(vi)\n\n\u03c6(zi(cid:124) zi(cid:31)1)\n\n(cid:44)\n\n(11)\n\ni=1\n\ni=1\n\ni=2\n\nwhich now has three terms, corresponding (respectively) to the effects of the base intensity, non-\nconditionally Poisson renewal properties, and dependencies between successive intervals.\n\n5 The time-rescaling goodness-of-\ufb01t test\n\nIf a particular point-process model provides an accurate description of a neuron\u2019s response, then the\ncumulative intensity function de\ufb01nes a mapping from the real time to rescaled-time such that the\nrescaled interspike intervals have a common distribution. Time-rescaling can therefore be used as a\ntool for assessing the goodness-of-\ufb01t of a point process model [1, 22]. Speci\ufb01cally, after remapping\na set of observed spike times according to the (model-de\ufb01ned) cumulative intensity, one can perform\na distributional test (e.g., Kolmogorov-Smirnov, or KS test) to assess whether the rescaled intervals\nhave the expected distribution7. For example, for a conditionally Poisson model, the KS test can be\napplied to the rescaled intervals (cid:123) vi(cid:125)\n\n(eq. 9) to assess their \ufb01t to a uniform distribution.\n\n7Although we have de\ufb01ned the time-rescaling transform using the base intensity instead of the conditional\nintensity as in [1], the resulting tests are equivalent provided the K-S test is applied using the appropriate\ndistribution.\n\n6\n\nn(cid:29)\n\nn(cid:29)\n\nn(cid:29)\n\n(cid:30)\n(cid:28)\n(cid:30)\n(cid:28)\n(cid:30)\n(cid:28)\n\fThis approach to model validation has grown in popularity in recent years [14, 23], and has in some\ninstances been used as the only metric for comparing models. We wish to point out that time-\nrescaling based tests are sensitive to one kind of error (i.e., errors in modeling rescaled ISIs), but\nmay be insensitive to other kinds of model error (i.e., errors in modeling the stimulus-dependent\nspike rate). Inspection of the CR model likelihood (eq. 10), makes it clear that time-rescaling based\ngoodness-of-\ufb01t tests are sensitive only to accuracy with which \u03c6(v) (or equivalently, q(u)) models\nthe rescaled intervals. The test can in fact be independent of the accuracy with which the model\ndescribes the transformation from stimulus to spikes, a point that we illustrate with an (admittedly\ncontrived) example in \ufb01g. 2.\nFor this example, spikes were genereated from a \u201ctrue\u201d model (denoted \u201ca\u201d), a CR model with a\nbiphasic stimulus \ufb01lter and a gamma renewal density (\u03ba = 10). Responses from this model were \ufb01t\nby two sub-optimal approximate models: \u201cb\u201d, a Poisson (LNP) model, which was speci\ufb01ed to have\nthe correct stimulus \ufb01lter; and \u201cc\u201d, a CR model in which the stimulus \ufb01lter was mis-speci\ufb01ed (set\nto the negative of the true \ufb01lter), and a renewal density \u03c6(v) was estimated non-parametrically from\nthe rescaled intervals {vi} (rescaled under the intensity de\ufb01ned by this model).\nAlthough the time-varying spike-rate predictions of model (c) were badly mis-matched to those of\nmodel (a) (\ufb01g. 2, middle), a KS-plot (upper right) shows that (c) exhibits near perfect goodness-of-\ufb01t\non a time-rescaling test, which the Poisson model (b) fails badly. We cross-validated these models\nby computing the log-likelihood of novel data, which provides a measure of predictive information\nabout novel spike trains in units of bits/s [24, 18]. Using this measure, the \u201ctrue\u201d model (a) provides\napproximately 24 bits/s about the spike response to a novel stimulus. The Poisson model (b) captures\nonly 8 bits/s, but is still much more accurate than the mis-speci\ufb01ed renewal model (c), for which\nthe information is slightly negative (indicating that performance is slightly worse than that of a\nhomogeneous Poisson process with the correct rate).\nFig. 3 shows that model (c) can be improved by modeling the dependencies between successive\nrescaled interspike intervals. We constructed a spline-based non-parametric estimate of the density\n\u03c0(zi+1|zi), where zi = \u03a6(vi). (We discretized zi into 7 bins, based on visual inspection of the pair-\nwise dependency structure, and \ufb01t a cubic spline with 10 evenly spaced knots on [0,1] to the density\nwithin each bin). Rescaling these intervals using the cdf of the augmented model yields intervals\nthat are both uniform on [0, 1] and approximately independent (\ufb01g. 3, right; independence for non-\nsuccessive intervals not shown). The augmented model raises the cross-validation score of model (c)\nto 1 bit/s, meaning that by incorporating dependencies between intervals, the model carries slightly\nmore predictive information than a homogeneous Poisson model, despite the mis-speci\ufb01ed stimu-\nlus \ufb01lter. However, this model\u2014despite passing time-rescaling tests of both marginal distribution\nand independence\u2014still carries less information about spike times than the inhomogeneous Poisson\nmodel (b).\n\n6 Application to neural data\n\nFigure 4 shows several speci\ufb01c cases of the CR model \ufb01t to spiking data from an ON parasol cell in\nprimate retina, which was visually stimulated with binary spatio-temporal white noise (i.e., \ufb02icker-\ning checkerboard, [18]). We \ufb01t parameters for the CR model with and without spike-history \ufb01lters,\nand with and without a non-Poisson renewal density (estimated non-parametrically as described\nabove).\nAs expected, a non-parametric renewal density allows for remapping of ISIs to the correct (uniform)\nmarginal distribution in rescaled time (\ufb01g. 4, left), and leads to near-perfect scores on the time-\nrescaling goodness-of-\ufb01t test (middle). Even when incorporating spike-history \ufb01lters, the model\nwith conditionally Poisson spiking (red) fails the time-rescaling test at the 95% level, though not so\nbadly as the the inhomogeneous Poisson model (blue). However, the conditional Poisson model with\nspike-history \ufb01lter (red) outperforms the non-parametric renewal model without spike-history \ufb01lter\n(dark gray) on likelihood-based cross-validation, carrying 14% more predictive information. For\nthis neuron, incorporating non-Poisson renewal properties into a model with spike history dependent\nintensity (light gray) provides only a modest (<1%) increase in cross-validation performance. Thus,\nin addition to being more tractable for estimation, it appears that the generalized linear modeling\nframework captures spike-train dependencies more accurately than a non-Poisson renewal process\n(at least for this neuron). We are in the process of applying this analysis to more data.\n\n7\n\n\fFigure 4: Evaluation of four speci\ufb01c cases of the conditional renewal model, \ufb01t to spike responses\nfrom a retinal ganglion cell stimulated with a time-varying white noise stimulus. Left: marginal dis-\ntribution over the interspike intervals {zi}, rescaled according to their cdf de\ufb01ned under four different\nmodels: (a) Inhomogeneous Poisson (i.e., LNP) model, without spike-history \ufb01lter. (b) Conditional\nrenewal model without spike-history \ufb01lter, with non-parametrically estimated renewal density \u03c6. (c)\nConditional Poison model, with spike-history \ufb01lter (GLM). (d) Conditional renewal model with spike-\nhistory \ufb01lter and non-parametrically estimated renewal density. A uniform distribution indicates good\nmodel \ufb01t under the time-rescaling test. Middle: The difference between the empirical cdf of the\nrescaled intervals (under all four models) and their quantiles. As expected, (a) fares poorly, (c) per-\nforms better but slightly exceeds the 95% con\ufb01dence interval (black lines), and (b) and (d) exhibit\nnear-perfect time-rescaling properties. Right: Likelihood-based cross-validation performance. Adding\na non-parametric renewal density adds 4% to the Poisson model performance, but <1% to the GLM\nperformance. Overall, a spike-history \ufb01lter improves cross-validation performance more than the use\nof non-Poisson renewal process.\n\n7 Discussion\n\nWe have connected two basic approaches for incorporating spike-history effects into neural encod-\ning models: (1) non-Poisson renewal processes; and (2) conditionally Poisson processes with an\nintensity that depends on spike train history. We have shown that both kinds of effects can be re-\ngarded as special cases of a conditional renewal (CR) process model, and have formulated the model\nlikelihood in a manner that separates the contributions from these two kinds of mechanisms.\nAdditionally, we have derived a condition on the CR model renewal density under which the likeli-\nhood function over \ufb01lter parameters is log-concave, guaranteeing that ML estimation of \ufb01lters (and\nMAP stimulus decoding) is a convex optimization problem.\nWe have shown that incorporating a non-parametric estimate of the CR model renewal density en-\nsures near-perfect performance on the time-rescaling goodness-of-\ufb01t test, even when the model itself\nhas little predictive accuracy (e.g., due to a poor model of the base intensity). Thus, we would argue\nthat K-S tests based on the time-rescaled interspike intervals should not be used in isolation, but\nrather in conjunction with other tools for model comparison (e.g., cross-validated log-likelihood).\nFailure under the time-rescaling test indicates that model performance may be improved by incor-\nporating a non-Poisson renewal density, which as we have shown, may be estimated directly from\nrescaled intervals.\nFinally, we have applied the CR model to neural data, and shown that it can capture spike-history\ndependencies in both real and rescaled time. In future work, we will examine larger datasets and\nexplore whether rescaled-time or real-time models provide more accurate descriptions of the depen-\ndencies in spike trains from a wider variety of neural datasets.\n\nAcknowledgments\n\nThanks to E. J. Chichilnisky, A. M. Litke, A. Sher and J. Shlens for retinal data, and to J. Shlens and\nL. Paninski for helpful discussions.\n\n8\n\n01stimulus \ronly stimulus + \rspike-historyconditional\rPoissonconditional\rrenewalzP(z)bcdabits/smodelabcdcross-validated\rlog-likelihood01020CDF - quantileKS statistic010quantile\fReferences\n[1] E. Brown, R. Barbieri, V. Ventura, R. Kass, and L. Frank. The time-rescaling theorem and its application\n\nto neural spike train data analysis. Neural Computation, 14:325\u2013346, 2002.\n\n[2] E. J. Chichilnisky. A simple white noise analysis of neuronal light responses. Network: Computation in\n\nNeural Systems, 12:199\u2013213, 2001.\n\n[3] E. P. Simoncelli, L. Paninski, J. W. Pillow, and O. Schwartz. Characterization of neural responses with\nstochastic stimuli. In M. Gazzaniga, editor, The Cognitive Neurosciences, III, chapter 23, pages 327\u2013338.\nMIT Press, 2004.\n\n[4] M. Berry and M. Meister. Refractoriness and neural precision. Journal of Neuroscience, 18:2200\u20132211,\n\n1998.\n\n[5] Daniel S. Reich, Ferenc Mechler, Keith P. Purpura, and Jonathan D. Victor. Interspike intervals, receptive\n\n\ufb01elds, and information encoding in primary visual cortex. J. Neurosci., 20(5):1964\u20131974, 2000.\n\n[6] N. Brenner, W. Bialek, and R. de Ruyter van Steveninck. Adaptive rescaling optimizes information\n\ntransmission. Neuron, 26:695\u2013702, 2000.\n\n[7] W. Gerstner. Population dynamics of spiking neurons: Fast transients, asynchronous states, and locking.\n\nNeural Computation, 12(1):43\u201389, 2000.\n\n[8] P. Reinagel and R. C. Reid. Temporal coding of visual information in the thalamus. Journal of Neuro-\n\nscience, 20:5392\u20135400, 2000.\n\n[9] J. W. Pillow, L. Paninski, V. J. Uzzell, E. P. Simoncelli, and E. J. Chichilnisky. Prediction and decod-\ning of retinal ganglion cell responses with a probabilistic spiking model. The Journal of Neuroscience,\n25:11003\u201311013, 2005.\n\n[10] M.A. Montemurro, S. Panzeri, M. Maravall, A. Alenda, M.R. Bale, M. Brambilla, and R.S. Petersen.\nRole of precise spike timing in coding of dynamic vibrissa stimuli in somatosensory thalamus. Journal of\nNeurophysiology, 98(4):1871, 2007.\n\n[11] A.L. Jacobs, G. Fridman, R.M. Douglas, N.M. Alam, P. Latham, et al. Ruling out and ruling in neural\n\ncodes. Proceedings of the National Academy of Sciences, 106(14):5936, 2009.\n\n[12] M. Berman. Inhomogeneous and modulated gamma processes. Biometrika, 68(1):143\u2013152, 1981.\n[13] R. Barbieri, M.C. Quirk, L.M. Frank, M.A. Wilson, and E.N. Brown. Construction and analysis of\nnon-poisson stimulus-response models of neural spiking activity. Journal of Neuroscience Methods,\n105(1):25\u201337, 2001.\n\n[14] E. Rossoni and J. Feng. A nonparametric approach to extract information from interspike interval data.\n\nJournal of neuroscience methods, 150(1):30\u201340, 2006.\n\n[15] K. Koepsell and F.T. Sommer. Information transmission in oscillatory neural activity. Biological Cyber-\n\nnetics, 99(4):403\u2013416, 2008.\n\n[16] R.E. Kass and V. Ventura. A spike-train probability model. Neural computation, 13(8):1713\u20131720, 2001.\n[17] W. Truccolo, U. T. Eden, M. R. Fellows, J. P. Donoghue, and E. N. Brown. A point process framework\nfor relating neural spiking activity to spiking history, neural ensemble and extrinsic covariate effects. J.\nNeurophysiol, 93(2):1074\u20131089, 2005.\n\n[18] J. W. Pillow, J. Shlens, L. Paninski, A. Sher, A. M. Litke, and E. P. Chichilnisky, E. J. Simoncelli. Spatio-\ntemporal correlations and visual signaling in a complete neuronal population. Nature, 454:995\u2013999,\n2008.\n\n[19] S. Gerwinn, J.H. Macke, M. Seeger, and M. Bethge. Bayesian inference for spiking neuron models with\n\na sparsity prior. Advances in Neural Information Processing Systems, 2008.\n\n[20] I.H. Stevenson, J.M. Rebesco, L.E. Miller, and K.P. K\n\n\u201dording. Inferring functional connections between neurons. Current Opinion in Neurobiology, 18(6):582\u2013\n588, 2008.\n\n[21] L. Paninski. Maximum likelihood estimation of cascade point-process neural encoding models. Network:\n\nComputation in Neural Systems, 15:243\u2013262, 2004.\n\n[22] J. W. Pillow. Likelihood-based approaches to modeling the neural code. In K. Doya, S. Ishii, A. Pouget,\nand R. P. Rao, editors, Bayesian Brain: Probabilistic Approaches to Neural Coding, pages 53\u201370. MIT\nPress, 2007.\n\n[23] T.P. Coleman and S. Sarma. Using convex optimization for nonparametric statistical analysis of point\nprocesses. In IEEE International Symposium on Information Theory, 2007. ISIT 2007, pages 1476\u20131480,\n2007.\n\n[24] L. Paninski, M. Fellows, S. Shoham, N. Hatsopoulos, and J. Donoghue. Superlinear population encoding\n\nof dynamic hand trajectory in primary motor cortex. J. Neurosci., 24:8551\u20138561, 2004.\n\n9\n\n\f", "award": [], "sourceid": 516, "authors": [{"given_name": "Jonathan", "family_name": "Pillow", "institution": null}]}