{"title": "Rescaling, thinning or complementing? On goodness-of-fit procedures for point process models and Generalized Linear Models", "book": "Advances in Neural Information Processing Systems", "page_first": 703, "page_last": 711, "abstract": "Generalized Linear Models (GLMs) are an increasingly popular framework for modeling neural spike trains. They have been linked to the theory of stochastic point processes and researchers have used this relation to assess goodness-of-fit using methods from point-process theory, e.g. the time-rescaling theorem. However, high neural firing rates or coarse discretization lead to a breakdown of the assumptions necessary for this connection. Here, we show how goodness-of-fit tests from point-process theory can still be applied to GLMs by constructing equivalent surrogate point processes out of time-series observations. Furthermore, two additional tests based on thinning and complementing point processes are introduced. They augment the instruments available for checking model adequacy of point processes as well as discretized models.", "full_text": "Rescaling, thinning or complementing? On\n\ngoodness-of-\ufb01t procedures for point process models\n\nand Generalized Linear Models\n\nFelipe Gerhard\n\nBrain Mind Institute\n\nWulfram Gerstner\nBrain Mind Institute\n\nEcole Polytechnique F\u00b4ed\u00b4erale de Lausanne\n\nEcole Polytechnique F\u00b4ed\u00b4erale de Lausanne\n\n1015 Lausanne EPFL, Switzerland\nfelipe.gerhard@epfl.ch\n\n1015 Lausanne EPFL, Switzerland\nwulfram.gerstner@epfl.ch\n\nAbstract\n\nGeneralized Linear Models (GLMs) are an increasingly popular framework for\nmodeling neural spike trains. They have been linked to the theory of stochastic\npoint processes and researchers have used this relation to assess goodness-of-\ufb01t\nusing methods from point-process theory, e.g. the time-rescaling theorem. How-\never, high neural \ufb01ring rates or coarse discretization lead to a breakdown of the as-\nsumptions necessary for this connection. Here, we show how goodness-of-\ufb01t tests\nfrom point-process theory can still be applied to GLMs by constructing equiva-\nlent surrogate point processes out of time-series observations. Furthermore, two\nadditional tests based on thinning and complementing point processes are intro-\nduced. They augment the instruments available for checking model adequacy of\npoint processes as well as discretized models.\n\n1\n\nIntroduction\n\nAction potentials are stereotyped all-or-nothing events, meaning that their amplitude is not consid-\nered to transmit any information and only the exact time of occurrence matters. This view suggests\nto model neurons\u2019 responses in the mathematical framework of point processes. An observation\nis a sequence of spike times and their stochastic properties are captured by a single function, the\nconditional intensity [1]. For point processes on the time line, several approaches for evaluating\ngoodness-of-\ufb01t have been proposed [2]. The most popular in the neuroscienti\ufb01c community has\nbeen a test based on the time-rescaling theorem [3].\n\nIn practice, neural data is binned such that a spike train is represented as a sequence of spike counts\nper time bin. Speci\ufb01cally, Generalized Linear Models (GLMs) are built on this representation. Such\ndiscretized models of time series have mostly been seen as an approximation to continuous point\nprocesses and hence, the time-rescaling theorem was also applied to such models [4, 5, 6, 7, 8].\n\nHere we ask the question whether the time-rescaling theorem can be translated to discrete time. We\nreview the approximations necessary for the transition to discrete time and point out a procedure\nto create surrogate point processes even when these approximations do not hold (section 2). Two\nnovel tests based on two different operations on point processes are introduced: random thinning\nand random complementing. These ideas are applied to a series of examples (section 3), followed\nby a discussion (section 4).\n\n1\n\n\fFigure 1: Spike train representations. (A) A trace of the membrane potential of a spiking neuron.\n(B) Information is conveyed in the timings and number of action potentials. This supports the\nrepresentation of neural activity as a point process in which each spike is assumed to be a singular\nevent in time. (C) When time is divided into large bins, the spike train is represented as a time series\nof discrete counts. (D) If the bin width is chosen small enough, the spike train corresponds to a\nbinary time series, indicating the presence of a single spike inside a given time bin.\n\n2 Methods\n\n2.1 Representations of neural activity\n\nWe characterize a neuron by its response in terms of trains of action potentials using the theory of\npoint processes (Figures 1A and 1B). An observation consists of a list of times, each denoting the\ntime point of one action potential. Following a common notation [3, 9], let (0, T ] be the time interval\nof the measurement and {ui} be the set of n event times. The stochastic properties of a point process\nare characterized by its conditional intensity function \u03bb(t|H(t)), de\ufb01ned as [1]:\n\n\u03bb(t|Ht) = lim\n\u2206\u21920\n\n\u2206\n\nwhere Ht is the history of the stochastic process up to time t and possibly includes other covariates\nof interest. For \ufb01tting and evaluating different parameter sets of the conditional intensity function, a\nmaximum-likelihood approach is followed [10, 11]. The log-likelihood of a point process model is\ngiven by [1]:\n\nP [spike in (t, t + \u2206)|Ht]\n\n,\n\n(1)\n\nlog L(point process) =\n\nn\n\nXi=1\n\nlog \u03bb(ui|Hui ) \u2212 Z T\n\n0\n\n\u03bb(t|Ht)dt.\n\n(2)\n\nOne possibility are binning-free models (like renewal processes or other parametric models). Alter-\nnatively, \u03bb(t|Ht) can be modeled as a piece-wise constant function with each piece having length \u2206.\nIn this case, the history term Ht covers the history up to the time of the left edge of the current bin.\nInside the bin, the process locally behaves like a Poisson process with constant rate \u03bbk = \u03bb(tk|Hk)\nwith tk = \u2206k and Hk = Htk. Using the number of spikes ck per bin as a representation of the obser-\nvation, the discretized version of Equation 2 is equivalent to the log-likelihood of a series of Poisson\nsamples (apart from terms that are not dependent on \u03bb(t|Ht)). Hence, for \ufb01nding the maximum-\nlikelihood solution for the point process, it is equivalently suf\ufb01cient to maximize the likelihood of\nsuch a Poisson regression model. The result of \ufb01tting will be a sequence of \u00b5i for each bin, where\n\u00b5i is the expected number of counts. Since a local Poisson process is assumed within the bins, \u00b5i is\nrelated to \u03bbi via: \u03bbi = \u00b5i/\u2206.\nA complementary approach to the point process framework is to see spike trains as time series,\ne. g. as a sequence of counts {ci} or binary events {bi} (Figures 1C and 1D). For Poisson-GLMs,\na sequence of Poisson-distributed count variables ci is modeled and the linear sum of covariates\nis linked to the expected mean of the Poisson distribution \u00b5i. Binary time series can be modeled\nas a sequence of conditionally independent Bernoulli trials with outcomes 0 and 1 and success\nprobabilities {pi}. For Bernoulli-GLMs, the pis are linked via a non-linear transfer function to\na linear sum of covariates. De\ufb01ned this way, the likelihood for an observed sequence bi given a\n+ Pk log(1 \u2212 pk). In the\napproximation of \u00b5i (cid:28) 1, \u00b5i becomes approximately pi and the likelihoods of the Bernoulli and\nPoisson series become equivalent. Moreover, using the same approximation, it is possible to link\nthe Bernoulli series to the conditional intensity function \u03bb(t|Ht) via \u03bbi \u2248 pi/\u2206 . Traditionally,\nthis path was chosen to relate the time series to the theory of point processes and to be able to use\ngoodness-of-\ufb01t analyses available for such point processes [9].\n\nparticular model of pi is given by log L(Bernoulli) = Pk bk log pk\n\n1\u2212pk\n\n2\n\n\fA\n\n(cid:79)\n(cid:79)\n\ntime\n\n-\n\nrescaling\n\nB\n\n(cid:79)\n(cid:79)\n\nB\n\nrandom\n\n thinning\n\nC\n\n(cid:79)\n(cid:79)\nC\n\ncomplement\n\ning\n\ni(cid:79)\n\nB\n\noriginal\n\nspiketrain\n\nt\n\ni\n\n(cid:179)(cid:32)\n' (cid:79)\n\ndtHt\n|(\n\n)\n\nt\n\nit\ni\n\n0\n\nrescaled\n\nspiketrain\n\nt\nt\n\nt\n\noriginal\n\nspiketrain\n\n(cid:32)\n\np\ni\n\nB\nB\n(cid:79)\n\ni\n\nthinned\n\nspiketrain\n\nt\nt\n\nt\n\noriginal\n\nspiketrain\n\nC(cid:79)\nC(cid:79)\n\ncomplement\n\nary\n\nprocess\n\ncomplement\n\ned\n\nspiketrain\n\nt\nt\n\nt\n\nt\n\nFigure 2: Overview of goodness-of-\ufb01t tests for point-process models. (A) Using the time-rescaling\ntheorem, the time of each spike is rescaled according to the integral of the conditional intensity\nfunction. (B) Assuming that the conditional intensity function has a lower limit B, spikes of the\noriginal spike train are thinned by keeping a spike only with probability B\u03bb\u22121\n. (C) Assuming that\nthe conditional intensity function has an upper limit C, a complementary process \u03bbC = C \u2212 \u03bb can\nbe constructed. Adding samples from this inhomogeneous Poisson process to the observed spikes\nresults in a homogeneous Poisson process with rate C.\n\ni\n\n2.2 Goodness-of-\ufb01t tests for point processes\n\nStatistical tests are usually evaluated using two measures: The speci\ufb01city (fraction of correct models\nthat pass the test) and the sensitivity or test power (fraction of wrong models that are properly\nrejected by the test). The speci\ufb01city is set by the signi\ufb01cance level: With signi\ufb01cance level \u03b1, the\nspeci\ufb01city is 1 \u2212 \u03b1. The sensitivity of a given test depends on the strength of the departure from the\nmodeled intensity function to the true intensity.\n\n2.2.1 The time-rescaling theorem\n\nA popular way for verifying point-process-based models has been the time-rescaling theorem [3, 12].\nIt states that if {ui} is a realization of events from a point process with conditional intensity \u03bb(t|Ht),\nthen rescaling via the transformation u0\nWe call the following transformation the na\u00a8\u0131ve time-rescaling when it is applied to binary sequences.\nThe spike time ui falling into bin j, is transformed into: u0\n\n0 \u03bb(t|Ht)dt will yield a unit-rate Poisson process.\n\ni = R ui\n\ni = Pj\n\nk=1 pk.\n\n2.2.2 Thinning point processes\n\nIt is well known that an inhomogeneous point process can be simulated by generating a homo-\ngeneous Poisson process with constant intensity C with C \u2265 max \u03bb(t) (the so-called dominant\nprocess) and keeping every spike at time ti with probability p = \u03bb(ti)\nIn reverse, this\ncan be used to do model-checking [14]: Let B be a lower bound of the \ufb01tted conditional intensity\n\u03bb(t|H(t)). Now take \u03bb(t|H(t)) as the dominant process with samples ui. Thin the process by keep-\n\u03bb(ti|Ht) . For a correctly speci\ufb01ed model \u03bb(t|Ht), the thinned process\ning a spike with probability\nwill be a homogeneous Poisson process with rate B (Figure 2B).\n\nC [13, 2].\n\nB\n\n3\n\n\fTypically, B = min \u03bb(t) (cid:28) \u00af\u03bb(t) (due to absolute refractoriness in most renewal process models\nand GLMs), such that the thinned process will have a prohibitively low rate and only very few spikes\nwill be selected. Testing the Poisson hypothesis on a handful of spikes will result in a vanishingly\nlow power.\n\nTo circumvent this problem, we propose the following remedy: Let B\u2217 be a threshold which may\nbe higher than the lower bound B. Then consider only the intervals of \u03bb for which \u03bb > B\u2217 and\nconcatenate those into a new point process. After applying the thinning procedure on all spikes of\nthe stitched process, the thinned process should be a Poisson process with rate B\u2217. This procedure\ncan be repeated K times for a range of uniformly spaced B\u2217s ranging from B to C (upper bound).\nStretching each thinned process by a factor of B\u2217 creates a set of K unit-rate processes. Each of\nthem is tested for the Poisson hypothesis by a Kolmogorov-Smirnov test on the inter-spike intervals.\nThe model is rejected when there is at least one signi\ufb01cant rejected null hypothesis. To correct for\nthe multiple tests, we employ Simes\u2019 procedure. It tests the global null hypothesis that all tested\nsub-hypotheses are true against the alternative hypothesis that at least one hypothesis is false. To\nthis end, it transforms the ordered list of p-values p(1), ..., p(K) into Kp(1)\nK . If any\nof the transformed p-values is less than the signi\ufb01cance level \u03b1 = .05, the model is rejected [15]1.\n\n, ..., Kp(K)\n\n1\n\n, Kp(2)\n\n2\n\n2.2.3 Complementing point processes\n\nThe idea of thinning might also be used the other way round. Assume the observations ui have been\ngenerated by thinning a homogeneous Poisson process with rate C using the modeled conditional\nintensity \u03bb(t|Ht) as the lower bound. Then we can de\ufb01ne a complementary process \u03bbc(t) = C \u2212\n\u03bb(t|Ht) such that adding spikes from the complementary point process to the observed spikes, the\nresulting process will be a homogeneous Poisson process with rate C. This algorithm is a straight-\nforward inversion of the thinning algorithms discussed in [2, 1].\n\nIt might happen that the upper bound C of the modeled intensity is much larger than the average\n\u03bb(t). In that case, the observed spike pattern would be distorted with high number of Poisson spikes\nfrom the complementary process and the test power would be low. To avoid this, a similar technique\nas for the thinning procedure can be employed. De\ufb01ne a threshold C \u2217 \u2264 C and consider only the\nregion of the spike train for which \u03bb(t|H(t)) < C \u2217. Apply the complementing procedure on these\nparts of the spike train to obtain a point process with rate C \u2217 when concatenating the intervals. This\nprocess can be repeated K times with values C \u2217 ranging from B to C. A multiple-test correction\nhas to be used, again we propose Simes\u2019 method (see previous section).\n\n2.3 Creating surrogate point processes from time series\n\nSince the time-rescaling theorem can only be used when \u03bb(t|Ht) the exact spike times {ui} are\nknown, it is not a priori clear how it applies to discretized time-series models. For such cases,\nwe propose to generate surrogate point process samples that are equivalent to the observed time\nseries. To apply the time-rescaling theorem on discretized models such as GLMs, the integral of the\ntime transformation is replaced by a discrete sum over bins (the na\u00a8\u0131ve time-rescaling). Taking the\nsimplest example of a homogeneous Poisson process, it is evident that the possible values for the\nrescaled intervals form a \ufb01nite set. This contradicts the time-rescaling theorem that states that the\nintervals are (continuously) exponentially distributed. Hence, using the time-rescaling theorem on\ndiscretized data produces a bias [17].\n\nWhile Haslinger et al. considered a modi\ufb01cation of the time-rescaling theorem to explicitly ac-\ncount for the discrete nature of the model [17], we propose a general, simple scheme how to form\nsurrogate point processes from Poisson- and Bernoulli-GLMs that can be used for the continuous\ntime-rescaling theorem as well as for any other goodness-of-\ufb01t test designed for point-process data\n(Figure 3).\nPoisson-GLMs: The observation consists of a sequence of count variables ci that is modeled as\na sample from Poisson distributions with mean \u00b5i. Hence, the modeled process can be regarded\nas a piecewise-constant intensity function. The expected number of spikes of a Poisson process is\nrelated to its intensity via \u00b5i = \u03bbi\u2206 such that we can construct the conditional intensity function as\n\n1The K tests contain overlapping regions of the same spike train, hence, we expect the statistical tests to be\n\ncorrelated. In these cases, a simple Bonferroni-correction would be too conservative [16].\n\n4\n\n\fbinning\n\n-\n\nfree\n\nmodel\n\nBernoulli\n\nGLM-\n\nPoisson\n\nGLM-\n\nspike\n\n times\n\nui\n{\n\n};\n\nconditiona\n\nl\n\nintensity\n\n(cid:79)\n\ntHt\n(\n|(\n\n))\n\nbinary\n\nobservatio\n\n{ns\n\nib\n\n};\n\nspiking\n\nprobabilit\n\nies\n\n}{p\n\ni\n\nspike\n\ncounts\n\nic\n{\n\n};\n\nexpected\n\ncounts\n\ni(cid:80)\n}{\n\n(cid:80)\n\ni\n\n(cid:16)(cid:32)\n\n1ln(\n\n(cid:16)\n\np\ni\n\n)\n\ndraw\n\nc\nfrom}{\ni\n\npoisson\n\n(\nP\ni(cid:80)\n\n)\n(c\ni\n\n(cid:32)\n\nkk\n\n|\n\n(cid:116)\n\n)1\n\ndraw\n\nspiketimes\n\nfrom}{u\n\ni\n\nUnif(\n\n0\n\n(cid:39)\n),\n\ninside\n\neach\n\nbin;\n\nset\n\n(cid:540)\n\ni\n\n(cid:32)\n\ni\n\n(cid:541)\n(cid:507)\n\napply\n\ngoodness\n\n-of-\n\nfit\n\nprocedures\n\nbased\n\non\n\npoint\n\nprocess\n\n}{u\n\ni\n\nand\n\nconditiona\n\nl\n\nintensity\n\n(cid:79)\n\n|(t\n\nH(t))\n\nFigure 3: Creating surrogate point processes from time series. For bin-free point process models\nfor which the spike times and a conditional intensity \u03bb(t|H(t)) is available, goodness-of-\ufb01t tests\nfor point processes can be readily applied. For Poisson-GLMs, exact spike times are drawn inside\neach bin for the speci\ufb01ed number of spikes that were observed. The piece-wise constant conditional\nintensity function is linked to the modeled number of counts per bin via \u03bbi = \u2206\u22121\u00b5i. For Bernoulli-\nGLMs, the probability of obtaining at least one spike per bin pi is modeled. For each bin with spikes\n(bi = 1) \u2013 assuming a local Poisson process \u2013 a sample ci from a biased Poisson distribution with\nmean \u00b5i = \u2212 ln(1 \u2212 pi) is drawn together with corresponding spike times. Finally, point-process\nbased goodness-of-\ufb01t tests may be applied to this surrogate spike train.\n\npiece-wise constant with values \u03bbi = \u2206\u22121\u00b5i. Conditioned on the number of spikes that occurred in\na homogeneous Poisson process of rate \u03bbi, the exact spike times are uniformly distributed inside bin\ni. A surrogate point process can be constructed from a Poisson-GLM by generating random spike\ntimes (i \u2212 1 + U nif (0, 1))\u2206 for each spike within bin i (1 \u2264 i \u2264 N) for all bins with ci > 0. One\ncan then proceed to the point-process-based goodness-of-\ufb01t tools using the surrogate spike train and\nits conditional intensity \u03bbi.\nBernoulli-GLMs: Based on the observed binary spike train {bi}, the sequence of probabilities pi\nof spiking within bin i is modeled. We can relate this to the point process framework using the\nfollowing observations: Assume that pi denotes the probability of \ufb01nding at least one spike within\neach bin2 and that locally, the process behaves like a Poisson process. Then, pi = P (poisson)\n(X \u2265\n1) = 1 \u2212 P (poisson)\n(X = 0) = 1 \u2212 exp(\u2212\u00b5i). The conditional intensity is given by \u03bbi = \u2206\u22121\u00b5i =\n\u2212\u2206\u22121 ln(1 \u2212 pi). In practice, for each bin with bi = 1, we draw the amount of spikes within the\nbin by \ufb01rst sampling from the distribution P (poisson)\n(X = k|k \u2265 1) and sample exact spike times\n\u00b5i\nuniformly as in the case of the Poisson-GLMs.\n\n\u00b5i\n\n\u00b5i\n\n3 Results\n\nHere, we compare the performance of the three different approaches in detecting wrongly speci\ufb01ed\nmodels, using examples of models that are commonly applied in neural data analysis. For the\nthinning and complementing procedure, K = 10 partitions were chosen (see section 2.2.2). Unless\notherwise noted, we report the test power at a speci\ufb01city of 1 \u2212 \u03b1 = .95. The Poisson hypothesis in\nthe proposed procedures is tested by a Kolmogorov-Smirnov test on the inter-spike intervals of the\ntransformed process.\n\n3.1 Example: Inhomogeneous Poisson process\n\n20 Hz + PJ=40\n\nConsider an inhomogeneous Poisson process with band-limited intensity: \u03bb(t|Ht) = \u03bb(t) =\nwith f = 1 Hz and J = 40 coef\ufb01cients that were randomly\ndrawn from a uniform distribution on the interval [0, 20]. The process was simulated over a length\nof T = 20 s and the intensity was discretized with \u2206 = 1 ms. Negative intensities were clipped\n\nj=1 uj\n\nsin(2\u03c0f (t\u2212 j\nJ T )\n\n\u03c0(t\u2212 j\n\nJ T ))\n\n2Such clipping is implicitly performed in many studies, e. g. in [18, 19, 20].\n\n5\n\n\f150\n\n100\n\n50\n\nn\no\n\ni\nt\nc\nn\nu\n\nf\n \ny\nt\ni\ns\nn\ne\n\nt\n\nn\n\ni\n\n0\n \n0\n\n \n\nno jitter\nmedium jitter\nhigh jitter\n\n5\n\n10\n\ntime [s]\n\n15\n\n20\n\n(a) intensity function\n\nr\ne\nw\no\np\n\n \nt\ns\ne\n\nt\n\n1\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\n0\n \n0\n\n \n\n1\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\nr\ne\nw\no\np\n\n \nt\ns\ne\n\nt\n\n40\n\n \n\n0\n0\n\n \n\n1\n\ncomplementing\nthinning\nrescaling\nnaive rescaling\n\n0.5\n\n1\u2212specificity\n\n(c) ROC curve\n\ncomplementing\nthinning\nrescaling\nnaive rescaling\n\n20\n\n30\n\n10\n\njitter strength\n(b) test power\n\nFigure 4: Inhomogeneous Poisson process. (A) Sample intensity functions for an undistorted inten-\nsity (black line) and two models with jitters in the coef\ufb01cients (\u03b2 = 12, medium jitter and \u03b2 = 30,\nlarge jitter). (B) The test power of each test as a function of the jitter strength. The dashed line\nindicates the level of the medium jitter strength (red line in \ufb01gure A). (C) ROC curve analysis for\nan intermediate jitter strength of \u03b2 = 12. The intersection of the curves with the dashed line corre-\nsponds to the test power at a signi\ufb01cance level of \u03b1 = .05.\n\nto zero. A binary spike train was generated by calculating the probability of at least one spike in\neach time bin as pi = 1 \u2212 exp(\u2212\u03bb(ti)\u2206) and drawing samples from a Bernoulli distribution with\nspeci\ufb01ed probabilities pi.\nFor evaluating the different algorithms, wrong models for the intensity were created with jittered\ncoef\ufb01cients u0\nk = uk + \u03b2Unif(\u22121, 1) where \u03b2 indicates the strength of the deviation from the true\nmodel. For each jitter strength, N = 1000 spike trains were generated from the true model and\n\u03bb(t|Ht) was constructed using the wrong model (Figure 4A). For any \u03b2 > 0, the fraction of rejected\nmodels de\ufb01nes the sensitivity or test power. For \u03b2 = 0, the fraction of accepted models de\ufb01nes the\nspeci\ufb01city which was controlled to be at 1 \u2212 \u03b1 = .95 for each test.\nAll three methods (rescaling, thinning, complementing) show a speci\ufb01ed type-I error of approx-\nimately 5% (\u03b2 = 0) and progressively detect the wrong models. Notably, the complementing\nand thinning procedures detect a departure from the correct model earlier than the classical rescal-\ning (Figure 4B). For comparison, also the na\u00a8\u0131ve implementation of the rescaling transformation is\nshown. The signi\ufb01cance level for the KS test used for the na\u00a8\u0131ve time-rescaling was adjusted to\n\u03b1 = .015 to achieve a 95% speci\ufb01city. The adjustment was necessary due to the discretization bias\n(see section 2.3).\n\nFor models with an intermediate jitter strength (\u03b2 = 12), ROC curves were constructed. Here, for a\ngiven signi\ufb01cance level \u03b1, a pair of true and false positive rates can be calculated and plotted for each\ntest (taking N = 1000 repetitions using the true model and the model with jittered coef\ufb01cients). It\ncan be seen that especially for intermediate jitter strengths, complementing and thinning outperform\ntime-rescaling (Figure 4C), independent of the chosen signi\ufb01cance level.\n\n3.2 Example: Renewal process\n\nIn a second example, we consider renewal processes, i. e. inter-spike intervals are an i. i. d. sample\nfrom a speci\ufb01c probability distribution p(\u2206t). In this case, the conditional intensity is given by\nwhere t\u2217 denotes the time of the last spike prior to time t. For this\n\u03bb(t|Ht) =\n\np(t\u2212t\u2217)\n\n1\u2212R t\u2212t\u2217\n\n0\n\np(u)du\n\nexample, we chose the Gamma distribution as it is commonly used to model real spike trains [4, 3, 7].\n\nThe spike train was generated from a true model, following a Gamma distribution with scale param-\neter A = 0.032 and shape parameter B = 6.25: p(\u2206t) = (\u2206t)B\u22121 e\u2212 \u2206t\nAB \u0393(B) . Wrong models were\ngenerated by scaling the shape and scale parameter by a factor of 1 + \u03b2 (\u201djitter\u201d) while keeping the\nexpected value of the distribution constant (i. e. B0 = (1 + \u03b2)B, A0 = (1 + \u03b2)\u22121A) (Figure 5A).\nFor each jitter strength, N = 1000 data sets of length T = 20 s were generated from the true model\nand the wrong model and the tests were applied.\n\nA\n\n6\n\n\fn\no\n\ni\nt\nc\nn\nu\n\nf\n \ny\nt\ni\ns\nn\ne\nd\n\n \ny\nt\ni\nl\ni\n\nb\na\nb\no\nr\np\n\n30\n\n20\n\n10\n\n0\n \n0\n\n \n\nsample ISI distribution\nno jitter\nmedium jitter\nhigh jitter\n\n1\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\nr\ne\nw\no\np\n\n \nt\ns\ne\n\nt\n\n0.1\n\n0.05\n0.15\ninter\u2212spike interval [s]\n(a) intensity function\n\n0.2\n\n0\n \n0\n\n \n\nrescaling\nthinning\ncomplementing\nnaive rescaling\n\n0.5\n1\njitter strength\n(b) test power\n\n1.5\n\nr\ne\nw\no\np\n\n \nt\ns\ne\n\nt\n\n1\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\n0\n \n0\n\n \n\n1\n\nrescaling\nthinning\ncomplementing\nnaive rescaling\n\n0.5\n\n1\u2212specificity\n\n(c) ROC curve\n\nFigure 5: Renewal process. (A) Inter-spike interval distributions for the undistorted (black line) and\ndistorted models (medium jitter, \u03b2 = 0.5 and strong jitter, \u03b2 = 1.0). For comparison, a sample ISI\nhistogram from one of the simulations is shown in gray. Note that the mean of the three distributions\nis matched to be the same (vertical dashed line). (B) The test power of each test as a function of the\njitter strength. The dashed line indicates the level of the medium jitter strength (red line in \ufb01gure A).\n(C) ROC curve analysis for an intermediate jitter strength of \u03b2 = 0.5. The intersection of the curves\nwith the dashed line corresponds to the test power at a signi\ufb01cance level of \u03b1 = .05.\n\nThe analysis of test power for each test and the ROC curve analysis for an intermediate jitter strength\nreveal that time-rescaling is slightly superior to thinning and complementing (Figure 5B and C). The\nna\u00a8\u0131ve time-rescaling performs worst (adjusted signi\ufb01cance level for the KS test, \u03b1 = .017).\n\n3.3 Example: Inhomogeneous Spike Response Model\n\nexample as a band-limited function rti = r(ti) = PJ=40\n\nWe model an inhomogeneous spike response model with escape noise using a Bernoulli-GLM [21].\nThe spiking probability is modulated by an inhomogeneous rate r(t). Additionally, for each spike,\na post-spike kernel is added to the process intensity. The rate function is modeled like in the \ufb01rst\nwith f = 1 Hz\nand J = 40 coef\ufb01cients that were randomly drawn from a uniform distribution on the interval\n[\u22120.2, 0.2]. The post-spike kernel \u03b7(\u2206t) is modeled as a sum of three exponential functions (\u03c4 =\n5 ms, 25 ms and 1 s) with appropriate amplitudes as to mimick a relative refractory period, a small\nrebound and a slow (inhibitory) adaptation. To construct the Bernoulli-GLM, the spiking probability\npi per bin of length \u2206 = 1 ms is pi =\n\n\u03b7(uj \u2212 ti).\n\nsin(2\u03c0f (ti\u2212 j\nJ T )\n\n\u03c0(ti\u2212 j\n\n1\n\n1+exp(\u2212si) with si = \u22123 + rti + P{uj }<ti\n\nj=1 uj\n\nJ T ))\n\nA binary time series (the spike train) was generated for a duration of T = 20 s. The jittered models\nwere constructed by adding a jitter \u03b2 on the coef\ufb01cients of the inhomogeneous rate modulation\n(Figure 6A). For each jitter strength, N = 1000 data sets were generated from the true model and\nthe wrong model and the tests were applied.\n\nBoth thinning and complementing are able to detect smaller distortions than both the time-rescaling\non the surrogate and discrete data (Figure 6B, adjusted signi\ufb01cance level for the na\u00a8\u0131ve rescaling,\n\u03b1 = .018). A ROC curve analysis for an intermediate jitter strength (\u03b2 = 0.4) supports this \ufb01nding\n(Figure 6C).\n\n4 Discussion\n\nAssessing goodness-of-\ufb01t for Generalized Linear Models has mostly been done by applying the\ntime-rescaling transformation that is de\ufb01ned for point processes, assuming a match between those\napproaches. When the per-bin probability of spiking cannot be regarded as low, this approximation\nbreaks down and creates a bias when applying the time-rescaling transformation [17]. In a \ufb01rst\nstep, we proposed a procedure to create surrogate point processes from discretized models, such as\nBernoulli- and Poisson-GLMs, that do not exhibit this bias. Throughout all the examples, the time-\nrescaling theorem applied to the surrogate point process was systematically better than applying the\nna\u00a8\u0131ve time-rescaling on the discrete data. Since only the adjusted time-rescaling procedure allows\n\n7\n\n\f80\n\nn\no\n\ni\nt\nc\nn\nu\n\nf\n \ny\nt\ni\ns\nn\ne\n\nt\n\nn\n\ni\n\n60\n\n40\n\n20\n\n0\n \n7\n\n \n\nno jitter\nmedium jitter\nhigh jitter\n\n7.5\n\n8\n\ntime [s]\n\n8.5\n\n9\n\n(a) intensity function\n\nr\ne\nw\no\np\n\n \nt\ns\ne\n\nt\n\n1\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\n0\n \n0\n\n \n\n1\n\nr\ne\nw\no\np\n\n \nt\ns\ne\n\nt\n\n1\n\n0.8\n\n0.6\n\n0.4\n\n0.2\n\n0\n \n0\n\n \n\n1\n\nthinning\ncomplementing\nrescaling\nnaive rescaling\n\n0.5\n\n1\u2212specificity\n\n(c) ROC curve\n\nthinning\ncomplementing\nrescaling\nnaive rescaling\n\n0.5\n\njitter strength\n(b) test power\n\nFigure 6: Inhomogeneous Spike Response Model. (A) Sample intensity functions for an undistorted\nintensity (black line) and two misspeci\ufb01ed models (medium jitter, \u03b2 = 0.4 and strong jitter, \u03b2 =\n1.0). (B) The test power of each test as a function of the jitter strength. The dashed line indicates the\nlevel of the medium jitter strength (red line in \ufb01gure A). (C) ROC curve analysis for an intermediate\njitter strength of \u03b2 = 0.4. The intersection of the curves with the dashed line corresponds to the test\npower at a signi\ufb01cance level of \u03b1 = .05.\n\nto reliably control the speci\ufb01city of the test, it should be preferred over the classical time-rescaling\nin all cases where discretized models are used.\n\nWe have presented two alternatives to an application of the time-rescaling theorem: For the \ufb01rst\nprocedure, the observed spike train is thinned according to the value of the conditional intensity at\nthe time of spikes. The resulting process is then a homogeneous Poisson process with a rate that is\nequal to the lower bound on the conditional intensity. The second proposed method builds on the\nidea that an intensity function \u03bb(t) with an upper bound C can be \ufb01lled up to a homogeneous Poisson\nprocess of rate C by adding spike samples from the complementary process C \u2212 \u03bb(t). The proposed\ntests work best if the lower and upper bounds are tight. However, in most practical cases, especially\nthe lower bound will be prohibitively low to apply any statistical test on the thinned process. As a\nremedy, we proposed to consider only regions of \u03bb(t|H(t)) for which the intensity exceeds a given\nthreshold and repeat the thinning for different thresholds. This successfully overcomes the limitation\nthat may have \u2013 up to now \u2013 prevented the use of the thinning algorithm as a goodness-of-\ufb01t measure\nfor neural models.\n\nThe three tests are complementary in the sense that they are sensitive to different deviations of the\nmodeled and true intensity function. Time-rescaling is only sensitive to the total integral of the\nintensity function between spikes, while thinning exclusively considers the intensity function at the\ntime of spikes and is insensitive to its value at places where no spikes occurred. Complementing is\nsensitive to the exact shape of \u03bb(t) regardless of where the spikes from the original observations are.\nFor the examples of an inhomogeneous Poisson process and the Spike Response Model, thinning\nand complementing outperform the sensitivity of the simple time-rescaling procedure. They can\ndetect deviations from the model that are only half as large as the ones necessary to alert the test\nbased on time-rescaling. For modeling renewal processes, time-rescaling was slightly advantageous\ncompared to the to other methods. This should not come as a surprise since the time-rescaling test\nis known to be sensitive to modeling the distribution of inter-spike intervals [3].\n\nBeside from likelihood criteria [12, 22, 23], there exist few goodness-of-\ufb01t tools for neural mod-\nels based on Generalized Linear Models [2, 24]. With the proposed procedure for surrogate point\nprocesses, we bridge the gap between such discrete models and point processes. That enables to\nmake use of additional tests from this domain, such as thinning and complementing procedures. We\nexpect these to be valuable contributions to the general practice of statistical evaluation in modeling\nsingle neurons as well as neural populations.\n\nAcknowledgments\n\nFelipe Gerhard thanks Gordon Pipa and Robert Haslinger for helpful discussions. Felipe Gerhard\nis supported by the Swiss National Science Foundation (SNSF) under the grant number 200020-\n117975.\n\n8\n\n\fReferences\n[1] Daley, D. J., & Vere-Jones, D. (2002). An Introduction to the Theory of Point Processes, Volume 1 (2nd\n\ned.). New York: Springer.\n\n[2] Ogata, Y. (1981). On Lewis\u2019 simulation method for point processes. IEEE Transactions on Information\n\nTheory, 27(1).\n\n[3] Brown, E. N., Barbieri, R., Ventura, V., Kass, R. E., & Frank, L. M. (2002). The time-rescaling theorem\n\nand its application to neural spike train data analysis. Neural Computation, 14(2), 325\u2013346.\n\n[4] Barbieri, R., Quirk, M. C., Frank, L. M., Wilson, M. A., & Brown, E. N. (2001). Construction and analysis\nof non-Poisson stimulus-response models of neural spiking activity. Journal of Neuroscience Methods,\n105(1), 25\u201337.\n\n[5] Koyama, S., & Kass, R. E. (2008). Spike train probability models for stimulus-driven leaky integrate-\n\nand-\ufb01re neurons. Neural computation, 20(7), 1776\u20131795.\n\n[6] Rigat, F., de Gunst, M., & van Pelt, J. (2006). Bayesian modelling and analysis of spatio-temporal\n\nneuronal networks. Bayesian Analysis, 1(4), 733\u2013764.\n\n[7] Shimokawa, T., & Shinomoto, S. (2009). Estimating instantaneous irregularity of neuronal \ufb01ring. Neural\n\nComputation, 21(7), 1931\u20131951.\n\n[8] Wojcik, D. K., Mochol, G., Jakuczan, W., Wypych, M., & Waleszczyk, W. (2009). Direct estimation of\n\ninhomogeneous Markov interval models of spike trains. Neural Computation, 21(8), 2105\u20132113.\n\n[9] Truccolo, W., Eden, U. T., Fellows, M. R., Donoghue, J. P., & Brown, E. N. (2005). A point process\nframework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate\neffects. J Neurophysiol, 93(2), 1074\u20131089.\n\n[10] Pawitan, Y. (2001). In all likelihood: statistical modelling and inference using likelihood. Oxford: Oxford\n\nUniversity Press.\n\n[11] Doya, K., Ishii, S., Pouget, A., & Rao, R. P. N. (2007). Bayesian brain: Probabilistic approaches to\n\nneural coding. Cambridge, MA: MIT Press.\n\n[12] Pillow, J. W. (2009). Time-rescaling methods for the estimation and assessment of non-Poisson neural\nencoding models.\nIn Y. Bengio, D. Schuurmans, J. Lafferty, C. K. I. Williams, & A. Culotta (Eds.)\nAdvances in Neural Information Processing Systems 22, (pp. 1473\u20131481). Cambridge, MA: MIT Press.\n[13] Lewis, P. A. W., & Shedler, G. S. (1979). Simulation of nonhomogeneous Poisson processes by thinning.\n\nNav. Res. Logist. Q., 26, 403\u2013413.\n\n[14] Schoenberg, F. P. (2003). Multidimensional residual analysis of point process models for earthquake\n\noccurrences. Journal of the American Statistical Association, 98(464), 789\u2013795.\n\n[15] Simes, R. J. (1986). An improved Bonferroni procedure for multiple tests of signi\ufb01cance. Biometrika,\n\n73(3), 751\u2013754.\n\n[16] Rodland, E. A. (2006). Simes\u2019 procedure is \u2019valid on average\u2019. Biometrika, 93(3), 742\u2013746.\n[17] Haslinger, R., Pipa, G., & Brown, E. (2010). Discrete time rescaling theorem: Determining goodness of\n\n\ufb01t for discrete time statistical models of neural spiking. Neural Computation, 22(10), 2477\u20132506.\n\n[18] Schneidman, E., Berry, M. J., Segev, R., & Bialek, W. (2006). Weak pairwise correlations imply strongly\n\ncorrelated network states in a neural population. Nature, 440(7087), 1007\u20131012.\n\n[19] Pillow, J. W., Shlens, J., Paninski, L., Sher, A., Litke, A. M., Chichilnisky, E. J., & Simoncelli, E. P.\n(2008). Spatio-temporal correlations and visual signalling in a complete neuronal population. Nature,\n454(7207), 995\u2013999.\n\n[20] Tang, A., Jackson, D., Hobbs, J., Chen, W., Smith, J. L., Patel, H., Prieto, A., Petrusca, D., Grivich, M. I.,\nSher, A., Hottowy, P., Dabrowski, W., Litke, A. M., & Beggs, J. M. (2008). A maximum entropy model\napplied to spatial and temporal correlations from cortical networks in vitro. Journal of Neuroscience,\n28(2), 505\u2013518.\n\n[21] Gerstner, W., & Kistler, W. M. (2002). Spiking Neuron Models. Cambridge: Cambridge University Press.\n[22] Wood, F., Roth, S., & Black, M. (2006). Modeling neural population spiking activity with Gibbs distri-\nbutions. In Y. Weiss, B. Sch\u00a8olkopf, & J. Platt (Eds.) Advances in Neural Information Processing Systems\n18, (pp. 1537\u20131544). Cambridge, MA: MIT Press.\n\n[23] Pillow, J., Berkes, P., & Wood, F. (2009). Characterizing neural dependencies with copula models. In\nD. Koller, D. Schuurmans, Y. Bengio, & L. Bottou (Eds.) Advances in Neural Information Processing\nSystems 21, (pp. 129\u2013136). Cambridge, MA: MIT Press.\n\n[24] Brown, E. N., Barbieri, R., Eden, U. T., & Frank, L. M. (2003). Likelihood methods for neural spike train\ndata analysis. In J. Feng (Ed.) Computational Neuroscience: A comprehensive approach, (pp. 253\u2013286).\nLondon: Chapman and Hall.\n\n9\n\n\f", "award": [], "sourceid": 767, "authors": [{"given_name": "Felipe", "family_name": "Gerhard", "institution": null}, {"given_name": "Wulfram", "family_name": "Gerstner", "institution": null}]}