{"title": "Sparse Space-Time Deconvolution for Calcium Image Analysis", "book": "Advances in Neural Information Processing Systems", "page_first": 64, "page_last": 72, "abstract": "We describe a unified formulation and algorithm to find an extremely sparse representation for Calcium image sequences in terms of cell locations, cell shapes, spike timings and impulse responses. Solution of a single optimization problem yields cell segmentations and activity estimates that are on par with the state of the art, without the need for heuristic pre- or postprocessing. Experiments on real and synthetic data demonstrate the viability of the proposed method.", "full_text": "Sparse space-time deconvolution\n\nfor Calcium image analysis\n\nFerran Diego\n\nFred A. Hamprecht\n\nHeidelberg Collaboratory for Image Processing (HCI)\nInterdisciplinary Center for Scienti\ufb01c Computing (IWR)\nUniversity of Heidelberg, Heidelberg 69115, Germany\n\n{ferran.diego,fred.hamprecht}@iwr.uni-heidelberg.de\n\nAbstract\n\nWe describe a uni\ufb01ed formulation and algorithm to \ufb01nd an extremely sparse rep-\nresentation for Calcium image sequences in terms of cell locations, cell shapes,\nspike timings and impulse responses. Solution of a single optimization problem\nyields cell segmentations and activity estimates that are on par with the state of\nthe art, without the need for heuristic pre- or postprocessing. Experiments on real\nand synthetic data demonstrate the viability of the proposed method.\n\n1\n\nIntroduction\n\nA detailed understanding of brain function is a still-elusive grand challenge. Experimental evidence\nis collected mainly by electrophysiology and \u201cCalcium imaging\u201d. In the former, multi-electrode\narray recordings allow the detailed study of hundreds neurons, while \ufb01eld potentials reveal the col-\nlective action of dozens or hundreds of neurons. The more recent Calcium imaging, on the other\nhand, is a \ufb02uorescent microscopy technique that allows the concurrent monitoring of the individ-\nual actions of thousands of neurons at the same time. While its temporal resolution is limited by\nthe chemistry of the employed \ufb02uorescent markers, its great information content makes Calcium\nimaging an experimental technique of \ufb01rst importance in the study of neural processing, both in\nvitro [16, 6] and in vivo [5, 7]. However, the acquired image sequences are large, and in laboratory\npractice the analysis remains a semi-manual, tedious and subjective task.\nCalcium image sequences reveal the activity of neural tissue over time. Whenever a neuron \ufb01res,\nits \ufb02uorescence signal \ufb01rst increases and then decays in a characteristic time course. Evolutionary\nand energetic constraints on the brain guarantee that, in most cases, neural activity is sparse in\nboth space (only a fraction of neurons \ufb01re at a given instant) and time (most neurons \ufb01re only\nintermittently). The problem setting can be formalized as follows: given an image sequence as\ninput, the desired output is (i) a set of cells1 and (ii) a set of time points at which these cells were\ntriggered. We here propose an unsupervised learning formulation and algorithm that leverages the\nknown structure of the data to produce the sparsest representations published to date, and allow for\nmeaningful automated analysis.\n\n1.1 Prior Art\n\nStandard laboratory practice is to delineate each cell manually by a polygon, and then integrate their\n\ufb02uorescence response over the polygon, for each point in time. The result is a set of time series, one\nper cell.\n\n1Optical sectioning by techniques such as confocal or two-photon microscopy implies that we see only parts\nof a neuron, such as a slice through its cell body or a dendrite, in an image plane. For brevity, we simply refer\nto these as \u201ccells\u201d in the following.\n\n1\n\n\fa) Matrix factorization [13, 15, 4, 3, 12]\n\nb) Convolutional sparse coding [8, 25, 20, 17, 14]\n\nFigure 1: Sketch of selected previous work. Left: Decomposition of an image sequence into a sum\nof K components. Each component is given by the Cartesian product of a spatial component or\nbasis image Dk and its temporal evolution uk. In this article, we represent such Cartesian products\nby the convolution of multidimensional arrays. Right: Description of a single image in terms of a\nsum of latent feature maps Dk convolved with \ufb01lters Hk\n\nGiven that the \ufb02uorescence signal impulse response to a stimulus is stereotypic, these time series\ncan then be deconvolved to obtain a sparse temporal representation for each cell using nonnegative\nsparse deconvolution [24, 5, 10].\nThe problem of automatically identifying the cells has received much less attention, possibly due to\nthe following dif\ufb01culties [16, 23]:\ni) low signal-to-noise ratio (SNR); ii) large variation in luminance\nand contrast; iii) heterogeneous background; iv) partial occlusion and v) pulsations due to heartbeat\nor breathing in live animals. Existing work either hard-codes prior knowledge on the appearance of\nspeci\ufb01c cell types [16, 22] or uses supervised learning (pixel and object level classi\ufb01cation, [23]) or\nunsupervised learning (convolutional sparse block coding, [14]).\nClosest in spirit to our work are attempts to simultaneously segment the cells and estimate their time\ncourses. This is accomplished by matrix factorization techniques such as independent component\nanalysis [13], nonnegative matrix factorization [12] and (hierarchical) dictionary learning [4, 3].\nHowever, none of the above give results that are truly sparse in time; and all of the above have to go\nto some lengths to obtain reasonable cell segmentations: [13, 4] recur to heuristic post-processing,\nwhile [3] invokes structured sparsity inducing norms and [15] use an additional clustering step as\ninitialization. All these extra steps are necessary to assure that each spatial component represents\nexactly one cell.\nIn terms of mathematical modeling, we build on recent advances and experiments in convolutional\nsparse coding such as [8, 25, 20, 17, 14]. Ref. [21] already applies convolutional sparse coding to\nvideo, but achieves sparsity only in space and not in time (where only pairs of frames are used to\nlearn latent representations). Refs. [19, 18] consider time series which they deconvolve, without\nhowever using (or indeed needing, given their data) a sparse spatial representation.\n\n1.2 Contributions\n\nSummarizing prior work, we see three strands: i) Fully automated methods that require an extrin-\nsic cell segmentation, but can \ufb01nd a truly2 sparse representation of the temporal activity. ii) Fully\nautomated methods that can detect and segment cells, but do not estimate time courses in the same\nframework. iii) Techniques that both segment cells and estimate their time courses. Unfortunately,\nexisting techniques either produce temporal representations that are not truly sparse [12, 4, 3] or do\nnot offer a uni\ufb01ed formulation of segmentation and activity detection that succeeds without extrane-\nous clustering steps [15].\nIn response, we offer the \ufb01rst uni\ufb01ed formulation in terms of a single optimization problem: its\nsolution simultaneously yields all cells along with their actions over time. The representation of\nactivity is truly sparse, ideally in terms of a single nonzero coef\ufb01cient for each distinct action of a\ncell. This is accomplished by sparse space-time deconvolution. Given a motion-corrected sequence\nof Calcium images, it estimates i) locations of cells and ii) their activity along with iii) typical cell\nshapes and iv) typical impulse responses. Taken together, these ingredients afford the sparsest, and\nthus hopefully most interpretable, representation of the raw data. In addition, our joint formulation\n\n2We distinguish a sparse representation, in which the estimated time course of a cell has many zeros; and a\n\u201ctruly sparse\u201d representation in which a single action of a cell is ideally represented in terms of a single nonzero\ncoef\ufb01cient.\n\n2\n\n\fFigure 2: Summary of sparse space-time deconvolution. Top: Uni\ufb01ed formulation in terms of a\nsingle optimization problem. Bottom: Illustration on tiny subset of data. Left: raw data. The\n\ufb02uorescence level to be estimated is heavily degraded by Poisson shot noise that is unavoidable\nat the requisite short exposure times. Middle: smoothed raw data. Right: approximation of the\ndata in terms of a Cartesian product of estimated cell shapes and temporal activities. Each temporal\nactivity is further decomposed as a convolution of estimated impulse responses and very few nonzero\ncoef\ufb01cients.\n\nallows to estimate a nonuniform and temporally variable background. Experiments on dif\ufb01cult\narti\ufb01cial and real-world data show the viability of the proposed formulation.\nNotation Boldface symbols describe multidimensional arrays. We de\ufb01ne A \u2217 B as a convolution of\nmultidimensional arrays A and mirror(B), with the result trimmed to the dimensions of A. Here,\nthe \u201cmirror\u201d operation \ufb02ips a multidimensional array along every dimension. A (cid:126) B is the full\nconvolution result of multidimensional arrays A and mirror(B). These de\ufb01nitions are analogous to\nthe \u201cconvn\u201d command in matlab with shape arguments \u201csame\u201d and \u201cfull\u201d, respectively. (cid:107)\u00b7(cid:107)0 counts\nthe number of non-zero coe\ufb01cients, and (cid:107) \u00b7 (cid:107)F is the Frobenius norm.\n\n2 Sparse space-time deconvolution (SSTD)\n\n2.1 No background subtraction\n\nAn illustration of the proposed formulation is given in Fig. 2 and our notation is summarized in\nTable. 1. We seek to explain image sequence X in terms of up to K cells and their activity over time.\nIn so doing, all cells are assumed to have exactly one (Eq. 1.1) of J << K possible appearances,\nand to reside at a unique location (Eq. 1.1). These cell locations are encoded in K latent binary\nfeature maps. The activity of each cell is further decomposed in terms of a convolution of impulses\n(giving the precise onset of each burst) with exactly one of L << K types of impulse responses.\nA single cell may \u201cuse\u201d different impulse responses at different times, but just one type at any one\ntime ((Eq. 1.2).\nAll of the above is achieved by solving the following optimization problem:\n\n(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)\n\n(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)X \u2212 K(cid:88)\n\nk=1\n\n\uf8eb\uf8ed J(cid:88)\n\nmin\n\nD,H,f ,s\n\n\uf8f6\uf8f8 (cid:126)\n\n(cid:32) L(cid:88)\n\n(cid:33)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)\n\n(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)2\n\nF\n\nDk,j \u2217 Hj\n\nsk,l \u2217 fl\n\n(1)\n\nj=1\n\nl=1\n\n3\n\n\fConstraint\n\n(cid:80)\n(cid:80)\nj (cid:107)Dk,j(cid:107)0 \u2264 1,\u2200k\nl (cid:107)st,k,l(cid:107)0 \u2264 1,\u2200k, t\nF \u2264 1,\u2200j\n2 \u2264 1,\u2200l\n\n(cid:107)Hj(cid:107)2\n(cid:107)fl(cid:107)2\n\nsuch that\n\nInterpretation\nat most one location and appearance per component\nonly one type of activation at each time per cell\nprevent cell appearance from becoming large\nprevent impulse \ufb01lter from becoming large\n\n(1.1)\n(1.2)\n(1.3)\n(1.4)\n\nHere, the optimization is with respect to the cell detection maps D, cell appearances H, activity\npatterns or impulse responses f as well as \u201ctruly sparse\u201d activity indicator vectors s. This optimiza-\ntion is subject to the two constraints mentioned earlier plus upper bounds on the norm of the learned\n\ufb01lters.\nThe user needs to select the following parameters: an upper bound K on the number of cells as\nwell as the size in pixels H of their matched \ufb01lters / convolution kernels H; upper bounds J on\nthe number of different appearances and L on the number of different activity patterns that cells\nmay have; as well as the number of coef\ufb01cients F that the learned impulse responses may have.\nConsidering that we propose a method for both cell detection and sparse time course estimation,\nthe number of six user-adjustable parameters compares favourably to previous work. Methods that\ndecouple these steps typically need more parameters altogether, and the heuristics that prior work\non joint optimization uses also have a large number of (implicit) parameters.\n\n(cid:80)J\nj=1 Dk,j \u2217 Hj (cid:126) sk,j \u2217 fj\nare conceivable and may make sense in other applications areas, the proposed formulation is the\nmost parsimonious of its kind. Indeed, it uses a small pool of J shapes and L \ufb01ring patterns, which\ncan be combined freely, to represent all cells and their activities. It is owing to this fact that we dub\nthe method sparse spatio-temporal deconvolution (SSTD).\n\nWhile many other approximations such as(cid:80)K\n\nk=1 Dk (cid:126) sk \u2217 fk or(cid:80)K\n\nk=1\n\n2.2 SSTD with background subtraction\n\nIn actual experiments, the observed \ufb02uorescence level is a sum of the signal of interest plus a nui-\nsance background signal. This background is typically nonuniform in the spatial domain and, while\nit can be modeled as constant over time [15, 24], is often also observed to vary over time, prompting\nrobust local normalization as a preprocessing step [7, 4].\nHere, we generalize the formulation from (1) to correct for a background that is assumed to be\nspatially smooth and time-varying. In more detail, we model the background in terms of the direct\nproduct Bs (cid:126) bt of a spatial component Bs \u2208 RM\u00d7N\u00d71\n. Insights\ninto the physics and biology of Calcium imaging suggest that (except for saturation regimes charac-\nterized by high neuron \ufb01ring rates), it is reasonable to assume that the normalized quantity (observed\n\ufb02uorescence minus background) divided by background, typically dubbed \u2206F/F0, is linearly related\nto the intracellular Calcium concentration [24, 10]. In keeping with this notion, we now propose our\n\ufb01nal model, viz.\n\nand a time series bt \u2208 R1\u00d71\u00d7T\n\n+\n\n+\n\n(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)\n\uf8eb\uf8edX \u2212 K(cid:88)\n\nk=1\n\n\uf8eb\uf8ed J(cid:88)\n\nj=1\n\nmin\n\nD,H,f ,s,Bs,bt\n\n\uf8f6\uf8f8 (cid:126)\n\n(cid:32) L(cid:88)\n\nl=1\n\n(cid:33)\n\nDk,j \u2217 Hj\n\nsk,l \u2217 fl\n\n\u2212 Bs (cid:126) bt\n\n\uf8f6\uf8f8 (cid:11)(cid:0)Bs (cid:126) bt(cid:1)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)(cid:13)2\n\n+ \u03bb(cid:107)Bs(cid:107)T V such that (1.1) \u2212 (1.4), Bs > 0, bt > 0\n\nF\n(2)\nwith \u201c(cid:11)\u201d denoting an elementwise division. Note that the optimization now also runs over the\nspatial and temporal components of the background, with the total variation (TV) regularization\nterm3 enforcing spatial smoothness of the spatial background component [2].\nIn addition to the previously de\ufb01ned parameters, the user also needs to select parameter \u03bb which\ndetermines the smoothness of the background estimate.\n\n2.3 Optimization\n\nThe optimization problem in (2) is convex in either the spatial or the temporal \ufb01lters H, f alone when\nkeeping all other unknowns \ufb01xed; but it is nonconvex in general. In our experiments, we use a block\ncoordinate descent strategy [1, Section 2.7] that iteratively optimizes one group of variables while\n\n3TV measures the sum of the absolute values of the spatial gradient.\n\n4\n\n\f+\n\nSymbol\nX \u2208 RM\u00d7N\u00d7T\nK \u2208 N+\nJ \u2208 N+\nHj \u2208 RH\u00d7H\u00d71\nDk,j \u2208 {0, 1}M\u00d7N\u00d71\nL \u2208 N+\nfl \u2208 R1\u00d71\u00d7F\nsk,l \u2208 R1\u00d71\u00d7T\n\n+\n\n+\n\n+\n\nDe\ufb01nition\nimage sequence of length T , each image is M \u00d7 N\nnumber of cells\nnumber of distinct cell appearances\njth cell appearance / spatial \ufb01lter / matched \ufb01lter of size H \u00d7 H\nindicator matrix of the kth cell for the jth cell appearance\nnumber of distinct impulse responses / activity patterns\nlth impulse response of length F\nindicator vector of the kth spike train for the lth impulse re-\nsponse\n\nTable 1: Notation\n\n\ufb01xing all others (see supplementary material for details). The nonconvex l0-norm constraints require\nthat cell centroids D and spike trains s are estimated by techniques such as convolutional matching\npursuit [20]; while the spatio-temporal \ufb01lters can be learned using simpler gradient descent [25],\nK-SVD [20] or simple algebraic expressions.\nAll unknowns are initialized with standard Gaussian noise truncated to nonnegative values. The\nlimiting number of cells K can be set to a generous upper bound of the expected true number\nbecause spatial components without activity are automatically set to zero during optimization.\n\n3 Experimental Setup\n\nThis section describes the data and algorithms used for experiments and benchmarks.\n\n3.1 Inferring Spike Trains\n\nThe following methods assume that cell segmentation has already been performed by some means,\nand that the \ufb02uorescence signal of individual pixels has been summed up for each cell and every time\nstep. They can hence concentrate exclusively on the estimation of a \u201ctruly sparse\u201d representation of\nthe respective activities in terms of a \u201cspike train\u201d.\nData We follow [24, 5] in generating 1100 sequences consisting of one-sided exponential decays\nwith a constant amplitude of 1 and decay rate \u03c4 = 1/2s, sampled at 30f ps with \ufb01ring rates ranging\nuniformly from 1 to 10Hz and different Gaussian noise levels \u03c3 \u2208 [0.1, 0.6].\nFast non-negative deconvolution (FAST) [24] uses a one-sided exponential decay as parametric\nmodel for the impulse response by invoking a \ufb01rst-order autoregressive process. Given that our\narti\ufb01cial data is free of a nuisance background signal, we disregard its ability to also model such\nbackground. The sole remaining parameter, the rate of the exponential decay, can be \ufb01t using maxi-\nmum likelihood estimation or a method-of-moments approach [15].\nPeeling [5] \ufb01nds spikes by means of a greedy approach that iteratively removes one impulse response\nat a time from the residual \ufb02uorescence signal. Importantly, this stereotypical transient must be\nmanually de\ufb01ned a priori.\nSparse temporal deconvolution (STD) with a single impulse response is a special case of this work\nfor given nonoverlapping cell segmentations and L = 1; and it is also a special case of [14]. The\nimpulse response can be speci\ufb01ed beforehand (amounting to sparse coding), or learned from the\ndata (that is, performing dictionary learning on time-series data).\n\n3.2 Segmenting Cells and Estimating Activities\n\nData Following the procedure described in [4, 12, 13], we have created 80 synthetic sequences\nwith a duration of 15s each at a frame rate of 30f ps with image sizes M = N = 512 pixels.\nThe cells are randomly selected from 36 cell shapes extracted from real data, and are randomly\nlocated in different locations with a maximum spatial overlap of 30%. Each cell \ufb01res according to\na dependent Poisson process, and its activation pattern follows a one-sided exponential decay with\n\n5\n\n\fa scale selected uniform randomly between 500 and 800ms. The average number of active cells\nper frame varies from 1 to 10. Finally, the data has been distorted by additive white Gaussian noise\nwith a relative amplitude (max. intensity \u2212 mean intensity)/\u03c3noise \u2208 {3, 5, 7, 10, 12, 15, 17, 20}.\nBy construction, the identity, location and activity patterns of all cells are known. The supplemental\nmaterial shows an example with its corresponding inferred neural activity.\nReal-world data comes from two-photon microscopy of mouse motor cortex recorded in vivo [7]\nwhich has been motion-corrected. These sequences allow us to conduct qualitative experiments.\nADINA [4] relies on dictionary learning [11] to \ufb01nd both spatial components and their time courses.\nBoth have many zero coef\ufb01cients, but are not \u201ctruly sparse\u201d in the sense of this paper. The method\ncomes with a heuristic post-processing to separate coactivated cells into distinct spatial components.\nNMF+ADINA uses non-negative matrix factorization to infer both the spatial and temporal prim-\nitives of an image sequence as in [12, 15]. In contrast to [15] which uses a k-means clustering of\nhighly con\ufb01dent spike vectors to provide a good initialization in the search for spatial components,\nwe couple NMF with the postprocessing of ADINA.\nCSBC+SC combines convolutional sparse block coding [14] based on a single still image (obtained\nfrom the temporal mean or median image, or a maximum intensity projection across time) with\ntemporal sparse coding.\nCSBC+STD combines convolutional sparse block coding [14] based on a single still image (ob-\ntained from the temporal mean or median image, or a maximum intensity projection across time)\nwith the proposed sparse temporal deconvolution in Sect. 3.1.\nSSTD is the method described here. We used J = L = 2, K = 200, F = 200 and H = 31, 15 for\nthe arti\ufb01cial and real data, respectively.\n\n4 Results\n\n4.1\n\nInferring spike trains\n\nTo quantify the accuracy of activity detection, we \ufb01rst threshold the estimated activities and then\ncompute, by summing over each step in every time series, the number of true and false negatives\nand positives. For a fair comparison, the thresholds were adjusted separately for each method to give\noptimal accuracy. Sensitivity, precision and accuracy computed from the above implicitly measure\nboth the quality of the segmentation and the quality of the activity estimation. An additional mea-\nsure, SPIKE distance [9], emphasizes any temporal deviations between the true and estimated spike\nlocation in a truly sparse representation.\nFig. 3 shows that, unsurprisingly, best results are obtained when methods use the true impulse re-\nsponse rather than learning it from the data. This \ufb01nding does not carry over to real data, where a\n\u201ctrue\u201d impulse response is typically not known. Given the true impulse response, both FAST and\nSTD fare better than Peeling, showing that a greedy algorithm is faster but gives somewhat worse\nresults. Even when learning the impulse response, FAST and STD are no worse than Peeling. When\nlearning the parameters, FAST has an advantage over STD on this arti\ufb01cial data because FAST al-\nready uses the correct parametric form of the impulse response that was used to generate the data\nand only needs to learn a single parameter; while STD learns a more general but nonparametric\nactivity model with many degrees of freedom.\nThe great spread of all quality measures results from the wide range of noise levels used, and the\noverall de\ufb01ciencies in accuracy attest to the dif\ufb01culty of these simulated data sets.\n\n4.2 Segmenting Cells and Inferring spike trains\n\nFig. 4 shows that all the methods from Sect. 3.2 reach respectable and comparable performance in\nthe task of identifying neural activity from non-trivial synthetic image sequences.\nCSBC+SC reaches the highest sensitivity while SSTD has the greatest precision. SSTD apparently\nachieves comparable performance to the other methods without the need for a heuristic pre- or\npostprocessing. Multiple random initializations lead to similar learned \ufb01lters (results not shown),\n\n6\n\n\fFigure 3: Sensitivity, precision, accuracy (higher is better) and SPIKE distance (lower is better) of\ndifferent methods for spike train estimation. Methods that need to learn the activation pattern per-\nform worse than those using the true (but generally unknown) activation pattern and its parameters.\nFAST is at an advantage here because it happens to use the very impulse response that was used in\ngenerating the data.\n\nso the optimization problem seems to be well-posed. The price to pay for the elegance of a uni\ufb01ed\nformulation is a much higher computational cost of this more involved optimization. Again, the\nspread of sensitivities, precisions and accuracies results from the range of noise levels used in the\nsimulations. The plots suggest that SSTD may have fewer \u201ccatastrophic failure\u201d cases, but an even\nlarger set of sequences will be required to verify this tendency.\n\nFigure 4: Quality of cell detection and and the estimation of their activities. SSTD does as well as\nthe competing methods that rely on heuristic pre- or post-processing.\n\nReal Sequences: Qualitative results are shown in Fig. 5. SSTD is able to distinguish both cells with\nspatial overlap and with high temporal correlation. It compensates large variations in luminance\nand contrast, and can discriminate between different types of cells. Exploiting truly sparse but\nindependent representations in both the spatial and the temporal domain allows to infer plausible\nneural activity and, at the same time, reduce the noise in the underlying Calcium image sequence.\n\n5 Discussion\n\nThe proposed SSTD combines the decomposition of the data into low-rank components with the\n\ufb01nding of a convolutional sparse representation for each of those components. The formalism allows\nexploiting sparseness and the repetitive motifs that are so characteristic of biological data. Users\nneed to choose the number and size of \ufb01lters that indirectly determine the number of cell types\nfound and their activation patterns.\nAs shown in Fig. 5, the approach gives credible interpretations of raw data in terms of an extremely\nsparse and hence parsimonious representation.\nThe decomposition of a spacetime volume into a Cartesian product of spatial shapes and their time\ncourses is only possible when cells do not move over time. This assumption holds for in vitro\nexperiments, and can often be satis\ufb01ed by good \ufb01xation in in vivo experiments, but is not universally\nvalid. Correcting for motions in a generalized uni\ufb01ed framework is an interesting direction for future\nwork. The experiments in section 4.1 suggest that it may also be worthwhile to investigate the use\nof more parametric forms for the impulse response instead of the completely unbiased variant used\nhere.\n\n7\n\n020406080100Accuracy (%)020406080100Precision (%)020406080100STD (learned param.)STD (fixed param.)Peeling (fixed param.)FAST (learned param.)FAST (fixed param.)Sensitivity (%)020406080100Sensitivity (%)00.10.20.30.4SPIKE distance5060708090100SSTDCSBC+STDCSBC+SCNNMF+ADINAADINASensitivity (%)5060708090100Accuracy (%)5060708090100Precision (%)5060708090100Sensitivity (%)\fFigure 5: Qualitative results on two real data sets. The data on the left column shows mostly cell\nbodies, while the data on the right shows both cell bodies (large) and dendrites (small). For each\ndata set, the top left shows an average projection of the relative \ufb02uorescence change across time with\ncell centroids D (black dots) and contours of segmented cells, and the top right shows the learned\nimpulse responses. In the middle, the \ufb02uorescence levels integrated over the segmented cells are\nshown in random colors. The bottom shows by means of small disks the location, type and strength\nof the impulses that summarize all the data shown in the middle. Together with the cell shapes, the\nimpulses from part of the \u201dtruly sparse\u201d representation that we propose. When convolving these\nspikes with the impulse responses from the top right insets, we obtain the time courses shown in\nrandom colors.\n\nSuch advances will further help making Calcium imaging an enabling tool for the neurosciences.\n\n8\n\n12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667686970717273747576777879808182838485868788899091929394959697989910010110210310410510610710810911011111211311411511611711811912012112212312412512612712812913013113213313413513613713813914014114214314414514614714814915015115202040608010000.050.10.150.2Framesfilter 1filter 20501001502002501511411311211111019181716151413121111Time (s)Cell number0501001502002501521511411311211111019181716151413121111Time (s)Cell number1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556575859606162636465666768697071727374757677787980818283848586878889909192939495969798991000501001502002501009181716151413121111Time (s)Cell number0501001502002501009181716151413121111Time (s)Cell number02040608010000.050.10.150.20.250.3Framesfilter 1filter 2filter 3\fReferences\n[1] D. P. Bertsekas. Nonlinear Programming. Athena Scienti\ufb01c, 1999.\n[2] A. Chambolle. An algorithm for total variation minimization and applications, 2004.\n[3] F. Diego and F. A. Hamprecht. Learning multi-level sparse representations. In NIPS. 2013.\n[4] F. Diego, S. Reichinnek, M. Both, and F. A. Hamprecht. Automated identi\ufb01cation of neuronal activity\n\nfrom calcium imaging by sparse dictionary learning. ISBI 2013. Proceedings, pages 1058\u20131061, 2013.\n\n[5] B. F. Grewe, D. Langer, H. Kasper, B. M. Kampa, and F. Helmchen. High-speed in vivo calcium imaging\n\nreveals neuronal network activity with near-millisecond precision. Nat Meth, 7(5):399\u2013405, May 2010.\n\n[6] C. Grienberger and A. Konnerth. Neuron, volume 73, chapter Imaging Calcium in Neurons, pages 862\u2013\n\n885. Cell Press,, Mar 2012.\n\n[7] D. Huber, D. A. Gutnisky, S. Peron, D. H. O/\u2019Connor, J. S. Wiegert, L. Tian, T. G. Oertner, L. L. Looger,\nand K. Svoboda. Multiple dynamic representations in the motor cortex during sensorimotor learning.\nNature, 484(7395):473\u2013478, Apr 2012.\n\n[8] K. Kavukcuoglu, P. Sermanet, Y. Boureau, K. Gregor, M. Mathieu, and Y. LeCun. Learning convolutional\n\nfeature hierachies for visual recognition. In NIPS, 2010.\n\n[9] T. Kreuz, D. Chicharro, C. Houghton, R. G. Andrzejak, and F. Mormann. Monitoring spike train syn-\n\nchrony. Journal of Neurophysiology, 2012.\n\n[10] H. Luetcke, F. Gerhard, F. Zenke, W. Gerstner, and F. Helmchen. Inference of neuronal network spike\n\ndynamics and topology from calcium imaging data. Frontiers in Neural Circuits, 7(201), 2013.\n\n[11] J. Mairal, F. Bach, J. Ponce, and G. Sapiro. Online Learning for Matrix Factorization and Sparse Coding.\n\nJournal of Machine Learning Research, 2010.\n\n[12] R. Maruyama, K. Maeda, H. Moroda, I. Kato, M. Inoue, H. Miyakawa, and T. Aonishi. Detecting cells\nusing non-negative matrix factorization on calcium imaging data. Neural Networks, 55(0):11 \u2013 19, 2014.\n[13] E. A. Mukamel, A. Nimmerjahn, and M. J. Schnitzer. Automated analysis of cellular signals from large-\n\nscale calcium imaging data. Neuron, 2009.\n\n[14] M. Pachitariu, A. M. Packer, N. Pettit, H. Dalgleish, M. Hausser, and M. Sahani. Extracting regions of\n\ninterest from biological images with convolutional sparse block coding. In NIPS. 2013.\n\n[15] E. A. Pnevmatikakis and L. Paninski. Sparse nonnegative deconvolution for compressive calcium imag-\n\ning: algorithms and phase transitions. In NIPS. 2013.\n\n[16] S. Reichinnek, A. von Kameke, A. M. Hagenston, E. Freitag, F. C. Roth, H. Bading, M. T. Hasan,\nA. Draguhn, and M. Both. Reliable optical detection of coherent neuronal activity in fast oscillating\nnetworks in vitro. NeuroImage, 60(1), 2012.\n\n[17] R. Rigamonti, A. Sironi, V. Lepetit, and P. Fua. Learning separable \ufb01lters. In Conference on Computer\n\nVision and Pattern Recognition, 2013.\n\n[18] M. N. Schmidt and M. M\u00f8rup. Nonnegative matrix factor 2-D deconvolution for blind single channel\n\nsource separation. In ICA, 2006.\n\n[19] P. Smaragdis. Non-negative matrix factor deconvolution; extraction of multiple sound sources from mono-\n\nphonic inputs. In ICA, pages 494\u2013499, 2004.\n\n[20] A. Szlam, K. Kavukcuoglu, and Y. LeCun. Convolutional matching pursuit and dictionary training. Com-\n\nputer Research Repository (arXiv), 2010.\n\n[21] G. W. Taylor, R. Fergus, Y. Lecun, and C. Bregler. Convolutional learning of spatio-temporal features,\n\n2010.\n\n[22] J. Tomek, O. Novak, and J. Syka. Two-photon processor and seneca: a freely available software package\nto process data from two-photon calcium imaging at speeds down to several milliseconds per frame. J\nNeurophysiol, 110, 2013.\n\n[23] I. Valmianski, A. Y. Shih, J. D. Driscoll, D. W. Matthews, Y. Freund, and D. Kleinfeld. Automatic iden-\nti\ufb01cation of \ufb02uorescently labeled brain cells for rapid functional imaging. Journal of Neurophysiology,\n2010.\n\n[24] J. T. Vogelstein, A. M. Packer, T. A. Machado, T. Sippy, B. Babadi, R. Yuste, and L. Paninski. Fast\nnon-negative deconvolution for spike train inference from population calcium imaging. Journal of Neu-\nrophysiology, 2010.\n\n[25] M. Zeiler, D. Krishnan, G. Taylor, and R. Fergus. Deconvolutional networks. In CVPR, 2010.\n\n9\n\n\f", "award": [], "sourceid": 61, "authors": [{"given_name": "Ferran", "family_name": "Diego Andilla", "institution": "University of Heidelberg"}, {"given_name": "Fred", "family_name": "Hamprecht", "institution": "University of Heidelberg"}]}