{"title": "Automatic Neuron Detection in Calcium Imaging Data Using Convolutional Networks", "book": "Advances in Neural Information Processing Systems", "page_first": 3270, "page_last": 3278, "abstract": "Calcium imaging is an important technique for monitoring the activity of thousands of neurons simultaneously. As calcium imaging datasets grow in size, automated detection of individual neurons is becoming important. Here we apply a supervised learning approach to this problem and show that convolutional networks can achieve near-human accuracy and superhuman speed. Accuracy is superior to the popular PCA/ICA method based on precision and recall relative to ground truth annotation by a human expert. These results suggest that convolutional networks are an efficient and flexible tool for the analysis of large-scale calcium imaging data.", "full_text": "Automatic Neuron Detection in Calcium Imaging\n\nData Using Convolutional Networks\n\nNoah J. Apthorpe1\u2217 Alexander J. Riordan2\u2217 Rob E. Aguilar1\nYi Gu2 David W. Tank2 H. Sebastian Seung12\n\nJan Homann2\n\n1Computer Science Department\n\n2Princeton Neuroscience Institute\n\nPrinceton University\n\n{apthorpe, ariordan, dwtank, sseung}@princeton.edu\n\n\u2217These authors contributed equally to this work\n\nAbstract\n\nCalcium imaging is an important technique for monitoring the activity of thousands\nof neurons simultaneously. As calcium imaging datasets grow in size, automated\ndetection of individual neurons is becoming important. Here we apply a supervised\nlearning approach to this problem and show that convolutional networks can achieve\nnear-human accuracy and superhuman speed. Accuracy is superior to the popular\nPCA/ICA method based on precision and recall relative to ground truth annotation\nby a human expert. These results suggest that convolutional networks are an\nef\ufb01cient and \ufb02exible tool for the analysis of large-scale calcium imaging data.\n\n1\n\nIntroduction\n\nTwo-photon calcium imaging is a powerful technique for monitoring the activity of thousands of\nindividual neurons simultaneously in awake, behaving animals [1, 2]. Action potentials cause\ntransient changes in the intracellular concentration of calcium ions. Such changes are detected by\nobserving the \ufb02uorescence of calcium indicator molecules, typically using two-photon microscopy\nin the mammalian brain [3]. Repeatedly scanning a single image plane yields a time series of 2D\nimages. This is effectively a video in which neurons blink whenever they are active [4, 5].\nIn the traditional work\ufb02ow for extracting neural activities from the video, a human expert manually\nannotates regions of interest (ROIs) corresponding to individual neurons [5, 1, 2]. Within each\nROI, pixel values are summed for each frame of the video, which yields the calcium signal of the\ncorresponding neuron versus time. A subsequent step may deconvolve the temporal \ufb01ltering of\nthe intracellular calcium dynamics for an estimate of neural activity with better time resolution.\nThe traditional work\ufb02ow has the de\ufb01ciency that manual annotation becomes laborious and time-\nconsuming for very large datasets. Furthermore, manual annotation does not de-mix the signals from\nspatially overlapping neurons.\nUnsupervised basis learning methods (PCA/ICA [6], CNMF [7], dictionary learning [8], and sparse\nspace-time deconvolution [9]) express the video as a time-varying superposition of basis images.\nThe basis images play a similar role as ROIs in the traditional work\ufb02ow, and their time-varying\ncoef\ufb01cients are intended to correspond to neural activities. While basis learning methods are useful\nfor \ufb01nding active neurons, they do not detect low-activity cells\u2014making these methods inappropriate\nfor studies involving neurons that may be temporarily inactive depending on context or learning [10].\nSuch subtle dif\ufb01culties may explain the lasting popularity of manual annotation. At \ufb01rst glance, the\nvideos produced by calcium imaging seem simple (neurons blinking on and off). Yet automating\nimage analysis has not been trivial. One dif\ufb01culty is that images are corrupted by noise and artifacts\ndue to brain motion. Another dif\ufb01culty is variability in the appearance of cell bodies, which vary\n\n30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain.\n\n\fin shape, size, spacing, and resting-level \ufb02uorescence. Additionally, different neuroscience studies\nmay require differing ROI selection criteria. Some may require only cell bodies [5, 11], while others\ninvolve dendrites [6]. Some may require only active cells, while others necessitate both active and\ninactive cells [10]. Some neuroscientists may wish to reject slightly out-of-focus neurons. For all of\nthese reasons, a neuroscientist may spend hours or days tuning the parameters of nominally automated\nmethods, or may never succeed in \ufb01nding a set of parameters that produces satisfactory results.\nAs a way of dealing with these dif\ufb01culties, we focus here on a supervised learning approach to\nautomated ROI detection. An automated ROI detector could be used to replace manual ROI detection\nby a human expert in the traditional work\ufb02ow, or could be used to make the basis learning algorithms\nmore reliable by providing good initial conditions for basis images. However, the usability of an\nautomated algorithm strongly depends on it attaining high accuracy. A supervised learning method can\nadapt to different ROI selection criteria and generalize them to new datasets. Supervised learning has\nbecome the dominant approach for attaining high accuracy in many computer vision problems [12].\nWe assemble ground truth datasets consisting of calcium imaging videos along with ROIs drawn by\nhuman experts and employ a precision-recall formalism for quantifying accuracy. We train a sliding\nwindow convolutional network (ConvNet) to take a calcium video as input and output a 2D image\nthat matches the human-drawn ROIs as well as possible. The ConvNet achieves near-human accuracy\nand exceeds that of PCA/ICA [6].\nThe prior work most similar to ours used supervised learning based on boosting with hand-designed\nfeatures [13]. Other previous attempts to automate ROI detection did not employ supervised machine\nlearning. For example, hand-designed \ufb01ltering operations [14] and normalized cuts [15] were applied\nto image pixel correlations.\nThe major cost of supervised learning is the human effort required to create the training set. As a\nrough guide, our results suggest that on the order of 10 hours of effort or 1000 annotated cells are\nsuf\ufb01cient to yield a ConvNet with usable accuracy. This initial time investment, however, is more than\nrepaid by the speed of a ConvNet at classifying new data. Furthermore, the marginal effort required\nto create a training set is essentially zero for those neuroscientists who already have annotated data.\nNeuroscientists can also agree to use the same trained ConvNets for uniformity of ROI selection\nacross labs.\nFrom the deep learning perspective, an interesting aspect of our work is that a ConvNet that processes\na spatiotemporal (2+1)D image is trained using only spatial (2D) annotations. Full spatiotemporal\nannotations (spatial locations and times of activation) would have been more laborious to collect. The\nuse of purely spatial annotations is possible because the neurons in our videos are stationary (apart\nfrom motion artifacts). This makes our task simpler than other applications of ConvNets to video\nprocessing [16].\n\n2 Neuron detection benchmark\n\nWe use a precision-recall framework to quantify accuracy of neuron detection. Predicted ROIs are\nclassi\ufb01ed as false positives (FP), false negatives (FN), and true positives (TP) relative to ground truth\nROIs. Precision and recall are de\ufb01ned by\n\nprecision =\n\nT P\n\nT P + F P\n\nrecall =\n\nT P\n\nT P + F N\n\n(1)\n\nBoth measures would be equal to 1 if predictions were perfectly accurate, i.e. higher numbers are\nbetter. If a single measure of accuracy is required, we use the harmonic mean of precision and recall,\n1/F1 = (1/precision + 1/recall) /2. The F1 score favors neither precision nor recall, but in practice\na neuroscientist may care more about one measure than the other. For example, some neuroscientists\nmay be satis\ufb01ed if the algorithm fails to detect many neurons (low recall) so long as it produces\nfew false positives (high precision). Other neuroscientists may want the algorithm to \ufb01nd as many\nneurons as possible (high recall) even if there are many false positives (low precision).\nFor computing precision and recall, it is helpful to de\ufb01ne the overlap between two ROIs R1 and R2\nas the Jaccard similarity coef\ufb01cient |R1 \u2229 R2|/|R1 \u222a R2| where |R| denotes the number of pixels in\nR. For each predicted ROI, we \ufb01nd the ground truth ROI with maximal overlap. The ground truth\nROIs with overlap greater than 0.5 are assigned to the predicted ROIs with which they overlap the\nmost. These assignments are true positives. Leftover ROIs are the false positives and false negatives.\n\n2\n\n\fWe prefer the precision-recall framework over the receiver operating characteristic (ROC), which\nwas previously used as a quantitative measure of neuron detection accuracy [13]. This is because\nprecision and recall do not depend on true negatives, which are less well-de\ufb01ned. (The ROC depends\non true negatives through the false positive rate.)\nGround truth generation by human annotation The quantitative measures of accuracy proposed\nabove depend on the existence of ground truth. For the vast majority of calcium imaging datasets,\nno objectively de\ufb01ned ground truth exists, and we must rely on subjective evaluation by human\nexperts. For a dataset with low noise in which the desired ROIs are cell bodies, human experts\nare typically con\ufb01dent about most of their ROIs, though some are borderline cases that may be\nambiguous. Therefore our measures of accuracy should be able to distinguish between algorithms\nthat differ widely in their performance but may not be adequate to distinguish between algorithms\nthat are very similar.\nTwo-photon calcium imaging data were gathered from both the primary visual cortex (V1) and medial\nentorhinal cortex (MEC) from awake-behaving mice (Supplementary Methods). All experiments\nwere performed according to the Guide for the Care and Use of Laboratory Animals, and procedures\nwere approved by Princeton University\u2019s Animal Care and Use Committee.\nEach time series of calcium images was corrected for motion artifacts (Supplementary Methods),\naverage-pooled over time with stride 167, and then max-pooled over time with stride 6. This\ndownsampling in time was arbitrarily chosen to reduce noise and make the dataset into a more\nmanageable size. Human experts then annotated ROIs using the ImageJ Cell Magic Wand Tool [17],\nwhich automatically generates a region of interest (ROI) based on a single mouse click. The human\nexperts found 4006 neurons in the V1 dataset with an average of 148 neurons per image series and\n538 neurons in the MEC dataset with an average of 54 neurons per image series.\nHuman experts used the following criteria to select neurons: 1. the soma was in the focal plane of\nthe image\u2014apparent as a light doughnut-like ring (the soma cytosol) surrounding a dark area (the\nnucleus), or 2. the area showed signi\ufb01cantly changing brightness distinguishable from background\nand had the same general size and shape expected from a neuron in the given brain region.\nAfter motion correction, downsampling, and human labeling, the V1 dataset consisted of 27 16-bit\ngrayscale multi-page TIFF image series ranging from 28 to 142 frames per series with 512 \u00d7 512\npixels per frame. The MEC dataset consisted of 10 image series ranging from 5 to 28 frames in the\nsame format. Human annotation time was estimated at one hour per image series for the V1 dataset\nand 40 minutes per images series for the MEC dataset. Each human-labeled ROI was represented as\na 512 \u00d7 512 pixel binary mask.\n\n3 Convolutional network\n\nPreprocessing of images and ground truth ROIs. Microscopy image series from the V1 and MEC\ndatasets were preprocessed prior to network training (Figure 1). Image contrast was enhanced by\nclipping all pixel values above the 99th percentile and below the 3rd percentile. Pixel values were\nthen normalized to [0, 1]. We divided the V1 series into 60% training, 20% validation, and 20% test\nsets and the MEC series into 50% training, 20% validation, and 30% test sets.\nNeighboring ground truth ROIs often touched or even overlapped with each other. For the purpose\nof ConvNet training, we shrank the ground truth ROIs by replacing each one with a 4-pixel radius\ndisk located at the centroid of the ROI. The shrinkage was intended to encourage the ConvNets to\nseparate neighboring neurons.\nConvolutional network architecture and training. The architecture of the (2+1)D ConvNet is de-\npicted in Figure 2. The input is an image stack containing T time slices. There are four convolutional\nlayers, a max pooling over all time slices, and then two pixelwise fully connected layers. This yields\ntwo 2D grayscale images as output, which together represent the softmax probability of each pixel\nbeing inside an ROI centroid.\nThe convolutional layers were chosen to contain only 2D kernels, because the temporal downsampling\nused in the preprocessing (\u00a72) caused most neural activity to last for only a single time frame. Each\noutput pixel depended on a 37 \u00d7 37 \u00d7 T pixel \ufb01eld of view in the input, where T is the number of\nframes in the input image stack\u2014governed by the length of the imaging experiment and the imaging\n\n3\n\n\fFigure 1: Preprocessing steps for calcium images and human-labeled ROIs. Col 1) Calcium imaging\nstacks were motion-corrected and downsampled in time. Col 2) Image contrast was enhanced by\nclipping pixel intensities below the 3rd and above the 99th percentile then linearly rescaling pixel\nintensities between these new bounds. Col 3) Human-labeled ROIs were converted into binary\nmasks. Col 4) Networks were trained to detect 4-pixel radius circular centroids of human-labeled\nROIs. Primary visual cortex (V1, Row 1) and medial entorhinal cortex (MEC, Row 2) datasets were\npreprocessed identically.\n\nsampling rate. T was equalized to 50 for all image stacks in the V1 dataset and 5 for all image stacks\nin the MEC dataset using averaging and bicubic interpolation. In the future, we will consider less\ntemporal downsampling and the use of 3D kernels in the convolutional layers.\nThe ConvNet was applied in a 37 \u00d7 37 \u00d7 T window, sliding in two dimensions over the input image\nstack to produce an output pixel for every location of the window fully contained within the image\nbounds. For comparison, we also trained a 2D ConvNet that took as input the time-averaged image\nstack and did no temporal computation (Figure 2).\nWe used ZNN, an open-source sliding window ConvNet package with multi-core CPU parallelism\nand FFT-based convolution [18]. ZNN automatically augmented training sets by random rotations\n(multiples of 90 degrees) and re\ufb02ections of image patches to facilitate ConvNet learning of invariances.\nThe training sets were also rebalanced by the fraction of pixels in human-labeled ROIs to the total\nnumber of pixels. See Supplementary Methods for further details.\nThe (2+1)D network was trained with softmax loss and output patches of size 120 \u00d7 120. The\nlearning rate parameter was annealed by hand from 0.01 to 0.002, and the momentum parameter was\nannealed by hand from 0.9 to 0.5. The network was trained for 16800 stochastic gradient descent\n(SGD) updates for the V1 dataset, which took approximately 1.2 seconds/update (\u223c 5.5hrs) on an\nAmazon EC2 c4.8xlarge instance (Supplementary Figure 1). The network was trained for 200000\nSGD updates for the MEC dataset, which took approximately 0.1 seconds/update (\u223c 5.5hrs).\nThe 2D network training omitted annealing of the learning rate and momentum parameters. The\n2D network was trained for 14000 SGD updates for the V1 dataset, which took approximately 0.9\nseconds/update (\u223c 3.75hrs) on an Amazon EC2 c4.8xlarge instance (Supplementary Figure 1). We\nperformed early stopping on the network after 10200 SGD updates based on the validation loss.\nNetwork output postprocessing. Network outputs were converted into individual ROIs by: 1.\nThresholding out pixels with low probability values, 2. Removing small connected components,\n3. Weighting resulting pixels with a normalized distance transform, 4. Performing marker-based\nwatershed labeling with local max markers, 5. Merging small watershed regions, and 6. Automat-\nically applying the ImageJ Cell Magic Wand tool to the original images at the centroids of the\nwatershed regions. Thresholding and minimum size values were optimized using the validation sets\n(Supplementary Methods).\nSource code. A ready-to-use pipeline, including pre- and postprocessing, ConvNet training, and\nprecision-recall scoring, will be publicly available for community use (https://github.com/\nNoahApthorpe/ConvnetCellDetection).\n\n4\n\nV1#DatasetMEC#DatasetInitial#ImageContrast#EnhancementHuman#Labeled#ROIsROI#Centroids20\u03bcm\fFigure 2: A) Schematic of the (2+1)D network architecture. The (2+1)D network transforms 3D\ncalcium imaging stacks \u2013 stacks of 2D calcium images changing over time \u2013 into 2D images of\npredicted neuron locations. All convolutional \ufb01lters are 2D except for the 1x1xT max \ufb01lter layer,\nwhere T is the number of frames in the image stack. B) The 2D network architecture. The 2D network\ntakes as input calcium imaging stacks that are mean projected over time down to two dimensions.\n\n4 Results\n\nConvNets successfully detect cells in calcium images. A sample image from the V1 test set and\nConvNet output is shown in Figure 4. Postprocessing of the ConvNet output yielded predicted ROIs,\nmany of which are the same as the human ROIs (Figure 4c). As described in Section 2, we quanti\ufb01ed\nagreement between ConvNet and human using the precision-recall formalism. Both (2+1)D and 2D\nnetworks attained the same F1 score (0.71). Full precision-recall curves are given in Supplementary\nFigure 1.\nInspection of the ConvNet-human disagreements suggested that some were not actually ConvNet\nerrors. To investigate this hypothesis, the original human expert reevaluated all disagreements\nwith the (2+1)D network. After reevaluation, 131 false positives became true positives, and 30\nfalse negatives became true negatives (Figure 4D). Some of these reversals appeared to involve\nunambiguous human errors in the original annotation, while others were ambiguous cases (Figure 4E\u2013\nG). After reevaluation, the F1 score of the (2+1)D network increased to 0.82. The F1 score of the\nhuman expert\u2019s reevaluation relative to his original annotation was 0.89. These results indicate that\nthe ConvNet is nearing human performance.\n(2+1)D versus 2D network. The (2+1)D and 2D networks achieved similar precision, recall, and\nF1 scores on the V1 dataset; however, the (2+1)D network produced raw output with less noise\nthan the 2D network (Figure 3). Qualitative inspection also indicates that the (2+1)D network \ufb01nds\ntransiently active and transiently in focus neurons missed by the 2D network (Figure 3). Although\nsuch neurons occurred infrequently in the V1 dataset and did not noticeably affect network scores,\nthese results suggest that datasets with larger populations of transiently active or variably focused\ncells will particularly bene\ufb01t from (2+1)D network architectures.\nConvNet segmentation outperforms PCA/ICA. The (2+1)D network was also able to successfully\nlocate neurons in the MEC dataset (Figure 5). For comparison, we also implemented and applied\nPCA/ICA as described by Ref. [6]. The (2+1)D network achieved an F1 score of 0.51, while\nPCA/ICA achieved 0.27. Precision and recall numbers are given in Figure 5.\n\n5\n\n3D'image'inputconv10x10x1conv10x10x1conv10x10x1conv10x10x1max'filter'1x1xTconv1x1x110'units/convlayer2D'output'image50'unit'FC'layerconv3x3conv3x3conv3x3max'filter'2x2conv3x32D'image'input2D'image'outputconv3x3conv1x1max'filter'2x224units4848729696120A.B.20'unit'FC'layer\fFigure 3: A) The (2+1)D network detected neurons that the 2D network failed to locate. The sequence\nof greyscale images shows a patch of V1 neurons over time. Both transiently active neurons and\nneurons that wane in and out of the focal plane are visible. The color image shows the output of both\nnetworks. The (2+1)D network detects these transiently visible neurons, whereas the 2D network is\nunable to \ufb01nd these cells using only the mean-\ufb02attened image. B) The raw outputs of the (2+1)D and\n2D networks. C) Representative histogram of output pixel intensities. The (2+1)D network output\nhas more values clustered around 0 and 1 compared to the 2D network. This suggests that (2+1)D\nnetwork output has a higher signal to noise ratio than 2D network output.\n\nConvNet accuracy was lower on the MEC dataset than the V1 dataset, probably because the former\nhas more noise and larger motion artifacts. The amount of training data for the MEC dataset was also\nmuch smaller.\nPCA/ICA accuracy was numerically worse, but this result should be interpreted cautiously. PCA/ICA\nis intended to identify active neurons, while the ground truth included both active and inactive\nneurons. Furthermore, the ground truth depends on the human expert\u2019s selection criteria, which are\nnot accessible to PCA/ICA.\nTraining and post-processing optimization for ConvNet segmentation took \u223c6 hours with a forward\npass taking \u223c1.2 seconds per image series. Parameter optimization for PCA/ICA performed by a\nhuman expert took \u223c2.5 hours with a forward pass taking \u223c40 minutes. This amounted to \u223c6 hours\ntotal computation time for the ConvNet and \u223c9 hours for the PCA/ICA algorithm. This suggests that\nConvNet segmentation is faster than PCA/ICA for all but the smallest datasets.\n\n5 Discussion\n\nThe lack of quantitative difference between (2+1)D and 2D ConvNet accuracy (same F1 score on\nthe V1 dataset) may be due to limitations of our study, such as imperfect ground truth and temporal\ndownsampling in preprocessing. It may also be because the vast majority of neurons in the V1 dataset\nare clearly visible in the time-averaged image. We do have qualitative evidence that the (2+1)D\narchitecture may turn out to be superior for other datasets, because its output looks cleaner, and it is\nable to detect transiently active or transiently in-focus cells (Figure 3).\nThe (2+1)D ConvNet outperformed PCA/ICA in the precision-recall metrics. We are presently\nworking to compare against recently released basis learning methods [7]. ConvNets readily locate\ninactive neurons and process new images rapidly once trained. ConvNets adapt to the selection\ncriteria of the neuroscientist if they are implicitly contained in the training set. They do not depend on\nhand-designed features and so require little expertise in computer vision. ConvNet speed could enable\nnovel applications involving online ROI detection, such as computer-guided single-cell optogenetics\n[11] or real-time neural feedback experiments.\n\n6\n\nNeuron'falls'in'and'out'of'focusTransiently'active'neuron(2+1)Dnetwork2D'networkA.B.(2+1)Dnetwork2D'networkOverlayC.20\u03bcm20\u03bcm00.20.40.60.81pixel intensity00.511.52number pixels in image#104(2+1)D network2D network\fFigure 4: The (2+1)D network successfully detected neurons in the V1 test set with near-human\naccuracy. A) Slice from preprocessed calcium imaging stack input to network. B) Network softmax\nprobability output. Brighter regions are considered by the network to have higher probability of\nbeing a neuron. C) ROIs found by the (2+1)D network after post-processing, overlaid with human\nlabels. Network output is shown by green outlines, whereas human labels are red. Regions of\nagreement are indicated by yellow overlays. D) ROI labels added by human reevaluation are shown\nin blue. ROI labels removed by reevaluation are shown in magenta. Post hoc assessment of network\noutput revealed a sizable portion of ROIs that were initially missed by human labeling. E) Examples\nof formerly negative ROIs that were reevaluated as positive. F) Initial positive labels that were\nreevaluated to be false. G) Examples of ROIs that remained negative even after reevaluation. H) F1\nscores for (2+1)D and 2D networks before and after ROI reevaluation. Human labels before and after\nreevaluation were also compared to assess human labeling variability. Boxplots depict the variability\nof F1 scores around the median score across test images.\n\n7\n\nHumanHuman&&&(2+1)D&network&overlay(2+1)D&networkAdded&by&human&relabeling&Removed&by&human&relabeling&A.B.C.D.20\u03bcm2D Original LabelsTemporal Original Labels2D RelabeledTemporal RelabeledHuman Original to RelabeledA.B.C.E.H.F1&ScoreF.G.20\u03bcm(2+1)D(2+1)D\fFigure 5: The (2+1)D network successfully detected neurons in the MEC test set with higher\nprecision and recall than PCA/ICA. A) Slice from preprocessed calcium imaging stack that was input\nto network. B) Network output, normalized by softmax. C) ROIs found by the (2+1)D network after\npostprocessing, overlaid with ROIs previously labeled by a human. Network output is shown by red\noutlines, whereas human labels are green. Regions of agreement are indicated by yellow overlays. D)\nThe ROIs found by PCA/ICA are overlaid in blue. E) Quantitative comparison of F1 score, precision,\nand recall for (2+1)D network and PCA/ICA on human-labeled MEC data.\n\n8\n\nC.D.HumanHuman)&)(2+1)D)network)overlayPCA/ICA(2+1)D)networkA.B.20\u03bcmTemporalPCATemporalPCATemporalPCAE.(2+1)Dnetwork(2+1)Dnetwork(2+1)DnetworkPCA/ICAPCA/ICAPCA/ICA\fAcknowledgments\n\nWe thank Kisuk Lee, Jingpeng Wu, Nicholas Turner, and Jeffrey Gauthier for technical assistance.\nWe also thank Sue Ann Koay, Niranjani Prasad, Cyril Zhang, and Hussein Nagree for discussions.\nThis work was supported by IARPA D16PC00005 (HSS), the Mathers Foundation (HSS), NIH\nR01 MH083686 (DWT), NIH U01 NS090541 (DWT, HSS), NIH U01 NS090562 (HSS), Simons\nFoundation SCGB (DWT), and U.S. Army Research Of\ufb01ce W911NF-12-1-0594 (HSS).\n\nReferences\n[1] Daniel Huber, DA Gutnisky, S Peron, DH O\u2019connor, JS Wiegert, Lin Tian, TG Oertner, LL Looger, and\nK Svoboda. Multiple dynamic representations in the motor cortex during sensorimotor learning. Nature,\n484(7395):473\u2013478, 2012.\n\n[2] Daniel A Dombeck, Anton N Khabbaz, Forrest Collman, Thomas L Adelman, and David W Tank. Imaging\n\nlarge-scale neural activity with cellular resolution in awake, mobile mice. Neuron, 56(1):43\u201357, 2007.\n\n[3] Winfried Denk, James H Strickler, Watt W Webb, et al. Two-photon laser scanning \ufb02uorescence microscopy.\n\nScience, 248(4951):73\u201376, 1990.\n\n[4] Tsai-Wen Chen, Trevor J Wardill, Yi Sun, Stefan R Pulver, Sabine L Renninger, Amy Baohan, Eric R\nSchreiter, Rex A Kerr, Michael B Orger, Vivek Jayaraman, et al. Ultrasensitive \ufb02uorescent proteins for\nimaging neuronal activity. Nature, 499(7458):295\u2013300, 2013.\n\n[5] Christopher D Harvey, Philip Coen, and David W Tank. Choice-speci\ufb01c sequences in parietal cortex during\n\na virtual-navigation decision task. Nature, 484(7392):62\u201368, 2012.\n\n[6] Eran A Mukamel, Axel Nimmerjahn, and Mark J Schnitzer. Automated analysis of cellular signals from\n\nlarge-scale calcium imaging data. Neuron, 63(6):747\u2013760, 2009.\n\n[7] Eftychios A Pnevmatikakis, Daniel Soudry, Yuanjun Gao, Timothy A Machado, Josh Merel, David Pfau,\nThomas Reardon, Yu Mu, Clay Lace\ufb01eld, Weijian Yang, et al. Simultaneous denoising, deconvolution, and\ndemixing of calcium imaging data. Neuron, 89(2):285\u2013299, 2016.\n\n[8] Marius Pachitariu, Adam M Packer, Noah Pettit, Henry Dalgleish, Michael Hausser, and Maneesh Sahani.\nExtracting regions of interest from biological images with convolutional sparse block coding. In Advances\nin Neural Information Processing Systems, pages 1745\u20131753, 2013.\n\n[9] Ferran Diego Andilla and Fred A Hamprecht. Sparse space-time deconvolution for calcium image analysis.\n\nIn Advances in Neural Information Processing Systems, pages 64\u201372, 2014.\n\n[10] David S Greenberg, Arthur R Houweling, and Jason ND Kerr. Population imaging of ongoing neuronal\n\nactivity in the visual cortex of awake rats. Nature neuroscience, 11(7):749\u2013751, 2008.\n\n[11] John Peter Rickgauer, Karl Deisseroth, and David W Tank. Simultaneous cellular-resolution optical\n\nperturbation and imaging of place cell \ufb01ring \ufb01elds. Nature neuroscience, 17(12):1816\u20131824, 2014.\n\n[12] Yann LeCun, Koray Kavukcuoglu, Cl\u00e9ment Farabet, et al. Convolutional networks and applications in\n\nvision. In ISCAS, pages 253\u2013256, 2010.\n\n[13] Ilya Valmianski, Andy Y Shih, Jonathan D Driscoll, David W Matthews, Yoav Freund, and David\nKleinfeld. Automatic identi\ufb01cation of \ufb02uorescently labeled brain cells for rapid functional imaging.\nJournal of neurophysiology, 104(3):1803\u20131811, 2010.\n\n[14] Spencer L Smith and Michael H\u00e4usser. Parallel processing of visual space by neighboring neurons in\n\nmouse visual cortex. Nature neuroscience, 13(9):1144\u20131149, 2010.\n\n[15] Patrick Kaifosh, Jeffrey D Zaremba, Nathan B Danielson, and Attila Losonczy. Sima: Python software for\n\nanalysis of dynamic \ufb02uorescence imaging data. Frontiers in neuroinformatics, 8, 2014.\n\n[16] Andrej Karpathy, George Toderici, Sanketh Shetty, Thomas Leung, Rahul Sukthankar, and Li Fei-Fei.\nLarge-scale video classi\ufb01cation with convolutional neural networks. In Proceedings of the IEEE conference\non Computer Vision and Pattern Recognition, pages 1725\u20131732, 2014.\n\n[17] Theo Walker. Cell magic wand tool. http://www.maxplanckflorida.org/fitzpatricklab/\n\nsoftware/cellMagicWand/index.html. 2014.\n\n[18] Aleksandar Zlateski, Kisuk Lee, and H Sebastian Seung. ZNN \u2013 a fast and scalable algorithm for training\n3d convolutional networks on multi-core and many-core shared memory machines. arXiv:1510.06706,\n2015.\n\n9\n\n\f", "award": [], "sourceid": 1632, "authors": [{"given_name": "Noah", "family_name": "Apthorpe", "institution": "Princeton University"}, {"given_name": "Alexander", "family_name": "Riordan", "institution": "Princeton University"}, {"given_name": "Robert", "family_name": "Aguilar", "institution": "Princeton University"}, {"given_name": "Jan", "family_name": "Homann", "institution": "Princeton University"}, {"given_name": "Yi", "family_name": "Gu", "institution": "Princeton University"}, {"given_name": "David", "family_name": "Tank", "institution": "Princeton University"}, {"given_name": "H. Sebastian", "family_name": "Seung", "institution": "Princeton University"}]}