{"title": "Inferring neural population dynamics from multiple partial recordings of the same neural circuit", "book": "Advances in Neural Information Processing Systems", "page_first": 539, "page_last": 547, "abstract": "Simultaneous recordings of the activity of large neural populations are extremely valuable as they can be used to infer the dynamics and interactions of neurons in a local circuit, shedding light on the computations performed. It is now possible to measure the activity of hundreds of neurons using 2-photon calcium imaging. However, many computations are thought to involve circuits consisting of thousands of neurons, such as cortical barrels in rodent somatosensory cortex. Here we contribute a statistical method for stitching\" together sequentially imaged sets of neurons into one model by phrasing the problem as fitting a latent dynamical system with missing observations. This method allows us to substantially expand the population-sizes for which population dynamics can be characterized---beyond the number of simultaneously imaged neurons. In particular, we demonstrate using recordings in mouse somatosensory cortex that this method makes it possible to predict noise correlations between non-simultaneously recorded neuron pairs.\"", "full_text": "Inferring neural population dynamics from multiple\n\npartial recordings of the same neural circuit\n\nSrinivas C. Turaga\u22171,2, Lars Buesing1, Adam M. Packer2, Henry Dalgleish2, Noah Pettit2, Michael\n\nH\u00a8ausser2 and Jakob H. Macke3,4\n\n1Gatsby Computational Neuroscience Unit, University College London\n2Wolfson Institute for Biomedical Research, University College London\n\n3Max-Planck Institute for Biological Cybernetics, T\u00a8ubingen\n4Bernstein Center for Computational Neuroscience, T\u00a8ubingen\n\nAbstract\n\nSimultaneous recordings of the activity of large neural populations are extremely\nvaluable as they can be used to infer the dynamics and interactions of neurons in\na local circuit, shedding light on the computations performed. It is now possible\nto measure the activity of hundreds of neurons using 2-photon calcium imaging.\nHowever, many computations are thought to involve circuits consisting of thou-\nsands of neurons, such as cortical barrels in rodent somatosensory cortex. Here we\ncontribute a statistical method for \u201cstitching\u201d together sequentially imaged sets of\nneurons into one model by phrasing the problem as \ufb01tting a latent dynamical sys-\ntem with missing observations. This method allows us to substantially expand the\npopulation-sizes for which population dynamics can be characterized\u2014beyond\nthe number of simultaneously imaged neurons. In particular, we demonstrate us-\ning recordings in mouse somatosensory cortex that this method makes it possible\nto predict noise correlations between non-simultaneously recorded neuron pairs.\n\n1\n\nIntroduction\n\nThe computation performed by a neural circuit is a product of the properties of single neurons in the\ncircuit and their connectivity. Simultaneous measurements of the collective dynamics of all neurons\nin a neural circuit will help us understand their function and test theories of neural computation.\nHowever, experimental limitations make it dif\ufb01cult to measure the joint activity of large populations\nof neurons. Recent progress in 2-photon calcium imaging now allows for recording of the activ-\nity of hundreds of neurons nearly simultaneously [1, 2]. However, in neocortex where circuits or\nsubnetworks can span thousands of neurons, current imaging techniques are still inadequate.\nWe present a computational method to more effectively leverage currently available experimental\ntechnology. To illustrate our method consider the following example: A whisker barrel in the mouse\nsomatosensory cortex consists of a few thousand neurons responding to stimuli from one whisker.\nModern microscopes can only image a small fraction\u2014a few hundred neurons\u2014of this circuit. But\nsince nearby neurons couple strongly to one another [3], by moving the microscope to nearby loca-\ntions, one can expect to image neurons which are directly coupled to the \ufb01rst population of neurons.\nIn this paper we address the following question: Could we characterize the joint dynamics of the\n\ufb01rst and second populations of neurons, even though they were not imaged simultaneously? Can we\nestimate correlations in variability across the two populations? Surprisingly, the answer is yes.\nWe propose a statistical tool for \u201cstitching\u201d together measurements from multiple partial observa-\ntions of the same neural circuit. We show that we can predict the correlated dynamics of large\n\n\u2217sturaga@gatsby.ucl.ac.uk\n\n1\n\n\fFigure 1: Inferring neuronal interactions from non-simultaneous measurements. a) If two\nsubsets of a neural population can only be recorded from in two separate imaging sessions, can\nwe infer the connectivity across the sub-populations (red connections)? b) We want to infer the\nfunctional connectivity matrix, and in particular those entries which correspond to pairs of neurons\nthat were not simultaneously measured (red off-diagonal block). While the two sets of neurons\nare pictured as non-overlapping here, we will also be interested in the case of partially overlapping\nmeasurements.\n\npopulations of neurons even if many of the neurons have not been imaged simultaneously. In sen-\nsory cortical neurons, where large variability in the evoked response is observed [4, 5], our model\ncan successfully predict the magnitude of (so-called) noise correlations between non-simultaneously\nrecorded neurons. Our method can help us build data-driven models of large cortical circuits and\nhelp test theories of circuit function.\nRelated recent research. Numerous studies have addressed the question of inferring functional\nconnectivity from 2-photon imaging data [6, 7] or electrophysiological measurements [8, 9, 10, 11].\nThese approaches include detailed models of the relationship between \ufb02uorescence measure-\nments, calcium transients and spiking activity [6] as well as model-free information-theoretic ap-\nproaches [7]. However, these studies do not attempt to infer functional connections between\nnon-simultaneously observed neurons. On the other hand, a few studies have presented statisti-\ncal methods for dealing with sub-sampled observations of neural activity or connectivity, but these\napproaches are not applicable to our problem: A recent study [12] presented a method for predict-\ning noise correlations between non-simultaneously recorded neurons, but this method requires the\nstrong assumption that noise correlations are monotonically related to stimulus correlations. [13]\npresented an algorithm for latent GLMs, but this algorithm does not scale to the population sizes\nof interest here. [14] presented a method for inferring synaptic connections on dendritic trees from\nsub-sampled voltage observations. In this setting, one typically obtains a measurement from each\nlocation every few imaging frames, and it is therefore possible to interpolate these observations.\nIn contrast, in our application, imaging sessions are of much longer duration than the time-scale\nof neural dynamics. Finally, [15] presented a statistical framework for reconstructing anatomical\nconnectivity by superimposing partial connectivity matrices derived from \ufb02uorescent markers.\n\n2 Methods\n\nOur goal is to estimate a joint model of the activity of a neural population which captures the corre-\nlation structure and stimulus selectivity of the population from partial observations of the population\nactivity. We model the problem as \ufb01tting a latent dynamical system with missing observations. In\nprinciple, any latent dynamical system model [13] can be used\u2014here we demonstrate our main point\nusing the simple linear gaussian dynamical system for its computational tractability.\n\n2.1 A latent dynamical system model for combining multiple measurements of population\n\nactivity\n\nLinear dynamics. We denote by xk the activity of N neurons in the population on recording session\nk, and model its dynamics as linear with Gaussian innovations in discrete time,\n\n2\n\nnon-simultaneouslymeasured pairssimultaneouslymeasured pairsabsession 1session 2couplings (A)imagingimaging\fwhere \u03b7t \u223c N (0, Q).\n\nt + \u03b7t,\n\nxk\nt = Axk\n\nt\u22121 + Buk\n\n(1)\nHere, the N \u00d7 N coupling matrix A models correlations across neurons and time. An entry Aij\nbeing non-zero implies that activity of neuron j at time t has a statistical in\ufb02uence on the activity of\nneuron i on the next time-step t + 1, but does not necessarily imply a direct synaptic connection. For\nthis reason, entries of A are usually referred to as the \u2018functional\u2019 (rather than anatomical) couplings\nor connectivity of the population. The entries of A also shape trial-to-trial variability which is\ncorrelated across neurons, i.e. noise-correlations. Further, we include an external, observed stimulus\nt (of dimension Nu) as well as receptive \ufb01elds B (of size N \u00d7 Nu) which model the stimulus\nuk\ndependence of the population activity. We model neural noise (which could include the effect of\nother in\ufb02uences not modeled explicitly) using zero-mean innovations \u03b7t, which are Gaussian i.i.d.\nwith covariance matrix Q, assuming the latter to be diagonal (see below for how our framework also\ncan allow for correlated noise). The mean x0 and covariance Q0 of the initial state xk\n0 were chosen\nsuch that the system is stationary (apart from the stimulus contribution Buk\nt ), i.e. x0 = 0 and Q0\nsatis\ufb01es the Lyapunov equation Q0 = AQ0A(cid:62) + Q.\nFor the sake of simplicity, we work directly in the space of continuous valued imaging measurements\n(rather than on the underlying spiking activity), i.e. xk\nt models the relative calcium \ufb02uorescence sig-\nnal. While this model does not capture the nonlinear and non-Gaussian cascade of neural couplings,\ncalcium dynamics, \ufb02uorescence measurements and imaging noise [16, 6], we will show that this\nmodel nevertheless is able to predict correlations across non-simultaneously observed pairs of neu-\nrons.\nIncomplete observations. In each imaging session k we measure the activity of Nk neurons simul-\ntaneously, where Nk is smaller than the total number of neurons N. Since these measurements are\nnoisy and incomplete observations of the full state vector, the true underlying activity of all neurons\nt is treated as a latent variable. The vector of the Nk measurements at time t in session k is denoted\nxk\nas yk\n\nt and is related to the underlying population activity by\n\nt + d + \u0001t)\n\nyk\nt = C k(xk\n\n(2)\nwhere the \u2018measurement matrix\u2019 C k is of size Nk \u00d7 N. Further assuming that the recording sites\ncorrespond to identi\ufb01ed cells (which typically is the case for 2-photon calcium imaging), we can\nassume C k to be known and of the following form: The element C k\nij is 1 if neuron j of the population\nis being recorded from on session k (as the i-th recording site); the remaining elements of C k are\n0. The measurement noise is modeled as a Gaussian random variable \u0001t with covariance R, and\nthe parameter d captures a constant offset. One can also envisage using our model with dimensions\nt which are never observed\u2013 such latent dimensions would then model correlated noise or the\nof xk\ninput from unobserved neurons into the population [17, 18].\nFitting the model. Our goal is to estimate the parameters (A, B, Q, R) of the latent linear dynamical\nsystem (LDS) model described by equations (1) and (2) from experimental data. One can learn these\nparameters using the standard expectation maximization (EM) algorithm that \ufb01nds a local maximum\nof the log-likelihood of the observed data [19]. The E-step can be performed via Kalman Smoothing\n(with a different C k for each session). In the M-step, the updates for A, B and Q are as in standard\nlinear dynamical systems, and the updates for R and d are element-wise given by\n\n\u0001t \u223c N (0, R),\n\n(cid:11)(cid:17)\n\u2212(cid:10)xk\nt,j \u2212 dj)2(cid:69)\n\n\u2212 xk\n\nt,j\n\n,\n\nyk\nt,\u03c3k\nj\n\n(yk\n\nt,\u03c3k\nj\n\ndj =\n\nRjj =\n\n1\nT nj\n\n1\nT nj\n\nk,t\n\n\u03c7k\nj\n\n(cid:16)\n(cid:88)\n(cid:68)\n(cid:88)\nj :=(cid:80)\n\n\u03c7k\nj\n\nk,t\n\notherwise, nj =(cid:80)\n\nwhere (cid:104)\u00b7(cid:105) denotes the expectation over the posterior distribution calculated in the E-step, and T is\nthe number of time steps in each recording session (assumed to be the same for each session for\nthe sake of simplicity). Furthermore, \u03c7k\nij is 1 if neuron j was imaged in session k and 0\nj is the\nindex of the recording site of neuron j during session k. To improve the computational ef\ufb01ciency of\nthe \ufb01tting procedure as well as to avoid shallow local maxima, we used a variant of online-EM with\nrandomly selected mini-batches [20] followed by full batch EM for \ufb01ne-tuning.\n\nj is the total number of sessions in which neuron j was imaged and \u03c3k\n\ni C k\n\nk \u03c7k\n\n3\n\n\f2.2 Details of simulated and experimental data\n\nSimulated data. We simulated a population of 60 neurons which were split into 3 pools (\u2019cell\ntypes\u2019) of 20 neurons each, with both connection probability and strength being cell-type speci\ufb01c.\nWithin each pool, pairs were coupled with probability 50% and random weights, cell-types one and\ntwo had excitatory connections onto the other cells, and type three had weak but dense inhibitory\ncouplings (see Figure 2a, top left). Coupling weights were truncated at \u00b10.2. The 4-dimensional ex-\nternal stimulus was delivered into the \ufb01rst pool. On average, 24% of the variance of each neuron was\nnoise, 2% driven by the stimulus, 25% by self-couplings and a further 49% by network-interactions.\nAfter shuf\ufb02ing the ordering of neurons (resulting in the connectivity matrix displayed in Fig. 2a,\ntop middle), we simulated K = 10 trials of length T = 1000 samples from the population. We\nthen pretended that the population was imaged in two sessions with non-overlapping subsets of 30\nneurons each (Figure 2a, green outlined blocks) of K = 5 trials each, and that observation noise \u0001\nwas uncorrelated and very small, std(\u0001ii) = 0.006.\nExperimental data. We also applied the stitching method to two calcium imaging datasets recorded\nin the somatosensory cortex of awake or anesthetized mice. We imaged calcium signals in the\nsuper\ufb01cial layers of mouse barrel cortex (S1) in-vivo using 2-photon laser scanning microscopy [1].\nA genetically encoded calcium indicator (GCaMP6s) was virally expressed, driven pan-neuronally\nby the human-synapsin promoter, in the C2 whisker barrel and the activity of about 100-200 neurons\nwas imaged simultaneously in-vivo at about 3Hz, compatible with the slow timescales of the calcium\ndynamics revealed by GCaMP6s. The anesthetized dataset was collected during an experiment in\nwhich the C2 whisker of an anesthetized mouse was repeatedly \ufb02icked randomly in one of three\ndifferent directions (rostrally, caudally or ventrally). About 200 neurons were imaged for about\n27min at a depth of 240\u00b5m in the C2 whisker barrel. The awake dataset was collected while an\nawake animal was performing a whisker \ufb02ick detection task. In this session, about 80 neurons were\nimaged for about 55min at a depth of 190\u00b5m, also in the C2 whisker barrel. Regions of interest\n(ROI) corresponding to putative GCaMP expressing soma (and in some instances isolated neuropil)\nwere manually de\ufb01ned and the time-series corresponding to the calcium signal for each such ROI\nwas extracted. The calcium time-series were high-pass \ufb01ltered with a time-constant of 1s.\n\n2.3 Quantifying and comparing model performance\n\nFictional imaging scenario in experimental data. To evaluate how well stitching works on real\ndata, we created a \ufb01ctional imaging scenario. We pretended that the neurons, which were in reality\nsimultaneously imaged, were not imaged in one session but instead were \u2018imaged\u2019 in two subsets in\ntwo different sessions. The subsets corresponding to different \u2018sessions\u2019 c = 60% of the neurons,\nmeaning that the subsets overlapped and a few neurons in common. We also experimented with\nc = 50% as in our simulation above, but failed to get good performance without any overlapping\nneurons. We imagined that we spent the \ufb01rst 40% of the time \u2018imaging\u2019 subset 1 and the second 40%\nof the time \u2018imaging\u2019 subset 2. The \ufb01nal 20% of the data was withheld for use as the test set. We\nthen used our stitching method to predict pairwise correlations from the \ufb01ctional imaging session.\nUpper and lower bounds on performance. We wanted to benchmark how well our method is doing\nboth compared to the theoretical optimum and to a conventional approach. On synthetic data, we\ncan use the ground-truth parameters as the optimal model. In lieu of ground-truth on the real data,\nwe \ufb01t a \u2018fully observed\u2019 model to the simulatenous imaging data of all neurons (which would be\nimpossible of course in practice, but is possible in our \ufb01ctional imaging scenario). We also analyzed\nthe data using a conventional, \u2018naive\u2019 approach in which we separately \ufb01t dynamical system models\nto each of the two imaging sessions and then combined their parameters. We set coef\ufb01cients of non-\nsimultaneously recorded pairs to 0 and averaged coef\ufb01cients for neurons which were part of both\nimaging sessions (in the c = 60% scenario). The \u201cfully observed\u201d and the \u201cnaive\u201d models constitute\nan upper and lower bound respectively on our performance. Certainly we can not expect to do better\nat predicting correlations, than if we had observed all neurons simultaneously.\n\n3 Results\n\nWe tested our ability to stitch multiple observations into one coherent model which is capable of\npredicting statistics of the joint dynamics, such as correlations across non-simultaneously imaged\n\n4\n\n\fFigure 2: Noise correlations and coupling parameters can be well recovered in a simulated\ndataset. a) A coupling matrix for 60 neurons arranged in 3 blocks was generated (true coupling\nmatrix) and shuf\ufb02ed. We simulated the imaging of non-overlapping subsets of 30 neurons each in\ntwo sessions. Couplings were recovered using a \u201cnaive\u201d strategy and using our proposed \u201cstitching\u201d\nmethod. b) Noise correlations estimated by our stitching method match true noise correlations\nwell. c) Couplings between non-simultaneously imaged neuron pairs (red off-diagonal block) are\nestimated well by our method.\n\nneuron pairs. We \ufb01rst apply our method to a synthetic dataset to explain its properties, and then\ndemonstrate that it works for real calcium imaging measurements from the mouse somatosensory\ncortex.\n\n3.1 Inferring correlations and model parameters in a simulated population\n\nIt might seem counterintuitive that one can infer the cross-couplings, and hence noise-correlations,\nbetween neurons observed in separate sessions. An intuition for why this might work nevertheless\ncan gained by considering the arti\ufb01cial scenario of a network of linearly interacting neurons driven\nby Gaussian noise: Suppose that during the \ufb01rst recording session we image half of these neurons.\nWe can \ufb01t a linear state-space model to the data in which the other, unobserved half of the population\nconstitutes the latent space. Given enough data, the maximum likelihood estimate of the model\nparameters (which is consistent) lets us identify the true joint dynamics of the whole population up\nto an invertible linear transformation of the unobserved dimensions [21]. After the second imaging\nsession, where we image the second (and previously unobserved) half of the population, we can\nidentify this linear transformation, and thus identify all model parameters uniquely, in particular the\ncross-couplings. To demonstrate this intuition, we simulated such an arti\ufb01cial dataset (described in\n2.2) and describe here the results of the stitching procedure.\nRecovering the coupling matrix. Our stitching method was able to recover the true coupling\nmatrix, including the off-diagonal blocks which correspond to pairs of neurons that were not imaged\nsimultaneously (see red-outlined blocks in 2a, bottom middle). As expected, recovery was better for\ncouplings across observed pairs (correlation between true and estimated parameters 0.95, excluding\nself-couplings) than for non-simultaneously recorded pairs (Figure 2c; correlation 0.73). With the\n\u201cnaive\u201d approach couplings between non-simultaneously observed pairs cannot be recovered, and\neven for simultaneously observed pairs, the estimate of couplings is biased (correlation 0.75).\nRecovering noise correlations. We also quanti\ufb01ed the degree to which we are able to predict\nstatistics of the joint dynamics of the whole network, in particular noise correlations across pairs\nof neurons that were never observed simultaneously. We calculated noise correlations by comput-\ning correlations in variability of neural activity after subtracting contributions due to the stimulus.\nWe found that the stitching method was able to accurately recover the noise-correlations of non-\nsimultaneously recorded pairs (correlation between predicted and true correlations was 0.92; Figure\n2b). In fact, we generally found the prediction of correlations to be more accurate than prediction\n\n5\n\ntrue couplingsstitchedestimateunshuffleshufflestitched couplingsunshuffleblocksnaiveestimatetruestitchingestimate\u22120.500.5\u22120.500.5stitchedtruenoise correlations\u22120.200.2\u22120.200.2stitchedtrueoff-diagonalcouplingnaive couplingsabc\fFigure 3: Examples of correlation and coupling recovery in the anesthetized calcium imaging\nexperiments. a) Coupling matrices \ufb01t to calcium signal using all neurons (fully observed) or \ufb01t after\n\u201cimaging\u201d two overlapping subsets of 60% neurons each (stitched and naive). The naive approach\nis unable to estimate coupling terms for \u201cnon-simultaneously imaged\u201d neurons, so these are set to\nzero. b) Scatter plot of coupling terms for \u201cnon-simultaneously imaged\u201d neuron pairs estimated\nusing the stitching method vs the fully observed estimates. c) Correlations predicted using the\ncoupling matrices. d) Scatter plot of correlations in c for \u201cnon-simultaneously imaged\u201d neuron pairs\nestimated using the stitching and the naive approaches.\n\nof the underlying coupling parameters. In contrast, a naive approach would not be able to estimate\nnoise correlations between non-simultaneously observed pairs. (We note that, as the stimulus drive\nin this simulation was very weak, inferring noise correlations from stimulus correlations [12] would\nbe impossible).\nPredicting unobserved neural activity. Given activity measurements from a subset of neurons,\nour method can predict the activity of neurons in the unobserved subset. This prediction can be\ncalculated by doing inference in the resulting LDS, i.e. by calculating the posterior mean \u00b5k\n1:T =\n1:T which correspond to unobserved neurons.\nE(xk\nOn our simulated data, we found that this prediction was strongly correlated with the underlying\nground-truth activity (average correlation 0.70 \u00b1 0.01 s.e.m across neurons, using a separate test-\nset which was not used for parameter \ufb01tting.). The upper bound for this prediction metric can be\nobtained by using the ground-truth parameters to calculate the posterior mean. Use of this ground-\ntruth model resulted in a performance of 0.82 \u00b1 0.01. In contrast, the \u2019naive\u2019 approach can only\nutilize the stimulus, but not the activity of the observed population for prediction and therefore only\nachieved a correlation of 0.23 \u00b1 0.01.\n\n1:T ) and looking at those entries of \u00b5k\n\n1:T|yk\n\n1:T , hk\n\n3.2\n\nInferring correlations in mouse somatosensory cortex\n\nNext, we applied our stitching method to two real datasets: anesthetized and awake (described in\nSection 2.2). We demonstrate that it can predict correlations between non-simultaneously accessed\nneuron pairs with accuracy approaching that of the upper bound (\u201cfully observed\u201d model trained on\nall neurons), and substantially better than the lower bound \u201cnaive\u201d model.\nExample results. Figure 3a displays coupling matrices of a population consisting of the 50 most\ncorrelated neurons in the anesthetized dataset (see Section 2.2 for details) estimated using all three\nmethods. Our stitching method yielded a coupling matrix with structure similar to the fully ob-\nserved model (Figure 3a, central panel), even in the off-diagonal blocks which correspond to non-\nsimultaneously recorded pairs. In contrast, the naive method, by de\ufb01nition, is unable to infer cou-\nplings for non-simultaneously recorded pairs, and therefore over-estimates the magnitude of ob-\nserved couplings (Figure 3a, right panel). Even for non-simultaneously recorded pairs, the stitched\nmodel predicted couplings which were correlated with the fully observed predictions (Figure 3b,\ncorrelation 0.38).\n\n6\n\nfully observedstitchednaivecouplingscorrelationspartially observedfully observed\u22120.500.5\u22120.500.500.10.20.300.10.20.3NaiveStitchedacbd\fFigure 4: Recovering correlations and coupling parameters in a real calcium imaging experi-\nments. 100 neurons were simultaneously imaged in an anesthetized mouse (top row) and an awake\nmouse (bottom row). Random populations of these neurons, ranging in size from 10 to 100 were\nchosen and split into two slightly overlapping sub-sets each containing 60% of the neurons. The\nactivity of these sub-sets were imagined to be \u201cimaged\u201d in two separate \u201cimaging\u201d sessions (see\nSection 2.2). a) Pairwise correlations for \u201cnon-simultaneously imaged\u201d neuron pairs estimated by\nthe \u201cnaive\u201d and our \u201dstitched\u201d strategies compared to correlations predicted by a model \ufb01t to all neu-\nrons (\u201dfull obs\u201c). b) Accuracy of predicting the activity of one sub-set of neurons, given the activities\nof the other sub-set of neurons. c) Comparison of estimated couplings for \u201cnon-simultaneously im-\naged\u201d neuron pairs to those estimated using the \u201cfully observed\u201d model. Note that true coupling\nterms are unavailable here.\n\nHowever, of greater interest is how well our model can recover pairwise correlations between non-\nsimultaneously measured neuron pairs. We found that our stitching method, but not the naive\nmethod, was able to accurately reconstruct these correlations (Figure 3c). As expected, the naive\nmethod strongly under-estimated correlations in the non-simultaneously recorded blocks, as it can\nonly model stimulus-correlations but not noise-correlations across neurons. 1 In contrast, our stitch-\ning method predicted correlations well, matching those of the fully observed model (correlation 0.84\nfor stitchLDS, 0.15 for naiveLDS, \ufb01gure 3d).\nSummary results across multiple populations. Here, we investigate the robustness of our \ufb01nd-\nings. We drew random neuronal populations of sizes ranging from 10 to 80 (for awake) or 100\n(for anesthetized) from the full datasets. For each population, we \ufb01t three models (fully observed,\nstitch, naive) and compared their correlations, parameters and activity cross-prediction accuracy.\nWe repeated this process 20 times for each population size and dataset (anesthetized/awake) to char-\nacterize the variability. We found that for both datasets, the correlations predicted by the stitching\nmethod for non-simultaneously recorded pairs were similar to the fully observed ones, and that this\nsimilarity is almost independent of population size (Figure 4a). In fact, for the awake data (in which\nthe overall level of correlation was higher), the correlation matrices were extremely similar (lower\npanel). The stitching method also substantially outperformed the naive approach, for which the\nsimilarity was lower by a factor of about 2.\nWe compared the accuracy of the models at predicting the neural activity of one subset of neu-\nrons given the stimulus and the activity of the other subset (Figure 4b). We \ufb01nd that our model\nmakes signi\ufb01cantly better predictions than the lower bound naive model, whose performance comes\nfrom modeling the stimulus and neurons in the overlap between both subsets. Indeed for the more\nactive and correlated awake dataset, predictions are nearly as good as those of the fully observed\n\n1The naive approach also over-estimated correlations within each view. This is a consequence of biases\nresulting from averaging couplings across views for neurons in the overlap between the two \ufb01ctional sessions.\n\n7\n\npredicting correlationspredicting couplingsanesthetizedawakepopulation sizecorrelationacpredicting neural activityb2040608010000.20.40.60.81204060800.40.60.812040608010000.20.4204060800.20.40.60.82040608010000.20.40.60.81204060800.20.40.60.81full obs (UB)stitchednaive (LB)\fmodel. We also found that prediction accuracy increased slightly with population size, perhaps\nsince a larger population provides more neurons from which the activity of the other subset can be\npredicted. Apparently, this gain in accuracy from additional neurons outweighed any potential drop\nin performance resulting from increased potential for over-\ufb01tting on larger populations.\nWhile we have no access to the true cross-couplings for the real data, we can nonetheless compare\nthe couplings from our stitched model to those estimated by the fully observed model. We \ufb01nd\nthat the stitching model is indeed able to estimate couplings that correlate positively with the fully\nobserved couplings, even for non-simultaneously imaged neuron pairs. Interestingly, this correlation\ndrops with increasing population size, perhaps due to possible near degeneracy of parameters for\nlarge systems.\n\n4 Discussion\n\nIt has long been appreciated that a dynamical system can be reconstructed from observations of only\na subset of its variables [22, 23, 21]. These theoretical results suggest that while only measuring\nthe activity of one population of neurons, we can infer the activity of a second neural population\nthat strongly interacts with the \ufb01rst, up to re-parametrization. Here, we go one step further. By later\nmeasuring the activity of the second population, we recover the true parametrization allowing us to\npredict aspects of the joint dynamics of the two populations, such as noise correlations.\nOur essential \ufb01nding is that we can put these theoretical insights to work using a simple linear\ndynamical system model that \u201cstitches\u201d together data from non-simultaneously recorded but strongly\ninteracting populations of neurons. We applied our method to analyze 2-photon population calcium\nimaging measurements from the super\ufb01cial layers of the somatosensory cortex of both anesthetized\nand awake mice, and found that our method was able to successfully combine data not accessed\nsimultaneously. In particular, this approach allowed us to accurately predict correlations even for\npairs of non-simultaneously recorded neurons.\nIn this paper, we focused our demonstration to stitching together two populations of neurons. Our\nframework can be generalized to more than two populations, however it remains to be empirically\nseen how well larger numbers of populations can be combined. An experimental variable of interest\nis the degree of overlap (shared neurons) between different populations of neurons. We found that\nsome overlap was critical for stitching to work, and increasing overlap improves stitching perfor-\nmance. Given a \ufb01xed imaging time budget, determining a good trade-off between overlap and total\ncoverage is an intriguing open problem in experimental design.\nWe emphasise that our linear gaussian dynamical system provides only a statistical description of the\nobserved data. However, even this simple model makes accurate predictions of correlations between\nnon-simultaneously observed neurons. Nevertheless, more realistic models [16, 6] can help improve\nthe accuracy of these predictions and disentangle the contributions of spiking activity, calcium dy-\nnamics, \ufb02uorescence measurements and imaging noise to the observed statistics. Similarly, better\npriors on neural connectivity [24] might improve reconstruction performance. Indeed, we found\nin unreported simulations that using a sparsifying penalty on the connectivity matrix [6] improves\nparameter estimates slightly. We note that our model can easily be extended to model potential\ncommon input from neurons which are never observed [13] as a low dimensional LDS [17, 18].\nThe simultaneous measurement of the activity of all neurons in a neural circuit will shed much light\non the nature of neural computation. While there is much progress in developing faster imaging\nmodalities, there are fundamental physical limits to the number of neurons which can be simulta-\nneously imaged. Our paper suggests a means for expanding our limited capabilities. With more\npowerful algorithmic tools, we can imagine mapping population dynamics of all the neurons in an\nentire neural circuit such as the zebra\ufb01sh larval olfactory bulb, or layers 2 & 3 of a whisker barrel\u2014\nan ambitious goal which has until now been out of reach.\n\nAcknowledgements\n\nWe thank Peter Dayan for valuable comments on our manuscript and members of the Gatsby Unit for discus-\nsions. We are grateful for support from the Gatsby Charitable Trust, Wellcome Trust, ERC, EMBO, People\nProgramme (Marie Curie Actions) and German Federal Ministry of Education and Research (BMBF; FKZ:\n01GQ1002, Bernstein Center T\u00a8ubingen).\n\n8\n\n\fReferences\n[1] J. N. D. Kerr and W. Denk, \u201cImaging in vivo: watching the brain in action,\u201d Nat Rev Neurosci, vol. 9,\n\nno. 3, pp. 195\u2013205, 2008.\n\n[2] C. Grienberger and A. Konnerth, \u201cImaging calcium in neurons.,\u201d Neuron, vol. 73, no. 5, pp. 862\u2013885,\n\n2012.\n\n[3] S. Lefort, C. Tomm, J.-C. Floyd Sarria, and C. C. H. Petersen, \u201cThe excitatory neuronal network of the\nC2 barrel column in mouse primary somatosensory cortex.,\u201d Neuron, vol. 61, no. 2, pp. 301\u2013316, 2009.\n[4] D. J. Tolhurst, J. A. Movshon, and A. F. Dean, \u201cThe statistical reliability of signals in single neurons in\n\ncat and monkey visual cortex,\u201d Vision research, vol. 23, no. 8, pp. 775\u2013785, 1983.\n\n[5] W. R. Softky and C. Koch, \u201cThe highly irregular \ufb01ring of cortical cells is inconsistent with temporal\n\nintegration of random epsps,\u201d The Journal of Neuroscience, vol. 13, no. 1, pp. 334\u2013350, 1993.\n\n[6] Y. Mishchenko, J. T. Vogelstein, and L. Paninski, \u201cA bayesian approach for inferring neuronal connec-\ntivity from calcium \ufb02uorescent imaging data,\u201d The Annals of Applied Statistics, vol. 5, no. 2B, pp. 1229\u2013\n1261, 2011.\n\n[7] O. Stetter, D. Battaglia, J. Soriano, and T. Geisel, \u201cModel-free reconstruction of excitatory neuronal\n\nconnectivity from calcium imaging signals,\u201d PLoS Comp Bio, vol. 8, no. 8, p. e1002653, 2012.\n\n[8] J. W. Pillow, J. Shlens, L. Paninski, A. Sher, A. M. Litke, E. J. Chichilnisky, and E. P. Simoncelli,\n\u201cSpatio-temporal correlations and visual signalling in a complete neuronal population.,\u201d Nature, vol. 454,\nno. 7207, pp. 995\u2013999, 2008.\n\n[9] I. H. Stevenson, J. M. Rebesco, L. E. Miller, and K. P. K\u00a8ording, \u201cInferring functional connections between\n\nneurons,\u201d Current opinion in neurobiology, vol. 18, no. 6, pp. 582\u2013588, 2008.\n\n[10] A. Singh and N. A. Lesica, \u201cIncremental mutual information: A new method for characterizing the\nstrength and dynamics of connections in neuronal circuits,\u201d PLoS Comp Bio, vol. 6, no. 12, p. e1001035,\n2010.\n\n[11] D. Song, H. Wang, C. Y. Tu, V. Z. Marmarelis, R. E. Hampson, S. A. Deadwyler, and T. W. Berger,\n\u201cIdenti\ufb01cation of sparse neural functional connectivity using penalized likelihood estimation and basis\nfunctions,\u201d J Comp Neursci, pp. 1\u201323, 2013.\n\n[12] A. Wohrer, R. Romo, and C. Machens, \u201cLinear readout from a neural population with partial correlation\n\ndata,\u201d in Advances in Neural Information Processing Systems, vol. 22, Curran Associates, Inc., 2010.\n\n[13] J. W. Pillow and P. Latham, \u201cNeural characterization in partially observed populations of spiking neu-\n\nrons,\u201d Adv Neural Information Processing Systems, vol. 20, no. 3.5, 2008.\n\n[14] A. Pakman, J. H. Huggins, and P. L., \u201cFast penalized state-space methods for inferring dendritic synaptic\n\nconnectivity,\u201d Journal of Computational Neuroscience, 2013.\n\n[15] Y. Mishchenko and L. Paninski, \u201cA bayesian compressed-sensing approach for reconstructing neural\n\nconnectivity from subsampled anatomical data,\u201d J Comp Neurosci, vol. 33, no. 2, pp. 371\u2013388, 2012.\n\n[16] J. T. Vogelstein, B. O. Watson, A. M. Packer, R. Yuste, B. Jedynak, and L. Paninski, \u201cSpike inference from\ncalcium imaging using sequential monte carlo methods,\u201d Biophysical Journal, vol. 97, no. 2, pp. 636\u2013\n655, 2009.\n\n[17] M. Vidne, Y. Ahmadian, J. Shlens, J. Pillow, J. Kulkarni, A. Litke, E. Chichilnisky, E. Simoncelli, and\nL. Paninski, \u201cModeling the impact of common noise inputs on the network activity of retinal ganglion\ncells.,\u201d J Comput Neurosci, 2011.\n\n[18] J. H. Macke, L. B\u00a8using, J. P. Cunningham, B. M. Yu, K. V. Shenoy, and M. Sahani., \u201cEmpirical models of\nspiking in neural populations.,\u201d in Advances in Neural Information Processing Systems, vol. 24, Curran\nAssociates, Inc., 2012.\n\n[19] A. P. Dempster, N. M. Laird, and D. B. Rubin, \u201cMaximum likelihood from incomplete data via the EM\n\nalgorithm,\u201d J R Stat Soc Ser B, vol. 39, no. 1, pp. 1\u201338, 1977.\n\n[20] P. Liang and D. Klein, \u201cOnline EM for unsupervised models,\u201d in NAACL \u201909: Proceedings of Human\nLanguage Technologies: The 2009 Annual Conference of the North American Chapter of the Association\nfor Computational Linguistics, Association for Computational Linguistics, 2009.\n[21] T. Katayama, Subspace methods for system identi\ufb01cation. Springer Verlag, 2005.\n[22] L. E. Baum and T. Petrie, \u201cStatistical Inference for Probabilistic Functions of Finite State Markov\n\nChains,\u201d The Annals of Mathematical Statistics, vol. 37, no. 6, pp. 1554\u20131563, 1966.\n\n[23] F. Takens, \u201cDetecting Strange Attractors In Turbulence,\u201d in Dynamical Systems and Turbulence (D. A.\nRand and L. S. Young, eds.), vol. 898 of Lecture Notes in Mathematics, (Warwick), pp. 366\u2013381,\nSpringer-Verlag, Berlin, 1981.\n\n[24] S. W. Linderman and R. P. Adams, \u201cInferring functional connectivity with priors on network topology,\u201d\n\nin Cosyne Abstracts, 2013.\n\n9\n\n\f", "award": [], "sourceid": 348, "authors": [{"given_name": "Srini", "family_name": "Turaga", "institution": "Gatsby Unit, UCL"}, {"given_name": "Lars", "family_name": "Buesing", "institution": "Gatsby Unit, UCL"}, {"given_name": "Adam", "family_name": "Packer", "institution": "UCL"}, {"given_name": "Henry", "family_name": "Dalgleish", "institution": "UCL"}, {"given_name": "Noah", "family_name": "Pettit", "institution": "UCL"}, {"given_name": "Michael", "family_name": "Hausser", "institution": "UCL"}, {"given_name": "Jakob", "family_name": "Macke", "institution": "MPI for Biological Cybernetics"}]}