{"title": "fMRI-Based Inter-Subject Cortical Alignment Using Functional Connectivity", "book": "Advances in Neural Information Processing Systems", "page_first": 378, "page_last": 386, "abstract": "The inter-subject alignment of functional MRI (fMRI) data is important for improving the statistical power of fMRI group analyses. In contrast to existing anatomically-based methods, we propose a novel multi-subject algorithm that derives a functional correspondence by aligning spatial patterns of functional connectivity across a set of subjects. We test our method on fMRI data collected during a movie viewing experiment. By cross-validating the results of our algorithm, we show that the correspondence successfully generalizes to a secondary movie dataset not used to derive the alignment.", "full_text": "fMRI-Based Inter-Subject Cortical Alignment Using\n\nFunctional Connectivity\n\nJames V. Haxby3\u2217 Peter J. Ramadge1\nBryan R. Conroy1 Benjamin D. Singer2\n1 Department of Electrical Engineering, 2 Neuroscience Institute, Princeton University\n\n3 Department of Psychology, Dartmouth College\n\nAbstract\n\nThe inter-subject alignment of functional MRI (fMRI) data is important for im-\nproving the statistical power of fMRI group analyses.\nIn contrast to existing\nanatomically-based methods, we propose a novel multi-subject algorithm that de-\nrives a functional correspondence by aligning spatial patterns of functional con-\nnectivity across a set of subjects. We test our method on fMRI data collected\nduring a movie viewing experiment. By cross-validating the results of our algo-\nrithm, we show that the correspondence successfully generalizes to a secondary\nmovie dataset not used to derive the alignment.\n\n1\n\nIntroduction\n\nFunctional MRI (fMRI) studies of human neuroanatomical organization commonly analyze fMRI\ndata across a population of subjects. The effective use of this data requires deriving a spatial cor-\nrespondence across the set of subjects, i.e., the data must be aligned, or registered, into a common\ncoordinate space. Current inter-subject registration techniques derive this correspondence by align-\ning anatomically-de\ufb01ned features, e.g. major sulci and gyri, across subjects, either in the volume or\non extracted cortical surfaces. Talairach normalization [1], for example, derives a piecewise af\ufb01ne\ntransformation by matching a set of major anatomical landmarks in the brain volume. More ad-\nvanced techniques match a denser set of anatomical features, such as cortical curvature [2], and\nderive nonlinear transformations between a reference space and each subject\u2019s cortical surface.\nIt is known, however, that an accurate inter-subject functional correspondence cannot be derived\nusing only anatomical features, since the size, shape and anatomical location of functional loci\nvary across subjects [3], [4]. Because of this de\ufb01ciency in current alignment methods, it is com-\nmon practice to spatially smooth each subject\u2019s functional data prior to a population based analysis.\nHowever, this incurs the penalty of blurring the functional data within and across distinct cortical\nregions. Thus, the functional alignment of multi-subject fMRI data remains an important problem.\nWe propose to register functional loci directly by using anatomical and functional data to learn an\ninter-subject cortical correspondence. This approach was \ufb01rst explored in [5], where subject cortices\nwere registered by maximizing the inter-subject correlation of the functional response elicited by a\ncommon stimulus (a movie viewing). In essence, the correspondence was selected to maximize the\ncorrelation of the fMRI time series between subjects. This relies on the functional response being\ntime-locked with the experimental stimulus. Large regions of visual and auditory cortex stimulated\nby a movie viewing do indeed show consistent inter-subject synchrony [6]. However, other areas in\nthe intrinsic [7] or default [8] system fail to exhibit signi\ufb01cant correlations across repeated stimulus\ntrials. The technique of [5] is hence not expected to improve alignment in these intrinsic regions.\nIn contrast to [5], we propose to achieve inter-subject alignment by aligning intra-subject patterns\nof cortical functional connectivity. By functional connectivity, we mean within-subject similarity of\n\n\u2217This work was funded by a grant from the National Institute of Mental Health (5R01MH075706-02)\n\n1\n\n\fthe temporal response of remote regions of cortex [9]. This can be estimated from fMRI data, for\nexample, by correlating the functional time series between pairs of cortical nodes within a subject.\nThis yields a dense set of functional features for each subject from which we learn an inter-subject\ncorrespondence. Unlike other functional connectivity work (see e.g. [10]), we de\ufb01ne connectivity\nbetween pairs of cortical nodes rather than with respect to anatomical regions of interest. Our\napproach is inspired by studies showing that the patterns of functional connectivity in the intrinsic\nnetwork are consistent across subjects [7], [11]. This suggests that our method has the potential to\nlearn an inter-subject functional correspondence within both extrinsic and intrinsic cortical networks.\nIn summary, we formulate a multi-subject cortical alignment algorithm that minimizes the difference\nbetween functional connectivity vectors of corresponding cortical nodes across subjects. We do so\nby learning a dense-deformation \ufb01eld on the cortex of each subject, suitably regularized to preserve\ncortical topology [2]. Our key contributions are: a) the novel alignment objective, b) a principled\nalgorithm for accomplishing the alignment, and c) experimental veri\ufb01cation on fMRI data.\nThe paper is organized as follows. In \u00a72 we formulate the multi-subject alignment problem, followed\nby a detailed exposition of the algorithm in \u00a73 and \u00a74. Finally, we exhibit results of the algorithm\napplied to multi-subject fMRI data in \u00a75 and draw conclusions in \u00a76.\n\n2 Formulation of the Multi-Subject Alignment Problem\n\nFor each subject we are given volumetric anatomical MRI data and fMRI data. The anatomical\ndata is used to extract a two-dimensional surface model of cortex. This greatly facilitates cortical\nbased analysis and subsequent visualization [12], [13], [14]. Cortex is segmented, then each cortical\nhemisphere is in\ufb02ated to obtain a smooth surface, which is projected to the sphere, S2, represented\nby a discrete spherical mesh Ms = {pk \u2208 S2; 1 \u2264 k \u2264 Nv/2}. The two cortical hemispheres\nare hence modeled by the disjoint union S = S2 (cid:93) S2, represented by the corresponding disjoint\nunion of mesh points M = Ms (cid:93) Ms. Anatomical cortical features, such as cortical curvature, are\nfunctions Da : S \u2192 RNa sampled on M. Thus, our analysis is restricted to cortex only.\nThe fMRI volumeric data is \ufb01rst aligned with the anatomical scan, then mapped onto S. This assigns\neach mesh node pk \u2208 M a \u201cvolumetric cortical voxel\u201d vk \u2208 R3, with associated functional time\nseries fk \u2208 RNt. The functional time series data is then a function Df : S \u2192 RNt sampled on M.\nAs indicated in the introduction, we do not directly register the fMRI time series but instead\nregister the functional connectivity derived from the time series. Let \u03c3(f1, f2) denote a similar-\nity measure on pairs of time series f1, f2 \u2208 RNt. A useful example is empirical correlation:\n\u03c3(f1, f2) = corr(f1, f2); another possibility is an estimate of the mutual information between\nthe pairwise entries of f1, f2. De\ufb01ne the functional connectivity of the fMRI data under \u03c3 as the\nmap C(pi, pj) = \u03c3(Df (pi), Df (pj)), i.e., the similarity of the functional times series at the pairs of\ncortical nodes. Functional connections both within and across cortical hemispheres are considered.\nFunctional connectivity can be conceptualized as the adjacency matrix of an edge-weighted graph\non all cortical nodes. The edge between nodes pi, pj is weighted by the pairwise similarity measure\n\u03c3(fi, fj) codifying the functional similarity of pi and pj. In the case of correlation, C is the correla-\ntion matrix of the time series data. For typical values of Nv (\u2248 72, 000), the functional connectivity\ndata structure is huge. Hence we need ef\ufb01cient mechanisms for working with C.\nWe are given the data discussed above for Ns subjects. Subject k\u2019s training data is speci\ufb01ed by sam-\nples of the functions Da,k : Sk \u2192 RNa, Df,k : Sj \u2192 RNt, and the derived functional connectivity\nCk, all sampled on the mesh Mk, k = 1, . . . , Ns. Our objective is to learn a relation consisting\nof Ns-tuples of corresponding points across the set of cortices. To do so, we could select a node\nfrom M1 for subject 1 and learn the corresponding points on the cortices of the remaining Ns \u2212 1\nsubjects through smooth and invertible mappings gk : S1 \u2192 Sk, k = 2, . . . , Ns. However, this arbi-\ntrarily and undesirably gives special status to one subject. Instead, we introduce a reference model\nSref = S2 (cid:93) S2 with mesh Mref. For each node p \u2208 Mref on Sref, we seek to learn the Ns-tuple of\ncorresponding points (g1(p), g2(p), . . . , gNs(p)), parameterized by gk : Sref \u2192 Sk, k = 1, . . . , Ns.\nIn general terms, we can now summarize our task as follows: use the functional connectivity data\nCk, in conjunction with the anatomical data Da,k, k = 1, . . . , Ns, to estimate warping functions\n{gk : k = 1, . . . , Ns}, subject to speci\ufb01ed regularity conditions, that bring some speci\ufb01ed balance\nof anatomy and functional connectivity into alignment across subjects. That said, for the remainder\n\n2\n\n\fof the paper we restrict attention to aligning only functional connectivity across subjects. There is\nno doubt that anatomy must be an integral part of a full solution; but that aspect is not new, and is\nalready well understood. Restricting attention to the alignment of functional connectivity will allow\nus to concentrate on the most novel and important aspects of our approach.\nTo proceed, assume a reference connectivity Cref, such that for each subject k = 1, . . . , Ns,\n\nCk(gk(pi), gk(pj)) = Cref(pi, pj) + \u0001k(pi, pj),\n\n(1)\nwhere Ck(gk(pi), gk(pj)) = \u03c3(Df,k(gk(pi)), Df,k(gk(pj))), and \u0001k is zero-mean random noise.\nSince gk(p) may not be a mesh point, computation of Df,k(gk(p)) requires interpolation of the time\nseries using mesh nodes in a neighborhood of gk(p). This will be important as we proceed.\nGiven (1), we estimate g by maximizing a regularized log likelihood:\n\npi, pj \u2208 Mref\n\nlog P (C1, . . . , CNs|g) \u2212 \u03bb(cid:80)\n\nk Reg(gk)\n\n(2)\n\n\u02c6g = arg\n\nmax\n\ng=(g1,\u00b7\u00b7\u00b7 ,gNs )\n\nwhere Reg(gk) constrains each warping function gk to be smooth and invertible. Here, we will\nfocus on the log likelihood term and delay the discussion of regularization to \u00a73. Optimization of\n(2) is complicated by the fact that Cref is a latent variable, so it must be estimated along with g.\nWe use Expectation-Maximization to iteratively alternate between computing an expectation of Cref\n(E-step), and a maximum likelihood estimate of g given both the observed and estimated unobserved\ndata (M-step) [15]. In the E-step, the expectation of Cref, C ref, conditioned on the current estimate\n\nC ref(pi, pj) = 1/Ns\n\nof g,(cid:98)g, is computed by averaging the connectivity across subjects:\n(cid:80)Ns\nk=1 Ck((cid:98)gk(pi),(cid:98)gk(pj)),\nIn the M-step, the estimate(cid:98)g is re\ufb01ned to maximize the likelihood of the full data:\nlog P (C ref , C1, C2,\u00b7\u00b7\u00b7 , CNs|g)\n(cid:80)\n(cid:80)Ns\n(cid:80)\n\n(cid:98)g = arg\n\n(C ref(pi, pj) \u2212 Ck(gk(pi), gk(pj)))2\n\n(cid:98)gk = arg min\n\npi, pj \u2208 Sref\n\ng=(g1,\u00b7\u00b7\u00b7 ,gNs )\n\npi,pj\u2208Sref\n\nmin\n\ng=(g1,\u00b7\u00b7\u00b7 ,gNs )\n\nk=1\n\npi,pj\u2208Sref\n\n= arg\n\nmax\n\ngk\n\nwhere we have assumed that the noise in (1) is i.i.d. Gaussian. Because (4b) decouples, we can\noptimize over each subject\u2019s warp separately, i.e., these optimizations can be done in parallel:\n\n(C ref(pi, pj) \u2212 Ck(gk(pi), gk(pj)))2\n\n(3)\n\n(4a)\n\n(4b)\n\n(5)\n\nHowever, an interesting alternative is to perform these sequentially with an E-step after each that\nupdates the reference estimate C ref. This also allows some other interesting adaptations. We note:\n(6a)\n\n(C ref(pi, pj) \u2212 Ck(gk(pi), gk(pj)))2 \u221d (C k(pi, pj) \u2212 Ck(gk(pi), gk(pj)))2\n\nwhere\n\nC k(pi, pj) = 1\n\n(7)\nis the leave-one-out template for subject k, which is indepedendent of gk. Thus, we replace (5) by:\n(8)\n\n(C k(pi, pj) \u2212 Ck(gk(pi), gk(pj)))2\n\n(cid:98)gk = arg min\n\npi,pj\u2208Sref\n\n(cid:80)\n\n(Ns\u22121)\n\ngk\n\npi, pj \u2208 Mref ,\n\n(cid:80)\nn(cid:54)=k Cn((cid:98)gn(pi),(cid:98)gn(pj)),\n\nFrom (5) and (8) we observe that the multi-subject alignment problem reduces to a sequence of\npairwise registrations, each of which registers one subject to an average of connectivity matrices. If\nwe use (5), each round of pairwise registrations can be done in parallel and the results used to update\nthe average template. The dif\ufb01culty is the computational update of C ref. Alternatively, using (8)\nwe do the pairwise registrations sequentially and compute a new leave-one-out template after each\nregistration. This is the approach we pursue. An algorithm for solving the pairwise registration is\nderived in the next section and we examine the computation of leave-one-out templates in \u00a74.\n\n3 Pairwise Cortical Alignment\n\nWe now develop an algorithm for aligning one subject, with connectivity CF , to a reference, with\nconnectivity CR, with CF , CR \u2208 RNv\u00d7Nv. For concreteness, from this point forward we let\n\u03c3(f1, f2) = corr(f1, f2) and assume that the time series have zero mean and unit norm.\n\n3\n\n\fg\n\ng(pi), f F\n\nf + \u03bbReg(g)\n\n(cid:107) \u02dcCF \u2212 CR(cid:107)2\n\nA function g : MR \u2192 SF maps a reference mesh point pi \u2208 MR to g(pi) \u2208 SF . By interpolating the\n\ufb02oating subject\u2019s times series at the points g(pi) \u2208 SF we obtain the associated warped functional\nconnectivity: \u02dcCF = [\u03c3(f F\n\ng(pj ))]. We seek(cid:98)g that best matches \u02dcCF to CR in the sense:\n(cid:98)g = arg min\n\nkernel \u03a6: f F (p) = (cid:80)Nv\n\n(9)\nHere (cid:107) \u00b7 (cid:107)f is the matrix Frobenius norm and the regularization term Reg(g) serves as a prior over\nthe space of allowable mappings. In the following steps, we examine how to ef\ufb01ciently solve (9).\nStep 1: Parameterizing the dependence of \u02dcCF on the warp. We \ufb01rst develop the dependence\nof the matrix \u02dcCF on the warping function g. This requires specifying how the time series at the\nwarped points g(pi) \u2208 SF is interpolated using the time series data {f F\ni \u2208 RNt, i = 1, . . . , Nv} at\nthe mesh points {pF\ni \u2208 MF , i = 1, . . . , Nv}. Here, we employ linear interpolation with a spherical\ni \u03a6(p, pi), p \u2208 SF . The kernel should be matched to the following\nspeci\ufb01c objectives: (a) The kernel should be monomodal. Since the gradient of the registration\nobjective depends on the derivative of the interpolation kernel, this will reduce the likelihood of the\nalgorithm converging to a local minimum; (b) The support of the kernel should be \ufb01nite. This will\nlimit interpolation complexity. However, as the size of the support decreases, so will the capture\nrange of the algorithm. At the initial stages of the algorithm, the kernel should have a broad extent,\ndue to higher initial uncertainty, and become increasingly more localized as the algorithm converges.\nThus, (c) The support of the kernel should be easily adjustable.\nWith these considerations in mind, we select \u03a6(p, pi) to be a spherical radial basis function \u03a6i :\nS2 \u2192 R centered at pi \u2208 S2 and taking the form: \u03a6i(p) = \u03d5(d(p, pi)), p \u2208 S2, where \u03d5 : [0, \u03c0] \u2192\nR and d(p, pi) is the spherical geodesic distance between p and pi [16]. Then \u03a6i(p) is monomodal\nwith a maximum at pi, it depends only on the distance between p and pi and is radially symmetric.\nIn detail, we employ the particular spherical radial basis function:\n\ni=1 f F\n\n\u03a6i(p) = \u03d5(d(p, pi)) = (1 \u2212 (2/r) sin(d(p, pi)/2))4\n\n(10)\nwhere r is a \ufb01xed parameter, and (a)+ = a1{a \u2265 0}. \u03a6i(p) has two continuous derivatives and its\nsupport is {p \u2208 S2 : d(p, pi) < 2 sin\u22121(r/2)}. Note that the support can be easily adjusted through\nthe parameter r. So the kernel has all of our desired properties.\nWe can now make the dependence of \u02dcCF on g more explicit. Let TF = [f F\n\n+((8/r) sin(d(p, pi)/2) + 1)\n\n]. Then\n\nf F (g(pNv))(cid:3) = TF A where A = [\u03a6i(g(pj))] is the Nv \u00d7\n(cid:101)CF = DAT CF AD\n\n2 ,\u00b7\u00b7\u00b7 , f F\n\n1 , f F\n\nNv\n\n(cid:101)TF =(cid:2)f F (g(p1))\n\nf F (g(p2))\n\n\u00b7\u00b7\u00b7\n\nNv matrix of interpolation coef\ufb01cients dependent on g and the interpolation kernel. Next, noting\nthat CF = T T\n\nF TF , we use A to write the post-warp correlation matrix as:\n\n(11)\nwhere D = diag(d1, d2,\u00b7\u00b7\u00b7 , dNv) serves to normalize the updated data to unit norm: dj =\n(cid:107)f F (g(pj))(cid:107)\u22121. Finally, we use \u02dcA = AD to write:\n\n(cid:107) \u02dcCF \u2212 CR(cid:107)2\n\nf = (cid:107) \u02dcAT CF \u02dcA \u2212 CR(cid:107)2\n\nf\n\n(12)\nHere, (12) encodes the dependence of the registration objective on g through the matrix \u02dcA. It is also\nimportant to note that since the interpolation kernel is locally supported, \u02dcA is a sparse matrix.\nStep 2: Ef\ufb01cient Representation/Computation of the Registration Objective. We now consider\nthe Nv \u00d7 Nv matrices CF and CR. At a spatial resolution of 2 mm, the spherical model of human\ncortex can yield Nv \u2248 72, 000 total mesh points. In this situation, direct computation with CF and\nCR is prohibitive. Hence we need an ef\ufb01cient way to represent and compute the objective (12).\nFor fMRI data it is reasonable to assume that Nt (cid:28) Nv. Hence, since the data has been centered, the\nR TR is at most Nt \u2212 1. For simplicity, we make the reasonable\nrank of CF = T T\nassumption that rank(TF ) = rank(TR) = d. Then CF and CR can be ef\ufb01ciently represented by\ncompact d-dimensional SVDs CF = VF \u03a3F V T\nF and CR = VR\u03a3RV T\nR . Moveover, these can be\ncomputed directly from SVDs of the data matrices: TF = UTF \u03a3TF V T\nand TR = UTR\u03a3TR V T\n. In\nTF\nTR\ndetail: VF = VTF , VR = VTR, \u03a3F = \u03a3T\nTF\n\nF TF and of CR = T T\n\n\u03a3TF , and \u03a3R = \u03a3T\nTR\n\n\u03a3TR.\n\n4\n\n\fThe above representation avoids computing CF and CR, but we must also show that it enables\nef\ufb01cient evaluation of (12). To this end, introduce the following linear transformation:\n\nwhere WF =(cid:2)VF V \u22a5\n\n(cid:3), WR =(cid:2)VR V \u22a5\n\nR\n\nF\n\nV \u22a5\nR forming orthonormal bases for range(VF )\u22a5 and range(VR)\u22a5, respectively. Write B as:\n\nB = W T\nF\n\n\u02dcAWR\n\n(cid:3), are orthogonal with the Nv \u2212 d columns of V \u22a5\n(cid:20)B1 B2\n\n(cid:21)\n\n(13)\n\nF and\n\n(14)\n\nB =\n\nB3 B4\n\nwith B1 \u2208 Rd\u00d7d, B2 \u2208 Rd\u00d7Nv, B3 \u2208 R(Nv\u2212d)\u00d7d and B4 \u2208 R(Nv\u2212d)\u00d7(Nv\u2212d). Substituting (13)\nand (14) into (12) and simplifying yields:\n\n(cid:107) \u02dcCF \u2212 CR(cid:107)2\n\nf = (cid:107)BT\n\n1 \u03a3F B1 \u2212 \u03a3R(cid:107)2\n\nf + 2(cid:107)BT\n\n1 \u03a3F B2(cid:107)2\n\nf + (cid:107)BT\n\n2 \u03a3F B2(cid:107)2\n\nf\n\n(15)\n\nwith\n\n\u02dcAV \u22a5\n\nF\n\nR\n\n\u02dcAVR and B2 = V T\n\nB1 = V T\nF\n\n1 lie in null(CR) and BT\n\nR , yields B2 = [R , 0], i.e., B2 is very sparse.\n\nR . The columns of \u02dcAVF \u2212 VRBT\n\n(16)\nThe d \u00d7 d matrix B1 is readily computed since VF , VR are of manageable size. Computation of the\nR . This has ON columns spanning the Nv\u2212d dimensional subspace\nd\u00d7Nv matrix B2 depends on V \u22a5\nnull(CR). Since there is residual freedom in the choice of V \u22a5\nR and B2 is large, its selection merits\n\u02dcA onto the columns\ncloser examination. Now (16) can be viewed as a projection of the rows of V T\nF\nR )T ( \u02dcAVF \u2212 VRBT\nof VR and V \u22a5\n2 = (V \u22a5\n1 ).\nHence a QR-factorization QR = \u02dcAT VF \u2212 VRBT\n1 yields d ON vectors in null(CR). Choosing these\nas the \ufb01rst d columns of V \u22a5\nIn summary, we have derived the following ef\ufb01cient means of evaluating the objective. By one-\ntime preprocessing of the time series data we obtain \u03a3F , \u03a3R and VF , VR. Then given a warp g,\nwe compute: the interpolation matrix \u02dcA, B1 = V T\n\u02dcAVR, and \ufb01nally B2 via QR factorization of\n\u02dcAT VF \u2212 VRBT\nStep 3: The Transformation Space and Regularization. We now examine the speci\ufb01cation of g in\ngreater detail. We allow each mesh point to move freely (locally) in two directions. The use of such\nnonlinear warp models for inter-subject cortical alignment has been validated over, for example,\nrigid-body transformations [17]. To specify g, we \ufb01rst need to set up a coordinate system on the\nsphere. Let U = {(\u03c6, \u03b8); 0 < \u03c6 < \u03c0, 0 < \u03b8 < 2\u03c0}. Then the sphere can be parameterized by\nx: U \u2192 R3 with x(\u03c6, \u03b8) = (sin \u03c6 cos \u03b8, sin \u03c6 sin \u03b8, cos \u03c6). Here, \u03c6 is a zenith angle measured\nagainst one of the principal axes, and \u03b8 is an azimuthal angle measured in one of the projective\nplanes (i.e., xy-plane, xz-plane, or yz-plane). Note that x omits a semicircle of S2; so at least two\nsuch parameterizations are required to cover the entire sphere [18].\nConsider pi \u2208 S2 parameterized by x(\u03c6, \u03b8) such that pi = x(\u03c6i, \u03b8i). Then the warp \ufb01eld at pi is:\n\n1 . Then we evaluate (15).\n\nF\n\ng(pi) = x(\u03c6i + \u2206\u03c6i, \u03b8i + \u2206\u03b8i) = x( \u02dc\u03c6i, \u02dc\u03b8i)\n\n(17)\n\nfor displacements \u2206\u03c6i and \u2206\u03b8i. The warp g is thus parameterized by: { \u02dc\u03c6i, \u02dc\u03b8i, i = 1, . . . , Nv}.\nThe warp g must be regularized to avoid undesired topological distortions (e.g. folding and excessive\nexpansion) and to avoid over-\ufb01tting the data. This is achieved by adding a regularization term to the\nobjective that penalizes such distortions. There are several ways this can be done. Here we follow\n[14] and regularize g by penalizing both metric and areal distortion. The metric distortion term\npenalizes warps that disrupt local distances between neighboring mesh nodes. This has the effect of\nlimiting the expansion/contraction of cortex. The areal distortion term seeks to preserve a consistent\norientation of the surface. Given a triangularization of the spherical mesh, each triangle is given\nan oriented normal vector that initially points radially outward from the sphere. Constraining the\noriented area of all triangles to be positive prevents folds in the surface [14].\nStep 4: Optimization of the objective. We optimize (3) over g by gradient descent. De-\n\nnote the objective by S(g), let (cid:101)aij = aijdj be the (i, j)-th entry of \u02dcA = AD and a(p) =\n\u00b7\u00b7\u00b7 \u03a6Nv(p)]T . From the parameterization of the warp (17), we see that(cid:101)aij =\n\n[\u03a61(p) \u03a62(p)\n\n5\n\n\fAlgorithm 1 Pairwise algorithm\n1: Given: SVD of \ufb02oating dataset \u03a3F , VF and\n\nreference dataset \u03a3R, VR\n\n2: Given: Initial warp estimate g(0)\n3: Given: Sequence r1 > r2 > \u00b7\u00b7\u00b7 > rM of\nspatial resolutions\n4: for m = 1 to M do\n5:\n6:\n7:\n\nSet the kernel \u03a6i in (10), with r = rm\nSmooth the reference to resolution rm\nSolve for \u02c6g in (9) by gradient descent\nwith initial condition g(m\u22121)\nSet g(m) = \u02c6g\n\n8:\n9: end for\n10: Output result: g(M )\n\nk=1\n\nto identity, k = 1, . . . , Ns\n\nAlgorithm 2 Multi-subject algorithm\n1: Given: SVD of datasets, {\u03a3k, Vk}Ns\n2: Initialize g(0)\nk\n3: for t = 1 to T do\n4:\n5:\n6:\n\nfor k = 1 to Ns do\n\nConstruct C k as explained in \u00a74\nAlign Ck to C k by Algorithm 1 with\ninitial condition g(t\u22121)\nSet g(t)\nUse g(t)\n\nk to the output of the alignment\nk to update \u03a3k, Vk\n\n7:\n8:\n9:\n10: end for\n11: Output result: g = {g(T )\n\nend for\n\n, . . . , g(T )\nNs\n\n)\n\nk\n\n1\n\nFigure 1: The registration algorithms.\n\n\u03a6i(x((cid:101)\u03c6j,(cid:101)\u03b8j))(cid:107)TF a(x((cid:101)\u03c6j,(cid:101)\u03b8j))(cid:107)\u22121 depends only on the warp parameters of the jth mesh node, (cid:101)\u03c6j\nand(cid:101)\u03b8j. Then, by the chain rule, the partial derivative of S(g) with respect to(cid:101)\u03c6j is given by:\nA similar expression is obtained for the partial derivative with respect to(cid:101)\u03b8j. Since the interpolation\nexpression for \u2202S/\u2202(cid:101)\u03c6j is given in the supplemental, and that of \u2202Reg(g)/\u2202(cid:101)\u03c6j in [14].\n\nkernel is supported locally, the summation in (18) is taken over a small number of terms. A full\n\n=(cid:80)Nv\n\n\u2202(cid:101)aij\n\u2202(cid:101)\u03c6j\n\n\u2202(cid:107) \u02dcCF \u2212CR(cid:107)2\n\n+ \u03bb \u2202Reg(g)\n\n\u2202(cid:101)aij\n\n\u2202(cid:101)\u03c6j\n\n\u2202(cid:101)\u03c6j\n\n(18)\n\n\u2202S(g)\n\ni=1\n\nf\n\nTo help avoid local minima we take a multi-resolution optimization approach [19]. The registration\nis run on a sequence of spatial resolutions r1 > r2 > \u00b7\u00b7\u00b7 > rM , with rM given by the original\nresolution of the data. The result at resolution rm is used to initialize the alignment at resolution\nrm+1. The alignment for rm is performed by matching the kernel parameter r in (10) to rm. Note\nthat the reference dataset is also spatially smoothed at each rm by the transformation in (11), with\nA = [a(p1) a(p2) \u00b7\u00b7\u00b7 a(pNv)]. The pairwise algorithm is summarized as Algorithm 1 in Figure 1.\n\n4 Multi-Subject Alignment: Computing Leave-one-out Templates\n\nWe now return to the multi-subject alignment problem, which is summarized as Algorithm 2 in\nFigure 1. It only remains to discuss ef\ufb01cient computation of the leave-one-out-template (7). Since\nC k is an average of Ns \u2212 1 positive semi-de\ufb01nite matrices each of rank d, the rank d of C k is\n\nbounded as follows d \u2264 d \u2264 (Ns \u2212 1)d. Assume that (cid:101)Cn, the connectivity matrix of subject n after\nwarp gn (see (11)), has an ef\ufb01cient d (cid:28) Nv dimensional SVD representation (cid:101)Cn = (cid:101)Vn(cid:101)\u03a3n(cid:101)V T\n\nn .\n\nTo compute the SVD for C k, we exploit the sequential nature of the multi-subject alignment algo-\nrithm by re\ufb01ning the SVD of the leave-one-out template for subject k\u22121, C k\u22121 = V k\u22121\u03a3k\u22121V\nT\nk\u22121,\ncomputed in the previous iteration. This is achieved by expressing C k in terms of C k\u22121:\n\nand computing matrix decompositions for the singular vectors of (cid:101)Ck\u22121 and (cid:101)Ck in terms of V k\u22121:\n\n(19)\n\nC k = C k\u22121 + 1\n\nNs\u22121((cid:101)Ck\u22121 \u2212 (cid:101)Ck)\n(cid:101)Vk\u22121 = V k\u22121Pk\u22121 + Qk\u22121Rk\u22121\n(cid:101)Vk = V k\u22121Pk\n\nk\u22121(cid:101)Vj \u2208 Rd\u00d7d, for j = k \u2212 1, k, projects the columns of (cid:101)Vj onto the columns of\n\nwhere Pj = V\nV k\u22121. The second term of (20a), Qk\u22121Rk\u22121, is the QR-decomposition of the residual components\n\nT\n\n(20a)\n(20b)\n\n6\n\n\fof (cid:101)Vk\u22121 after projection onto range(V k\u22121). Since C k\u22121 is an average of positive semi-de\ufb01nite\nmatrices that includes (cid:101)Ck, we are sure that range((cid:101)Vk) \u2286 range(V k\u22121), (supplementary material).\n\nUsing the matrix decompositions (20a) and (20b), C k in (19) above can be expressed as:\n\n(cid:20)\nPk(cid:101)\u03a3kP T\n\nk\n\n0\n\n\u2212\n\n(cid:21)\n\n0\n0\n\n)\n\n(21)\n\n(22)\n\nwhere G is the symmetric (d + d) \u00d7 (d + d) matrix:\n\n(cid:20)\u03a3k\u22121\n\n0\n\nG =\n\n(cid:21)\n\n0\n0\n\n+\n\nNs \u2212 1\n\n(\n\n(cid:3)T\n(cid:35)\n\nC k =(cid:2)V k\u22121 Qk\u22121\n(cid:34)\nPk\u22121(cid:101)\u03a3k\u22121P T\nRk\u22121(cid:101)\u03a3k\u22121P T\nV k =(cid:2)V k\u22121 Qk\u22121\n\n(cid:3) G(cid:2)V k\u22121 Qk\u22121\nk\u22121 Pk\u22121(cid:101)\u03a3k\u22121RT\nk\u22121 Rk\u22121(cid:101)\u03a3k\u22121RT\n(cid:3) VG and \u03a3k = \u03a3G\n\nk\u22121\nk\u22121\n\n1\n\nWe now compute the SVD of G = VG\u03a3GV T\n\nG . Then, using (21), we obtain the SVD for C k as:\n\n(23)\nFor a moderate number of subjects, (d + d) \u2264 Nsd (cid:28) Nv, this approach is more ef\ufb01cient than a\nbrute-force O(N 3\n\nv ) SVD. Additionally, it works directly on the singular values (cid:101)\u03a3k and vectors (cid:101)Vk\n\nof each warped connectivity matrix (cid:101)Ck, alleviating the need to store large Nv \u00d7 Nv matrices.\n\n5 Experimental Results\n\nWe tested the algorithm using fMRI data collected from 10 subjects viewing a movie split into\n2 sessions separated by a short break. The data was preprocessed following [5]. For each\nsubject, a structural scan was acquired before each session, from which the cortical surface\nmodel was derived (\u00a72) and then anatomically aligned to a template using FreeSurfer (Fischl,\nhttp://surfer.nmr.mgh.harvard.edu). Similar to [5], we \ufb01nd that anatomical alignment based on cor-\ntical curvature serves as a superior starting point for functional alignment over Talairach alignment.\nFirst, functional connectivity was found for each subject and session: Ck,i, k = 1, . . . , Ns, i = 1, 2.\nThese were then aligned within subjects, Ck,1 \u2194 Ck,2, and across subjects, Ck,1 \u2194 Cj,2, using Al-\ngorithm 1. Since the data starts in anatomical correspondence, we expect small warp displacements\nwithin subject and larger ones across subjects. The mean intra-subject warp displacement was 0.72\nmm (\u03c3 = 0.48), with 77% of the mesh nodes warped less than 1 mm and fewer than 1.5% warped by\nmore than the data spatial resolution (2 mm). In contrast, the mean inter-subject warp displacement\nwas 1.46 mm (\u03c3 = 0.92 mm), with 22% of nodes warped more than 2 mm. See Figures 2(a)-(b).\nIn a separate analysis, each subject was aligned to its leave-one-out template on each session using\nAlgorithm 1, yielding a set of warps gk,i(pj), k = 1, . . . , Ns, i = 1, 2, j = 1, . . . , Nv. To evaluate\nthe consistency of the correspondence derived from different sessions, we compared the warps gk,1\nto gk,2 for each subject k. Here, we only consider nodes that are warped by at least the data resolu-\ntion. This analysis provides a measure of the sensitivity to noise present in the fMRI data. At node\npj, we compute the angle 0 \u2264 \u03b8 \u2264 \u03c0 between the warp tangent vectors of gk,1(pj) and gk,2(pj).\nThis measures the consistency of the direction of the warp across sessions: smaller values of \u03b8 sug-\ngest a greater warp coherence across sessions. Figure 2(c) shows a histogram of \u03b8 averaged across\nthe cortical nodes of all 10 subjects. The tight distribution centered near \u03b8 = 0 suggests signi\ufb01cant\nconsistency in the warp direction across sessions. In particular, 93% of the density for \u03b8 lies inside\n\u03c0/2, 81% inside \u03c0/4, and 58% inside \u03c0/8. As a secondary comparison, we compute a normalized\nconsistency measure WNC(pj) = d(gk,1(pj), gk,2(pj))/(d(gk,1(pj), pj) + d(gk,2(pj), pj)), where\nd(\u00b7,\u00b7) is spherical geodesic distance. The measure takes variability in both warp angle and magni-\ntude into account; it is bounded between 0 and 1, and WNC(pj) = 0 only if gk,1(pj) = gk,2(pj). A\nhistogram for WNC is given in 2(d); WNC exhibits a peak at 0.15, with a mean of 0.28 (\u03c3 = 0.22).\nFinally, Algorithm 2 was applied to the \ufb01rst session fMRI data to learn a set of warps g =\n(g1, . . . , gNs) for 10 subjects. The alignment required approximately 10 hours on a Intel 3.8GHz\nNehalem quad-core processor with 12GB RAM. To evaluate the alignment, we apply the warps to\nthe held out second session fMRI data, where subjects viewed a different segment of the movie. This\nwarping yields data {f k\ngk(pi)} for each subject k, with interpolation performed in the original vol-\nume to avoid arti\ufb01cial smoothing. The cross-validated inter-subject correlation ISC(pi) is the mean\n\n7\n\n\f(a)\n\n(b)\n\n(c)\n\n(d)\n\nFigure 2: Consistency Histograms. (a) Intra-subject warp distances; (b) Inter-subject warp distances; (c) Angle\nbetween warp vectors across sessions; (d) Across-session normalized warp consistency measure WNC.\n\n(a) Lateral View\n\n(b) Medial View\n\n(c) Ventral View\n\n(d) Lateral View\n\n(e) Medial View\n\n(f) Ventral View\n\nFigure 3: Map of ISC on right cortical hemisphere, alignment: anatomical (top), functional (bottom).\n\nISC(pi) = (1/Ns)(cid:80)Ns\n\ncorrelation of each subject\u2019s functional time series with the mean time series of the other subjects:\n(24)\n\nWe also compute the mean inter-subject correlation, ISC = (1/Nv)(cid:80)Nv\n\ngn(pi)), pi \u2208 Mref\ni=1 ISC(pi).\n\ngk(pi),(cid:80)\n\nk=1 corr(f k\n\nn(cid:54)=k f n\n\nWe compare the cross-validated ISC map with the ISC map of the second session movie viewing\ncomputed under anatomical correspondence. Mean ISC improved by 18%, from 0.072 to 0.085. In\naddition, the number of signi\ufb01cant inter-subject correlations (ISC(pi) > 0.1, P < 0.01) increased\nby 22.9%, from 19, 362 to 23, 789. Figure 3 shows the ISC maps computed under anatomical\nalignment and functional alignment on the in\ufb02ated right cortical hemisphere. As expected, the areas\nof improvement in inter-subject correlation are consistent with the extrinsic regions of cortex [6].\n\n6 Conclusion\n\nWe have proposed a novel cortical registration algorithm that produces a functional correspondence\nacross a set of subjects. The algorithm uses the fMRI data directly to align the spatial patterns of\nfunctional response elicited by a movie viewing. Despite the high-dimensionality of the data under\nconsideration, the algorithm is ef\ufb01cient in both space and time complexity.\nBy comparing the inter-subject alignments derived from different fMRI experimental sessions, we\nshow that the correspondence is consistent and robust to noise and variability in the fMRI temporal\nresponse. We also cross-validate the correspondence on independent test data that was not used\nto derive the alignment. On the test data, the algorithm produces a consistent increase in inter-\nsubject correlation of fMRI time series, suggesting that functional alignment of extrinsic regions of\ncortex that are directly driven by the movie viewing experiment, such as visual and auditory areas,\nis improved considerably. Further testing is warranted to evaluate improvement in intrinsic areas of\ncortex whose response is not temporally synchronized with the experimental stimulus.\n\n8\n\n012345600.020.040.060.080.10.120.140.16Warp Distance (mm)Frequency012345600.020.040.060.080.10.120.140.16Warp Distance (mm)Frequency00.20.40.60.8100.050.10.150.2Angle between tangent vectors (radians/!)Frequency00.20.40.60.8100.020.040.060.080.1Normalized consistencyFrequency\fReferences\n[1] J. Talairach and P. Tournoux. Co-planar Stereotaxic Atlas of the Human Brain. Thieme Pub-\n\nlishing Group, 1988.\n\n[2] B. Fischl, R.B.H. Tootell, and A.M. Dale. High-resolution intersubject averaging and a coor-\n\ndinate system for the cortical surface. Human Brain Mapping, 8:272\u2013284, 1999.\n\n[3] J.D.G. Watson, R. Myers, R.S.F. Frackowiak, J.V. Hajnal, R.P. Woods, J.C. Mazziotta,\nS. Shipp, and S. Zeki. Area v5 of the human brain: evidence from a combined study using\npositron emission tomography and magnetic resonance imaging. Cerebral Cortex, 3:79\u201394,\n1993.\n\n[4] J. Rademacher, V.S. Caviness, H. Steinmetz, and A.M. Galaburda. Topographical variation of\nthe human primary cortices: implications for neuroimaging, brain mapping and neurobiology.\nCerebral Cortex, 3:313\u2013329, 1995.\n\n[5] M.R. Sabuncu, B.D. Singer, B. Conroy, R.E. Bryan, P.J. Ramadge, and J.V. Haxby. Function-\nbased inter-subject alignment of human cortical anatomy. Cerebral Cortex Advance Access\npublished on May 6, 2009, DOI 10.1093/cercor/bhp085.\n\n[6] U. Hasson, Y. Nir, G. Fuhrmann, and R. Malach.\n\nIntersubject synchronization of cortical\n\nactivity during natural vision. Science, 303:1634\u20131640, 2004.\n\n[7] Y. Golland, S. Bentin, H. Gelbard, Y. Benjamini, R. Heller, Y. Nir, U. Hasson, and R. Malach.\nExtrinsic and intrinsic systems in the posterior cortex of the human brain revealed during nat-\nural sensory stimulation. Cerebral Cortex, 17:766\u2013777, 2007.\n\n[8] M.E. Raichle, A.M. MacLeod, A.Z. Snyder, W.J. Powers, D.A. Gusnard, and G.L. Shulman.\n\nA default mode of brain function. PNAS, 98:676\u2013682, 2001.\n\n[9] K.J. Friston. Functional and effective connectivity in neuroimaging. Human Brain Mapping,\n\n2:56\u201378, 1994.\n\n[10] Michael D. Greicius, Ben Krasnow, Allan L. Reiss, and Vinod Menon. Functional connectivity\nin the resting brain: A network analysis of the default mode hypothesis. PNAS, 100:253\u2013258,\n2003.\n\n[11] J.L. Vincent, A.Z. Snyder, M.D. Fox, B.J. Shannon, J.R. Andrews, M.E. Raichle, and R.L.\nBuckner. Coherent spontaneous activity identi\ufb01es a hippocampal-parietal memory network. J.\nNeurophysiol, 96:3517\u20133531, 2006.\n\n[12] D.C. Van Essen, H.A. Drury, J. Dickson, J. Harwell, D. Hanlon, and C.H. Anderson. An\nintegrated software suite for surface-based analyses of cerebral cortex. J. Am. Med. Inform.\nAssoc., 8:443\u2013459, 2001.\n\n[13] A.M. Dale, B. Fischl, and M.I. Sereno. Cortical surface-based analysis. i. segmentation and\n\nsurface reconstruction. NeuroImage, 9:179\u2013194, 1999.\n\n[14] B. Fischl, M.I. Sereno, and A.M. Dale. Cortical surface-based analysis. ii. in\ufb02ation, \ufb02attening,\n\nand a surface-based coordinate system. NeuroImage, 9:195\u2013207, 1999.\n\n[15] G.J. McLachlan and T. Krishnan. The EM Algorithm and Extensions. Wiley, 1997.\n[16] G.E. Fasshauer and L.L. Schumaker. Scattered data \ufb01tting on the sphere. Proceedings of the\ninternational conference on mathematical methods for curves and surfaces II, pages 117\u2013166,\n1998.\n\n[17] B.A. Ardekani, A.H. Bachman, S.C. Strother, Y. Fujibayashi, and Y. Yonekura.\n\nImpact of\ninter-subject image registration on group analysis of fmri data. International Congress Series,\n1265:49\u201359, 2004.\n\n[18] M. Do Carmo. Differential Geometry of Curves and Surfaces. Prentice Hall, 1976.\n[19] R. Bajcsy and S. Kovacic. Multiresolution elastic matching. Computer Vision, Graphics, and\n\nImage Processing, 46:1\u201321, 1989.\n\n9\n\n\f", "award": [], "sourceid": 673, "authors": [{"given_name": "Bryan", "family_name": "Conroy", "institution": null}, {"given_name": "Ben", "family_name": "Singer", "institution": null}, {"given_name": "James", "family_name": "Haxby", "institution": null}, {"given_name": "Peter", "family_name": "Ramadge", "institution": null}]}