{"title": "Clustering Signed Networks with the Geometric Mean of Laplacians", "book": "Advances in Neural Information Processing Systems", "page_first": 4421, "page_last": 4429, "abstract": "Signed networks allow to model positive and negative relationships. We analyze existing extensions of spectral clustering to signed networks. It turns out that existing approaches do not recover the ground truth clustering in several situations where either the positive or the negative network structures contain no noise. Our analysis shows that these problems arise as existing approaches take some form of arithmetic mean of the Laplacians of the positive and negative part. As a solution we propose to use the geometric mean of the Laplacians of positive and negative part and show that it outperforms the existing approaches. While the geometric mean of matrices is computationally expensive, we show that eigenvectors of the geometric mean can be computed efficiently, leading to a numerical scheme for sparse matrices which is of independent interest.", "full_text": "Clustering Signed Networks with the\n\nGeometric Mean of Laplacians\n\nPedro Mercado1, Francesco Tudisco2 and Matthias Hein1\n\n1Saarland University, Saarbr\u00fccken, Germany\n\n2University of Padua, Padua, Italy\n\nAbstract\n\nSigned networks allow to model positive and negative relationships. We analyze\nexisting extensions of spectral clustering to signed networks. It turns out that\nexisting approaches do not recover the ground truth clustering in several situations\nwhere either the positive or the negative network structures contain no noise. Our\nanalysis shows that these problems arise as existing approaches take some form of\narithmetic mean of the Laplacians of the positive and negative part. As a solution\nwe propose to use the geometric mean of the Laplacians of positive and negative\npart and show that it outperforms the existing approaches. While the geometric\nmean of matrices is computationally expensive, we show that eigenvectors of the\ngeometric mean can be computed ef\ufb01ciently, leading to a numerical scheme for\nsparse matrices which is of independent interest.\n\n1\n\nIntroduction\n\nA signed graph is a graph with positive and negative edge weights. Typically positive edges model\nattractive relationships between objects such as similarity or friendship and negative edges model\nrepelling relationships such as dissimilarity or enmity. The concept of balanced signed networks\ncan be traced back to [10, 3]. Later, in [5], a signed graph is de\ufb01ned as k-balanced if there exists\na partition into k groups where only positive edges are within the groups and negative edges are\nbetween the groups. Several approaches to \ufb01nd communities in signed graphs have been proposed\n(see [23] for an overview). In this paper we focus on extensions of spectral clustering to signed\ngraphs. Spectral clustering is a well established method for unsigned graphs which, based on the\n\ufb01rst eigenvectors of the graph Laplacian, embeds nodes of the graphs in Rk and then uses k-means\nto \ufb01nd the partition. In [16] the idea is transferred to signed graphs. They de\ufb01ne the signed ratio\nand normalized cut functions and show that the spectrum of suitable signed graph Laplacians yield a\nrelaxation of those objectives. In [4] other objective functions for signed graphs are introduced. They\nshow that a relaxation of their objectives is equivalent to weighted kernel k-means by choosing an\nappropriate kernel. While they have a scalable method for clustering, they report that they can not\n\ufb01nd any cluster structure in real world signed networks.\nWe show that the existing extensions of the graph Laplacian to signed graphs used for spectral\nclustering have severe de\ufb01ciencies. Our analysis of the stochastic block model for signed graphs\nshows that, even for the perfectly balanced case, recovery of the ground-truth clusters is not guaranteed.\nThe reason is that the eigenvectors encoding the cluster structure do not necessarily correspond to\nthe smallest eigenvalues, thus leading to a noisy embedding of the data points and in turn failure\nof k-means to recover the cluster structure. The implicit mathematical reason is that all existing\nextensions of the graph Laplacian are based on some form of arithmetic mean of operators of the\npositive and negative graphs. In this paper we suggest as a solution to use the geometric mean of\nthe Laplacians of positive and negative part. In particular, we show that in the stochastic block\nmodel the geometric mean Laplacian allows in expectation to recover the ground-truth clusters in\n\n30th Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain.\n\n\fany reasonable clustering setting. A main challenge for our approach is that the geometric mean\nLaplacian is computationally expensive and does not scale to large sparse networks. Thus a main\ncontribution of this paper is showing that the \ufb01rst few eigenvectors of the geometric mean can still be\ncomputed ef\ufb01ciently. Our algorithm is based on the inverse power method and the extended Krylov\nsubspace technique introduced by [8] and allows to compute eigenvectors of the geometric mean\nA#B of two matrices A, B without ever computing A#B itself.\nIn Section 2 we discuss existing work on Laplacians on signed graphs. In Section 3 we discuss the\ngeometric mean of two matrices and introduce the geometric mean Laplacian which is the basis of our\nspectral clustering method for signed graphs. In Section 4 we analyze our and existing approaches for\nthe stochastic block model. In Section 5 we introduce our ef\ufb01cient algorithm to compute eigenvectors\nof the geometric mean of two matrices, and \ufb01nally in Section 6 we discuss performance of our\napproach on real world graphs. Proofs have been moved to the supplementary material.\n\n2 Signed graph clustering\n\nNetworks encoding positive and negative relations among the nodes can be represented by weighted\nsigned graphs. Consider two symmetric non-negative weight matrices W + and W \u2212, a vertex set\nV = {v1, . . . , vn}, and let G+ = (V, W +) and G\u2212 = (V, W \u2212) be the induced graphs. A signed\ngraph is the pair G\u00b1 = (G+, G\u2212) where G+ and G\u2212 encode positive and the negative relations,\nrespectively.\nThe concept of community in signed networks is typically related to the theory of social balance.\nThis theory, as presented in [10, 3], is based on the analysis of affective ties, where positive ties are a\nsource of balance whereas negative ties are considered as a source of imbalance in social groups.\nDe\ufb01nition 1 ([5], k-balance). A signed graph is k-balanced if the set of vertices can be partitioned\ninto k sets such that within the subsets there are only positive edges, and between them only negative.\nThe presence of k-balance in G\u00b1 implies the presence of k groups of nodes being both assortative\nin G+ and dissassortative in G\u2212. However this situation is fairly rare in real world networks and\nexpecting communities in signed networks to be a perfectly balanced set of nodes is unrealistic.\nIn the next section we will show that Laplacians inspired by De\ufb01nition 1 are based on some form of\narithmetic mean of Laplacians. As an alternative we propose the geometric mean of Laplacians and\nshow that it is able to recover communities when either G+ is assortative, or G\u2212 is disassortative, or\nboth. Results of this paper will make clear that the use of the geometric mean of Laplacians allows to\nrecognize communities where previous approaches fail.\n\n2.1 Laplacians on Unsigned Graphs\n\nSpectral clustering of undirected, unsigned graphs using the Laplacian matrix is a well established\ntechnique (see [19] for an overview). Given an unsigned graph G = (V, W ), the Laplacian and its\nnormalized version are de\ufb01ned as\n\nL = D \u2212 W\n\nLsym = D\u22121/2LD\u22121/2\n\n(1)\n\nwhere Dii =(cid:80)n\n\nj=1 wij is the diagonal matrix of the degrees of G. Both Laplacians are positive\nsemide\ufb01nite, and the multiplicity k of the eigenvalue 0 is equal to the number of connected compo-\nnents in the graph. Further, the Laplacian is suitable in assortative cases [19], i.e. for the identi\ufb01cation\nof clusters under the assumption that the amount of edges inside clusters has to be larger than the\namount of edges between them.\nFor disassortative cases, i.e. for the identi\ufb01cation of clusters where the amount of edges has to be\nlarger between clusters than inside clusters, the signless Laplacian is a better choice [18]. Given the\nunsigned graph G = (V, W ), the signless Laplacian and its normalized version are de\ufb01ned as\n\nQ = D + W,\n\nQsym = D\u22121/2QD\u22121/2\n\n(2)\n\nBoth Laplacians are positive semi-de\ufb01nite, and the smallest eigenvalue is zero if and only if the graph\nhas a bipartite component [6].\n\n2\n\n\f2.2 Laplacians on Signed Graphs\n\nand \u00afDii =(cid:80)n\n\nii =(cid:80)n\n\nRecently a number of Laplacian operators for signed networks have been introduced. Consider the\nsigned graph G\u00b1 = (G+, G\u2212). Let D+\nij be the diagonal matrix of the degrees of G+\n\nj=1 w+\n\nj=1 w+\n\nij + w\u2212\n\nij the one of the overall degrees in G\u00b1.\n\nThe following Laplacians for signed networks have been considered so far\n\nLBR = D+ \u2212 W ++W \u2212, LBN = \u00afD\u22121LBR,\nLSR = \u00afD \u2212 W ++W \u2212, LSN = \u00afD\u22121/2LSR \u00afD\u22121/2,\n\n(3)\nand spectral clustering algorithms have been proposed for G\u00b1, based on these Laplacians [16, 4].\nLet L+ and Q\u2212 be the Laplacian and the signless Laplacian matrices of the graphs G+ and G\u2212,\nrespectively. We note that the matrix LSR blends the informations from G+ and G\u2212 into (twice) the\narithmetic mean of L+ and Q\u2212, namely the following identity holds\n\n(balance ratio/normalized Laplacian)\n\n(signed ratio/normalized Laplacian)\n\n(4)\nThus, as an alternative to the normalization de\ufb01ning LSN from LSR, it is natural to consider the\narithmetic mean of the normalized Laplacians LAM = L+\nsym. In the next section we\nintroduce the geometric mean of L+\nsym and propose a new clustering algorithm for signed\ngraphs based on that matrix. The analysis and experiments of next sections will show that blending\nthe information from the positive and negative graphs trough the geometric mean overcomes the\nde\ufb01ciencies showed by the arithmetic mean based operators.\n\nsym and Q\u2212\n\nsym + Q\u2212\n\nLSR = L+ + Q\u2212 .\n\n3 Geometric mean of Laplacians\n\nWe de\ufb01ne here the geometric mean of matrices and introduce the geometric mean of normalized\nLaplacians for clustering signed networks. Let A1/2 be the unique positive de\ufb01nite solution of the\nmatrix equation X 2 = A, where A is positive de\ufb01nite.\nDe\ufb01nition 2. Let A, B be positive de\ufb01nite matrices. The geometric mean of A and B is the positive\nde\ufb01nite matrix A#B de\ufb01ned by A#B = A1/2(A\u22121/2BA\u22121/2)1/2A1/2.\nOne can prove that A#B = B#A (see [1] for details). Further, there are several useful ways to\nrepresent the geometric mean of positive de\ufb01nite matrices (see f.i. [1, 12])\n\nA#B = A(A\u22121B)1/2 = (BA\u22121)1/2A = B(B\u22121A)1/2 = (AB\u22121)1/2B\n\n(5)\nThe next result reveals further consistency with the scalar case, in fact we observe that if A and B have\nsome eigenvectors in common, then A + B and A#B have those eigenvectors, with eigenvalues given\nby the arithmetic and geometric mean of the corresponding eigenvalues of A and B, respectively.\nTheorem 1. Let u be an eigenvector of A and B with eigenvalues \u03bb and \u00b5, respectively. Then, u is\nan eigenvector of A + B and A#B with eigenvalue \u03bb + \u00b5 and\n\n\u03bb\u00b5, respectively.\n\n\u221a\n\n3.1 Geometric mean for signed networks clustering\nConsider the signed network G\u00b1 = (G+, G\u2212). We de\ufb01ne the normalized geometric mean Laplacian\nof G\u00b1 as\n\nLGM = L+\n\n(6)\nWe propose Algorithm 1 for clustering signed networks, based on the spectrum of LGM . By\nde\ufb01nition 2, the matrix geometric mean A#B requires A and B to be positive de\ufb01nite. As both\nthe Laplacian and the signless Laplacian are positve semi-de\ufb01nte, in what follows we shall assume\nsym in (6) are modi\ufb01ed by a small diagonal shift, ensuring positive\nthat the matrices L+\nde\ufb01niteness. That is, in practice, we consider L+\nsym + \u03b52I being \u03b51 and \u03b52\nsmall positive numbers. For the sake of brevity, we do not explicitly write the shifting matrices.\nInput: Symmetric weight matrices W +, W \u2212 \u2208 Rn\u00d7n, number k of clusters to construct.\nOutput: Clusters C1, . . . , Ck.\n\nsym + \u03b51I and Q\u2212\n\nsym and Q\u2212\n\nsym#Q\u2212\n\nsym\n\n1 Compute the k eigenvectors u1, . . . , uk corresponding to the k smallest eigenvalues of LGM .\n2 Let U = (u1, . . . , uk).\n3 Cluster the rows of U with k-means into clusters C1, . . . , Ck.\n\nAlgorithm 1: Spectral clustering with LGM on signed networks\n\n3\n\n\f(E+)\n(E\u2212)\n(Ebal)\n\np+\nout < p+\nin\np\u2212\nin < p\u2212\np\u2212\nin + p+\n\nout\nout < p+\n\nin + p\u2212\n\nout\n\n(Evol)\n\n(Econf )\n\n(EG)\n\nout < p+\n\n(cid:17)(cid:16)\n(cid:16)\nin + (k \u2212 1)p\u2212\np\u2212\n(cid:17)(cid:16)\n(cid:16)\n\nin+(k\u22121)p+\np+\n\nkp+\n\nout\n\nout\n\nkp+\n\nout\n\nin+(k\u22121)p+\np+\n\nout\n\nout\n\n(cid:17)\nin + (k \u2212 1)p+\n(cid:17)\n\n\u2212\n\u2212\nin+(k\u22121)p\np\nout\n\u2212\n\u2212\nin\u2212p\n1 + p\nout\n\u2212\nin+(k\u22121)p\n\n\u2212\nout\n\n\u2212\nin\n\n< 1\n\nkp\n\np\n\n< 1\n\nTable 1: Conditions for the Stochastic Block Model analysis of Section 4\n\nThe main bottleneck of Algorithm 1 is the computation of the eigenvectors in step 1. In Section 5 we\npropose a scalable Krylov-based method to handle this problem.\nLet us brie\ufb02y discuss the motivating intuition behind the proposed clustering strategy. Algorithm 1,\nas well as state-of-the-art clustering algorithms based on the matrices in (3), rely on the k smallest\neigenvalues of the considered operator and their corresponding eigenvectors. Thus the relative\nordering of the eigenvalues plays a crucial role. Assume the eigenvalues to be enumerated in\nascending order. Theorem 1 states that the functions (A, B) (cid:55)\u2192 A + B and (A, B) (cid:55)\u2192 A#B map\neigenvalues of A and B having the same corresponding eigenvectors, into the arithmetic mean\n\n\u03bbi(A) + \u03bbj(B) and geometric mean(cid:112)\u03bbi(A)\u03bbj(B), respectively, where \u03bbi(\u00b7) is the ith smallest\n\neigenvalue of the corresponding matrix. Note that the indices i and j are not the same in general,\nas the eigenvectors shared by A and B may be associated to eigenvalues having different positions\nin the relative ordering of A and B. This intuitively suggests that small eigenvalues of A + B are\nrelated to small eigenvalues of both A and B, whereas those of A#B are associated with small\neigenvalues of either A or B, or both. Therefore the relative ordering of the small eigenvalues of\nLGM is in\ufb02uenced by the presence of assortative clusters in G+ (related to small eigenvalues of\nsym) or by disassortative clusters in G\u2212 (related to small eigenvalues in Q\u2212\nsym), whereas the ordering\nL+\nof the small eigenvalues of the arithmetic mean takes into account only the presence of both those\nsituations.\nIn the next section, for networks following the stochastic block model, we analyze in expectation\nthe spectrum of the normalized geometric mean Laplacian as well as the one of the normalized\nLaplacians previously introduced. In this case the expected spectrum can be computed explicitly and\nwe observe that in expectation the ordering induced by blending the informations of G+ and G\u2212\ntrough the geometric mean allows to recover the ground truth clusters perfectly, whereas the use of\nthe arithmetic mean introduces a bias which reverberates into a signi\ufb01cantly higher clustering error.\n\n4 Stochastic block model on signed graphs\n\nin (p\u2212\n\nout (p\u2212\n\nIn this section we present an analysis of different signed graph Laplacians based on the Stochastic\nBlock Model (SBM). The SBM is a widespread benchmark generative model for networks showing a\nclustering, community, or group behaviour [22]. Given a prescribed set of groups of nodes, the SBM\nde\ufb01nes the presence of an edge as a random variable with probability being dependent on which\ngroups it joins. To our knowledge this is the \ufb01rst analysis of spectral clustering on signed graphs\nwith the stochastic block model. Let C1, . . . ,Ck be ground truth clusters, all having the same size |C|.\nWe let p+\nin) be the probability that there exists a positive (negative) edge between nodes in the\nsame cluster, and let p+\nout) denote the probability of a positive (negative) edge between nodes in\ndifferent clusters.\nCalligraphic letters denote matrices in expectation. In particular W + and W\u2212 denote the weight\nmatrices in expectation. We have W +\nin if vi, vj belong to the same cluster,\nwhereas W +\ni,j = p\u2212\nout if vi, vj belong to different clusters. Sorting nodes according\nto the ground truth clustering shows that W + and W\u2212 have rank k.\nConsider the relations in Table 1. Conditions E+ and E\u2212 describe the presence of assortative or\ndisassortative clusters in expectation. Note that, by De\ufb01nition 1, a graph is balanced if and only if\nin = 0. We can see that if E+ \u2229 E\u2212 then G\u2212 and G+ give information about the cluster\nout = p\u2212\np+\nstructure. Further, if E+ \u2229 E\u2212 holds then Ebal holds. Similarly Econf characterizes a graph where\nthe relative amount of con\ufb02icts - i.e. positive edges between the clusters and negative edges inside the\nclusters - is small. Condition EG is strictly related to such setting. In fact when E\u2212 \u2229 EG holds then\n\nout and W\u2212\n\nin and W\u2212\n\ni,j = p\u2212\n\ni,j = p+\n\ni,j = p+\n\n4\n\n\fEconf holds. Finally condition Evol implies that the expected volume in the negative graph is smaller\nthan the expected volume in the positive one. This condition is therefore not related to any signed\nclustering structure.\nLet\n\n\u03c71 = 1,\n\n\u03c7i = (k \u2212 1)1Ci \u2212 1Ci\n\n.\n\nThe use of k-means on \u03c7i, i = 1, . . . , k identi\ufb01es the ground truth communities Ci. As spectral\nclustering relies on the eigenvectors corresponding to the k smallest eigenvalues (see Algorithm 1)\nwe derive here necessary and suf\ufb01cient conditions such that in expectation the eigenvectors \u03c7i, i =\n1, . . . , k correspond to the k smallest eigenvalues of the normalized Laplacians introduced so far. In\nparticular, we observe that condition EG affects the ordering of the eigenvalues of the normalized\ngeometric mean Laplacian. Instead, the ordering of the eigenvalues of the operators based on the\narithmetic mean is related to Ebal and Evol. The latter is not related to any clustering, thus introduces\na bias in the eigenvalues ordering which reverberates into a noisy embedding of the data points and in\nturn into a signi\ufb01cantly higher clustering error.\nTheorem 2. Let LBN and LSN be the normalized Laplacians de\ufb01ned in (3) of the expected graphs.\nThe following statements are equivalent:\n\n1. \u03c71, . . . , \u03c7k are the eigenvectors corresponding to the k smallest eigenvalues of LBN .\n2. \u03c71, . . . , \u03c7k are the eigenvectors corresponding to the k smallest eigenvalues of LSN .\n3. The two conditions Ebal and Evol hold simultaneously.\n\nTheorem 3. Let LGM = L+\nsym be the geometric mean of the Laplacians of the expected\ngraphs. Then \u03c71, . . . , \u03c7k are the eigenvectors corresponding to the k smallest eigenvalues of LGM\nif and only if condition EG holds.\n\nsym#Q\u2212\n\nConditions for the geometric mean Laplacian of diagonally shifted Laplacians are available in the\nsupplementary material. Intuition suggests that a good model should easily identify clusters when\nE+ \u2229 E\u2212. However, unlike condition EG, condition Evol \u2229 Ebal is not directly satis\ufb01ed under that\nregime. Speci\ufb01cally, we have\nCorollary 1. Assume that E+ \u2229 E\u2212 holds. Then \u03c71, . . . , \u03c7k are eigenvectors corresponding to the\nk smallest eigenvalues of LGM . Let p(k) denote the proportion of cases where \u03c71, . . . , \u03c7k are the\neigenvectors of the k smallest eigenvalues of LSN or LBN , then p(k) \u2264 1\n\n6 + 2\n\n3(k\u22121) + 1\n\n(k\u22121)2 .\n\nIn order to grasp the difference in expectation between LBN , LSN and LGM , in Fig 1 we present the\nproportion of cases where Theorems 2 and 3 hold under different contexts. Experiments are done with\nall four parameters discretized in [0, 1] with 100 steps. The expected proportion of cases where EG\nholds (Theorem 3) is far above the corresponding proportion for Evol \u2229 Ebal (Theorem 2), showing\nthat in expectation the geometric mean Laplacian is superior to the other signed Laplacians.\nIn\nFig. 2 we present experiments on sampled graphs with k-means on top of the k smallest eigenvectors.\nIn all cases we consider clusters of size |C| = 100 and present the median of clustering error (i.e.,\nerror when clusters are labeled via majority vote) of 50 runs. The results show that the analysis\nmade in expectation closely resembles the actual behavior. In fact, even if we expect only one noisy\neigenvector for LBN and LSN , the use of the geometric mean Laplacian signi\ufb01cantly outperforms\nany other previously proposed technique in terms of clustering error. LSN and LBN achieve good\nclustering only when the graph resembles a k-balanced structure, whereas they fail even in the ideal\nsituation where either the positive or the negative graphs are informative about the cluster structure.\nAs shown in Section 6, the advantages of LGM over the other Laplacians discussed so far allow us to\nidentify a clustering structure on the Wikipedia benchmark real world signed network, where other\nclustering approaches have failed.\n\n5 Krylov-based inverse power method for small eigenvalues of L+\n\nsym#Q\u2212\n\nsym\n\nThe computation of the geometric mean A#B of two positive de\ufb01nite matrices of moderate size\nhas been discussed extensively by various authors [20, 11, 12, 13]. However, when A and B have\nlarge dimensions, the approaches proposed so far become unfeasible, in fact A#B is in general a full\nmatrix even if A and B are sparse. In this section we present a scalable algorithm for the computation\nof the smallest eigenvectors of L+\nsym. The method is discussed for a general pair of matrices\nA and B, to emphasize its general applicability which is therefore interesting in itself. We remark that\n\nsym#Q\u2212\n\n5\n\n\fFigure 1: Fraction of cases where in expectation \u03c71, . . . , \u03c7k correspond to the k smallest eigenvalues\nunder the SBM.\n\nFigure 2: Median clustering error under the stochastic block model over 50 runs.\n\nthe method takes advantage of the sparsity of A and B and does not require to explicitly compute the\nmatrix A#B. To our knowledge this is the \ufb01rst effective method explicitly built for the computation\nof the eigenvectors of the geometric mean of two large and sparse positive de\ufb01nite matrices.\nGiven a positive de\ufb01nite matrix M with eigenvalues \u03bb1 \u2264 \u00b7\u00b7\u00b7 \u2264 \u03bbn, let H be any eigenspace of M\nassociated to \u03bb1, . . . , \u03bbt. The inverse power method (IPM) applied to M is a method that converges\nto an eigenvector x associated to the smallest eigenvalue \u03bbH of M such that \u03bbH (cid:54)= \u03bbi, i = 1, . . . , t.\nThe pseudocode of IPM applied to A#B = A(A\u22121B)1/2 is shown in Algorithm 2. Given a vector\nv and a matrix M, the notation solve{M, v} is used to denote a procedure returning the solution\nx of the linear system M x = v. At each step the algorithm requires the solution of two linear\nsystems. The \ufb01rst one (line 2) is solved by the preconditioned conjugate gradient method, where the\npreconditioner is obtained by the incomplete Cholesky decomposition of A. Note that the conjugate\ngradient method is very fast, as A is assumed sparse and positive de\ufb01nite, and it is matrix-free, i.e. it\nrequires to compute the action of A on a vector, whereas it does not require the knowledge of A (nor\nits inverse). The solution of the linear system occurring in line 3 is the major inner-problem of the\nproposed algorithm. Its ef\ufb01cient solution is performed by means of an extended Krylov subspace\ntechnique that we describe in the next section. The proposed implementation ensures the whole IPM\nis matrix-free and scalable.\n\n5.1 Extended Krylov subspace method for the solution of the linear system (A\u22121B)1/2x = y\nWe discuss here how to apply the technique known as Extended Krylov Subspace Method (EKSM) for\nthe solution of the linear system (A\u22121B)1/2x = y. Let M be a large and sparse matrix, and y a given\nvector. When f is a function with a single pole, EKSM is a very effective method to approximate\nthe vector f (M )y without ever computing the matrix f (M ) [8]. Note that, given two positive\nde\ufb01nite matrices A and B and a vector y, the vector we want to compute is x = (A\u22121B)\u22121/2y,\nso that our problem boils down to the computation of the product f (M )y, where M = A\u22121B and\nf (X) = X\u22121/2. The general idea of EKSM s-th iteration is to project M onto the subspace\n\nKs(M, y) = span{y, M y, M\u22121y, . . . , M s\u22121y, M 1\u2212sy} ,\n\nand solve the problem there. The projection onto Ks(M, y) is realized by means of the Lanczos\nprocess, which produces a sequence of matrices Vs with orthogonal columns, such that the \ufb01rst\n\n6\n\n2 5 102550100Numberofclusters00.20.40.60.81PositiveandNegativeInformative:p+out