{"title": "Dynamic Network Model from Partial Observations", "book": "Advances in Neural Information Processing Systems", "page_first": 9862, "page_last": 9872, "abstract": "Can evolving networks be inferred and modeled without directly observing their nodes and edges? In many applications, the edges of a dynamic network might not be observed, but one can observe the dynamics of stochastic cascading processes (e.g., information diffusion, virus propagation) occurring over the unobserved network. While there have been efforts to infer networks based on such data, providing a generative probabilistic model that is able to identify the underlying time-varying network remains an open question. Here we consider the problem of inferring generative dynamic network models based on network cascade diffusion data. We propose a novel framework for providing a non-parametric dynamic network model---based on a mixture of coupled hierarchical Dirichlet processes---based on data capturing cascade node infection times. Our approach allows us to infer the evolving community structure in networks and to obtain an explicit predictive distribution over the edges of the underlying network---including those that were not involved in transmission of any cascade, or are likely to appear in the future. We show the effectiveness of our approach using extensive experiments on synthetic as well as real-world networks.", "full_text": "Dynamic Network Model from Partial Observations\n\nElahe Ghalebi\n\nTU Wien\n\neghalebi@cps.tuwien.ac.at\n\nBaharan Mirzasoleiman\n\nStanford University\n\nbaharanm@cs.stanford.edu\n\nRadu Grosu\n\nTU Wien\n\nradu.grosu@tuwien.ac.at\n\nJure Leskovec\n\nStanford University\n\njure@cs.stanford.edu\n\nAbstract\n\nCan evolving networks be inferred and modeled without directly observing their\nnodes and edges? In many applications, the edges of a dynamic network might not\nbe observed, but one can observe the dynamics of stochastic cascading processes\n(e.g., information diffusion, virus propagation) occurring over the unobserved\nnetwork. While there have been efforts to infer networks based on such data,\nproviding a generative probabilistic model that is able to identify the underlying\ntime-varying network remains an open question. Here we consider the problem of\ninferring generative dynamic network models based on network cascade diffusion\ndata. We propose a novel framework for providing a non-parametric dynamic\nnetwork model\u2014based on a mixture of coupled hierarchical Dirichlet processes\u2014\nbased on data capturing cascade node infection times. Our approach allows us\nto infer the evolving community structure in networks and to obtain an explicit\npredictive distribution over the edges of the underlying network\u2014including those\nthat were not involved in transmission of any cascade, or are likely to appear in the\nfuture. We show the effectiveness of our approach using extensive experiments on\nsynthetic as well as real-world networks.\n\n1\n\nIntroduction\n\nNetworks of interconnected entities are widely used to model pairwise relations between objects in\nmany important problems in sociology, \ufb01nance, computer science, and operations research [1, 2, 3].\nOften times, these networks are dynamic, with nodes or edges appearing or disappearing over time,\nand the underlying network structure evolving over time. As a result, there is a growing interest in\ndeveloping dynamic network models allowing for the study of evolving networks.\nNon-parametric models are specially useful when there is no prior knowledge or assumption about\nthe shape or size of the network as they can automatically address the model selection problem. Non-\nparametric Bayesian approaches mostly rely on the assumption of vertex exchangeability, in which\nthe distribution of a graph is invariant to the order of its vertices [4, 5, 6]. Vertex-exchangeable models\nsuch as the Stochastic Block model and its variants, explain the data by means of an underlying\nlatent clustering structure. However, such models yield dense graphs [7] and are less appropriate\nfor predicting unseen interactions. Recently, an alternative notion of edge-exchangeability was\nintroduced for graphs, in which the distribution of a graph is invariant to the order of its edges [8, 9].\nEdge-exchangeable models can exhibit sparsity, and small-world behavior of real-world networks.\nSuch models allow both the latent dimensionality of the model and the number of nodes to grow over\ntime, and are suitable for predicting future interactions.\n\n32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montr\u00e9al, Canada.\n\n\fExisting models, however, aim to model a fully observed network [4, 5, 8, 9] but in many real-\nworld problems, the underlying network structure is not known. What is often known are partial\nobservations of a stochastic cascading process that is spreading over the network. A cascade is\ncreated by a contagion (e.g., a social media post, a virus) that starts at some node of the network\nand then spreads like an epidemic from node to node over the edges of the underlying (unobserved)\nnetwork. The observations are often in the form of the times when different nodes get infected by\ndifferent contagions. A fundamental problem, therefore, is to infer the underlying network structure\nfrom these partial observations. In recent years, there has been a body of research on inferring\ndiffusion networks from node infection times. However, these efforts mostly rely on a \ufb01xed cascade\ntransmission model\u2014describing how nodes spread contagions\u2014to infer the set of most likely edges\n[2, 10, 11, 12]. More recently, there have been attempts to predict the transmission probabilities from\ninfection times, either by learning node representations [13], or by learning diffusion representations\nusing the underlying network structure [13, 14, 15]. However, it remains an open problem to provide\na generative probabilistic model for the underlying network from partial observations.\nHere we propose a novel online dynamic network inference framework, DYFERENCE, for providing\nnon-parametric edge-exchangeable network models from partial observations. We build upon the\nnon-parametric network model of [8], namely MDND, that assumes that the network clusters into\ngroups and then places a mixture of Dirichlet processes over the outgoing and incoming edges\nin each cluster while coupling the network using a shared discrete base measure. However, our\nframework is easily extended to arbitrary generative models replacing the MDND with other choices\nof latent representations, such as network models presented in [9, 16, 17, 18]. Given a set of cascades\nspreading over the network, we process observations in time intervals. For each time interval we \ufb01rst\n\ufb01nd a probability distribution over the cascade diffusion trees that may have been involved in each\ncascade. We then calculate the marginal probabilities for all the edges involved in the diffusion trees.\nFinally, we sample a set of edges from this distribution and provide the sampled edges to a Gibbs\nsampler to update the model variables. In the next iteration, we use the updated edge probabilities\nprovided by the model to update the probability distributions over edges supported by each cascade.\nWe continue the above iterative process until the model does not change considerably. Extensive\nexperiments on synthetic and real-world networks show that DYFERENCE is able to track changes in\nthe structure of dynamic networks and provides accurate online estimates of the time-varying edge\nprobabilities for different network topologies. We also apply DYFERENCE for diffusion prediction\nand predicting the most in\ufb02uential nodes in Twitter and MemeTracker datasets, as well as bankruptcy\nprediction in a \ufb01nancial transaction network.\n\n2 Related Work\nThere is a body of work on inferring diffusion network from partial observations. NETINF [19] and\nMULTITREE [20] formulate the problem as submodular optimization. NETRATE [21] and CONNIE\n[2] further infer the transmission rates using convex optimization. INFOPATH [22] considers inferring\nvarying transmission rates in an online manner using stochastic convex optimization. The above\nmethods assume that diffusion rates are derived from a prede\ufb01ned parametric probability distribution.\nIn contrast, we don\u2019t make any assumption on the transmission model. EMBEDDEDIC [13] embeds\nnodes in a latent space based on Independent Cascade model, and infer diffusion probabilities based\non the relative node positions in the latent space. DEEPCAS [15] and TOPOLSTM [14] use the\nnetwork structure to learn diffusion representations and predict diffusion probabilities. Our work\nis different in nature to the existing methods in that we aim at providing a generative probabilistic\nmodel for the underlying dynamic network from diffusion data.\nThere has also been a growing interest in developing probabilistic network models that can capture\nnetwork evolutions from full observations. Bayesian generative models such as the stochastic block\nmodel [4], the mixed membership stochastic block model [23], the in\ufb01nite relational model [24, 25],\nthe latent space model [26], the latent feature relational model [27], the in\ufb01nite latent attribute\nmodel [17, 28], and the random function model [7] are among the vertex-exchangeable examples.\nA limitation of the vertex-exchangeable models is that they generate dense or empty networks with\nprobability one [29, 30]. This is in contrast with the sparse nature of many real-world networks.\nRecently, edge-exchangeable models have been proposed and shown to exhibit sparsity [8, 9, 16].\nHowever, these models assume that networks are fully observed. In contrast, our work here considers\nthe network is unobserved but what we observe are node infection times of a stochastic cascading\nprocess spreading over the network.\n\n2\n\n\f3 Preliminaries\nIn this section, we \ufb01rst formally de\ufb01ne the problem of dynamic network inference from partial\nobservations. We then review the non-parametric edge-exchangeable network model of [8] that we\nwill build upon in the rest of this paper. Finally, we give a brief overview of Bayesian inference for\ninferring latent model variables.\n\n3.1 Dynamic Network Inference Problem\n\nConsider a hidden directed dynamic network where nodes and edges may appear or disappear over\ntime. At each time step t, the network Gt = (V t, Et) consists of a set of vertices V t, and a set of\nedges Et. A set C of cascades spread over edges of the network from infected to non-infected nodes.\nFor each cascade c \u2208 C, we observe a sequence tc := (tc\n1,\u00b7\u00b7\u00b7 , tc|V |), recording the times when each\nu = \u221e. For\nnode got infected by the cascade c. If node u is not infected by the cascade c, we set tc\nu when node u got infected, but not what node and which\neach cascade, we only observe the time tc\nedge infected node u. Our goal is to infer a model M to capture the latent structure of the network\nGt over which cascades propagated, using these partial observations. Such a model, in particular, can\nprovide us with the probabilities of all the |V t|2 potential edges between nodes u, v \u2208 V t.\n\n3.2 Non-parametric Edge-exchangeable Network Model\n\nWe adopt the Bayesian non-parametric model of [8] that combines structure elucidation with predictive\nperformance. Here, the network is modeled as an exchangeable sequence of directed edges, and\ncan grow over time. More speci\ufb01cally, each community in the network is modeled by a mixture of\nDirichlet network distributions (MDND).\nThe model M can be described as:\n\nThe edges of the network are modeled by a Dirichlet distribution H. Here, \u0398 is the measurable space,\n\u03b4\u03b8i denotes an indicator function centered on \u03b8i, and hi is the corresponding probability of an edge to\ni=1 hi = 1. The concentration parameter \u03b3 controls the number of edges in the\nnetwork, with larger \u03b3 results in more edges. The size and number of communities are modeled by a\nstick-breaking distribution GEM(\u03b1) with concentration parameter \u03b1. For every community k, two\nDirichlet distribution Ak, and Bk models the outlinks and inlinks in community k. To ensure that\noutlinks and inlinks are de\ufb01ned on the same set of locations \u03b8, distributions Ak, and Bk are coupled\nusing the shared, discrete base measure H.\nTo generate an edge euv, we \ufb01rst select a cluster \u03c2uv according to D. We then select a pair of nodes\nu and v according to the cluster-speci\ufb01c distributions A\u03c2uv , B\u03c2uv. The concentration parameter \u03c4\ncontrols the overlap between clusters, with smaller \u03c4 results in smaller overlaps. Finally, zij is the\ninteger-valued weight of edge eij.\n\n3.3 Bayesian Inference\nHaving speci\ufb01ed the model M in terms of the joint distribution in Eq. 1, we can infer the latent\nmodel variables for a fully observed network using Bayesian inference. In the full observation setting\nwhere we can observe all the edges in the network, the posterior distribution of the latent variables\nconditioned on a set of observed edges X can be updated using the Bayes rule:\n\n(2)\n\np(\u03a6|X,M) =\n\np(X|\u03a6,M)p(\u03a6|M)\n\n(cid:82) p(X, \u03a6|M)d\u03a6\n\n3\n\nD := (dk, k \u2208 N) \u223c GEM(\u03b1)\n\u223c DP(\u03b3, \u0398)\n\nH :=\n\nhi\u03b4\u03b8i\n\ni=1\n\nAk :=\n\n\u221e(cid:88)\n\u221e(cid:88)\n\u221e(cid:88)\nexist at \u03b8i, with(cid:80)\u221e\n\nBk :=\n\ni=1\n\nbk,i\u03b4\u03b8i\n\ni=1\n\nak,i\u03b4\u03b8i \u223c DP(\u03c4, H), k = 1, 2,\u00b7\u00b7\u00b7\n\n\u223c DP(\u03c4, H)\n\n\u03c2uv \u223c D, u, v \u2208 V\nu \u223c A\u03c2uv\n(cid:88)\nv \u223c B\u03c2uv\nzuv =\n\ni,j\u2208V\n\nI(i = u, j = v),\n\n(1)\n\n\fHere, \u03a6 is the in\ufb01nite dimensional parameter vector of the model M speci\ufb01ed in Eq. 1. The\ndenominator in the above equation is dif\ufb01cult to handle as it involves summation over all possible\nparameter values. Consequently, we need to resort to approximate inference. In Section 4, we show\nhow we extract our set of observations from diffusion data and construct a collapsed Gibbs sampler\nto update the the posterior distributions of latent variables.\n\n4 DYFERENCE: Dynamic Network Inference from Partial Observations\n\nIn this section, we describe our algorithm, DYFERENCE, for inferring the latent structure of the\nunderlying dynamic network from diffusion data. DYFERENCE works based on the following iterative\nidea: in each iteration we (1) \ufb01nd a probability distribution over all the edges that could be involved in\neach cascade; (2) Then we sample a set of edges from the probability distribution associated with each\ncascade, and provide the sampled edges as observations to a Gibbs sampler to update the posterior\ndistribution of the latent variables of our non-parametric network model. We start by explaining our\nmethod on a static directed network, over which we observe a set C of cascades {tc1 ,\u00b7\u00b7\u00b7 , tc|C|}. In\nSection 4.3, we shall then show how we can generalize our method to dynamic networks.\n\n4.1 Extracting Observations from Diffusion Data\n\nThe set of edges that could have been involved in transmission of a cascade c is the set of all edges\neuv for which u is infected before v, i.e., Ec = {euv|tc\nu < \u221e} is\nthe set of all infected nodes in cascade c. To \ufb01nd the probability distribution over all the edges in Ec,\nwe \ufb01rst assume that every infected node in cascade c gets infected through one of its neighbors, and\ntherefore each cascade propagates as a directed tree. For a cascade c, each possible way in which the\ncascade could spread over the underlying network G creates a tree. To calculate the probability of a\ncascade to spread as a tree T , we use the following Gibbs measure [31],\n\nv < \u221e}. Similarly, Vc = {u|tc\n\nu < tc\n\np(T|ddd/\u03bb) =\n\n1\n\nZ(ddd/\u03bb)\n\ne\u2212(cid:80)\n\n(3)\n\neuv\u2208Ec\n\neuv\u2208T duv/\u03bb,\n\nEP (T|ddd/\u03bb)[[[R \u2286 T ]]] = det KR.\n\nwhere \u03bb is the temperature parameter. The normalizing constant Z(ddd/\u03bb) =(cid:80)\n\ne\u2212duv/\u03bb is\nthe partition function that ensures that the distribution is normalized, and duv is the weight of edge\neuv. The most probable tree for cascade c is a MAP con\ufb01guration for the above distribution, and the\ndistribution will concentrate on the MAP con\ufb01gurations as \u03bb \u2192 0.\nTo calculate the probability distribution over the edges in Ec, we use the result of [32] who showed\nthat the probability distribution over subsets of edges associated with all the spanning trees in a\ngraph is a Determinantal Point Processes (DPP), where the probability of every subset R \u2286 T can be\ncalculated as:\n(4)\nHere, KR is the |R| \u00d7 |R| restriction of the DPP kernel K to the entries indexed by elements of R.\nFor constructing the kernel matrix K, we take the incidence matrix A \u2208 {\u22121, 0, +1}|Vc\u22121|\u00d7|Ec|,\nin which Aij \u2208 {1, 0,\u22121} indicates that edge j is an outlink/inlink of node i, and we removed an\narbitrary vertex from the graph. Then, construct its Laplacian L = A diag(e\u2212ddd/\u03bb)AT and compute\nH = L\u22121/2A diag(e\u2212ddd/2\u03bb) and K = H T H.\nFinally, the marginal probabilities of an edge euv in Ec can be calculated as:\np(euv|ddd/\u03bb) = e\u2212duv/\u03bb(au \u2212 av)T L\u22121(au \u2212 av),\n\n(5)\nwhere ai is the vector with coordinates equal to zero, except the i-th coordinate which is one. All\nmarginal probabilities can be calculated in time \u02dcO(r|Vc|2/\u00012), where \u0001 is the desired relative precision\nand and r = 1\nTo construct our multiset of observations X\u2014in which each edge can appear multiple times\u2014, for\neach c we sample a set Sc of q edges from the probability distributions of edges in Ec. I.e,.\n\n\u03bb (maxe d(e) \u2212 mine d(e)) [33].\n\nX = {Sc1,\u00b7\u00b7\u00b7 , Sc|C|}\n\n(6)\n\nNote that an edge could be sampled multiple times from the probability distributions corresponding\nto multiple cascades. The number of times each edge euv is sampled is the integer valued weight\n\n4\n\n\fAlgorithm 1 EXTRACT_OBSERVATIONS\nInput: Set of cascades {tc1 ,\u00b7\u00b7\u00b7 , tc|C|}, sample size q.\nOutput: Extracted multiset of edges X from cascades.\n1: X \u2190 {}\n2: for c \u2208 C do\n3:\n4:\n5:\n6: end for\n\nCalculate p(euv|ddd/\u03bb) for all euv \u2208 Ec using Eq. 5\nSc \u2190 Sample q edges from the above probability distribution.\nX \u2190 {X, Sc}\n\nwe initialize duv \u221d (cid:80)\n\nzij in Eq. 1. Initially, without any prior knowledge about the structure of the underlying network,\nu for all euv \u2208 Ec, and duv = 0 otherwise. However, in the\nsubsequent iterations when we get the updated posterior probabilities from our model, we use\nduv = p(euv|\u03a6,M).\nThe pseudocode for extracting observations from diffusion data is shown in Algorithm 1.\n\nv \u2212 tc\n\nc\u2208C tc\n\n4.2 Updating Latent Model Variables\n\nTo update the posterior distribution of the latent model variables conditioned on the extracted\nobservations, we construct a collapsed Gibbs sampler by sweeping through each variable to sample\nfrom its conditional distribution with the remaining variables \ufb01xed to their current values.\n\nSampling cluster assignments \u03c2\u03c2\u03c2. Following [8], we model the posterior probability for an edge\neuv to belong to cluster k as a function of the importance of the cluster in the network, the importance\nof u as a source and v as a destination in cluster k, as well as the importance of u, v in the network.\nTo this end, we measure the importance of a cluster by the total number of its edges, i.e., \u03b7k =\nu,v\u2208V [I\u03c2uv = k]. Similarly, the importance of u as a source, and the importance of v as a destication\nu. , as well as\n.v . Finally, the importance \u03b2 of node i in the network\n\n(cid:80)\nis determined by the probability mass of its outlinks hi. and inlinks h.i, i.e. \u03b2i =(cid:80) hi. +(cid:80) h.i.\n\nin cluster k is measured by the number of outlinks of u associated with cluster k, i.e. l(k)\ninlinks of v associated with cluster k, i.e. l(k)\n\nThe distribution over the cluster assignment \u03c2uv of an edge euv, given the end nodes u, v, the cluster\nassignments for all other edges, and \u03b2 is given by:\n(l(k)\u00aceuv\n\n+ \u03c4 \u03b2u)(l(k)\u00aceuv\n\n+ \u03c4 \u03b2v)\n\n(cid:40)\n\n.,v\n\np(\u03c2uv = k|u, v, \u03c2\n\n1:M , \u03b21:N ) \u221d\n\u00aceuv\n\nif \u03b7\nif \u03b7\n\n\u00aceuv\nk\n\u00aceuv\nk\n\n> 0\n= 0\n\n(7)\n\n\u00aceuv\n\u03b7\nu.\nk\n\u03b1\u03c4 2\u03b2u\u03b2v\n\nwhere \u00aceuv is used to exclude the variables associated with the current edge being observed. As\ndiscussed in Section 3.2, \u03b1, \u03c4, and \u03b3 controls the number of clusters, cluster overlaps, and the number\nof nodes in the network. Moreover, N, M are the number of nodes and edges in the network.\n\nSampling edge probabilities eee. Due to the edge-exchangeability, we can treat euv as the last\nvariable being sampled. The conditional posterior for euv given the rest of the variables can be\ncalculated as:\n\np(euv = eij|\u03c21:M ,e\n\n\u00aceuv\n1:M , \u03b21:N ) =\n\n\uf8f1\uf8f4\uf8f4\uf8f4\uf8f4\uf8f4\uf8f2\uf8f4\uf8f4\uf8f4\uf8f4\uf8f4\uf8f3\n\n(cid:80)K+\n(cid:80)K+\n(cid:80)K+\n\nk=1\n\nk=1\n\nk=1\n\n\u03b22\nn\n\n\u03b7k\n\nM +\u03b1\n\n\u03b7k\n\nM +\u03b1\n\n\u03b7k\n\nM +\u03b1\n\nu.\n\nl(k)\u00aceuv\n\u03b7k+\u03c4\nl(k)\u00aceuv\n\u03b7k+\u03c4\nl(k)\u00aceuv\n\u03b7k+\u03c4\n\n.,v\n\nu.\n\n+\u03c4 \u03b2u\n\n+\u03c4 \u03b2u\n\n+\u03c4 \u03b2u\n\n.,v\n\nl(k)\u00aceuv\n\u03b7k+\u03c4\n\u03b2n + \u03b1\n\u03b2n + \u03b1\n\nM +\u03b1 \u03b2u\u03b2n\nM +\u03b1 \u03b2n\u03b2v\n\n+\u03c4 \u03b2v\n\n+ \u03b1\n\nM +\u03b1 \u03b2u\u03b2v\n\nif i, j \u2208 V\nif i \u2208 V, j /\u2208 V\nif i /\u2208 V, j \u2208 V\nif i, j /\u2208 V\n\n(8)\n\nwhere \u03b2n =(cid:80)\u221e\n\ni=N +1 hi is the probability mass for all the edges that may appear in the network in\nthe future, and K is number of clusters. We observe that an edge may appear between existing nodes\nin the network, or because one or two nodes has appeared in the network. Note that the predictive\ndistribution for a new link to appear in the network can be calculated similarly using Eq. 8.\n\n5\n\n\fX \u2190 Extract_Observations({tc1,\u00b7\u00b7\u00b7 , tc|C|})\nfor j = 1, 2,\u00b7\u00b7\u00b7 until convergence do\n\nAlgorithm 2 UPDATE_NETWORK_MODEL\nInput: Model M(c1:M , p1:M , \u03b21:N ), set of cascades {tc1,\u00b7\u00b7\u00b7 , tc|C|}.\nOutput: Updated model M\u2217(c1:M , p1:M , \u03b21:N )\n1: for i = 1, 2,\u00b7\u00b7\u00b7 until convergence do\n2:\n3:\n4:\n5:\n6:\n7:\n8:\n9:\n10: end for\n\nSelect euv randomly from X\nSample \u03c2 from the conditional distribution p(cuv = k|u, v, \u03c2\n\u00aceuv\n1:M , \u03b21:N )\nSample e from the conditional distribution p(euv = eij|\u03c21:M , e\nSample \u03c1 from the conditional distribution p(\u03c1(k)\n1 ,\u00b7\u00b7\u00b7 , \u03c1(.)|V |, \u03b3)\nSample (\u03b21,\u00b7\u00b7\u00b7 , \u03b2|V |, \u03b2u) \u223c Dir(\u03c1(.)\n\n\u00aceuv\n1:M , \u03b21:N )\nu. = \u03c1|\u03c21:M , \u03c1(k)\u00aceuv\nu.\n\nend for\n\n, \u03b21:N )\n\n(cid:46) Eq. 7\n(cid:46) Eq. 8\n(cid:46) Eq. 9\n(cid:46) Eq. 10\n\nSampling outlink and inlink probabilities \u03c1.\u03c1.\u03c1. The probability mass on the outlinks and inlinks of\nnode i associated with cluster k are modeled by variables \u03c1(k)\n. The posterior distribution of\n\u03c1(k)\nu. (similarly \u03c1(k)\n\n.v ), can be calculated using:\n\ni. and \u03c1(k)\n\n.i\n\np(\u03c1(k)\n\nu. = \u03c1|c1:M , \u03c1(k)\u00aceuv\n\nu.\n\n, \u03b21:N ) =\n\n\u0393(\u03c4 \u03b2u)\n\n\u0393(\u03c4 \u03b2u + l(k)\nu. )\n\ns(l(k)\n\nu. , \u03c1)(\u03c4 \u03b2u)\u03c1,\n\n(9)\n\nwhere s(l(k)\nu. , \u03c1) are unsigned Stirling numbers of the \ufb01rst kind. I.e., s(0, 0) = s(1, 1) = 1, s(n, 0) =\n0 for n > 0 and s(n, m) = 0 for m > n. Other entries can be computed as s(n + 1, m) =\ns(n, m \u2212 1) + ns(n, m). However, for large l(k)\nu. , it is often more ef\ufb01cient to sample \u03c1k,i by\nsimulating the table assignments of the Chinese restaurant according to Eq. 8 [34].\n\nSampling node probabilities \u03b2\u03b2\u03b2. Finally, the probability of each node is the sum of the probability\nmasses on its edges and is modeled by a Dirichlet distribution, i.e.,\n1 ,\u00b7\u00b7\u00b7 , \u03c1(.)\n\n(\u03b21,\u00b7\u00b7\u00b7 , \u03b2N , \u03b2n) \u223c Dir(\u03c1(.)\n\nN , \u03b3),\n\n(10)\n\nwhere \u03c1(.)\nThe pseudocode for inferring the latent network variables from diffusion data is given in Algorithm 2.\n\ni. + \u03c1(k)\n\nk \u03c1(k)\n\n.i\n\ni =(cid:80)\n\n.\n\n4.3 Online Dynamic Network Inference\n\nIn order to capture the dynamics of the underlying network and keep the model updated over time,\nwe consider time intervals of length w. For the i-th interval, we only consider the infection times\ntc \u2208 [(i \u2212 1)w, iw)) for all c \u2208 C, and update the model conditioned on the observations in the\ncurrent time interval. Updating the model over intervals resembles the continuous time updates with\nlarger steps. Indeed, we can update the model in a continuous manner upon observing every infected\nnode (w = dt). However, the observations provided to the Gibbs sampler from a single infection\ntime are limited to the direct neighborhood of the infected node. This increases the overall mixing\ntime as well as the probability of getting stuck in a local optima. Updating the model using very\nsmall intervals has the same effect by providing the Gibbs sampler with limited information about\ndirected neighborhoods of few infected nodes.\nNote that we do not infer a new model for the network based on infection times in each time interval.\nInstead, we use new observations to update the latent variables from the previous time interval.\nUpdating the model with observations in the current interval results in a higher probability for the\nobserved edges, and a lower probability for the edges that have not been observed recently. Therefore,\nwe do not need to consider an aging factor to take into account the older cascades. For a large w,\nthe model may change considerably from one interval to the next. Hence, updating the model from\nprevious interval may harm the solution quality. However, if w is not very large, initializing the\nparameters from the previous interval signi\ufb01cantly improves the running time, while the quality of\nthe solutions are preserved.\n\n6\n\n\ffor all c \u2208 C do\n\nAlgorithm 3 DYNAMIC_NETWORK_INFERENCE (DYFERENCE)\nInput: Set of infection times {tc1 ,\u00b7\u00b7\u00b7 , tc|C|}, interval length w.\nOutput: Updated network model Mt at times t = iw.\n1: t = w, initialize M0 randomly.\n2: while t < last infection time do\n3:\nu \u2208 [t \u2212 w, t)\n4:\n5:\n6:\n7: Mt \u2190 Update_Network_Model(Mt\u2212w, Y t)\n8:\n9: end while\n\ntcw \u2190 tc\nY t \u2190 {Y t, tcw}\n\nend for\n\nt = t + w.\n\nFinally, a very large sample size q provides the Gibbs sampler with uninformative observations,\nincluding edges with a low probability, and result in an increased mixing time. Since the model from\nprevious interval has the information about all the infections happened so far, if w and q are not too\nlarge, we expect the parameters to change smoothly over the intervals. We observed that q = \u0398(|Ec|)\nworks well in practice.\nThe pseudocode of our dynamic inference method is given in Algorithm 3.\n\n5 Experiments\nIn this section, we address the following questions: (1) What is the predictive performance of\nDYFERENCE in static and dynamic networks and how does it compare to the existing network\ninference algorithms? (2) How does predictive performance of DYFERENCE change with the number\nof cascades? (3) How does running time of DYFERENCE compare to the baselines? And, (4) How\ndoes DYFERENCE perform for the task of predicting diffusion and in\ufb02uential nodes?\nBaselines. We compare the performance of DYFERENCE to NETINF [19], NETRATE [21], TOPOL-\nSTM [13], DEEPCAS [15], EMBEDDEDIC [13] and INFOPATH [22]. INFOPATH is the only method\nable to infer dynamic networks, hence we can only compare the performance of DYFERENCE on\ndynamic networks with INFOPATH.\nEvaluation Metrics. For performance comparison, we use Precision, Recall, F1 score, Map@k\nand Hit@k. Precision is the fraction of edges in the inferred network present in the true network,\nRecall is the fraction of edges of the true network present in the inferred network, and F1 score is\n2\u00d7(precision\u00d7recall)/(precision+recall). MAP@k is the classical mean average precision measure\nand Hits@k is the rate of the top-k ranked nodes containing the next infected node.\nIn all the experiments we use a sample size of q = |Ec| \u2212 1 for all the cascades c \u2208 C. We further\nconsider a window of length w = 1 day in our dynamic network inference experiments in Fig 1 and\nw = 2-years in Table 3.\nSynthetic Experiments. We generated synthetic networks consist of 1024 nodes and about 2500\nedges using Kronecker graph model [35]: core-periphery network (CP) (parameters [0.9,0.5;0.5,0.3]),\nhierarchical community network (HC) (parameters [0.9,0.1;0.1,0.9]), and the Forest Fire model [36]:\nwith forward and backward burning probability 0.2 and 0.17. For dynamic networks, we assign\na pattern to each edge uniformly at random from a set of \ufb01ve edge evolution patterns: Slab, and\nHump (to model outlinks of nodes that temporarily become popular), Square, and Chainsaw (to\nmodel inlinks of nodes that update periodically), and Constant (to model long term interactions) [22].\nTransmission rates are generated for each edge according to its evolution pattern for 100 time steps.\nWe then generate 500 cascades per time step (1 day) on the network with a random initiator [10].\nFigures 1a, and 1b compare precision, recall and F1 score of DYFERENCE to INFOPATH for online\ndynamic network inference on CP-Kronecker network with exponential edge transmission model,\nand HC-Kronecker network with Rayleigh edge transmission model. It can be seen that DYFERENCE\noutperforms INFOPATH in terms of F1 score as well as precision and recall on different network\ntopologies in different transmission models. Figures 1c, 1d, 1e compare F1 score of DYFERENCE\ncompared to INFOPATH and NETRATE for static network inference for varying number of cascades\nover CP-Kronecker network with Rayleigh and Exponential edge transmission model, and Forest\n\n7\n\n\fTable 1: Performance of DYFERENCE for diffusion\nprediction compared to DEEPCAS,TOPOLSTM,\nand EMBEDDEDIConTwitter and Memes datasets\n(TOPOLSTM requires the underlying network).\n\nTwitter\n\nMemes\n\nMAP@k\nDEEPCAS\nTOPOLSTM\n\nEMBEDDED-IC\nDYFERENCE\n\nHits@k\nDEEPCAS\nTOPOLSTM\n\nEMBEDDEDIC\nDYFERENCE\n\n19.4\n29.9\n19.3\n31.5\n\n9.8\n20.8\n12.5\n20.9\n\n9.8\n20.8\n12.4\n20.8\n\n18.2\n29.0\n18.3\n29.4\n\n@10 @50 @100 @10 @50 @100\n19.6\n9.3\n30.0\n20.5\n19.4\n12.0\n20.6\n32.4\n@10 @50 @100 @10 @50 @100\n70.0\n25.7\n76.8\n28.3\n65.0\n25.1\n30.0\n84.0\n\n60.5\n69.5\n56.0\n71.0\n\n31.1\n33.1\n33.5\n34.3\n\n33.2\n34.9\n36.6\n36.7\n\n43.9\n50.8\n35.1\n47.4\n\nTable 2: Top 10 predicted in\ufb02uential web-\nsites of Memes (Linkedin) on 30-06-2011.\nThe correct predictions are indicated in bold.\n\nDYFERENCE\n\npressrelated.com\narsipberita.com\nnews.yahoo.com\nin.news.yahoo.com\n\npodrobnosti.ua\narticle.wn.com\n\nctv.ca\n\nfair-news.de\nfan\ufb01ction.net\n\nbbc.co.uk\n\nINFOPATH\n\npodrobnosti.ua\n\nscribd.com\n\nderstandard.at\nheraldonline.com\nstartribune.com\ncanadaeast.com\nnews.yahoo.com\nproceso.com.mx\narticle.wn.com\nprnewswire.com\n\nTable 3: Performance of DYFERENCE for dynamic bankruptcy prediction compared to INFOPATH on\n\ufb01nancial transaction network from 2010 to 2016. In 2010, a \ufb01nancial crisis hit the network.\n\n2012\n\n2014\n\n2016\n\nMAP@k\nINFOPATH\nDYFERENCE\n\nHits@k\nINFOPATH\nDYFERENCE\n\n6.6\n20.6\n\n5.3\n19.1\n\n@10 @20 @30 @10 @20 @30 @10 @20 @30\n65.0\n4.0\n17.6\n85.7\n@10 @20 @30 @10 @20 @30 @10 @20 @30\n20.0\n65.0\n70.0\n40.0\n\n35.0\n62.0\n\n54.7\n69.6\n\n34.5\n51.9\n\n30.0\n38.1\n\n65.0\n85.7\n\n80.0\n80.0\n\n50.0\n70.0\n\n65.0\n70.0\n\n25.0\n45.0\n\n26.6\n46.6\n\n55.0\n65.0\n\n50.0\n50.0\n\nFire network with Power-law edge transmission model. We observe that DYFERENCE consistently\noutperforms the baselines in terms of accuracy and is robust to varying number of cascades.\nReal-world Experiments. We applied DYFERENCE to three real wold datasets, (1) Twitter [37]\ncontains the diffusion of URLs on Twitter during 2010 and the follower graph of users. The network\nconsists of 6,126 nodes and 12,045 edges with 5106 cascades of length of 17 on average, (2)\nMemes [38] contains the diffusion of memes from March 2011 to February 2012 over online news\nwebsites; The real diffusion network is constructed by the temporal dynamics of hyperlinks created\nbetween news sites. The network consists of 5,000 nodes and 313,669 edges with 54,847 cascades\nof length of 14 on average, and (3) a European county\u2019s \ufb01nancial transaction network. The data is\ncollected from the entire country\u2019s transaction log for all transactions larger than 50K Euros over 10\nyears from 2007 to 2017, and includes 1,197,116 transactions between 103,497 companies. 2,765\ncompanies are labeled as bankrupted with corresponding timestamps. In 2010, a \ufb01nancial crisis hit\nthe network. For every 2 years from 2010, we built a diffusion of bankruptcy with average length of\n85 between 200 bankrupted nodes that had the highest amount of transactions each year.\nFigures 1g, 1h, 1i, 1j compare the F1 score of DYFERENCE to INFOPATH for online dynamic network\ninference on the time-varying hyperlink network with four different topics over time from March 2011\nto July 2011. As we observe, DYFERENCE outperforms INFOPATH in terms of the prediction accuracy\nin all the networks. Figure 1f compares the running time of DYFERENCE to that of INFOPATH. We\ncan see that DYFERENCE has a running time that is comparable to INFOPATH, while consistently\noutperforms it in terms of the prediction accuracy.\nDiffusion Prediction. Table 1 compares Map@k and Hits@k for DYFERENCE vs. TOPOLSTM,\nDEEPCAS, and EMBEDDEDIC. We use the infection times in the \ufb01rst 80% of the total time interval\nfor training, and the remaining 20% for the test. It can be seen that DYFERENCE has a very good\nperformance for the task of diffusion prediction. Note that TOPOLSTM needs complete information\nabout the underlying network structure for predicting transmission probabilities, and INFOPATH\nrelies on prede\ufb01ned parametric probability distributions for transmission rates. On the other hand,\nDYFERENCE does not need any information about network structure or transmission rates.\nTable 3 compares Map@k and Hits@k for DYFERENCE vs. INFOPATH for dynamic bankruptcy\nprediction on the \ufb01nancial transaction network since the crisis hit the network at 2010. We used\nwindows of length w = 2 years to build cascades between bankrupted nodes and predict the\n\n8\n\n\fcompanies among the neighbors of the bankrupted nodes that are going to get bankrupted the next\nyear. It can be seen that DYFERENCE signi\ufb01cantly outperforms INFOPATH for bankruptcy prediction.\nIn\ufb02uence Prediction. Table 2 shows the set of in\ufb02uential websites found based on the predicted\ndynamic Memes network by DYFERENCE vs INFOPATH. The dynamic Memes network for Linkedin\nis predicted till 30-06-2011, and the in\ufb02uential websites are found using the method of [39]. We\nobserve that using the predicted network by DYFERENCE we could predict the in\ufb02uential nodes with\na good accuracy.\n\n(a) CP - Exp\n\n(b) HC - Ray\n\n(c) HC - Ray\n\n(d) CP - Exp\n\n(e) FF - Pwl\n\n(f) Memes\n\n(g) Occupy\n\n(h) Linkedin\n\n(i) NBA\n\n(j) News\n\nFigure 1: Precision, Recall and F1 score of DYFERENCE. (a) Compared to INFOPATH for dynamic network\ninference over time on Core-Periphery (CP) Kronecker network with exponential transmission model, and (b)\nHierarchical (HC) Kronecker network with Rayleigh transmission model. (c) accuracy of DYFERENCE compared\nto INFOPATH and NETRATE for static network inference for varying number of cascades over CP-Kronecker\nnetwork with Rayleigh, and (d) Exponential transmission model, and (e) on Forest Fire network with Power-law\ntransmission model. (f) compares the running time of DYFERENCE with INFOPATH for online dynamic network\ninference on the time-varying hyperlink network with four different topics Occupy with 1,875 sites and 655,183\nmemes, Linkedin with 1,035 sites and 155,755 memes, NBA with 1,875 sites and 655,183 memes, and News\nwith 1,035 sites and 101,836 memes. (g), (h), (i), (j) compare the accuracy of DYFERENCE to INFOPATH for\nonline dynamic network inference on the same dataset and four topics from March 2011 to July 2011.\n6 Conclusion\nWe considered the problem of developing generative dynamic network models from partial ob-\nservations, i.e. diffusion data. We proposed a novel framework, DYFERENCE, for providing a\nnon-parametric edge-exchangeable network model based on a mixture of coupled hierarchical Dirich-\nlet processes (MDND). However, our proposed framework is not restricted to MDND and can be\nused along with any generative network models to capture the underlying dynamic network structure\nfrom partial observations. DYFERENCE provides online time-varying estimates of probabilities for all\nthe potential edges in the underlying network, and track the evolution of the underlying community\nstructure over time. We showed the effectiveness of our approach using extensive experiments on\nsynthetic as well as real-world networks.\n\n9\n\n0102030405060708090100Time0.00.20.40.60.81.0AccuracyDyferenceInfoPath0102030405060708090100Time0.00.20.40.60.81.0InfoPath-PrecisionInfoPath-RecallDyference-PrecisionDyference-Recall0102030405060708090100Time0.00.20.40.60.81.0AccuracyDyferenceInfoPath0102030405060708090100Time0.00.20.40.60.81.0InfoPath-PrecisionInfoPath-RecallDyference-PrecisionDyference-Recall1002003004005006007008009001000Number of Cascades0.20.30.40.50.60.70.80.9AccuracyDyferenceNetInfNetRate1002003004005006007008009001000Number of Cascades0.10.20.30.40.50.60.7AccuracyDyferenceNetInfNetRate1002003004005006007008009001000Number of Cascades0.20.30.40.50.60.70.8AccuracyDyferenceNetInfNetRateNews of The WorldOccupyLinkedinNBADatasets02468101214Running Time (hours)DyferenceInfoPath2011-03-01 2011-04-01 2011-05-01 2011-06-01 2011-07-01 Time0.10.20.30.40.5AccuracyDyferenceInfoPath2011-03-01 2011-04-01 2011-05-01 2011-06-01 2011-07-01 Time0.20.30.40.50.60.7AccuracyDyferenceInfoPath2011-03-01 2011-04-01 2011-05-01 2011-06-01 2011-07-01 Time0.100.150.200.250.300.350.400.45AccuracyDyferenceInfoPath2011-03-01 2011-04-01 2011-05-01 2011-06-01 2011-07-01 Time0.20.30.40.50.60.70.8AccuracyDyferenceInfoPath\fAcknowledgment. This research was partially supported by SNSF P2EZP2_172187.\n\nReferences\n[1] A Namaki, AH Shirazi, R Raei, and GR Jafari. Network analysis of a \ufb01nancial market based on genuine\ncorrelation and threshold method. Physica A: Statistical Mechanics and its Applications, 390(21):3835\u2013\n3841, 2011.\n\n[2] Seth Myers and Jure Leskovec. On the convexity of latent social network inference. In Advances in neural\n\ninformation processing systems, pages 1741\u20131749, 2010.\n\n[3] Amr Ahmed and Eric P Xing. Recovering time-varying networks of dependencies in social and biological\n\nstudies. Proceedings of the National Academy of Sciences, 106(29):11878\u201311883, 2009.\n\n[4] Paul W Holland, Kathryn Blackmond Laskey, and Samuel Leinhardt. Stochastic blockmodels: First steps.\n\nSocial networks, 5(2):109\u2013137, 1983.\n\n[5] Tom AB Snijders and Krzysztof Nowicki. Estimation and prediction for stochastic blockmodels for graphs\n\nwith latent block structure. Journal of classi\ufb01cation, 14(1):75\u2013100, 1997.\n\n[6] EM Airoldi. The exchangeable graph model. Technical report, Technical report 1, Department of Statistics,\n\nHarvard University, 2009.\n\n[7] James Lloyd, Peter Orbanz, Zoubin Ghahramani, and Daniel M Roy. Random function priors for exchange-\nable arrays with applications to graphs and relational data. In Advances in Neural Information Processing\nSystems, pages 998\u20131006, 2012.\n\n[8] S Williamson. Nonparametric network models for link prediction. Journal of Machine Learning Research,\n\n17(202):1\u201321, 2016.\n\n[9] Diana Cai, Trevor Campbell, and Tamara Broderick. Edge-exchangeable graphs and sparsity. In Advances\n\nin Neural Information Processing Systems, pages 4249\u20134257, 2016.\n\n[10] Manuel Gomez-rodriguez and David Balduzzi Bernhard Sch\u00f6lkopf. Uncovering the temporal dynamics of\n\ndiffusion networks. In in Proc. of the 28th Int. Conf. on Machine Learning (ICML\u201911. Citeseer, 2011.\n\n[11] Manuel Gomez Rodriguez, Jure Leskovec, and Andreas Krause. Inferring networks of diffusion and\nin\ufb02uence. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery\nand data mining, pages 1019\u20131028. ACM, 2010.\n\n[12] Manuel Gomez Rodriguez and Bernhard Sch\u00f6lkopf. Submodular inference of diffusion networks from\n\nmultiple trees. arXiv preprint arXiv:1205.1671, 2012.\n\n[13] Simon Bourigault, Sylvain Lamprier, and Patrick Gallinari. Representation learning for information\nIn Proceedings of the Ninth ACM\n\ndiffusion through social networks: an embedded cascade model.\nInternational Conference on Web Search and Data Mining, pages 573\u2013582. ACM, 2016.\n\n[14] Jia Wang, Vincent W Zheng, Zemin Liu, and Kevin Chen-Chuan Chang. Topological recurrent neural\n\nnetwork for diffusion prediction. arXiv preprint arXiv:1711.10162, 2017.\n\n[15] Cheng Li, Jiaqi Ma, Xiaoxiao Guo, and Qiaozhu Mei. Deepcas: An end-to-end predictor of information\ncascades. In Proceedings of the 26th International Conference on World Wide Web, pages 577\u2013586.\nInternational World Wide Web Conferences Steering Committee, 2017.\n\n[16] Harry Crane and Walter Dempsey. Edge exchangeable models for network data. arXiv preprint\n\narXiv:1603.04571, 2016.\n\n[17] Konstantina Palla, David A Knowles, and Zoubin Ghahramani. An in\ufb01nite latent attribute model for\nnetwork data. In Proceedings of the 29th International Coference on International Conference on Machine\nLearning, pages 395\u2013402. Omnipress, 2012.\n\n[18] Tue Herlau, Mikkel N Schmidt, and Morten M\u00f8rup. Completely random measures for modelling block-\nstructured sparse networks. In Advances in Neural Information Processing Systems, pages 4260\u20134268,\n2016.\n\n[19] Manuel Gomez-Rodriguez, Jure Leskovec, and Andreas Krause. Inferring networks of diffusion and\n\nin\ufb02uence. ACM Transactions on Knowledge Discovery from Data (TKDD), 5(4):21, 2012.\n\n10\n\n\f[20] Manuel Gomez-Rodriguez and Bernhard Sch\u00f6lkopf. Submodular inference of diffusion networks from\nmultiple trees. In Proceedings of the 29th International Coference on International Conference on Machine\nLearning, pages 1587\u20131594. Omnipress, 2012.\n\n[21] Manuel Gomez Rodriguez, David Balduzzi, and Bernhard Sch\u00f6lkopf. Uncovering the temporal dynamics\n\nof diffusion networks. arXiv preprint arXiv:1105.0697, 2011.\n\n[22] Manuel Gomez Rodriguez, Jure Leskovec, and Bernhard Sch\u00f6lkopf. Structure and dynamics of information\npathways in online media. In Proceedings of the sixth ACM international conference on Web search and\ndata mining, pages 23\u201332. ACM, 2013.\n\n[23] Edoardo M Airoldi, David M Blei, Stephen E Fienberg, and Eric P Xing. Mixed membership stochastic\n\nblockmodels. Journal of Machine Learning Research, 9(Sep):1981\u20132014, 2008.\n\n[24] Charles Kemp, Joshua B Tenenbaum, Thomas L Grif\ufb01ths, Takeshi Yamada, and Naonori Ueda. Learning\n\nsystems of concepts with an in\ufb01nite relational model. In AAAI, volume 3, page 5, 2006.\n\n[25] Katsuhiko Ishiguro, Tomoharu Iwata, Naonori Ueda, and Joshua B Tenenbaum. Dynamic in\ufb01nite relational\nmodel for time-varying relational data analysis. In Advances in Neural Information Processing Systems,\npages 919\u2013927, 2010.\n\n[26] Peter D Hoff, Adrian E Raftery, and Mark S Handcock. Latent space approaches to social network analysis.\n\nJournal of the american Statistical association, 97(460):1090\u20131098, 2002.\n\n[27] Kurt Miller, Michael I Jordan, and Thomas L Grif\ufb01ths. Nonparametric latent feature models for link\n\nprediction. In Advances in neural information processing systems, pages 1276\u20131284, 2009.\n\n[28] K Palla, F Caron, and YW Teh. A bayesian nonparametric model for sparse dynamic networks. arXiv\n\npreprint, 2016.\n\n[29] David J Aldous. Representations for partially exchangeable arrays of random variables. Journal of\n\nMultivariate Analysis, 11(4):581\u2013598, 1981.\n\n[30] Douglas N Hoover. Relations on probability spaces and arrays of random variables. Preprint, Institute for\n\nAdvanced Study, Princeton, NJ, 2, 1979.\n\n[31] Josip Djolonga and Andreas Krause. Learning implicit generative models using differentiable graph tests.\n\narXiv preprint arXiv:1709.01006, 2017.\n\n[32] Russell Lyons. Determinantal probability measures. Publications Math\u00e9matiques de l\u2019Institut des Hautes\n\n\u00c9tudes Scienti\ufb01ques, 98(1):167\u2013212, 2003.\n\n[33] Daniel A Spielman and Shang-Hua Teng. Nearly linear time algorithms for preconditioning and solving\nsymmetric, diagonally dominant linear systems. SIAM Journal on Matrix Analysis and Applications,\n35(3):835\u2013885, 2014.\n\n[34] Emily B Fox, Erik B Sudderth, Michael I Jordan, and Alan S Willsky. The sticky hdp-hmm: Bayesian\n\nnonparametric hidden markov models with persistent states.\n\n[35] Jure Leskovec and Christos Faloutsos. Scalable modeling of real graphs using kronecker multiplication. In\nProceedings of the 24th International Conference on Machine Learning, ICML \u201907, pages 497\u2013504, New\nYork, NY, USA, 2007. ACM.\n\n[36] Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. Graphs over time: densi\ufb01cation laws, shrinking\ndiameters and possible explanations. In KDD \u201905: Proceeding of the eleventh ACM SIGKDD international\nconference on Knowledge discovery in data mining, pages 177\u2013187, New York, NY, USA, 2005. ACM\nPress.\n\n[37] Nathan Oken Hodas and Kristina Lerman. The simple rules of social contagion. CoRR, abs/1308.5015,\n\n2013.\n\n[38] Jure Leskovec, Lars Backstrom, and Jon Kleinberg. Meme-tracking and the dynamics of the news cycle.\nIn Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data\nMining, KDD \u201909, pages 497\u2013506, New York, NY, USA, 2009. ACM.\n\n[39] David Kempe, Jon Kleinberg, and \u00c9va Tardos. Maximizing the spread of in\ufb02uence through a social\nnetwork. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and\ndata mining, pages 137\u2013146. ACM, 2003.\n\n11\n\n\f", "award": [], "sourceid": 6431, "authors": [{"given_name": "Elahe", "family_name": "Ghalebi", "institution": "TU Wien"}, {"given_name": "Baharan", "family_name": "Mirzasoleiman", "institution": "Stanford University"}, {"given_name": "Radu", "family_name": "Grosu", "institution": "TU Wien"}, {"given_name": "Jure", "family_name": "Leskovec", "institution": "Stanford University and Pinterest"}]}