{"title": "On the Completeness of First-Order Knowledge Compilation for Lifted Probabilistic Inference", "book": "Advances in Neural Information Processing Systems", "page_first": 1386, "page_last": 1394, "abstract": "Probabilistic logics are receiving a lot of attention today because of their expressive power for knowledge representation and learning. However, this expressivity is detrimental to the tractability of inference, when done at the propositional level. To solve this problem, various lifted inference algorithms have been proposed that reason at the first-order level, about groups of objects as a whole. Despite the existence of various lifted inference approaches, there are currently no completeness results about these algorithms. The key contribution of this paper is that we introduce a formal definition of lifted inference that allows us to reason about the completeness of lifted inference algorithms relative to a particular class of probabilistic models. We then show how to obtain a completeness result using a first-order knowledge compilation approach for theories of formulae containing up to two logical variables.", "full_text": "On the Completeness of First-Order Knowledge\nCompilation for Lifted Probabilistic Inference\n\nDepartment of Computer Science, Katholieke Universiteit Leuven\n\nGuy Van den Broeck\n\nCelestijnenlaan 200A, B-3001 Heverlee, Belgium\n\nguy.vandenbroeck@cs.kuleuven.be\n\nAbstract\n\nProbabilistic logics are receiving a lot of attention today because of their expres-\nsive power for knowledge representation and learning. However, this expressivity\nis detrimental to the tractability of inference, when done at the propositional level.\nTo solve this problem, various lifted inference algorithms have been proposed\nthat reason at the \ufb01rst-order level, about groups of objects as a whole. Despite\nthe existence of various lifted inference approaches, there are currently no com-\npleteness results about these algorithms. The key contribution of this paper is\nthat we introduce a formal de\ufb01nition of lifted inference that allows us to reason\nabout the completeness of lifted inference algorithms relative to a particular class\nof probabilistic models. We then show how to obtain a completeness result using\na \ufb01rst-order knowledge compilation approach for theories of formulae containing\nup to two logical variables.\n\n1\n\nIntroduction and related work\n\nProbabilistic logic models build on \ufb01rst-order logic to capture relational structure and on graphical\nmodels to represent and reason about uncertainty [1, 2]. Due to their expressivity, these models can\nconcisely represent large problems with many interacting random variables. While the semantics of\nthese logics is often de\ufb01ned through grounding the models [3], performing inference at the proposi-\ntional level is \u2013 as for \ufb01rst-order logic \u2013 inef\ufb01cient. This has motivated the quest for lifted inference\nmethods that exploit the structure of probabilistic logic models for ef\ufb01cient inference, by reasoning\nabout groups of objects as a whole and avoiding repeated computations. The \ufb01rst approaches to ex-\nact lifted inference have upgraded the variable elimination algorithm to the \ufb01rst-order level [4, 5, 6].\nMore recent work is based on methods from logical inference [7, 8, 9, 10], such as knowledge com-\npilation. While these approaches often yield dramatic improvements in runtime over propositional\ninference methods on speci\ufb01c problems, it is still largely unclear for which classes of models these\nlifted inference operators will be useful and for which ones they will eventually have to resort to\npropositional inference. One notable exception in this regard is lifted belief propagation [11], which\nperforms exact lifted inference on any model whose factor graph representation is a tree.\nA \ufb01rst contribution of this paper is that we introduce a notion of domain lifted inference, which\nformally de\ufb01nes what lifting means, and which can be used to characterize the classes of proba-\nbilistic models to which lifted inference applies. Domain lifted inference essentially requires that\nprobabilistic inference runs in polynomial time in the domain size of the logical variables appearing\nin the model. As a second contribution we show that the class of models expressed as 2-WFOMC\nformulae (weighted \ufb01rst-order model counting with up to 2 logical variables per formula) can be\ndomain lifted using an extended \ufb01rst-order knowledge compilation approach [10]. The resulting\napproach allows for lifted inference even in the presence of (anti-) symmetric or total relations in a\ntheory. These are extremely common and useful concepts that cannot be lifted by any of the existing\n\ufb01rst-order knowledge compilation inference rules.\n\n1\n\n\f2 Background\n\nWe will use standard concepts of function-free \ufb01rst-order logic (FOL). An atom p(t1, . . . , tn) con-\nsists of a predicate p/n of arity n followed by n arguments, which are either constants or logical\nvariables. An atom is ground if it does not contain any variables. A literal is an atom a or its nega-\ntion \u00aca. A clause is a disjunction l1 \u2228 ... \u2228 lk of literals. If k = 1, it is a unit clause. An expression\nis an atom, literal or clause. The pred(a) function maps an atom to its predicate and the vars(e)\nfunction maps an expression to its logical variables. A theory in conjunctive normal form (CNF) is\na conjunction of clauses. We often represent theories by their set of clauses and clauses by their set\nof literals. Furthermore, we will assume that all logical variables are universally quanti\ufb01ed.\nIn addition, we associate a set of constraints with each clause or atom, either of the form X (cid:54)= t,\nwhere X is a logical variable and t is a constant or variable, or of the form X \u2208 D, where D is a\ndomain, or the negation of these constraints. These de\ufb01ne a \ufb01nite domain for each logical variable.\nAbusing notation, we will use constraints of the form X = t to denote a substitution of X by t. The\nfunction atom(e) maps an expression e to its atoms, now associating the constraints on e with each\natom individually. To add the constraint c to an expression e, we use the notation e \u2227 c. Two atoms\nunify if there is a substitution which makes them identical and if the conjunction of the constraints on\nboth atoms with the substitution is satis\ufb01able. Two expressions e1 and e2 are independent, written\ne1 \u22a5\u22a5 e2, if no atom a1 \u2208 atom(e1) uni\ufb01es with an atom a2 \u2208 atom(e2).\nWe adopt the Weighted First-Order Model Counting (WFOMC) [10] formalism to represent proba-\nbilistic logic models, building on the notion of a Herbrand interpretation. Herbrand interpretations\nare subsets of the Herbrand base HB (T ), which consists of all ground atoms that can be constructed\nwith the available predicates and constant symbols in T . The atoms in a Herbrand interpretation are\nassumed to be true. All other atoms in HB (T ) are assumed to be false. An interpretation I satis\ufb01es\na theory T , written as I |= T , if it satis\ufb01es all the clauses c \u2208 T . The WFOMC problem is de\ufb01ned\non a weighted logic theory T , which is a logic theory augmented with a positive weight function w\nand a negative weight function w, which assign a weight to each predicate. The WFOMC problem\ninvolves computing\n\nwmc(T, w, w) =\n\nw(pred(a))\n\nw(pred(a)).\n\n(1)\n\n(cid:88)\n\n(cid:89)\n\nI|=T\n\na\u2208I\n\n(cid:89)\n\na\u2208HB(T )\\I\n\n3 First-order knowledge compilation for lifted probabilistic inference\n\n3.1 Lifted probabilistic inference\n\nA \ufb01rst-order probabilistic model de\ufb01nes a probability distribution P over the set of Herbrand in-\nterpretations H. Probabilistic inference in these models is concerned with computing the posterior\nprobability P(q|e) of query q given evidence e, where q and e are logical expressions in general:\n\n(cid:80)\n(cid:80)\n\nP(q|e) =\n\nh\u2208H,h|=q\u2227e P(h)\nh\u2208H,h|=e P(h)\n\n(2)\n\nWe propose one notion of lifted inference for \ufb01rst-order probabilistic models, de\ufb01ned in terms of the\ncomputational complexity of inference w.r.t. the domains of logical variables. It is clear that other\nnotions of lifted inference are conceivable, especially in the case of approximate inference.\nDe\ufb01nition 1 (Domain Lifted Probabilistic Inference). A probabilistic inference procedure is domain\nlifted for a model m, query q and evidence e iff the inference procedure runs in polynomial time in\n|D1|, . . . , |Dk| with Di the domain of the logical variable vi \u2208 vars(m, q, e).\nDomain lifted inference does not prohibit the algorithm to be exponential in the size of the vocab-\nulary, that is, the number of predicates, arguments and constants, of the probabilistic model, query\nand evidence. In fact, the de\ufb01nition allows inference to be exponential in the number of constants\nwhich occur in arguments of atoms in the theory, query or evidence, as long as it is polynomial in\nthe cardinality of the logical variable domains. This de\ufb01nition of lifted inference stresses the ability\nto ef\ufb01ciently deal with the domains of the logical variables that arise, regardless of their size, and\nformalizes what seems to be generally accepted in the lifted inference literature.\n\n2\n\n\fA class of probabilistic models is a set of probabilistic models expressed in a particular formalism.\nAs examples, consider Markov logic networks (MLN) [12] or parfactors [4], or the weighted FOL\ntheories for WFOMC that we introduced above, when the weights are normalized.\nDe\ufb01nition 2 (Completeness). Restricting queries to atoms and evidence to a conjunction of literals,\na procedure that is domain lifted for all probabilistic models m in a class of models M and for all\nqueries q and evidence e, is called complete for M.\n\n3.2 First-order knowledge compilation\n\nFirst-order knowledge compilation is an approach to lifted probabilistic inference consisting of the\nfollowing three steps (see Van den Broeck et al. [10] for details):\n\n1. Convert the probabilistic logical model to a weighted CNF. Converting MLNs or parfactors re-\nquires adding new atoms to the theory that represent the (truth) value of each factor or formula.\n\n2 friends(X, Y ) \u2227 smokes(X)\n\n\u21d2 smokes(Y )\n\n(a) MLN Model\n\nset-disjunction\n\nunit clause leaf\n\n(cid:95)\n\nSmokers\n\u2286 People\n\n\u2227\n\ndecomposable\nconjunction\n\n\u2228\u00ac friends(X, Y ) \u2228 \u00ac f(X, Y )\n\nsmokes(Y ) \u2228 \u00ac smokes(X)\nfriends(X, Y ) \u2228 f(X, Y )\nsmokes(X) \u2228 f(X, Y )\n\u00ac smokes(Y ) \u2228 f(X, Y ).\n(b) CNF Theory\n\nPredicate w w\n1\nfriends\nsmokes\n1\n1\n(c) Weight Functions\n\n1\n1\ne2\n\nf\n\nsmokes(X), X \u2208 Smokers\n\n\u2227\n\n\u2227\n\nf(X, Y ), Y \u2208 Smokers\n\n\u00ac smokes(Y ), Y /\u2208 Smokers\n\n\u2227\n\nf(X, Y ), X /\u2208 Smokers, Y /\u2208 Smokers\n\ndeterministic\ndisjunction\n\n\u2227\n\n\u2228\n\n(cid:94)\n(cid:94)\n\nx \u2208 Smokers\n\ny /\u2208 Smokers\n\nset-conjunction\n\u2227\n\nf(x, y)\n\n\u00ac friends(x, y)\n(d) First-Order d-DNNF Circuit\n\nfriends(x, y)\n\n\u00ac f(x, y)\n\nFigure 1: Friends-smokers example (taken from [10])\n\nExample 1. The MLN in Figure 1a assigns a weight to a formula in FOL. Figure 1b represents\nthe same model as a weighted CNF, introducing a new atom f(X, Y ) to encode the truth value of\nthe MLN formula. The probabilistic information is captured by the weight functions in Figure 1c.\n2. Compile the logical theory into a First-Order d-DNNF (FO d-DNNF) circuit. Figure 1d shows\nan example of such a circuit. Leaves represent unit clauses. Inner nodes represent the disjunc-\ntion or conjunction of their children l and r, but with the constraint that disjunctions must be\ndeterministic (l \u2227 r is unsatis\ufb01able) and conjunctions must be decomposable (l \u22a5\u22a5 r).\n\n3. Perform WFOMC inference to compute posterior probabilities.\n\nIn a FO d-DNNF circuit,\n\nWFOMC is polynomial in the size of the circuit and the cardinality of the domains.\n\nTo compile the CNF theory into a FO d-DNNF circuit, Van den Broeck et al. [10] propose a set of\ncompilation rules, which we will refer to as CR1. We will now brie\ufb02y describe these rules.\nUnit Propagation introduces a decomposable conjunction when the theory contains a unit clause. In-\ndependence creates a decomposable conjunction when the theory contains independent subtheories.\nShannon decomposition applies when the theory contains ground atoms and introduces a determin-\nistic disjunction between two modi\ufb01ed theories: one where the ground atom is true, and one where\nit is false. Shattering splits clauses in the theory until all pairs of atoms represent either a disjoint or\nidentical set of ground atoms.\nExample 2. In Figure 2a, the \ufb01rst two clauses are made independent from the friends(X, X) clause\nand split off in a decomposable conjunction by unit propagation. The unit clause becomes a leaf of\nthe FO d-DNNF circuit, while the other operand requires further compilation.\n\n3\n\n\ffriends(X, Y ) \u2228 dislikes(X, Y )\n\u00ac friends(X, Y ) \u2228 likes(X, Y )\n\nfriends(X, X)\n\ndislikes(X, Y ) \u2228 friends(X, Y )\n\nfun(X) \u2228 \u00ac friends(X, Y )\n\n\u2227\n\n(cid:94)\n\nx \u2208 People\n\nfriends(X, X)\nfriends(X, Y ) \u2228 dislikes(X, Y ), X (cid:54)= Y\n\u00ac friends(X, Y ) \u2228 likes(X, Y ), X (cid:54)= Y\n\nlikes(X, X)\n\ndislikes(x, Y ) \u2228 friends(x, Y )\n\nfun(x) \u2228 \u00ac friends(x, Y )\n\nfun(X) \u2228 \u00ac friends(X, Y )\nfun(X) \u2228 \u00ac friends(Y, X)\n\n(cid:95)\n\nFunPeople\n\u2286 People\n\nfun(X), X \u2208 FunPeople\n\u00ac fun(X), X /\u2208 FunPeople\nfun(X) \u2228 \u00ac friends(X, Y )\nfun(X) \u2228 \u00ac friends(Y, X)\n\n(a) Unit propagation of friends(X, X)\n\n(b) Independent partial grounding\n\n(c) Atom counting of fun(X)\n\nFigure 2: Examples of compilation rules. Circles are FO d-DNNF inner nodes. White rectangles\nshow theories before and after applying the rule. All variable domains are People. (taken from [10])\n\nIndependent Partial Grounding creates a decomposable conjunction over a set of child circuits,\nwhich are identical up to the value of a grounding constant. Since they are structurally identical,\nonly one child circuit is actually compiled. Atom Counting applies when the theory contains an atom\nwith a single logical variable X \u2208 D. It explicitly represents the domain D(cid:62) \u2286 D of X for which\nthe atom is true. It compiles the theory into a deterministic disjunction between all possible such\ndomains. Again, these child circuits are identical up to the value of D(cid:62) and only one is compiled.\nExample 3. The theory in Figure 2b is compiled into a decomposable set-conjunction of theories\nthat are independent and identical up to the value of the x constant. The theory in Figure 2c contains\nan atom with one logical variable: fun(X). Atom counting compiles it into a deterministic set-\ndisjunction over theories that differ in FunPeople, which is the domain of X for which fun(X) is\ntrue. Subsequent steps of unit propagation remove the fun(X) atoms from the theory entirely.\n\n3.3 Completeness\n\nWe will now characterize those theories where the CR1 compilation rules cannot be used, and where\nthe inference procedure has to resort to grounding out the theory to propositional logic. For these,\n\ufb01rst-order knowledge compilation using CR1 is not yet domain lifted.\nWhen a logical theory contains symmetric, anti-symmetric or total relations, such as\n\nfriends(X, Y ) \u21d2 friends(Y, X),\n\nparent(X, Y ) \u21d2 \u00ac parent(Y, X), X (cid:54)= Y,\n\n\u2264 (X, Y)\u2228\u2264 (Y, X),\n\n(3)\n(4)\n(5)\n\n(6)\n\nor more general formulas, such as\n\nenemies(X, Y ) \u21d2 \u00ac friend(X, Y ) \u2227 \u00ac friend(Y, X),\n\nnone of the CR1 rules apply. Intuitively, the underlying problem is the presence of either:\n\u2022 Two unifying (not independent) atoms in the same clause which contain the same logical variable\nin different positions of the argument list. Examples include (the CNF of) Formulas 3, 4 and 5,\nwhere the X and Y variable are bound by unifying two atoms from the same clause.\n\u2022 Two logical variables that bind when unifying one pair of atoms but appear in different positions\nof the argument list of two other unifying atoms. Examples include Formula 6, which in CNF is\n\n\u00ac friend(X, Y ) \u2228 \u00ac enemies(X, Y )\n\u00ac friend(Y, X) \u2228 \u00ac enemies(X, Y )\n\nHere, unifying the enemies(X, Y ) atoms binds the X variables from both clauses, which appear\nin different positions of the argument lists of the unifying atoms friend(X, Y ) and friend(Y, X).\n\nBoth of these properties preclude the use of CR1 rules. Also in the context of other model classes,\nsuch as MLNs, probabilistic versions of the above formulas cannot be processed by CR1 rules.\n\n4\n\n\fEven though \ufb01rst-order knowledge compilation with CR1 rules does not have a clear completeness\nresult, we can show some properties of theories to which none of the compilation rules apply. First,\nwe need to distinguish between the arity of an atom and its dimension. A predicate with arity two\nmight have atoms with dimension one, when one of the arguments is ground or both are identical.\nDe\ufb01nition 3 (Dimension of an Expression). The dimension of an expression e is the number of\nlogical variables it contains: dim(e) = | vars(e)|.\nLemma 1 (CR1 Postconditions). The CR1 rules remove all atoms from the theory T which have\nzero or one logical variable arguments, such that afterwards \u2200a \u2208 atom(T ) : dim(a) > 1. When\nno CR1 rule applies, the theory is shattered and contains no independent subtheories.\nProof. Ground atoms are removed by the Shannon decomposition operator followed by unit prop-\nagation. Atoms with a single logical variable (including unary relations) are removed by the atom\ncounting operator followed by unit propagation. If T contains independent subtheories, the inde-\npendence operator can be applied. Shattering is always applied when T is not yet shattered.\n\n4 Extending \ufb01rst-order knowledge compilation\n\nIn this section we introduce a new operator which does apply to the theories from Section 3.3.\n\n4.1 Logical variable properties\n\nTo formally de\ufb01ne the operator we propose, and prove its correctness, we \ufb01rst introduce some math-\nematical concepts related to the logical variables in a theory (partly after Jha et al. [8]).\nDe\ufb01nition 4 (Binding Variables). Two logical variables X, Y are directly binding b(X, Y ) if they\nare bound by unifying a pair of atoms in the theory. The binding relationship b+(X, Y ) is the\ntransitive closure of the directly binding relation b(X, Y ).\nExample 4. In the theory\n\n\u00ac p(W, X) \u2228 \u00ac q(X)\n\nr(Y ) \u2228 \u00ac q(Y )\n\u00ac r(Z) \u2228 s(Z)\n\nthe variable pairs (X, Y ) and (Y, Z) are directly binding. The variables X, Y and Z are binding.\nVariable W does not bind to any other variable. Note that the binding relationship b+(X, Y ) is an\nequivalence relation that de\ufb01nes two equivalence classes: {X, Y, Z} and {W}.\nLemma 2 (Binding Domains). After shattering, binding logical variables have identical domains.\n\nProof. During shattering (see Section 3.2), when two atoms unify, binding two variables with par-\ntially overlapping domains, the atoms\u2019 clauses are split up into clauses where the domain of the\nvariables is identical, and clauses where the domains are disjoint and the atoms no longer unify.\nDe\ufb01nition 5 (Root Binding Class). A root variable is a variable that appears in all the atoms in its\nclause. A root binding class is an equivalence class of binding variables where all variables are root.\nExample 5. In the theory of Example 4, {X, Y, Z} is a root binding class and {W} is not.\n\n4.2 Domain recursion\n\nWe will now introduce the new domain recursion operator, starting with its preconditions.\nDe\ufb01nition 6. A theory allows for domain recursion when (i) the theory is shattered, (ii) the theory\ncontains no independent subtheories and (iii) there exists a root binding class.\n\nFrom now on, we will denote with C the set of clauses of the theory at hand and with B a root\nbinding class guaranteed to exist if C allows for domain recursion. Lemma 2 states that all variables\nin B have identical domains. We will denote the domain of these variables with D.\nThe intuition behind the domain recursion operator is that it modi\ufb01es D by making one element\nexplicit: D = D(cid:48) \u222a{xD} with xD /\u2208 D(cid:48). This explicit domain element is introduced by the SPLITD\nfunction, which splits clauses w.r.t. the new subdomain D(cid:48) and element xD.\n\n5\n\n\f(cid:26)c,\n\nDe\ufb01nition 7 (SPLITD). For a clause c and given set of variables Vc \u2286 vars(c) with domain D, let\n\nSPLITD(c,Vc) =\n\nSPLITD(c1,Vc \\ {V }) \u222a SPLITD(c2,Vc \\ {V }),\n\nclauses C and set of variables V with domain D: SPLITD(C,V) =(cid:83)\n\n(7)\nwhere c1 = c \u2227 (V = xD) and c2 = c \u2227 (V (cid:54)= xD) \u2227 (V \u2208 D(cid:48)) for some V \u2208 Vc. For a set of\nc\u2208C SPLITD(c,V \u2229 vars(c)).\nThe domain recursion operator creates three sets of clauses: SPLITD(C, B) = Cx \u222a Cv \u222a Cr, with\n(8)\n\n(V = xD)|c \u2208 C},\n\nif Vc = \u2205\nif Vc (cid:54)= \u2205\n\n(V (cid:54)= xD) \u2227 (V \u2208 D(cid:48))|c \u2208 C},\n\n(9)\n\nV \u2208B\u2229vars(c)\n\nCx = {c \u2227 (cid:94)\nCv = {c \u2227 (cid:94)\nc\u2208SPLITD(C,B) c and therefore(cid:86)\n\nV \u2208B\u2229vars(c)\n\n(10)\nProposition 3. The conjunction of the domain recursion sets is equivalent to the original theory:\n\nCr = SPLITD(C, B) \\ Cx \\ Cv.\n\nc\u2208C c \u2261(cid:0)(cid:86)\n\nc(cid:1) \u2227(cid:0)(cid:86)\n\nc\u2208Cx\n\nc\u2208Cv\n\nc(cid:1) \u2227(cid:0)(cid:86)\n\nc\u2208Cr\n\nc(cid:1).\n\n(cid:86)\n\nc\u2208C c \u2261(cid:86)\n\nWe will now show that these sets are independent and that their conjunction is decomposable.\nTheorem 4. The theories Cx, Cv and Cr are independent: Cx \u22a5\u22a5 Cv, Cx \u22a5\u22a5 Cr and Cv \u22a5\u22a5 Cr.\nThe proof of Theorem 4 relies on the following Lemma.\nLemma 5. If the theory allows for domain recursion, all clauses and atoms contain the same number\nof variables from B:\n\n\u2203n, \u2200c \u2208 C, \u2200a \u2208 atom(C) : | vars(c) \u2229 B | = | vars(a) \u2229 B | = n.\n\nProof. Denote with Cn the clauses in C that contain n logical variables from B and with C c\nn its\ncompliment in C. If C is nonempty, there is a n > 0 for which Cn is nonempty. Then every atom\nin Cn contains exactly n variables from B (De\ufb01nition 5). Since the theory contains no independent\nn is empty.\nsubtheories, there must be an atom a in Cn which uni\ufb01es with an atom ac in C c\nAfter shattering, all uni\ufb01cations bind one variable from a to a single variable from ac. Because a\ncontains exactly n variables from B, ac must also contain exactly n (De\ufb01nition 4), and because B is\na root binding class, the clause of ac also contains exactly n, which contradicts the de\ufb01nition of C c\nn.\nTherefore, C c\n\nn is empty, and because the variables in B are root, they also appear in all atoms.\n\nn, or C c\n\nProof of Theorem 4. From Lemma 5, all atoms in C contain the same number of variables from B.\nIn Cx, these variables are all constrained to be equal to xD, while in Cv and Cr at least one variable\nis constrained to be different from xD. An attempt to unify an atom from Cx with an atom from Cv\nor Cr therefore creates an unsatis\ufb01able set of constraints. Similarly, atoms from Cv and Cr cannot\nbe uni\ufb01ed.\n\nFinally, we extend the FO d-DNNF language proposed in Van den Broeck et al. [10] with a new\nnode, the recursive decomposable conjunction \u2227(cid:13)r, and de\ufb01ne the domain recursion compilation\nrule.\nDe\ufb01nition 8 (\u2227(cid:13)r). The FO d-DNNF node \u2227(cid:13)r(nx, nr, D, D(cid:48),V) represents a decomposable con-\njunction between the d-DNNF nodes nx, nr and a d-DNNF node isomorphic to the \u2227(cid:13)r node itself.\nIn particular, the isomorphic operand is identical to the node itself, except for the size of the domain\nof the variables in V, which becomes one smaller, going from D to D(cid:48) in the isomorphic operand.\n\nWe have shown that the conjunction between sets Cx, Cv and Cr is decomposable (Theorem 4) and\nlogically equivalent to the original theory (Proposition 3). Furthermore, Cv is identical to C, up\nto the constraints on the domain of the variables in B. This leads us to the following de\ufb01nition of\ndomain recursion.\nDe\ufb01nition 9 (Domain Recursion). The domain recursion compilation rule compiles C into\n\u2227(cid:13)r(nx, nr, D, D(cid:48), B), where nx, nr are the compiled circuits for Cx, Cr. The third set Cv is repre-\nsented by the recursion on D, according to De\ufb01nition 8.\n\n6\n\n\fCr\n\n\u00ac friends(x, X) \u2228 friends(X, x), X (cid:54)= x\n\u00ac friends(X, x) \u2228 friends(x, X), X (cid:54)= x\n\nnr\n\n(cid:94)\n\nx(cid:48)\u2208P erson\n\nx(cid:48)(cid:54)=x\n\n\u2228\n\nnv\n\n\u2227r\n\nP erson \u2190 P erson \\ {x}\n\nCx\n\u00ac friends(x, x) \u2228 friends(x, x)\n\nnx\n\u2228\n\n\u00ac friends(x, x)\n\nfriends(x, x)\n\n\u2227\n\n\u2227\n\n\u00ac friends(x, x(cid:48))\n\n\u00ac friends(x(cid:48), x)\n\nfriends(x, x(cid:48))\n\nfriends(x(cid:48), x)\n\nFigure 3: Circuit for the symmetric relation in Equation 3, rooted in a recursive conjunction.\n\nExample 6. Figure 3 shows the FO d-DNNF circuit for Equation 3. The theory is split up into\nthree independent theories: Cr and Cx, shown in the Figure 3, and Cv = {\u00ac friends(X, Y ) \u2228\nfriends(Y, X), X (cid:54)= x, Y (cid:54)= x}. The conjunction of these theories is equivalent to Equation 3.\nTheory Cv is identical to Equation 3, up to the inequality constraints on X and Y .\nTheorem 6. Given a function size, which maps domains to their size, the weighted \ufb01rst-order model\ncount of a \u2227(cid:13)r(nx, nr, D, D(cid:48),V) node is\n\nwmc(\u2227(cid:13)r(nx, nr, D, D(cid:48),V), size) = wmc(nx, size)size(D)\n\nwmc(nr, size\u222a{D(cid:48) (cid:55)\u2192 s}),\n\nsize(D)(cid:89)\n\ns=0\n\n(11)\n\n(12)\n\nwhere size\u222a{D(cid:48) (cid:55)\u2192 s} adds to the size function that the subdomain D(cid:48) has cardinality s.\nProof. If C allows for domain recursion, due to Theorem 4, the weighted model count is\nif size(D) = 0\nif size(D) > 0\n\nwmc(Cx) \u00b7 wmc(Cv, size(cid:48)) \u00b7 wmc(Cr, size(cid:48))\n\nwmc(C, size) =\n\n(cid:26)1,\n\nwhere size(cid:48) = size\u222a{D(cid:48) (cid:55)\u2192 size(D) \u2212 1}.\nTheorem 7. The Independent Partial Grounding compilation rule is a special case of the domain\nrecursion rule, where \u2200c \u2208 C : | vars(c) \u2229 B | = 1 (and therefore Cr = \u2205).\n\n4.3 Completeness\n\nIn this section, we introduce a class of models for which \ufb01rst-order knowledge compilation with\ndomain recursion is complete.\nDe\ufb01nition 10 (k-WFOMC). The class of k-WFOMC consist of WFOMC theories with clauses that\nhave up to k logical variables.\n\nA \ufb01rst completeness result is for 2-WFOMC, using the set of knowledge compilation rules CR2,\nwhich are the rules in CR1 extended with domain recursion.\nTheorem 8 (Completeness for 2-WFOMC). First-order knowledge compilation using the CR2 com-\npilation rules is a complete domain lifted probabilistic inference algorithm for 2-WFOMC.\n\nProof. From Lemma 1, after applying the CR1 rules, the theory contains only atoms with dimension\nlarger than or equal to two. From De\ufb01nition 10, each clause has dimension smaller than or equal to\ntwo. Therefore, each logical variable in the theory is a root variable and according to De\ufb01nition 5,\nevery equivalence class of binding variables is a root binding class. Because of Lemma 1, the theory\nallows for domain recursion, which requires further compilation of two theories: Cx and Cr into nx\nand nr. Both have dimension smaller than 2 and can be lifted by CR1 compilation rules.\n\nThe properties of 2-WFOMC are a suf\ufb01cient but not necessary condition for \ufb01rst-order knowledge\ncompilation to be domain lifted. We can obtain a similar result for MLNs or parfactors by reducing\nthem to a WFOMC problem. If an MLN contains only formulae with up to k logical variables, then\nits WFOMC representation will be in k-WFOMC.\n\n7\n\n\fThis result for 2-WFOMC is not trivial. Van den Broeck et al. [10] showed in their experiments that\ncounting \ufb01rst-order variable elimination (C-FOVE) [6] fails to lift the \u201cFriends Smoker Drinker\u201d\nproblem, which is in 2-WFOMC. We will show in the next section that the CR1 rules fail to lift\nthe theory in Figure 4a, which is in 2-WFOMC. Note that there are also useful theories that are not\nin 2-WFOMC, such as those containing the transitive relation friends(X, Y ) \u2227 friends(Y, Z) \u21d2\nfriends(X, Z).\n\n5 Empirical evaluation\n\nTo complement the theoretical results of the previous section, we extended the WFOMC implemen-\ntation1 with the domain recursion rule. We performed experiments with the theory in Figure 4a,\nwhich is a version of the friends and smokers model [11] extended with the symmetric relation of\nEquation 3. We evaluate the performance querying P(smokes(bob)) with increasing domain size,\ncomparing our approach to the existing WFOMC implementation and its propositional counterpart,\nwhich \ufb01rst grounds the theory and then compiles it with the c2d compiler [13] to a propositional\nd-DNNF circuit. We did not compare to C-FOVE [6] because it cannot perform lifted inference on\nthis model.\nPropositional inference quickly becomes intractable when there are more than 20 people. The lifted\ninference algorithms scale much better. The CR1 rules can exploit some regularities in the model.\nFor example, they eliminate all the smokes(X) atoms from the theory. They do, however, resort\nto grounding at a later stage of the compilation process. With the domain recursion rule, there is\nno need for grounding. This advantage is clear in the experiments, our approach having an almost\nconstant inference time in this range of domains sizes. Note that the runtimes for c2d include\ncompilation and evaluation of the circuit, whereas the WFOMC runtimes only represent evaluation\nof the FO d-DNNF. After all, propositional compilation depends on the domain size but \ufb01rst-order\ncompilation does not. First-order compilation takes a constant two seconds for both rule sets.\n\n2 smokes(X) \u2227 friends(X, Y )\nfriends(X, Y ) \u21d2 friends(Y, X).\n\n\u21d2 smokes(Y )\n\n(a) MLN Model\n\n(b) Evaluation Runtime\n\nFigure 4: Symmetric friends and smokers experiment, comparing propositional knowledge compi-\nlation (c2d) to WFOMC using compilation rules CR1 and CR2 (which includes domain recursion).\n\n6 Conclusions\n\nWe proposed a de\ufb01nition of complete domain lifted probabilistic inference w.r.t. classes of prob-\nabilistic logic models. This de\ufb01nition considers algorithms to be lifted if they are polynomial in\nthe size of logical variable domains. Existing \ufb01rst-order knowledge compilation turns out not to\nadmit an intuitive completeness result. Therefore, we generalized the existing Independent Partial\nGrounding compilation rule to the domain recursion rule. With this one extra rule, we showed that\n\ufb01rst-order knowledge compilation is complete for a signi\ufb01cant class of probabilistic logic models,\nwhere the WFOMC representation has up to two logical variables per clause.\n\nAcknowledgments\nThe author would like to thank Luc De Raedt, Jesse Davis and the anonymous reviewers for valuable\nfeedback. This work was supported by the Research Foundation-Flanders (FWO-Vlaanderen).\n\n1http://dtai.cs.kuleuven.be/wfomc/\n\n8\n\n 0.01 0.1 1 10 100 1000 10000 10 20 30 40 50 60 70 80Runtime [s]Number of Peoplec2dWFOMC - CR1WFOMC - CR2\fReferences\n[1] Lise Getoor and Ben Taskar, editors. An Introduction to Statistical Relational Learning. MIT\n\nPress, 2007.\n\n[2] Luc De Raedt, Paolo Frasconi, Kristian Kersting, and Stephen Muggleton, editors. Probabilis-\ntic inductive logic programming: theory and applications. Springer-Verlag, Berlin, Heidelberg,\n2008.\n\n[3] Daan Fierens, Guy Van den Broeck, Ingo Thon, Bernd Gutmann, and Luc De Raedt. Inference\nin probabilistic logic programs using weighted CNF\u2019s. In Proceedings of UAI, pages 256\u2013265,\n2011.\n\n[4] David Poole. First-order probabilistic inference.\n\n2003.\n\nIn Proceedings of IJCAI, pages 985\u2013991,\n\n[5] Rodrigo de Salvo Braz, Eyal Amir, and Dan Roth. Lifted \ufb01rst-order probabilistic inference. In\n\nProceedings of IJCAI, pages 1319\u20131325, 2005.\n\n[6] Brian Milch, Luke S. Zettlemoyer, Kristian Kersting, Michael Haimes, and Leslie Pack Kael-\nbling. Lifted Probabilistic Inference with Counting Formulas. In Proceedings of AAAI, pages\n1062\u20131068, 2008.\n\n[7] Vibhav Gogate and Pedro Domingos. Exploiting Logical Structure in Lifted Probabilistic\n\nInference. In Proceedings of StarAI, 2010.\n\n[8] Abhay Jha, Vibhav Gogate, Alexandra Meliou, and Dan Suciu. Lifted Inference Seen from the\n\nOther Side: The Tractable Features. In Proceedings of NIPS, 2010.\n\n[9] Vibhav Gogate and Pedro Domingos. Probabilistic theorem proving. In Proceedings of UAI,\n\npages 256\u2013265, 2011.\n\n[10] Guy Van den Broeck, Nima Taghipour, Wannes Meert, Jesse Davis, and Luc De Raedt. Lifted\nIn Proceedings of IJCAI,\n\nProbabilistic Inference by First-Order Knowledge Compilation.\npages 2178\u20132185, 2011.\n\n[11] Parag Singla and Pedro Domingos. Lifted \ufb01rst-order belief propagation. In Proceedings of\n\nAAAI, pages 1094\u20131099, 2008.\n\n[12] Matthew Richardson and Pedro Domingos. Markov logic networks. Machine Learning, 62(1):\n\n107\u2013136, 2006.\n\n[13] Adnan Darwiche. New advances in compiling CNF to decomposable negation normal form.\n\nIn Proceedings of ECAI, pages 328\u2013332, 2004.\n\n9\n\n\f", "award": [], "sourceid": 805, "authors": [{"given_name": "Guy", "family_name": "Broeck", "institution": null}]}