{"title": "Counting the Optimal Solutions in Graphical Models", "book": "Advances in Neural Information Processing Systems", "page_first": 12114, "page_last": 12124, "abstract": "We introduce #opt, a new inference task for graphical models which calls for counting the number of optimal solutions of the model. We describe a novel variable elimination based approach for solving this task, as well as a depth-first branch and bound algorithm that traverses the AND/OR search space of the model. The key feature of the proposed algorithms is that their complexity is exponential in the induced width of the model only. It does not depend on the actual number of optimal solutions. Our empirical evaluation on various benchmarks demonstrates the effectiveness of the proposed algorithms compared with existing depth-first and best-first search based approaches that enumerate explicitly the optimal solutions.", "full_text": "Counting the Optimal Solutions in Graphical Models\n\nRadu Marinescu\nIBM Research\nDublin, Ireland\n\nradu.marinescu@ie.ibm.com\n\nRina Dechter\n\nUniversity of California, Irvine\n\nIrvine, CA 92697, USA\n\ndechter@ics.uci.edu\n\nAbstract\n\nWe introduce #opt, a new inference task for graphical models which calls for\ncounting the number of optimal solutions of the model. We describe a novel\nvariable elimination based approach for solving this task, as well as a depth-\ufb01rst\nbranch and bound algorithm that traverses the AND/OR search space of the model.\nThe key feature of the proposed algorithms is that their complexity is exponential\nin the induced width of the model only. It does not depend on the actual number of\noptimal solutions. Our empirical evaluation on various benchmarks demonstrates\nthe effectiveness of the proposed algorithms compared with existing depth-\ufb01rst and\nbest-\ufb01rst search based approaches that enumerate explicitly the optimal solutions.\n\n1\n\nIntroduction\n\nGraphical models such as belief networks, Markov networks, constraint networks or in\ufb02uence\ndiagrams provide a powerful framework for reasoning with probabilistic and deterministic information.\nCombinatorial optimization tasks such as \ufb01nding the minimum or maximum cost solutions arise in\nmany applications and often can be ef\ufb01ciently solved by search or variable elimination schemes.\nAlthough \ufb01nding the optimal solution is paramount in many practical situations, we argue that it is\nimportant to also know how many optimal solutions there are and, possibly to enumerate all or just\na fraction thereof. Indeed, in genetic linkage analysis \ufb01nding the number of maximum likelihood\nhaplotype con\ufb01gurations may shed additional light on how the genetic information is transmitted\nfrom ancestors to descendants in a pedigree [1]. In computational protein design \ufb01nding the number\nof optimal protein side-chain resonance assignments could be indicative for the protein structure\ndetermination [2]. Similarly, knowing the number of optimal frequency assignments to the radio links\nin a communication network could help the engineers produce more reliable network designs [3]. In\npost-optimality analysis we may be interested in estimating the distribution of optimal solutions over\nthe values of a certain target variable, for which we clearly require the count of optimal solutions.\nThe number of optimal solutions may also be used as a feature for explaining the hardness of \ufb01nding\nan optimal solution to a problem instance. It thus may be employed to guide a random problem\ngenerator to produce hard problem instances for optimization.\nOne approach that gained attention in the past decade has focused on knowledge compilation\ntechniques that produce a more compact representation of all optimal solutions [4, 5]. In particular,\n[5] described an ef\ufb01cient way to compile a graphical model into a compact AND/OR Multi-Valued\nDecision Diagram (AOMDD) which represents all its optimal solutions. More recently, [6] introduced\na collection of depth-\ufb01rst and best-\ufb01rst search algorithms for computing the set of m-best solutions\nof a graphical model. However, both compilation based approaches or specialized m-best algorithms\nwill yield a count of the optimal solutions by enumeration. In particular, the compilation based\ntechniques typically count the number of optimal solutions during a secondary pass over the compiled\ndecision diagram, whereas the m-best algorithms must rely on a sequence of searches each with\na different value of m in order to recover the actual number of optimal solutions. In contrast, the\nalgorithms we will present and explore are not dependent on the number of optimal solutions.\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\f(a) Primal graph\n\n(b) Pseudo tree\n\n(c) Functions\n\nFigure 1: A simple graphical model.\n\nContributions In this paper, we de\ufb01ne the #opt task for graphical models, and show how the\ncommon algorithmic principles of variable elimination and search can be extended to this task.\nSpeci\ufb01cally, the idea behind variable elimination for #opt is to \ufb01rst capture all the optimal solutions\nas a constraint representation by \ufb02attening the relation of the cost-to-go functions and then apply the\ncounting variable elimination algorithm over the resulting set of constraints. In our algorithm these\ntwo phases are interleaved. Subsequently, we present a depth-\ufb01rst branch and bound algorithm that\ntraverses an AND/OR search space of the graphical model. For each node the algorithm computes a\npair of values, one representing the optimal cost below the node in the search space and the other\nthe number of the corresponding optimal solutions. We also provide a formulation of the #opt task\nwithin the semiring framework [7] thus placing it in relation to other well known graphical models\ntasks. It is well known that counting the number of all solutions in a graphical model can be done\nwithout enumeration in complexity exponential in the induced width of the model [8, 9]. We show\nhere that counting the number of optimal solutions can be done without enumeration as well, with\nthe same complexity. An empirical evaluation on various benchmark problems demonstrates the\neffectiveness of the proposed algorithms compared with enumeration-based extensions of depth-\ufb01rst\nand best-\ufb01rst search to this task.\nThe #opt task involves both summation and optimization. Other tasks involving these two operations\nare Marginal MAP and \ufb01nding optimal policies that maximize the expected utility, de\ufb01ned over\nin\ufb02uence diagrams [10, 11, 12]. But, there is a notable difference. While, as we will show, #opt can\nbe formulated within the semiring framework [7] and solved by traditional algorithms exactly [13], it\ndoes not land itself to the common approximation schemes such as the mini-bucket approach [14], at\nleast not in a straightforward manner. This, we think makes it more unique and thus deserving a more\nfocused attention.\n\n2 Background\nA graphical model is a tuple M = (cid:104)X, D, F(cid:105), where X = {Xi : i \u2208 V } is a set of variables indexed\nby set V and D = {Di : i \u2208 V } is the set of their \ufb01nite domains of values. F = {\u03c8\u03b1 : \u03b1 \u2208 F}\nis a set of discrete real-valued local functions de\ufb01ned on subsets of variables, where we use \u03b1 \u2286 V\nand X\u03b1 \u2286 X to indicate the scope of function \u03c8\u03b1, i.e., X\u03b1 = var(\u03c8\u03b1) = {Xi : i \u2208 \u03b1}. A solution\nis a complete assignment to the variables, namely x = (X1 = x1, . . . , Xn = xn). Given a set of\nvariables S = {X1, . . . , Xk}, we denote by \u2126(S) the Cartesian product of their domains, namely\n\u2126(S) = D1 \u00d7\u00b7\u00b7\u00b7\u00d7 Dk. The function scopes yield a primal graph G whose vertices are the variables\nand whose edges connect any two variables that appear in the scope of the same function. The\ngraphical model M represents a global function, whose scope is X and which is the combination of\n\nall the local functions, namely: F(x) =(cid:80)\n\n\u03b1\u2208F \u03c8\u03b1(x\u03b1).\n\nThe most common optimization task (opt) for graphical models is to compute the optimal value\nV \u2217 = minx F(x) and its optimizing con\ufb01guration x\u2217 = argminx F(x). The latter is also known as\nthe optimal solution.\nHowever, a graphical model may have more than one optimal solution. We next de\ufb01ne formally the\ntask of counting the number of optimal solutions of a graphical model which we abbreviate hereafter\nby #opt.\nDEFINITION 1 (#opt). Given a graphical model M = (cid:104)X, D, F(cid:105), the #opt task is to compute |S|,\nwhere S = {x | F(x) = V \u2217, V \u2217 = minx\nWe also de\ufb01ne the task e-opt which calls for enumerating explicitly all the optimal solutions:\n\n(cid:80)\n\u03b1\u2208F \u03c8\u03b1(x\u03b1)}.\n\n2\n\n\f(cid:80)\n\u03b1\u2208F \u03c8\u03b1(x\u03b1)}.\n\nDEFINITION 2 (e-opt). Given a graphical model M = (cid:104)X, D, F(cid:105), the e-opt task is to enumerate\nall elements in the set S = {x | F(x) = V \u2217, V \u2217 = minx\nAn important feature of graphical models in characterizing complexity is the induced width (or\ntreewidth). The induced graph of G relative to an ordering \u03c4 of its variables is obtained by processing\nthe nodes in reverse order of \u03c4. For each node all its earlier neighbors are connected, including\nneighbors connected by previously added edges. The width of a node is the number of edges\nconnecting it to nodes lower in the ordering. The induced width of G along \u03c4, denoted w\u2217(\u03c4 ), is the\nmaximum width of the nodes in the induced graph.\nExample 1. Figure 1(a) depicts the primal graph of a simple graphical model representing a global\nfunction over 4 variables X = {A, B, C, D} with 4 local functions (shown in Figure 1(b)) de\ufb01ned by\nthe arcs (each pair is a scope of one function). There are 6 optimal solutions with optimal value 3,\nnamely S = {0000, 0010, 1000, 1001, 1010, 1011}.\nWe describe next brute-force search and variable elimination based schemes for counting the optimal\nsolutions.\nDepth-First Branch and Bound Solving #opt can be done by a simple extension of depth-\ufb01rst\nbranch and bound search [15]. The algorithm, called BnB, traverses the space of partial assignments\nin a depth-\ufb01rst manner while maintaining an upper bound U on the optimal value and a counter c\nwhich is updated every time a new solution value V is found: c = c + 1 if V = U and c = 1 if\nV < U, respectively. In the latter case, V becomes the current best upper bound. Throughout the\nsearch, the algorithm also attempts to prune unpromising regions of the search space. Namely, at\neach node n it computes a heuristic lower bound f (n) of the best solution extending the current\npartial assignment and prunes the respective subtree if the heuristic estimate is strictly greater than\nthe current upper bound (f (n) > U). The strict inequality is required in order to \ufb01nd all optimal\nsolutions. When search terminates, the value of the counter c gives the number of optimal solutions.\nBest-\ufb01rst Search Alternatively, we can use a best-\ufb01rst search strategy such as A* search [16].\nAlgorithm A* for #opt maintains the search frontier in the OPEN list and always expands \ufb01rst the\nnode n with the smallest f-value f (n) = g(n) + h(n), where g(n) is the cost of the path from the\nroot of the search space to n and h(n) is a lower bound of the best extension to a solution [17]. When\nthe optimal solution is encountered, the algorithm saves its value V \u2217, initializes a counter c to 1\nand continues the search. Every time a new optimal solution is found, the counter c is incremented.\nSearch terminates when OPEN is empty, in which case c gives the number of optimal solutions.\nVariable Elimination An immediate extension to #opt within the compilation paradigm is to\napply a variable elimination scheme for optimization along an ordering \u03c4 and then enumerate all\noptimal solutions one by one in a greedy fashion using the intermediate messages generated during\nthe elimination procedure [13]. The only difference between this algorithm and a regular variable\nelimination is that the forward decoding pass does not terminate with the \ufb01rst optimal solution.\nComplexity of Brute-Force Approaches All the algorithms presented above require enumerating\nall the optimal solutions and thus have a factor of #opt in their complexity. If |T| captures the\nsearch space size explored by best-\ufb01rst search for \ufb01nding the \ufb01rst optimal solution and #opt is their\nnumber, best-\ufb01rst can be bounded by O(|T| + #opt). Branch and bound is often faster and has a\nbetter memory management than best-\ufb01rst search yet it is harder to bound and we cannot give a\nbetter bound than O(|T|\u00b7 #opt). The simple variable elimination algorithm we described generates a\ncompiled representation in time and memory exponential in the induced width w\u2217\n\u03c4 along an ordering\n\u03c4. Then, generating each new optimal solution requires consulting the compiled structure and is in\n\u03c4 \u00b7 #opt),\nthe worst-case exponential in w\u2217\nwhere n is the number of variables and k bounds the domain size.\n\n\u03c4 implying an overall worst-case complexity of O(n \u00b7 kw\u2217\n\n3 Bucket Elimination for #opt\n\nThe main drawback of the brute-force search and variable-elimination approaches described in\nSection 2 is that they must enumerate explicitly all optimal solutions (i.e., they actually solve the\ne-opt task). If the number of optimal solutions is large then the computational overhead can be\nsigni\ufb01cant. In this and the next section we will describe more ef\ufb01cient methods for solving #opt that\nare based on either bucket elimination or on depth-\ufb01rst branch and bound search over an AND/OR\nsearch spaces, but avoid explicit enumeration of the optimal solutions.\n\n3\n\n\fAlgorithm 1 BE for #opt\n\nelimination order \u03c4 = X1, . . . , Xn\n\nRequire: graphical model M = (cid:104)X, D, F(cid:105),\n1: Let \u03a8 = {\u03c8\u03b1 : \u03c8\u03b1 \u2208 F}\n2: for all variable Xp in the reversed order \u03c4 do\n3: Create bucket Bp and its associated set \u03a8p\nLet \u03a8p = {\u03c8\u03b1 \u2208 \u03a8 : Xp \u2208 vars(\u03c8\u03b1)}\n4:\nLet \u03a8 \u2190 \u03a8 \\ \u03a8p, \u039bp = \u2205 and \u0393p = \u2205\n5:\n6: for all variable Xp in the reversed order \u03c4 do\nLet \u03a8p = {\u03c81 . . . \u03c8r}, \u039bp = {\u03bb1 . . . \u03bbm}\n7:\nand \u0393p = {\u03b31, . . . , \u03b3q}\ni=1 \u03c8i\n\nLet \u03c8p \u2190(cid:80)r\n\n8:\n\nx(cid:48)\n\np\n\n9:\n10:\n\nLet \u03bbp \u2190 minxp (\u03c8p +(cid:80)m\nLet \u03b3p \u2190 (cid:80)\n( \u00af\u03c8p \u00b7 (cid:81)q\np \u2208 argminxp (\u03c8p +(cid:80)r\n\nj=1 \u03bbj)\nk=1 \u03b3k) where\nx(cid:48)\ni=1 \u03c8i) and \u00af\u03c8p is\nthe \ufb02attened \u03c8p (see text below for details)\n11: Add \u03bbp and \u03b3p to the sets \u039b and \u0393 of\nthe highest bucket corresp. to a variable in\nvars(\u03bbp); If Xp is the \ufb01rst variable then add\n\u03bbp to \u039b0 and \u03b3p to \u03930\n\n12: Let v\u2217 \u2190(cid:80)\n\n\u03bb and c\u2217 \u2190(cid:81)\n\n\u03b3\n\n13: return (cid:104)v\u2217, c\u2217(cid:105)\n\n\u03bb\u2208\u039b0\n\n\u03b3\u2208\u03930\n\nAs noted, the optimal solution to a graphical model can be obtained by using the bucket elimination\n(BE) algorithm which eliminates (minimizes over) the variables in sequence [13]. In order to extend it\nto counting the optimal solutions and avoid enumeration we use the concept of \ufb02attened representation\nof a function. Speci\ufb01cally, given a function \u03c8, its \ufb02attening denoted by \u00af\u03c8 is de\ufb01ned as: \u00af\u03c8(y) = 1\nif \u03c8(y) (cid:54)= \u221e and \u00af\u03c8(y) = 0 otherwise, for all y \u2208 \u2126(vars(\u03c8)). When a variable X is eliminated,\nwe \ufb01rst record a standard cost message that represents the cost-to-go corresponding to minimizing\nover the domain values of X. Then, we count the minimizing con\ufb01gurations of X and record them in\na new count message which has the same scope as the cost message. The latter can be viewed as a\ncounting step over a constraint representation of the \ufb02attened cost-to-go function.\nAlgorithm 1 presents the BE procedure for solving #opt. Given a variable ordering \u03c4 = X1, . . . Xn,\nthe functions are \ufb01rst partitioned into their corresponding buckets such that a bucket Bp is associated\nwith a single variable Xp and a function is placed in the bucket of its argument that appears latest in\nthe ordering (lines 1\u20135). With each bucket Bp we also assign three sets \u03a8p, \u039bp and \u0393p to store the\noriginal functions as well as the cost and count messages, respectively.\nThe algorithm processes each bucket in reversed order, from last to \ufb01rst by a variable elimination\nprocedure that computes a new cost message and a new count message which are both placed into a\nlower bucket. Speci\ufb01cally, let Xp be the current variable and let \u03c8p be the bucket function which is\nobtained by summing all original functions in the bucket (line 8). The cost message (or \u03bb-message)\n\u03bbp computed in bucket Bp is obtained by minimizing out variable Xp from the compound function\nthat combines by summation the bucket function and all incoming cost messages to this bucket.\nj=1 \u03bbj). This is the usual cost-to-go message when computing\nthe optimal solution. The count message (or \u03b3-message) originating from Bp is computed by \ufb01rst\nmultiplying all the incoming count messages to this bucket with the \ufb02attened bucket function \u00af\u03c8p\nk=1 \u03b3k). Notice\nthat the summation is performed only over the minimizing domain values x(cid:48)\np of Xp, namely over\nx(cid:48)\ni=1 \u03c8i). Finally, after the \ufb01rst variable in the ordering is processed, the sets\n\u039b0 and \u03930 contain all the constant messages generated during the execution. Therefore, the optimal\nvalue is obtained by summing up all constants in \u039b0 (line 12), while the number of optimal solutions\nis calculated as the product of all constants in \u03930 (line 12), respectively.\nWe can show that BE for #opt is time and space exponential in the induced width w\u2217\n\u03c4, i.e., O(n \u00b7 kw\u2217\n\u03c4 ), where n is the number of variables and k bounds the domain size.\n\nNamely, \u03bbp \u2190 minxp (\u03c8p +(cid:80)m\nand then summing over the bucket\u2019s variable. Namely, we have \u03b3p \u2190(cid:80)\np \u2208 argminxp (\u03c8p +(cid:80)r\n\n( \u00af\u03c8p \u00b7(cid:81)q\n\n\u03c4 of the ordering\n\nx(cid:48)\n\np\n\n4 AND/OR Branch and Bound for #opt\n\nSigni\ufb01cant recent improvements in search for optimization in graphical models have been achieved\nby using AND/OR search spaces, which often capture problem structure far better than standard OR\nsearch methods [18, 19, 20, 21]. We next introduce a depth-\ufb01rst branch and bound algorithm that\ntraverses an AND/OR search space to compute the number of optimal solutions as well as the optimal\nvalue.\n\n4\n\n\fFigure 2: AND/OR search tree of the problem from Figure 1.\n\nAND/OR Search Spaces The AND/OR search space is de\ufb01ned relative to a pseudo tree of the\nprimal graph, which captures problem decomposition.\nDEFINITION 3 (pseudo tree). A pseudo tree of an undirected graph G = (V, E) is a directed rooted\ntree T = (V, E(cid:48)) such that every arc of G not included in E(cid:48) is a back-arc in T connecting a node in\nT to one of its ancestors. The arcs in E(cid:48) may not all be included in E.\nGiven a graphical model M = (cid:104)X, D, F(cid:105) with primal graph G and pseudo tree T of G, the AND/OR\nsearch tree ST based on T has alternating levels of OR nodes corresponding to the variables, and\nAND nodes corresponding to the values of the OR parent\u2019s variable, with arc weights extracted from\nthe original functions F. Identical subproblems, identi\ufb01ed by their context (the partial instantiation\nthat separates the subproblem from the rest of the problem graph), can be merged, yielding an\nAND/OR search graph. Merging all context-mergeable nodes yields the context minimal AND/OR\nsearch graph, denoted CT . The size of CT is exponential in the induced width of G along a depth-\ufb01rst\ntraversal of T (see also [18]).\nA solution tree \u02c6x is a subtree of CT such that: (1) contains the root of CT ; (2) if an internal OR node\nn \u2208 CT is in \u02c6x then exactly one of its AND children is in \u02c6x; (3) if an internal AND node n \u2208 CT is\nin \u02c6x then all its OR children are in \u02c6x; (4) every tip node in \u02c6x is a terminal node.\n\n4.1 Arc Weights and Node Values\n\nThe OR-to-AND arcs in the AND/OR search space are associated with weights that are de\ufb01ned based\non the graphical model\u2019s functions [19]. In order to solve the #opt task, each node n in CT is\nassociated with two values denoted by v(n) and c(n), respectively. The optimal value below n is\ngiven by v(n), while c(n) captures the number of optimal solutions of the conditioned subproblem\nrooted at n. Based on previous work [18], the value v(n) can be computed recursively based on the\nvalues of n\u2019s successors in CT , as follows:\n\n\uf8f1\uf8f4\uf8f2\uf8f4\uf8f30, if n is terminal AND node\n(cid:80)\n\uf8f1\uf8f4\uf8f2\uf8f4\uf8f31, if n is terminal AND node\n(cid:81)\n(cid:80)\nm\u2208succ(n) c(m), if n is AND node\nm(cid:48)\u2208succ(n) c(m), if n is OR node, and m\u2019 \u2208 argminm\u2208succ(n)(w(n,m) + v(m))\n\nminm\u2208succ(n)(w(n,m) + v(m)), if n is OR node\n\nSimilarly, we can compute c(n) recursively as:\n\nv(n) =\n\nm\u2208succ(n) v(m), if n is AND node\n\n(1)\n\nc(n) =\n\n(2)\nClearly, the values v(s) and c(s) of the root node s represent the optimal value and the number of\noptimal solutions of the initial problem.\nExample 2. Figure 2 displays the weighted AND/OR search tree of the problem from Figure 1 based\non the pseudo tree from Figure 1(b). The node values v(n) and c(n) are shown next to each node (in\nblack and red, respectively). Consider the highlighted subtree rooted at the OR node labeled C. The\noptimal value of the subproblems rooted by its AND children is 3, and since both values of C are\noptimal in this case it follows that the number of optimal solutions below C is 2.\n\n5\n\n\fAlgorithm 2 AOBB for #opt\n\nRequire: graphical model M = (cid:104)X, D, F(cid:105),\n\npseudo tree T , heuristic h(n)\n1: function AOBB(\u02c6x, X, D, F)\nif X = \u2205 then\n2:\nreturn (0, 1)\n3:\nelse\n4:\nXi \u2190 SelectV ar(T )\n5:\nLet n be the OR node labeled by (cid:104)Xi(cid:105)\n6:\nif Ctxt(n) in cache then\n7:\n(v(n), c(n)) \u2190 ReadCache(Ctxt(n))\n8:\nelse\n9:\nv(n) \u2190 \u221e; c(n) \u2190 0; ch(n) \u2190 \u2205\n10:\nfor all domain value xi \u2208 Di do\n11:\nExtend partial solution \u02c6x \u2190 \u02c6x\u222a{xi}\n12:\nLet m be the AND node (cid:104)Xi, xi(cid:105)\n13:\n\n14:\n15:\n\n16:\n17:\n18:\n19:\n20:\n21:\n22:\n23:\n\nLet v(m) \u2190 w(n,m); c(m) \u2190 1\nCalculate f (\u02c6x) using the h(m) of the\nunexpanded leaves m of \u02c6x\nif f (\u02c6x) \u2264 v(s) then\nfor all k = 1 . . . q do\n(v, c) \u2190 AOBB(\u02c6x, Xk, Dk, Fk)\nv(m) \u2190 v(m) + v\nc(m) \u2190 c(m) \u00d7 c\nch(n) \u2190 ch(n) \u222a {m}\nv(n) \u2190 minm\u2208ch(n) v(m)\n\nc(n) \u2190 (cid:80)\n\nm(cid:48)\u2208ch(n) c(m(cid:48)), where\n\nm(cid:48) \u2208 argminm\u2208ch(n) v(m)\n\n24: W riteCache(Ctxt(n), v(n), c(n))\n25:\n\nreturn (v(n), c(n))\n\nBranch and Bound Search The depth-\ufb01rst AND/OR branch and bound (AOBB) search method\nfor the #opt task is described by Algorithm 2. The following notation is used: (X, D, F) is the\nproblem with which the procedure is called, \u02c6x is the current partial solution subtree, Ctxt(n) denotes\nthe context of a node n, while v(n) and c(n) are the node values that are updated based on the values\nof their successors in the search space (see also Equations 1 and 2). The weight w(n,m) labels the arc\nfrom the OR node n to its AND child m. The algorithm assumes that variables are selected according\nto a pseudo tree T . If the set X is empty, then the result is trivially computed (lines 2\u20133). Else,\nAOBB selects a variable Xi and expands the OR node n labeled by Xi, namely it iterates over its\ndomain values xi (line 11) to compute the node values v(n) and c(n), respectively. The algorithm\nattempts to retrieve the results cached at the OR nodes (lines 7\u20138). If a valid cache entry is found\nfor the current OR node then the node values are updated (line 8) and the search continues. Before\nexpanding the AND node m labeled by (cid:104)Xi, xi(cid:105), AOBB uses the h(\u00b7) values of the unexpanded leaf\nnodes in \u02c6x to compute the heuristic evaluation function f (\u02c6x) which yields a lower bound on the\noptimal extension of \u02c6x. Subsequently, it safely prunes the search space below m if f (\u02c6x) > v(s),\nwhere v(s) is the current value of the root node s and is an upper bound on the optimal solution value.\nNotice that, unlike in regular branch and bound search, a strict inequality is required to account for\nall optimal solutions. The problem is then decomposed into a set of q independent subproblems, one\nfor each child Xk of Xi in the pseudo tree, which are then solved sequentially (line 17). After trying\nall feasible values of variable Xi , the minimal cost as well as the number of optimal solutions to\nthe problem rooted by Xi remain in v(n) and c(n), which are returned (line 25). Based on previous\nwork [18], we can show that the complexity of AOBB is time and space O(n \u00b7 kw\u2217\nT ), where n is\nthe number of variables, k bounds the domain size and w\u2217\nT is the induced width along a depth-\ufb01rst\ntraversal of the pseudo tree.\nAlthough algorithms AOBB and BE have the same worst-case complexity for #opt, in practice,\nAOBB is likely to be more effective than BE because it can exploit a heuristic evaluation function to\nprune the search space. We will illustrate this experimentally on several problem benchmarks.\n\n5 The Semiring Formulation for #opt\n\nWe next show how to formulate and solve the #opt task within a semiring based system [7].\nSpeci\ufb01cally, consider the semiring A = (cid:104)R2,\u2297,\u2295(cid:105) over pairs of real values, where operations \u2297 and\n\u2295 are de\ufb01ned as follows (intuitively, v is the cost of a solution and c is the count):\n\n(v1, c1) \u2297 (v2, c2) = (v1 + v2, c1 \u00b7 c2) (3) (v1, c1) \u2295 (v2, c2) =\n\n6\n\n\uf8f1\uf8f2\uf8f3(v1, c1 + c2),\n\n(v1, c1),\n(v2, c2),\n\nif v1 < v2\nif v1 > v2\n\nif v1 = v2\n\n(4)\n\n\fIt is easy to verify that \u2297 and \u2295 are commutative and associative, namely a\u2297 b = b\u2297 a, a\u2295 b = b\u2295 a,\na \u2297 (b \u2297 c) = (a \u2297 b) \u2297 c and a \u2295 (b \u2295 c) = (a \u2295 b) \u2295 c, for all a, b, c \u2208 R2, respectively. In order\nto facilitate local computations over the semiring valuations, \u2297 must distribute over \u2295 [22].\nPROPOSITION 1 (distributivity). Given a semiring A = (cid:104)R2,\u2297,\u2295(cid:105), then \u2200a, b, c \u2208 R2 we have that\na \u2297 (b \u2295 c) = (a \u2297 b) \u2295 (a \u2297 c).\n\n(cid:78)\n\nX1\n\nXn\n\n\u00b7\u00b7\u00b7(cid:76)\n\n(cid:78) \u03c62(yz) =\n\nvaluations \u03c61(Y) and \u03c62(Z) such that Y, Z \u2286 X, the combination(cid:78) is de\ufb01ned by \u03c61\nand Y = {X} \u222a Z, the elimination(cid:76) is de\ufb01ned by(cid:76)\nClearly, solving #opt corresponds to computing: (v\u2217, c\u2217) =(cid:76)\n\nProof. Let a = (v1, c1), b = (v2, c2) and c = (v3, c3). Assume that v2 < v3 (the other cases\nv2 > v3 and v2 = v3 can be shown in a similar manner). Clearly, v1 + v2 < v1 + v3 also holds.\nThen a \u2297 (b \u2295 c) = (v1, c1) \u2297 (v2, c2) = (v1 + v2, c1 \u00b7 c2). We have that (a \u2297 b) \u2295 (a \u2297 c) =\n(v1 + v2, c1c2) \u2295 (v1 + v3, c1 \u00b7 c3) = (v1 + v2, c1 \u00b7 c2), which concludes the proof.\nGiven a graphical model (cid:104)X, D, F(cid:105), each local function \u03c8\u03b1(X\u03b1) \u2208 F can be expressed as a semiring\nvaluation \u03c6\u03b1 : \u2126(X\u03b1) \u2192 R2 such that \u2200y \u2208 \u2126(X\u03b1), \u03c6\u03b1(y) = (\u03c8\u03b1(y), 1). We use the two\noperations \u2295 and \u2297 in A to de\ufb01ne the combination and elimination operators, as follows. Given two\n\u03c61(y) \u2297 \u03c62(z) for all y \u2208 \u2126(Y) and z \u2208 \u2126(Z). Similarly, given a valuation \u03c6(Y) such that Y \u2286 X\nX \u03c6(z) = \u2295x\u2208\u2126(X)\u03c6(xz), for all z \u2208 \u2126(Z).\n\u03b1\u2208F \u03c6\u03b1, where v\u2217 is\nthe optimal solution cost and c\u2217 is the number of optimal solutions. This can be done using bucket\nelimination and search based algorithms such as those presented in the previous section.\nMini-Bucket Approximation and #opt Mini-Bucket Elimination (MBE) is a classic relaxation of\nthe exact bucket elimination that approximates each elimination operator to enable the user to control\na bound on the space and time complexity [14]. The idea is to partition each bucket into smaller\nsubsets called mini-buckets, each containing at most i distinct variables (where i is a user selected\nparameter called the i-bound). The mini-buckets are processed independently by the same variable\nelimination procedure resulting in messages over fewer variables and thus requiring less time and\nmemory (namely, O(n \u00b7 ki), where n is the number of variables and k bounds the domain size).\nThe distributivity property from Proposition 1 allows us to extend MBE to the #opt task as well.\nSpeci\ufb01cally, let Xk be the current variable. The bucket Bk = {\u03c61, . . . , \u03c6m} is partitioned into r\nmini-buckets Qk = {Qk1, . . . Qkr} such that Qkj = {\u03c6j1, . . . \u03c6jl}. Then, the exact elimination of\n\u03c6.\nUnfortunately, this computation does not provide a bound on the number of optimal solutions. In fact,\nwe can show that the number can go up or down in an unpredictable manner.\nConsider for example the following valuations a1 = (2, 1), b1 = (1, 1), a2 = (2, 1) and b2 = (2, 1).\nWe would like to provide a bound on the exact computation (v\u2217, c\u2217) = (a1 \u2297 b1) \u2295 (a2 \u2297 b2) by\n(v, c) = (a1 \u2295 a2) \u2297 (b1 \u2295 b2). In this case, we have that (v\u2217, c\u2217) = (3, 1) \u2295 (4, 1) = (3, 1) and\n(v, c) = (2, 2) \u2297 (1, 1) = (3, 2). Clearly, c = 2 is an upper bound on c\u2217 = 1 . On the other hand,\nif a1 = (2, 1), b1 = (1, 1), a2 = (1, 2) and b2 = (2, 1) then (v\u2217, c\u2217) = (3, 1) \u2295 (3, 2) = (3, 3) and\n(v, c) = (2, 2) \u2297 (1, 1) = (2, 2) in which case c = 2 is a lower bound on c\u2217 = 3.\nThis observation is in stark contrast with what we know about other tasks for graphical models such\nas counting all solutions and \ufb01nding the optimal solution to the model, where the MBE scheme is\nguaranteed to produce an upper and, respectively, a lower bound on the value of the exact computation.\nTherefore, we leave the extension of MBE into a correct bounding scheme for #opt as future work.\n\nXk from bucket Bk, namely(cid:76)\n\n\u03c6 can be approximated by(cid:78)\n\nQkj\u2208Qk\n\nXk\n\n\u03c6\u2208Qkj\n\n(cid:76)\n\n(cid:78)\n\n(cid:76)\n\nXp\n\n\u03c6\u2208Bk\n\n6 Experiments\n\nWe evaluate empirically our proposed counting algorithms on four benchmarks for graphical models.\nAll experiments were run on a 2.6GHz processor with 10GB of RAM.\nBenchmarks and Algorithms For our purpose, we considered two random problem domains: (1)\ngrid which consists of random m-by-m grid networks, and (2) random which consists of random\nnetworks with n variables and 2 \u00b7 n binary functions, respectively. We generated random problem\ninstances for each domain, as follows: for grid problems, m ranged between 8 and 14, respectively\n(so that the number of variables varied between 64 and 196); for random problems, the number\nof variables ranged between 60 and 120, respectively. In all cases, the domain size of the variables\nwas set to 3. The function values were distributed uniformly at random between 1 and 10. In order\n\n7\n\n\fTable 1: Results for grid (left) and random (right) networks.\n\ngrid instances\n\np = 0.20\n\np = 0.50\n\nalg\n\nsize\nn, w*\n64, 10 AOBB 10\nA*\n10\nBE\n10\nBnB 10\n100, 13 AOBB 10\n10\nA*\nBE\n10\nBnB 10\n\n#opt\n267\n267\n267\n267\n1,030\n1,030\n1,030\n1,030\n196, 19 AOBB 10 182 199.32 10 9.71E+06\n\ntime N\n0.08 10\n0.08 10\n0.27 10\n0.08 10\n0.26 10\n0.33 10\n8.29 10\n0.31 10\n\nN #opt\n13\n13\n13\n13\n45\n45\n45\n45\n\nA*\nBE\nBnB\n\n96 2887.35 2\n- 0\n0\n\n2,352 2885.78 0\n2\n0\n- 0\n8 209 1025.42 7 1.20E+07 1635.57 0\n\n0\n\nrandom instances\np = 0.50\n\nalg\n\nsize\n\np = 0.20\n\np = 0.80\n\ntime n, w*\n0.09 60, 11 AOBB 10\nA*\n10\nBE\n10\nBnB 10\n0.26 80, 16 AOBB 10\n10\nA*\nBE\n4\nBnB 10\n\np = 0.80\ntime\ntime N\n#opt\ntime N\n#opt\n0.41 10 1.11E+07\n0.35\n0.07 10 5.58E+09\n0.48 9 9.27E+05 370.66\n0.09 0\n0\n-\n14.91 10 1.11E+07\n6.86\n0.27 10 5.58E+09\n0.27\n83.51\n0.49 10 1.11E+07\n0.09 6 2.25E+08 206.02\n1.40 10 1.16E+08\n1.21\n0.25 10 5.71E+12\n4.78 8 1.12E+07 1085.97\n0.56 0\n0\n238 2213.98 3 2.38E+08 2617.60\n8.49 10 5.71E+12\n0.36 0\n2.18 9 7.98E+07 1021.19\n0\n62.13 10 5.11E+22 113.34 120, 22 AOBB 7 320 1815.84 9 1.38E+05 1147.63 10 1.14E+13 182.24\n- 1 1.59E+07 3320.61\n0\n-\n0\n- 0\n6 149 2029.35 7 1.63E+05 1598.08 0\n-\n\ntime N\nN #opt\n0.33 10\n10\n0.33 10\n10\n5.85 10\n10\n0.34 10\n10\n1.72 10\n56\n56\n7.72 10\n7 2260.07 4\n56\n1.91 10\n\n#opt\n591\n591\n591\n591\n1,201\n1,201\n\nA*\nBE\nBnB\n\n-\n9.92\n-\n\n- 0\n- 0\n\n1,201\n\n0\n0\n0\n\n0\n0\n\n-\n-\n-\n\n0\n0\n\n0\n0\n\nTable 2: Number of optimal solutions (#opt) and CPU time (sec) on ISCAS (left) and SPOT5 (right) instances.\n\nISCAS instances\n\nSPOT5 instances\n\nBE\n\nBnB instance n, k, w*\n\n- 1050.67 1502\n\n#opt AOBB A*\n0.07 0.54\n\ninstance n, k, w*\nc432\nc499\nc880\ns1196\ns1423\ns1488\ns1494\ns386\ns953\n\n32\n32 39.60 17.10 118.03\n32\n16 118.17\n\n(432, 2, 28)\n- 29\n(499, 2, 24)\n- 2783.92 404\n(880, 2, 27)\n-\n- 42\n(1196, 2, 54)\n0.10 4.06 56.49 1843.46 503\n(1423, 2, 23) 256\n1.99 1.84\n- 102.58 505\n2\n(1488, 2, 46)\n0.95 2.16\n- 195.44 54\n2\n(1494, 2, 46)\n0.04 0.07\n2\n(386, 2, 20)\n32 99.67 5.50\n(953, 2, 72)\n\n1.27 27.05\n-\n\n4.91\n-\n\n0.16\n-\n\n0.08\n(209, 4, 6) 4.66E+70\n7.29\n(82, 4, 14) 1.13E+12\n0.55\n(100, 4, 19)\n21120\n(192, 4, 26) 2.57E+11 2031.22\n0.33\n(143, 4, 9) 6.41E+33\n(240, 4, 22) 3.06E+42 3351.22\n(67, 4, 11)\n\nBE BnB\n#opt AOBB A*\n0.03\n-\n-\n- 187.78\n-\n- 40.43 164.54\n-\n-\n-\n-\n-\n-\n0.55 0.58\n0.58\n\n-\n0.42\n-\n1.53\n\n216\n\nto control the number of optimal solutions we post-processed each function by randomly setting p\npercent of the function values to 1. We refer to p as the perturbation parameter and, intuitively, as p\nincreases, the number of optimal solutions should increase as well.\nIn addition, we also considered two collections of real-world WCSP instances derived from the\nISCAS circuits [23] and SPOT5 satellite scheduling benchmark [24], respectively. For the SPOT5\ninstances the goal is to \ufb01nd the optimal schedule for an Earth observing satellite. Clearly, the number\nof optimal schedules may indicate certain degrees of freedom to operate the satellite in orbit. The\nISCAS instances correspond to diagnosis of digital circuits where the goal is to compute the most\nlikely explanation of a small subset of failed components. In this case as well the number of optimal\nsolutions could explain the reliability of the circuits. The original problem instances which we\nobtained from the UCI Graphical Models Repository (graphmod.ics.uci.edu) are speci\ufb01ed\nas Markov networks with real valued potential values between 0 and 1. We converted these instances\ninto equivalent WCSPs by taking the negative log of the potential values and rounding up to the\nnearest integer value.\nWe evaluated algorithms BE and AOBB and compared them with the A* search and the depth-\ufb01rst\nbranch and bound (BnB) that enumerate explicitly the optimal solutions. All search algorithms\nwere by guided by a static mini-bucket heuristic MBE(i) which was pre-compiled along a min-\ufb01ll\nelimination ordering [14, 19, 25]. The heuristic uses a parameter called i-bound to control the\naccuracy of the heuristic estimates. We set the i-bound to 10 and allowed a 1 hour time limit to all\nalgorithms.\nMeasures of Performance We report the average CPU time in seconds (time), the number of\nproblem instances solved (N) and the average number of optimal solutions over the solved instances\n(#opt), respectively. In addition, we also collect the problems\u2019 parameters as the number of variables\n(n), maximum domain size (k), and the average induced width (w\u2217) obtained along a min-\ufb01ll based\nelimination ordering [26]. The best performance points are highlighted. A \u201c-\u201d denotes that the\nrespective algorithm exceeded the time or the memory limit.\nResults Table 1 summarizes the results obtained on the grid and random domains. The left-\nmost column indicates the problem sizes, while the remaining columns are divided into 3 groups\ncorresponding to different values of the perturbation parameter p \u2208 {0.20, 0.50, 0.80}. Each data\npoint represents an average over 10 random instances. We can see that AOBB is the overall best\nperforming algorithm both in terms of running time and the number of problem instances solved (over\n96% of instances solved). Furthermore, algorithms A* and BnB are competitive only for problems\n\n8\n\n\fFigure 3: CPU time in seconds (red) versus number of optimal solutions (blue).\n\nwith small and moderate numbers of optimal solutions (e.g., grid and random with p = 0.20 and\np = 0.50, respectively). However, they typically fail to solve most of the problem instances where\nthe number of optimal solutions is very large (e.g., grid and random with p = 0.80) because of\nthe prohibitively large overhead associated with enumerating the optimal solutions. Finally, BE is\ncompetitive only on problem instances with relatively small induced widths.\nIn Table 2 we show the results for solving the ISCAS and SPOT5 problem instances. We see again\nthat AOBB offers the best performance, especially on the SPOT5 instances which have the largest\nnumber of optimal solutions. As before, BE can only handle problems with relatively small induced\nwidth regardless of the number of optimal solutions. In this case, the performance of A* and BnB is\nquite poor compared with AOBB and BE because of the large number of optimal solutions.\nFigure 3 plots the running time of AOBB and the number of optimal solutions as a function of the\nperturbation value (p) for two representative problem classes from the grid and random domains.\nEach data point represents an average over 100 random instances generated for the respective p value.\nWe see that as the number of optimal solutions increases (i.e., p increases), the problems become\neasier to solve and the running time decreases. This is important especially when designing random\nproblem generators for optimization to control the hardness of the problem instances generated.\n\n7 Related Work\n\nModel counting (#SAT), solution counting (#CSP) or weighted model counting (WMC) are well\nknown #P-complete problems that have many applications in \ufb01elds such as veri\ufb01cation, planning\nand automated reasoning. Exact approaches to counting solutions are based on either extending\nsystematic search-based SAT/CSP solvers such as DPLL and AND/OR search [27, 28, 29, 30], or\nvariable elimination algorithms [14] which are known to be time and space exponential in the induced\nwidth of the problem. Our algorithms build on top of those ideas by combining summation and\noptimization without resorting to explicit enumeration. Approximate model counting techniques\nbased on hashing were recently proposed [31, 32]. However, these methods are not directly applicable\nto the #opt task, at least not in a straightforward manner. Maximum model counting (Max#SAT) is\na recent extension of #SAT [33] that is also related to #opt. The work by [34] develops a semiring\nbased formalism for counting weighted subgraphs in an explicit larger graph.\n\n8 Conclusion\n\nWe introduced here the #opt task for graphical models, presented and evaluated variable elimination\nand depth-\ufb01rst AND/OR branch and bound algorithms for this task. We also described a semiring\nbased formulation of the task. The complexity of the proposed algorithms is exponential in the\ninduced width and does not depend on the number of optimal solutions. Our empirical evaluation\ndemonstrated their effectiveness compared with brute-force search approaches that rely on explicitly\nenumerating the optimal solutions. Overall, our proposed AOBB version appears to be superior.\n\n9\n\n0.20.40.60.81.0perturbation (p)01234CPU time (sec)grid (n=144, k=3, w*=16): CPU time vs #optAOBB103106109101210151018102110241027#opt0.20.40.60.81.0perturbation (p)020406080100120140160CPU time (sec)random (n=120, k=3, w*=24): CPU time vs #optAOBB1031051071091011101310151017#opt\fReferences\n[1] M. Fishelson, N. Dovgolevsky, and D. Geiger. Maximum likelihood haplotyping for general\n\npedigrees. Human Heredity, 59(1):41\u201360, 2005.\n\n[2] J. Zeng, P. Zhou, and B. Donald. A markov random \ufb01eld framework for protein side-chain\nresonance assignment. In Research in Computational Molecular Biology (RECOMB), page\n550\u2013570, 2010.\n\n[3] B. Cabon, S. de Givry, L. Lobjois, T. Schiex, and J. Warners. Radio link frequency assignment.\n\nConstraints, 4:79\u201389, 1999.\n\n[4] T. Hadzic and J. Hooker. Postoptimality analysis for integer programming using binary decision\n\ndiagrams. Technical Report, Carnegie Mellon., 2006.\n\n[5] R. Mateescu, R. Dechter, and R. Marinescu. AND/OR multi-valued decision diagrams\n(AOMDDs) for graphical models. Journal of Arti\ufb01cial Intelligence Research, 33:465\u2013519,\n2008.\n\n[6] N. Flerova, R. Marinescu, and R. Dechter. Searching for the m best solutions in graphical\n\nmodels. Journal of Arti\ufb01cial Intelligence Research, 55(1):889\u2013952, 2016.\n\n[7] S. Bistarelli, U. Montanari, and F. Rossi. Semiring-based constraint satisfaction and optimization.\n\nJournal of ACM, 44(2):201\u2013236, 1997.\n\n[8] R. Dechter. Constraint processing. Elsevier Morgan Kaufmann, 2003.\n\n[9] K. Kask, R. Dechter, and V. Gogate. Counting-based look-ahead schemes for constraint\nsatisfaction. In International Conference on Principles and Practice of Constraint Programming\n(CP), pages 317\u2013331, 2004.\n\n[10] R. Marinescu, J. Lee, R. Dechter, and A. Ihler. Anytime best+depth-\ufb01rst search for bounding\nmarginal MAP. In AAAI Conference on Arti\ufb01cial Intelligence (AAAI), pages 1749\u20131755, 2017.\n\n[11] R. Shachter. Probabilistic inference and in\ufb02uence diagrams. Operations Research, 36(4):589\u2013\n\n604, 1988.\n\n[12] J. Pearl. Probabilistic Reasoning in Intelligent Systems. Morgan Kaufmann, 1988.\n\n[13] R. Dechter. Reasoning with Probabilistic and Deterministic Graphical Models: Exact Algo-\nrithms. Synthesis Lectures on Arti\ufb01cial Intelligence and Machine Learning. Morgan & Claypool\nPublishers, 2013.\n\n[14] R. Dechter and I. Rish. Mini-buckets: A general scheme of approximating inference. Journal\n\nof ACM, 50(2):107\u2013153, 2003.\n\n[15] E. Lawler and D. Wood. Branch and bound methods: A survey. Operations Research, 14:699\u2013\n\n719, 1956.\n\n[16] P. Hart, N. Nilsson, and B. Raphael. A formal basis for the heuristic determination of minimum\n\ncost paths. IEEE Transactions on Systems Science and Cybernetics, 8(2):100\u2013107, 1968.\n\n[17] R. Dechter and J. Pearl. Generalized best-\ufb01rst search strategies and the optimality of A*. In\n\nJournal of ACM, 32(3):505\u2013536, 1985.\n\n[18] R. Dechter and R. Mateescu. AND/OR search spaces for graphical models. Arti\ufb01cial Intelligence,\n\n171(2-3):73\u2013106, 2007.\n\n[19] R. Marinescu and R. Dechter. AND/OR branch-and-bound search for combinatorial optimization\n\nin graphical models. Arti\ufb01cial Intelligence, 173(16-17):1457\u20131491, 2009.\n\n[20] R. Marinescu and R. Dechter. Memory intensive AND/OR search for combinatorial optimization\n\nin graphical models. Arti\ufb01cial Intelligence, 173(16-17):1492\u20131524, 2009.\n\n[21] W. Lam, K. Kask, J. Larrosa, and R. Dechter. Residual-guided look-ahead in AND/OR search\n\nfor graphical models. Journal of Arti\ufb01cial Intelligence Research, 60:287\u2013346, 2017.\n\n10\n\n\f[22] J. Kohlas and N. Wilson. Semiring induced valuation algebras: exact and approximate local\n\ncomputation algorithms. Arti\ufb01cial Intelligence, 172(1):1360\u20131399, 2008.\n\n[23] F. Brglez and H. Fujiwara. A neutral netlist of 10 combinatorial benchmark circuits and a target\n\ntranslator in fortran. In IEEE International Symposium on Circuits and Systems, 1996.\n\n[24] E. Bensana, M. Lemaitre, and G. Verfaillie. Earth observation satellite management. Constraints,\n\n4(3):293\u2013299, 1999.\n\n[25] K. Kask and R. Dechter. A general scheme for automatic generation of search heuristics from\n\nspeci\ufb01cation dependencies. Arti\ufb01cial Intelligence, 129(1-2):91\u2013131, 2001.\n\n[26] U. Kjaerulff. Triangulation of graph-based algorithms giving small total space. Technical\n\nReport, University of Aalborg, Denmark, 1990.\n\n[27] R. Bayardo and J. Pehoushek. Counting models using connected components. In National\n\nConference of Arti\ufb01cial Intelligence (AAAI), pages 157\u2013162, 2000.\n\n[28] R. Dechter and R. Mateescu. The impact of AND/OR search spaces on constraint satisfaction and\ncounting. In International Conference on Principles and Practice of Constraint Programming\n(CP), pages 731\u2013736, 2004.\n\n[29] T. Sang, P. Beame, and H. Kautz. Solving Bayesian networks by weighted model counting. In\n\nNational Conference of Arti\ufb01cial Intelligence (AAAI), pages 475\u2013482, 2005.\n\n[30] M. Chavira and A. Darwiche. On probabilistic inference by weighted model counting. Arti\ufb01cial\n\nIntelligence, 172(1):772\u2013799, 2008.\n\n[31] A. Sabharwal S. Ermon, C. Gomes and B. Selman. Taming the curse of dimensionality: Discrete\nintegration by hashing and optimization. In International Conference on Machine Learning\n(ICML), page 334\u2013342, 2013.\n\n[32] S. Chakraborty, D. Fremont, K. S. Meel, S. Seshia, and M. Y. Vardi. Distribution-aware\nsampling and weighted model counting for sat. In AAAI Conference on Arti\ufb01cial Intelligence\n(AAAI), page 1722\u20131730, 2014.\n\n[33] D. Fremont, M. Rabe, and S. Seshia. Maximum model counting. In AAAI Conference on\n\nArti\ufb01cial Intelligence (AAAI), pages 3885\u20133892, 2017.\n\n[34] V. Vassilevska and R. Williams. Finding, minimizing and counting weighted subgraphs. In\n\nACM Symposium on Theory of Computing (STOC), pages 455\u2013464, 2009.\n\n11\n\n\f", "award": [], "sourceid": 6511, "authors": [{"given_name": "Radu", "family_name": "Marinescu", "institution": "IBM Research"}, {"given_name": "Rina", "family_name": "Dechter", "institution": "UCI"}]}