{"title": "The Statistical Mechanics of k-Satisfaction", "book": "Advances in Neural Information Processing Systems", "page_first": 439, "page_last": 446, "abstract": null, "full_text": "The Statistical Mechanics of \n\nk-Satisfaction \n\nScott Kirkpatrick* \n\nRacah Institute for Physics and \nCenter for Neural Computation \n\nHebrew University \n\nJerusalem, 91904 Israel \n\nkirk@fiz.huji.ac .il \n\nGeza Gyorgyi \n\nInstitute for Theoretical Physics \n\nEotvos University \n\n1-1088 Puskin u. 5-7 \nBudapest, Hungary \n\ngyorgyi@ludens.elte.hu, \n\nN aft ali Tishby \n\nand Lidror Troyansky \n\nInstitute of Computer Science and Center for Neural Computation \n\nThe Hebrew University of Jerusalem \n\n91904 Jerusalem, Israel \n\n{tishby, lidrort }@cs.huji.ac.il \n\nAbstract \n\nThe satisfiability of random CNF formulae with precisely k vari(cid:173)\nables per clause (\"k-SAT\") is a popular testbed for the performance \nof search algorithms. Formulae have M clauses from N variables, \nrandomly negated, keeping the ratio a = M / N fixed . For k = 2, \nthis model has been proven to have a sharp threshold at a = 1 \nbetween formulae which are almost aways satisfiable and formulae \nwhich are almost never satisfiable as N --jo 00 . Computer experi(cid:173)\nments for k = 2, 3, 4, 5 and 6, (carried out in collaboration with \nB. Selman of ATT Bell Labs). show similar threshold behavior for \neach value of k. Finite-size scaling, a theory of the critical point \nphenomena used in statistical physics, is shown to characterize the \nsize dependence near the threshold. Annealed and replica-based \nmean field theories give a good account of the results. \n\n\"Permanent address: IBM TJ Watson Research Center, Yorktown Heights, NY 10598 \nUSA. (kirk@watson.ibm.com) Portions of this work were done while visiting the Salk \nInstitute, with support from the McDonnell-Pew Foundation. \n\n439 \n\n\f440 \n\nKirkpatrick, Gyorgyi, Tishby, and Troyansky \n\n1 Large-scale computation without a length scale \n\nIt is increasingly possible to model the natural world on a computer. Condensed \nmatter physics has strategies to manage the complexities of such calculations, usu(cid:173)\nally depending on a characteristic length. For example, molecules or atoms with \nfinite ranged interactions can be broken down into weakly interacting smaller parts. \nWe may also use symmetry to identify natural modes of the system as a whole. \nEven in the most difficult case, continuous phase transitions correlated over a wide \nrange of scales, the renormalization group provides a way of collapsing the problem \ndown to its \"relevant\" parts by providing a generator of behavior on all scales in \nterms of the critical point itself. \n\nBut length scales are not much help in organizing another sort of large calculation. \nExamples include large rule-based \"expert systems\" that model the particulars of \ncomplex industrial processes. Digital Equipment, for example, has used a network \nof three or more expert systems (originally called \"R1/XCON\") to check computer \norders for completeness and internal consistency, to schedule production and ship(cid:173)\nping, and to aid a salesman to anticipate customers' needs. This very detailed set \nof tasks in 1979 required 2 programmers and 250 rules to deal with 100 parts. In \nthe ten years described by Barker (1989), it grew 100X, employing 60 programmers \nand nearly 20,000 rules to deal with 30,000 part numbers. 100X in ten years is only \nmoderate growth, and it would be valuable to understand how technical, social, and \nbusiness factors have constrained it. \n\nMany important commercial and scientific problems without length scales are ready \nfor attack by computer modelling or automatic classification, and lie within a few \ndecades of XCON's size. Retail industries routinely track 105 - 106 distinct items \nkept in stock. Banks, credit card companies, and specialized information providers \nare building models of what 108 Americans have bought and might want to buy \nIn biology, human metabolism is currently described in terms of > 1000 \nnext. \nsubstances coupled through> 10,000 reactions, and the data is doubling yearly. \nSimilarly, amino acid sequences are known for> 60,000 proteins. \n\nA deeper understanding of the computational cost of these problems of order 106 \u00b12 \nis needed to see which are practical and how they can be simplified. We study \nan idealization of XC ON-style resolution search, and find obvious collective effects \nwhich may be at the heart of its computational complexity. \n\n2 Threshold Phenomena and Random k-SAT \n\nProperties of randomly generated combinatorial structures often exhibit sharp \nthreshold phenomena analogous to the phase transitions studied in condensed mat(cid:173)\nter physics. Recently, thresholds have been observed in randomly generated Boolean \nformulae. Mitchell et al. (1992) consider the k-satisfiability problem (k-SAT). An \ninstance of k-SAT is a Boolean formula in conjunctive normal form (CNF), i.e., \na conjunction (logical AND) of disjunctions or clauses (logical ORs), where each \ndisjunction contains exactly k literals. A literal is a Boolean variable or, with equal \nprobability, its negation. The task is to determine whether there is an assignment \nto the variables such that all clauses evaluate to true. Here, we will use N to denote \nthe number of variables and M for the number of clauses in a formula. \n\n\fThe Statistical Mechanics of k-Satisfaction \n\n441 \n\nFor randomly generated 2-SAT instances, it has been shown analytically that for \nlarge N, when the ratio a: = M / N is less than 1 the instances are almost all \nsatisfiable, whereas for ratios larger than 1, almost all instances are unsatisfiable \n(Chvatal and Reed 1992; Goerdt 1992). For k ~ 3, a rigorous analysis has proven \nto be elusive. Experimental evidence, however, strongly suggests a threshold with \na: ~ 4.3 for 3SAT (Mitchell et al. 1992; Crawford and Auton 1993; Larrabee 1993). \n\nOne of the main reasons for studying randomly generated 3CNF formulae is for their \nuse in the empirical evaluation of combinatorial search algorithms. 3CNF formulae \nare good candidates for the evaluation of such algorithms because determining their \nsatisfiability is an NP-complete problem. This also holds for larger values of k. For \nk = 1 or 2, the satisfiability problem can be solved efficiently (Aspvall et al. 1979) . \nDespite the worst-case complexity, simple heuristic methods can usually determine \nthe satisfiability of random formulae. However, computationally challenging test \ninstances are found by generating formulae at or near the threshold (Mitchell et al. \n1992). Cheeseman (1991) has made a similar observation of increased computational \ncost for heuristic search at a boundary between two distinct phases or behaviors of \na combinatorial model. \n\nWe will provide a precise characterization of the N -dependence of the threshold \nphenomena for k-SAT with k ranging from 2 to 6. We will employ finite size scaling, \na method from statistical physics in which direct observation of the width of the \nthreshold , or \"critical region\" of a transition is used to characterize the \"universal\" \nbehavior of quantities across the entire critical region, extending the analysis to \ncombinatorial problems in which N characterizes the size of the model observed. \nFor discussion of the applicability of finite-size scaling to systems without a metric, \nsee Kirkpatrick and Selman (1993). \n\nill \n~ \n\n~ \n\n'\" ~ \n... ill \n\n\u00a7 \n\n~ \n0 \n\ng \n',j \n~ \n~ \n\n1. \n\nO . B \n\n0 . 6 \n\n0 . 4 \n\n0 . 2 \n\n0 \n\n0 \n\ni\n\n.' \n\n,': \n\nif 1(/ \nII! \nif \n!i \ni! \n!i 11/ \n~i \nii \n\n'1,1 \n\nJ \n\n1.0 \n\nThr \u2022\u2022 ho~d. rOr 2SAT. 3SAT , 4SAT, 5SAT , and 6SAT \n\n/'<> \n\n.'\" \n/ \n:' \n.... \nI : \n! / \nIi' \n// \n// \nf/ \n}' \n\n:1 \n\n..... ; \n.. ' \",' \n\n20 \n\nJ.. \n\n30 \nM I N \n\n40 \n\nso \n\n60 \n\nFig. 1: Fraction of unsatisfiable formulae for 2-, 3- 4-, 5- and 6-SAT. \n\n\f442 \n\nKirkpatrick, Gyorgyi, Tishby, and Troyansky \n\n3 Experimental data \n\nWe have generated extensive data on the satisfiability of randomly generated k(cid:173)\nCNF formulae with k ranging from 2 to 6. Fig. 1 shows the fraction of random \nk-SAT formulae that is unsatisfiable as a function of the ratio, a. For example, \nthe left-most curve in Fig. 1 shows the fraction of formulae that is unsatisfiable for \nrandom 2CNF formulae with 50 variables over a range of values of a. \n\nEach data point was generated using 10000 randomly generated formulae, giving \n1 % accuracy. We used a highly optimized implementation of the Davis-Putnam \nprocedure (Crawford and Auton 1993). The procedure works best on formulae with \nsmaller k . Data was obtained for k = 2 on samples with N ~ 500, for k = 3 with \nN ~ 100, and for k = 5 with N ~ 40, all at comparable computing cost. \n\nFig. 1 (for N ranging from 10 to 50) shows a threshold for each value of k. Except \nfor the case k = 2, the curves cross at a single point and sharpen up with increasing \nN. For k = 2, the intersections between the curves for the largest values of N seem \nto be converging to a single point as well, although the curves for smaller N deviate. \nThe point where 50% of the formulae are unsatisfiable is thought to be where the \ncomputationally hardest problems are found (Mitchell et al. 1992; Cheeseman et al. \n1991). The 50% point lies consistently to the right of the scale-invariant point (the \npoint where the curves cross each other), and shifts with N. \n\nThere is a simple explanation for the rapid shift of the thresholds to the right \nwith increasing k . The probability that a given clause is satisfied by a random \ninput configuration is (2k - 1)/2k = (1 - 2- k ) _ 'k. If we treat the clauses as \nindependent, the probability that all clauses are satisfied is ,~ = ,kN . We define \nconfigurations,2N 'k . 5 = 1 + alog2(,k) = 1- a/aann, and the vanishing of the \nthe entropy, 5, per in~ut as l/N times the log2 of the expected number of satisfying \n\nentropy gives an estimate of the threshold, identical to the upper bound derived \nby several workers (see Franco (1983) and citations in Chvatal (1992)): aann = \n-(log2(1 - 2- k))-1 ~ (ln2)2k. This is called an annealed estimate for C\u00a5c, because \nit ignores the interactions between clauses, just as annealed theories of materials \n(see Mezard 1986) average over many details of the disorder. We have marked aann \nwith an arrow for each k in the figures, and tabulate it in Table 1. \n\n4 Results of Finite-Size Scaling Analysis \n\nFrom Fig. 1, it is clear that the threshold \"sharpens up\" for larger values of N. \nBoth the threshold shift and the increasing slope in the curves of Fig. 1 can be \naccounted for by finite size scaling. (See Stauffer and Aharony (1992) or Kirkpatrick \nand Swendsen (1985).) We plot the fraction of samples unsatisfied against the \ndimensionless rescaled variable, \n\ny = Nl/V(a - c\u00a5c)/ac . \n\nValues for a c and 1I must be derived from the experimental data. First a c is \ndetermined as the crossing point of the curves for large N in Fig. 1. Then 1I is \ndetermined to make the slopes match up through the critical region. In Fig. 2 (for \nk = 3) we find that these two parameters capture both the threshold shift and the \nsteepening of the curves, using a c = 4.17 and 1I = 1.5. We see that F, the fraction \n\n\fThe Statistical Mechanics of k-Satisfaction \n\n443 \n\nscakMf CFOuover functton, III SAT modele \n\n_>SAT.,. \n\n\",.12 \u2022 \n\"=20 \u2022 \nN=24 a \nN=tO \nIl \nN. 50 a. \nN .. 100 .... . \n\n\u2022 .. .fi' \n\ni. \n\n01 \n\n01 \n\nJ \ni \na \n\nf '0 \nOf I \n\n02 \n\nFig. 2: Rescaled 3-SAT data using a c = 4.17, lJ = 1.5. \nFig. 3: Rescaled data for 2-, 3-, 4-, 5-, and 6-SAT approach annealed limit. \n\n-2 \n\n-\\ \n\n2 \nY \n\n3 \n\nof unsatisfiable formulae, is given by F(N, a) = I(y) , where the invariant function, \nI, is that graphed in Fig. 2. \nA description of the 50% threshold shift follows immediately. If we define y' by \nI(y') = 0.5, then a50 = a c(1 + y' N- 1/ V ) . From Fig. 2 we find that a50 ~ 4.17 + \n3.1N- 2 / 3 . Crawford and Auton (1993) fit their data on the 50% point as a function \nof N by arbitrarily assuming that the leading correction will be O(I/N) . They \nobtain a50 = 4.24 + 6/ N. However, the two expressions differ by only a few percent \nas N ranges from 10 to 00. \n\nWe also obtained good results in rescaling the data for the other values of k. In \nTable 1 we give the critical parameters obtained from this analysis. The error \nbars are subjective, and show the range of each parameter over which the best \nfits were obtained. Note that v appears to be tending to 1, and aann becomes \nan increasingly good approximation to a c as k increases. The success of finite-size \nscaling with different powers, v, is strong evidence for criticality, i.e., diverging \ncorrelations, even in the absence of any length. \n\nFinally, we found that all the crossovers were similar in shape. In fact, combining \nthe various rescaled curves in figure 3 shows that the curves for k ~ 3 all coincide \nin the vicinity of the 50% point, and tend to a limiting form, which can be obtained \nby extending the annealed arguments of the previous section. If we define \n\nthen the probability that a formula remains unsatisfied for all 2N configurations is \n\nThe curve for k = 2 is similar in form, but shifted to the right from the other ones. \n\n\f444 \n\nKirkpatrick, Gyorgyi, Tishby, and Troyansky \n\nk \n2 \n3 \n4 \n5 \n6 \n\n0'2 \nO'ann \n2.41 \n1.38 \n5.19 \n4.25 \n10.74 9.58 \n21.83 20.6 \n44.01 \n42.8 \n\nO'c \n1.0 \n\n0\" \n2.25 \n4.17\u00b1.03 0.74 \n9.75\u00b1.05 0.67 \n20.9\u00b1.1 0.71 \n43.2\u00b1.2 0.69 \n\nV \n2.6\u00b1.2 \n1.5\u00b1.1 \n1.25\u00b1.05 \n1.1\u00b1.O5 \n1.05\u00b1.05 \n\nTable 1: Critical parameters for random k-SAT. \n\n5 Outline of Statistical Mechanics Analysis \n\nSpace permits only a sketch of our analysis of this model. Since the N inputs are \nbinary, we may represent them as a vector, X, of Ising spins: \n\nEach random formula, F, can be written as a sum of its M clauses, Cj, \n\nX={xi=\u00b1l} \n\ni=l, ... N. \n\nwhere \n\nM \n\nF = LCj, \n\nj=1 \n\nk \n\nCj = II (1 - Jj 1X)/2. \n\n1=1 \n\nwhere the vector, Jj,l, has only one non-zero element, \u00b11, at the input which \nit selects. F evaluates to the number of clauses left unsatisfied by a particular \nconfiguration. It is natural to take the value of F to be the energy. The partition \nfunction, \n\nz = tr{x.}e.6.r = tr{x.} II e.6Cj , \n\nj \n\nwhere f3 is the inverse of a fictitious temperature, factors into contributions from \neach clause. The \"annealed\" approximation mentioned above consists simply of \ntaking the trace over each subproduct individually, neglecting their interactions. In \nthis construction, we expect both energy and entropy, S, to be extensive quantities, \nthat is, proportional to N. Fig. 4 shows that this is indeed the case for S( a). The \nlines in Fig. 4 are the annealed predictions S( a, k) = 1 - 0'/ aann. Expressions for \nthe energy can also be obtained from the annealed theory, and used to compare \nthe specific heat observed in numerical experiments with the simple limit in which \nthe clauses do not interact. This gives evidence supporting the identification of \nthe unsatisfied phase as a spin glass. Finally, a plausible phase diagram for the \nspin glass-like \"unsatisfied\" phase is obtained by solving for S(T) = 0 at finite \ntemperatures. \n\nTo perform the averaging over the random clauses correctly requires introducing \nreplicas (see Mezard 1986), which are identical copies of the random formula, and \ndefining q, the overlap between the expectation values of the spins in any two \nreplicas, as the new order parameter. The results appear to be capable of accounting \n\n\fThe Statistical Mechanics of k-Satisfaction \n\n445 \n\nfor the difference between experiment and the annealed predictions at finite k. For \nexample, an uncontrolled approximation in which we consider just two replicas \ngives the values of a2 in Table 1, and accounts rather closely for the average overlap \nfound experimentally between pairs of lowest energy states, as shown in Fig. 5. The \n2-replica theory gives q as the solution of \n\na(k, q) = 2k(1 + q)k-l(4k - 2k+l + (1 - ql)/ln\u00abl + q)(l - q)) \n\nfor q as a function of a. This gives the lines in Fig 5. We defined a2 (in Table 1) \nas the point of inflection, or the maximum in the slope of q(a). \n\nEntropy tor It- SAT. \n\nl = 2. 3, t . S \n\no . \n\n0 ' \n\no . \n\no 1 \n\n' l i.frlk.ll'Sp\u00b7(cid:173)\n' n6k2 p' \u2022 \n' n2 f.k2 p'_ \n\n' nlO, p' D \n\u00b7 nH .p\u00b7 ....... \n\n\u2022 \n\n' n12U p2 ' \n'n20 kf, p ' ..-....t \n' nlOkS p' \u2022 \n' n20kS p ' -\n\n,~~:-\n'qob.I2CM a c , even for k = 2. Therefore, if both diverging correlations (diverging in size \nif no lengths are defined) and random sign or \"spin-glass\" effects are present, we \nexpect a local search like Davis-Putnam to be exponentially difficult on average. \nBut these characteristics do not imply NP-completeness. \n\n7 References \n\nAspvall, B., Plass, M.F., and Tarjan, R.E. (1979) A linear-time algorithm for testing the \ntruth of certain quantified Boolean formulae. Inform. Process. Let., Vol. 8., 1979, \n289-314. \n\nBarker, V. E., and O'Connor, D. (1989). Commun. Assoc. for Computing Machinery, \n\n32(3), 1989, 298-318. \n\nCheeseman, P., Kanefsky, B., and Taylor, W.M. (1991). Where the really hard problems \n\nare. Proceedings IJCAI-91, 1991, 163-169. \n\nClearwater, S.H., Huberman B.A., Hogg, T. (1991) Cooperative Solution of Constraint \n\nSatisfaction Problems. Science, Vol. 254, 1991, 1181-1183 \n\nCrawford, J.M. and Auton L.D. (1993). Experimental Results on the Crossover Point in \n\nSatisfiability Problems. Proc. of AAAI-99, 1993. \n\nChvatal, V. and Reed, B. (1992) Mick Gets Some: The Odds are on his Side. Proc. of \n\nSTOC, 1992, 620-627. \n\nFu, Y. (1989). The Uses and Abuses of Statistical Mechanics in Computational Com(cid:173)\n\nin Lectures in the Sciences of Complexity, ed. D. Stein, pp. 815-826, \n\nplexity. \nAddison-Wesley, 1989. \n\nFranco, J. and Paull, M. (1988). Probabilistic Analysis of the Davis-Putnam Procedure \nfor solving the Satisfiability Problem. Discrete Applied Math., Vol. 5, 77-87, 1983. \nGoerdt, A. (1992). A threshold for unsatisfiability. Proc. 17th Int. Symp. on the Math. \n\nFoundations of Compo Sc., Prague, Czechoslovakia, 1992. \n\nKirkpatrick, S. and Swendsen, R.H. (1985). Statistical Mechanics and Disordered Sys(cid:173)\n\ntems. CA CM, Vol. 28, 1985, 363-373. \n\nKirkpatrick, S., and Selman, B. (1993), submitted for publication. \nLarrabee, T. and Tsuji, Y. (1993) Evidence for a Satisfiability Threshold for Random \n3CNF Formulas, Proc. of the AAAI Spring Symposium on AI and NP-hard prob(cid:173)\nlems, Palto Alto, CA, 1993. \n\nMezard, M., Parisi, G., Virasoro, M.A. (1986). Spin Glass Theory and Beyond, Singapore: \n\nWorld Scientific, 1986. \n\nMitchell, D., Selman, B., and Levesque, H.J. (1992) Hard and Easy Distributions of SAT \n\nproblems. Proc. of AAAI-92, 1992, 456-465. \n\nStauffer, D. and Aharony, A. (1992) Introduction to Percolation Theory. London: Taylor \n\nand Francis, 1992. See especially Ch. 4. \n\n\f", "award": [], "sourceid": 737, "authors": [{"given_name": "Scott", "family_name": "Kirkpatrick", "institution": null}, {"given_name": "G\u00e9za", "family_name": "Gy\u00f6rgyi", "institution": null}, {"given_name": "Naftali", "family_name": "Tishby", "institution": null}, {"given_name": "Lidror", "family_name": "Troyansky", "institution": null}]}