{"title": "Multi-objective Maximization of Monotone Submodular Functions with Cardinality Constraint", "book": "Advances in Neural Information Processing Systems", "page_first": 9493, "page_last": 9504, "abstract": "We consider the problem of multi-objective maximization of monotone submodular functions subject to cardinality constraint, often formulated as $\\max_{|A|=k}\\min_{i\\in\\{1,\\dots,m\\}}f_i(A)$. While it is widely known that greedy methods work well for a single objective, the problem becomes much harder with multiple objectives. In fact, Krause et al.\\ (2008) showed that when the number of objectives $m$ grows as the cardinality $k$ i.e., $m=\\Omega(k)$, the problem is inapproximable (unless $P=NP$). On the other hand, when $m$ is constant Chekuri et al.\\ (2010) showed a randomized $(1-1/e)-\\epsilon$ approximation with runtime (number of queries to function oracle) $n^{m/\\epsilon^3}$. %In fact, the result of Chekuri et al.\\ (2010) is for the far more general case of matroid constant. \n\t\n\tWe focus on finding a fast and practical algorithm that has (asymptotic) approximation guarantees even when $m$ is super constant. We first modify the algorithm of Chekuri et al.\\ (2010) to achieve a $(1-1/e)$ approximation for $m=o(\\frac{k}{\\log^3 k})$. This demonstrates a steep transition from constant factor approximability to inapproximability around $m=\\Omega(k)$. Then using Multiplicative-Weight-Updates (MWU), we find a much faster $\\tilde{O}(n/\\delta^3)$ time asymptotic $(1-1/e)^2-\\delta$ approximation. While the above results are all randomized, we also give a simple deterministic $(1-1/e)-\\epsilon$ approximation with runtime $kn^{m/\\epsilon^4}$. Finally, we run synthetic experiments using Kronecker graphs and find that our MWU inspired heuristic outperforms existing heuristics.", "full_text": "Multi-objective Maximization of Monotone\n\nSubmodular Functions with Cardinality Constraint\n\nRajan Udwani\n\nOperations Research Center, M.I.T.\n\nrudwani@alum.mit.edu\n\nAbstract\n\nfunctions subject\n\nWe consider the problem of multi-objective maximization of monotone sub-\nmodular\nto cardinality constraint, often formulated as\nmax|A|=k mini\u2208{1,...,m} fi(A). While it is widely known that greedy methods\nwork well for a single objective, the problem becomes much harder with multiple\nobjectives. In fact, Krause et al. (2008) showed that when the number of objectives\nm grows as the cardinality k i.e., m = \u2126(k), the problem is inapproximable (unless\nP = N P ). On the other hand, when m is constant Chekuri et al. (2010) showed\na randomized (1 \u2212 1/e) \u2212 \u0001 approximation with runtime (number of queries to\nfunction oracle) nm/\u00013.\nWe focus on \ufb01nding a fast and practical algorithm that has (asymptotic) approx-\nimation guarantees even when m is super constant. We \ufb01rst modify the algo-\nrithm of Chekuri et al. (2010) to achieve a (1 \u2212 1/e) \u2212 \u0001 approximation for\nlog3 k ), with \u0001 \u2192 0 as k \u2192 \u221e. This demonstrates a steep transition from\nm = o(\nconstant factor approximability to inapproximability around m = \u2126(k). Then\nusing Multiplicative-Weight-Updates (MWU), we \ufb01nd a much faster \u02dcO(n/\u03b43)\ntime asymptotic (1 \u2212 1/e)2 \u2212 \u03b4 approximation. While the above results are all\nrandomized, we also give a simple deterministic (1 \u2212 1/e) \u2212 \u0001 approximation with\nruntime knm/\u00014. Finally, we run synthetic experiments using Kronecker graphs\nand \ufb01nd that our MWU inspired heuristic outperforms existing heuristics.\n\nk\n\n1\n\nIntroduction\n\nSeveral well known objectives in combinatorial optimization exhibit two common properties: the\nmarginal value of any given element is non-negative and it decreases as more and more elements are\nselected. The notions of submodularity and monotonicity 1 nicely capture this property, resulting in the\nappearance of constrained monotone submodular maximization in a wide and diverse array of modern\napplications, including feature selection ([KG05, TCG+09]), network monitoring ([LKG+07]), news\narticle recommendation ([EAVSG09]), sensor placement and information gathering ([OUS+08,\nGKS05, KGGK06, KLG+08]), viral marketing and in\ufb02uence maximization ([KKT03, HK16]),\ndocument summarization ([LB11]) and crowd teaching ([SB14]).\nIn this paper, we are interested in scenarios where multiple objectives, all monotone submodular, need\nto be simultaneously maximized subject to a cardinality constraint. This problem has an established\nline of work in both machine learning ([KMGG08]) and the theory community ([CVZ10]). Broadly\nspeaking, there are two ways in which this paradigm has been applied:\n\n1A set function f : 2N \u2192 R on the ground set N is called submodular when f (A + a) \u2212 f (A) \u2264 f (B +\na) \u2212 f (B) for all B \u2286 A \u2286 N and a \u2208 N \\ A.. The function is monotone if f (B) \u2264 f (A) for all B \u2286 A.\nWe assume f (\u2205) = 0, then due to monotonicity we have that f is non-negative.\n\n32nd Conference on Neural Information Processing Systems (NeurIPS 2018), Montr\u00e9al, Canada.\n\n\fWhen there are several natural criteria that need to be simultaneously optimized:\nsuch as in\nnetwork monitoring, sensor placement and information gathering [OUS+08, LKG+07, KLG+08,\nKMGG08]. For example in the problem of intrusion detection [OUS+08], one usually wants to\nmaximize the likelihood of detection while also minimizing the time until intrusion is detected, and\nthe population affected by intrusion. The \ufb01rst objective is often monotone submodular and the latter\nobjectives are monotonically decreasing supermodular functions [LKG+07, KLG+08]. Therefore,\nthe problem is often formulated as an instance of cardinality constrained maximization with a small\nnumber of submodular objectives.\n\nWhen looking for solutions robust to the uncertainty in objective:\nsuch as in feature selection\n[KMGG08, GR06], variable selection and experimental design [KMGG08], robust in\ufb02uence maxi-\nmization [HK16]. In these cases, there is often inherently just a single submodular objective which is\nhighly prone to uncertainty either due to dependence on a parameter that is estimated from data, or\ndue to multiple possible scenarios that each give rise to a different objective. Therefore, one often\nseeks to optimize over the worst case realization of the uncertain objective, resulting in an instance of\nmulti-objective submodular maximization. In some applications the number of objectives is given by\nthe problem structure and can be larger even than the cardinality parameter. However, in applications\nsuch as robust in\ufb02uence maximization, variable selection and experimental design, the number of\nobjectives is a design choice that trades off optimality with robustness.\n\n1.1 Related Work\n\nThe problem of maximizing a monotone submodular function subject to a cardinality constraint,\n\nP0 := max\n\nA\u2286N,|A|\u2264k\n\nf (A),\n\ngoes back to the work of [NWF78, NW78], where they showed that the greedy algorithm gives a\nguarantee of (1 \u2212 1/e) and this is best possible in the value-oracle model. Later, [Fei98] showed\nthat this is also the best possible approximation unless P=NP. While this settled the hardness and\napproximability of the problem, \ufb01nding faster approximations remained an open line of inquiry.\nNotably, [BV14] found a faster algorithm for P0 that improved the quadratic O(nk) query complexity\nof the classical greedy algorithm to nearly linear complexity, by trading off on the approximation\nguarantee. This was later improved by [MBK+15].\nFor the more general problem maxA\u2208I f (A), where I is the collection of independent sets of a\nmatroid; [CCPV11, Von08] in a breakthrough, achieved a (1\u22121/e) approximation by (approximately)\nmaximizing the multilinear extension of submodular functions, followed by suitable rounding. Based\non this framework, tremendous progress was made over the last decade for a variety of different\nsettings [CCPV11, Von08, FNS11, Von13, VCZ11, CVZ10, DV12].\nIn the multi-objective setting, [KMGG08] amalgamated various applications and formally introduced\nthe following problem,\n\nP1 = max\n\nA\u2286N,|A|\u2264k\n\nmin\n\ni\u2208{1,2,...,m} fi(A),\n\nwhere fi(.) is monotone submodular for every i. They call this the Robust Submodular Observation\nSelection (RSOS) problem and show that in general the problem is inapproximable unless P = N P .\nConsequently, they proceeded to give a bi-criterion approximation algorithm, called SATURATE,\nwhich achieves the optimal answer by violating the cardinality constraint. Note that their inapprox-\nimability result only holds when m = \u2126(k). Another bi-criterion approximation was given more\nrecently in [CLSS17].\nOn the other hand, [CVZ10] found a randomized (1 \u2212 1/e) \u2212 \u0001 approximation for constant m in the\nmore general case of matroid constraint, as an application of a new technique for rounding over a\nmatroid polytope, called swap rounding. The runtime scales as O(nm/\u00013\n+ mn8) 2. Note, [CVZ10]\nconsider a different but equivalent formulation of the problem that stems from the in\ufb02uential paper\non multi-objective optimization [PY00]. The alternative formulation, which we introduce in Section\n2, is the reason we call this a multi-objective maximization problem (same as [CVZ10]). For the\nspecial case of cardinality constraint (which will be our focus here), [OSU18] recently showed that\nthe greedy algorithm can be generalized to achieve a deterministic 1 \u2212 1/e \u2212 \u0001 approximation for the\n\n2The n8 term could potentially be improved to n5 by leveraging subsequent work [BV14, FW14].\n\n2\n\n\fspecial case of bi-objective maximization. Their runtime scales as n1+1/\u0001 and \u0001 \u2264 1/2. To the best\nof our knowledge, when m = o(k) no constant factor approximation algorithms or inapproximability\nresults were known prior to this work.\n\n1.2 Our Contributions\n\n(cid:1) and \u0001 = min{ 1\n\n8 ln m , 4(cid:112) m\n\nlog3 k\n\n\u03b43 log m log n\n\nimation, which for m = o(cid:0) k\n\nOur focus here is on the regime m = o(k). This setting is essential to understanding the approx-\nimability of the problem for super-constant m and includes several of the applications we referred\nto earlier. For instance, in network monitoring and sensor placement, the number of objectives is\nusually a small constant ([KMGG08, LKG+07]). For robust in\ufb02uence maximization, the number of\nobjectives depends on the underlying uncertainty but is often small ([HK16]). And in settings like\nvariable selection and experimental design ([KMGG08]), where the number of objectives considered\nis a design choice. We show three algorithmic results with asymptotic approximation guarantees for\nm = o(k).\n1. Asymptotically optimal approximation algorithm: We give a (1 \u2212 1/e \u2212 \u0001)(1 \u2212 m\n\nk\u00013 ) approx-\nk } tends to 1 \u2212 1/e as k \u2192 \u221e. The\nalgorithm is randomized and outputs such an approximation w.h.p. Observe that this implies a\nsteep transition around m, due to the inapproximability result (to within any non-trivial factor) for\nm = \u2126(k).\nWe obtain this via extending the algorithm of [CVZ10], which relies on the continuous greedy\napproach, resulting in a runtime of \u02dcO(mn8). Note that there is no \u0001 dependence in the runtime, unlike\nthe result from [CVZ10]. The key idea behind the result is quite simple, and relies on exploiting the\nfact that we are dealing with a cardinality constraint, far more structured than matroids.\n2. Fast and practical approximation algorithm: In practice, n can range from tens of thousands\nto millions ([OUS+08, LKG+07]), which makes the above runtime intractable. To this end, we\n\u03b4 ) time (1 \u2212 1/e)2(1 \u2212 m/k\u00013) \u2212 \u0001 \u2212 \u03b4 approximation. Under the\ndevelop a fast O( n\nsame asymptotic conditions as above, the guarantee simpli\ufb01es to (1 \u2212 1/e)2 \u2212 \u03b4. We achieve this via\nthe Multiplicative-Weight-Updates (MWU) framework, which replaces the bottleneck continuous\ngreedy process. This costs us another factor of (1 \u2212 1/e) in the guarantee but allows us to leverage\nthe runtime improvements for P0 achieved in [BV14, MBK+15].\nMWU has proven to be a vital tool in the past few decades ([GK94, AK07, Bie06, Fle00, GK04,\nGK07, KY07, You95, You01, PST91, AHK12]). Linear functions and constraints have been the\nprimary setting of interest in these works, but recent applications have shown its usefulness when\nconsidering non-linear and in particular submodular objectives ([AG12, CJV15]). Unlike these recent\napplications, we instead apply the MWU framework in vein of the Plotkin-Shmoys-Tardos scheme\nfor linear programming ([PST91]), essentially showing that the non-linearity only costs us a another\nfactor of (1\u22121/e) in the guarantee and yields a nearly linear time algorithm. Independently, [CLSS17]\napplied the MWU framework in a similar manner and gave a new bi-criterion approximation. We\nfurther discuss how our result differs from theirs in Section 3.2.\n3. Finding a deterministic approximation for small m: While the above results are all randomized,\nwe also show a simple greedy based deterministic 1 \u2212 1/e \u2212 \u0001 approximation with runtime knm/\u00014.\nThis follows by establishing an upper bound on the increase in optimal solution value as a function of\ncardinality k, which also resolves a weaker version of a conjecture posed in [OSU18].\nOutline: We start with de\ufb01nitions and preliminaries in Section 2, where we also review relevant\nparts of the algorithm in [CVZ10] that are essential for understanding the results here. In Section 3,\nwe state and prove the main results. Since the guarantees we present are asymptotic and technically\nconverge to the constant factors indicated as k becomes large, in Section 4 we test the performance of\na heuristic closely inspired by our MWU based algorithm on Kronecker graphs [LCK+10] of various\nsizes and \ufb01nd improved performance over previous heuristics even for small k and large m.\n\n3\n\n\f2 Preliminaries\n\n2.1 De\ufb01nitions & review\n\nS\u2286N f (S)(cid:81)\n\ni\u2208S xi\n\nextension over x = (x1, . . . , xn) \u2208 [0, 1]n is de\ufb01ned as, F (x) =(cid:80)\n\nWe work with a ground set N of n elements and recall that we use P0 to denote the single objective\n(classical) problem. [NWF78, NW78] showed that the natural greedy algorithm for P0 summarized\nas, starting with \u2205, at each step add to the current set an element which adds the maximum marginal\nvalue until k elements are chosen, achieves a guarantee of 1\u2212 1/e for P0 and that this is best possible.\nFormally, given set A the marginal increase in value of function f due to inclusion of set X is given\nby, f (X|A) = f (A \u222a X) \u2212 f (A).\n(cid:81)\nWe use the notation xS for the support vector of a set S (1 along dimension i if i \u2208 S and 0 otherwise).\nWe also use the short hand |x| to denote the (cid:96)1 norm of x. Given f : 2N \u2192 R, recall that its multilinear\nj(cid:54)\u2208S(1 \u2212\nxj). This function acts as a natural replacement for the original function f in the continuous greedy\nalgorithm ([CCPV11]). Like the greedy algorithm, the continuous version always moves in a feasible\ndirection that best increases the value of function F . While evaluating the exact value of this\nfunction is naturally hard in general, for the purpose of using this function in optimization algorithms,\napproximations obtained using a sampling based oracle suf\ufb01ce ([BV14, CVZ10, CCPV11]). Given\ntwo vectors x, y \u2208 [0, 1]n, let x\u2228 y denote the component wise maximum. Then we de\ufb01ne marginals\nfor this function as, F (x|y) = F (x \u2228 y) \u2212 F (y).\nNow, we brie\ufb02y discuss another formulation of the multi-objective maximization problem, call it P2,\nintroduced in [CVZ10]. In P2 we are given a target value Vi (positive real) with each function fi\nand the goal is to \ufb01nd a set S\u2217 of size at most k, such that fi(S\u2217) \u2265 Vi, \u2200i \u2208 {1, . . . , m} or certify\nthat no S\u2217 exists. More feasibly one aims to ef\ufb01ciently \ufb01nd a set S \u2208 I such that fi(S) \u2265 \u03b1Vi for\nall i and some factor \u03b1, or certify that there is no set S\u2217 such that fi(S\u2217) \u2265 Vi, \u2200i. Observe that\nw.l.o.g. we can assume Vi = 1,\u2200i (since we can consider functions fi(.)/Vi instead) and therefore\nP2 is equivalent to the decision version of P1: Given t > 0, \ufb01nd a set S\u2217 of size at most k such that\nmini fi(S\u2217) \u2265 t, or give a certi\ufb01cate of infeasibility.\nWhen considering formulation P2, since we can always consider the modi\ufb01ed submodular objectives\nmin{fi(.), Vi}, we w.l.o.g. assume that fi(S) \u2264 Vi for every set S and every function fi. Finally,\nfor both P1, P2 we use Sk to denote an optimal/feasible set (optimal for P1, and feasible for P2) to\nthe problem and OP Tk to denote the optimal solution value for formulation P1. We now give an\noverview of the algorithm from [CVZ10] which is based on P2. To simplify the description we focus\non cardinality constraint, even though it is designed more generally for matroid constraint. We refer\nto it as Algorithm 1 and it has three stages. Recall, the algorithm runs in time O(nm/\u00013\nStage 1: Intuitively, this is a pre-processing stage with the purpose of picking a small initial set\nconsisting of elements with \u2019large\u2019 marginal values, i.e. marginal value at least \u00013Vi for some function\nfi. This is necessary for technical reasons due to the rounding procedure in Stage 3.\nGiven a set S of size k, \ufb01x a function fi and index elements in S = {s1, . . . , sk} in the or-\nder in which the greedy algorithm would pick them. There are at most 1/\u00013 elements such\nthat fi(sj|{s1, . . . , sj\u22121}) \u2265 \u00013Vi, since otherwise by monotonicity fi(S) > Vi (violating our\nw.l.o.g. assumption that fi(S) \u2264 Vi \u2200i).\nIn fact, due to decreasing marginal values we have,\nfi(sj|{s1, . . . , sj\u22121}) < \u00013Vi for every j > 1/\u00013. Therefore, we focus on sets of size \u2264 m/\u00013\n(at most 1/\u00013 elements for each function) to \ufb01nd an initial set such that the remaining elements have\nmarginal value \u2264 \u00013Vi for fi, for every i. In particular, one can try all possible initial sets of this\nsize (i.e. run subsequent stages with different starting sets), leading to the nm/\u00013 term in the runtime.\nStages 2,3 have runtime polynomial in m (in fact Stage 3 has runtime independent of m), hence Stage\n1 is really the bottleneck. It is not obvious at all if one can do better than brute force enumeration\nover all possible starting sets and still retain the approximation guarantee, since the \ufb01nal solution\nmust be an independent set of a matroid. However, as we show later, for cardinality constraints one\ncan easily avoid enumeration.\nStage 2: Given a starting set S from stage one, this stage works with the ground set N \u2212 S and runs\nthe continuous greedy algorithm. If a feasible set Sk exists for the problem, then for the right starting\nset S1 \u2208 Sk, this stage outputs a fractional point x(k1) \u2208 [0, 1]n with |x(k1)| = k1 = k \u2212 |S| such\nthat Fi(x(k1)|xN\u2212S) \u2265 (1 \u2212 1/e \u2212 \u0001)(Vi \u2212 fi(S1)) for every i, where \u0001 = 1/\u2126(k). The stage is\n\n+ mn8).\n\n4\n\n\fcomputationally expensive and takes time \u02dcO(mn8). We refer the interested reader to [CVZ10] for\nfurther details (which will not be necessary for subsequent discussion).\nStage 3: For the right starting set S1 (if one exists), Stage 2 successfully outputs a point x(k1). Stage\n3 now follows a random process that converts x(k1) into a set S2 of size k1 such that, S2 \u2208 N \u2212 S1\nand fi(S1 \u222a S2) \u2265 (1 \u2212 1/e)(1 \u2212 \u0001)Vi,\u2200i as long as \u0001 < 1/8 ln m. The rounding procedure is called\nswap rounding and we include a specialized version of the formal lemma below.\nLemma 1. ([CVZ10] Theorem 1.4, Theorem 7.2) Given m monotone submodular functions fi(.)\nwith the maximum value of singletons in [0, \u00013Vi] for every i; a fractional point x and \u0001 < 1\n8\u03b3 ln m .\nSwap Rounding yields a set R with cardinality |x|, such that,\n\nPr[fi(R) < (1 \u2212 \u0001)Fi(x)] < me\u22121/8\u0001 < 1/m\u03b3\u22121.\n\n(cid:88)\n\ni\n\nRemark: For any \u03b3 > 1, the above can be converted to a result w.h.p. by standard repetition. Also\nthis is a simpli\ufb01ed version of the matroid based result in [CVZ10].\n\n3 Main Results\n\n3.1 Asymptotic (1 \u2212 1/e) approximation for m = o(cid:0) k\n\nlog3 k\n\n(cid:1)\n\nWe replace the enumeration in Stage 1 with a single starting set, obtained by scanning once over\nthe ground set. The main idea is simply that for the cardinality constraint case, any starting set that\nful\ufb01lls the Stage 3 requirement of small marginals will be acceptable (not true for general matroids).\nNew Stage 1: Start with S1 = \u2205 and pass over all elements once in arbitrary order. For each element\ne, add it to S1 if for some i, fi(e|S1) \u2265 \u00013Vi. Note that we add at most m/\u00013 elements (at most\n1/\u00013 for each function). When the subroutine terminates, for every remaining element e \u2208 N\\S1,\nfi(e|S1) < \u00013Vi,\u2200i (as required by Lemma 1). Let k1 = k \u2212 |S1| and note k1 \u2265 k \u2212 m/\u00013.\nStage 2 remains the same as Algorithm 1 and outputs a fractional point x(k1) with |x(k1)| = k1.\nUsing basic properties of the multilinear extension and the continuous greedy framework we show\nfor \u0001(cid:48) = 1/\u2126(k),\n\nFi(x(k1)|xS1) \u2265 k1\nk\n\n(1 \u2212 1/e \u2212 \u0001(cid:48))(Vi \u2212 fi(S1))\u2200i.\n\n(1)\n\nThe details are deferred to the supplementary material. Stage 3 rounds x(k1) to S2 of size k1, and\n\ufb01nal output is S1 \u222a S2.\nk } we have, fi(S1 \u222a S2) \u2265 (1\u2212 \u0001)(1\u2212 1/e)(1\u2212 m/k\u00013)Vi \u2200i\nTheorem 2. For \u0001 = min{ 1\n\nwith constant probability. For m = o(cid:0)k/ log3 k(cid:1), the guarantee approaches (1\u2212 1/e) asymptotically\n\n8 ln m , 4(cid:112) m\n\nand the algorithm makes \u02dcO(mn8) queries.\nProof. From (1) and applying Lemma 1 we have, fi(S2|S1) \u2265 (1\u2212\u0001)(1\u22121/e\u2212\u0001(cid:48))(1\u2212m/k\u00013)(Vi\u2212\nfi(S1)),\u2200i. Therefore, fi(S1\u222aS2) \u2265 (1\u2212\u0001)(1\u22121/e\u2212\u0001(cid:48))(1\u2212m/k\u00013)Vi,\u2200i. To re\ufb01ne the guarantee,\nwe choose \u0001 = min{ 1\nk term is to balance\n\u0001 and m/k\u00013. Also \u0001(cid:48) = 1/\u2126(k) therefore, the resulting guarantee becomes (1 \u2212 1/e)(1 \u2212 h(k)),\n\n8 ln m is due to Lemma 1 and the 4(cid:112) m\n\nwhere the function h(k) \u2192 0 as k \u2192 \u221e, so long as m = o(cid:0) k\n\n8 ln m , 4(cid:112) m\n\nk }, where the\n\n(cid:1).\n\n1\n\nlog3 k\n\nNote that the runtime is now independent of \u0001. The \ufb01rst stage makes O(mn) oracle queries, the\nsecond stage runs the continuous greedy algorithm on all functions simultaneously and makes \u02dcO(n8)\nqueries to each function oracle, contributing O(mn8) to the runtime. Stage 2 results in a fractional\nsolution that can be written as a convex combination of O(nk2) sets of cardinality k each (bases) (ref.\nAppendix A in [CCPV11]). For cardinality constraint, swap rounding can merge two bases in O(k)\ntime hence, the last stage takes time O(nk3).\n\n3.2 Fast, asymptotic (1 \u2212 1/e)2 \u2212 \u03b4 approximation for m = o(cid:0) k\n\n(cid:1)\n\nlog3 k\n\nWhile the previous algorithm achieves the best possible asymptotic guarantee, it is infeasible to use\nin practice. The main underlying issue was our usage of the continuous greedy algorithm in Stage 2\n\n5\n\n\fwhich has runtime \u02dcO(mn8), but the \ufb02exibility offered by continuous greedy was key to maximizing\nthe multilinear extensions of all functions at once. To improve the runtime we avoid continuous\ngreedy and \ufb01nd an alternative in Multiplicative-Weight-Updates (MWU) instead. MWU allows us to\ncombine multiple submodular objectives together into a single submodular objective and utilize fast\nalgorithms for P0 at every step.\nThe algorithm consists of 3 stages as before. Stage 1 remains the same as the New Stage 1 introduced\nin the previous section. Let S1 be the output of this stage as before. Stage 2 is replaced with a fast\nMWU based subroutine that runs for T = O( ln m\n\u03b42 ) rounds and solves an instance of SO during\neach round. Here \u03b4 is an artifact of MWU and manifests as a subtractive term in the approximation\n\u03b4(cid:48) ) and\nguarantee. The currently fastest algorithm for SO, in [MBK+15], has runtime O(n log 1\nan expected guarantee of (1 \u2212 1/e) \u2212 \u03b4(cid:48). However, the slightly slower, but still nearly linear\n\u03b4(cid:48) ) thresholding algorithm in [BV14], has (the usual) deterministic guarantee of\ntime O( n\n(1 \u2212 1/e) \u2212 \u03b4(cid:48). Both of these are known to perform well in practice and using either would lead to a\nruntime of T \u00d7 \u02dcO(n/\u03b4) = \u02dcO( n\nNow, \ufb01x some algorithm A for P0 with guarantee \u03b1, and let A(f, k) denote the set it outputs given\nmonotone submodular function f and cardinality constraint k as input. Note that \u03b1 can be as large as\n1 \u2212 1/e, and we have k1 = k \u2212 |S1| as before. Then the new Stage 2 is,\n\n\u03b43 ), which is a vast improvement over the previous algorithm.\n\n\u03b4(cid:48) log n\n\ni = 1/m, \u02dcfi(.) = fi(.|S1)\nVi\u2212fi(S1)\n\n\u03b42\n\n2: while 1 \u2264 t \u2264 T do gt(.) =(cid:80)m\n\nAlgorithm 2 Stage 2: MWU\n1: Input: \u03b4, T = 2 ln m\n, \u03bb1\n3: X t = A(gt, k1)\ni = \u02dcfi(X t) \u2212 \u03b1\n4: mt\n(cid:80)T\ni(1 \u2212 \u03b4mt\n5: \u03bbt+1\ni = \u03bbt\ni)\n6: t = t + 1\n7: Output: x2 = 1\nT\n\nt=1 X t\n\ni=1 \u03bbt\ni\n\n\u02dcfi(.)\n\nThe point x2 obtained above is rounded to a set S2 in Stage 3 (which remains unchanged). The \ufb01nal\noutput is S1 \u222a S2. Note that by abuse of notation we used the sets X t to also denote the respective\nsupport vectors. We continue to use X t and xX t interchangeably in the below.\nThis application of MWU is unlike [AG12, CJV15], where broadly speaking the MWU framework\nis applied in a novel way to determine how an individual element is picked (or how a direction\nfor movement is chosen in case of continuous greedy). In contrast, we use standard algorithms\nfor P0 and pick an entire set before changing weights. Also, [CJV15] uses MWU along with the\ncontinuous greedy framework to tackle harder settings, but for our setting using the continuous greedy\nframework eliminates the need for MWU altogether and in fact, we use MWU as a replacement for\ncontinuous greedy. Subsequent to our work we discovered a resembling application of MWU in\n[CLSS17]. Their application differs from Stage 2 above only in minor details, but unlike our result\nthey give a bi-criterion approximation where the output is a set S of cardinality up to k log m\nV \u00012 such\nthat fi(S) \u2265 (1 \u2212 1/e \u2212 2\u0001)V .\nNow, consider the following intuitive schema. We would like to \ufb01nd a set X of size k such that\ni \u03bbifi(.), which is also\ni \u03bbifi(X\u03bb) \u2265\ni \u03bbiVi, since this is a single objective problem and we have fast approximations for P0. However,\nfor a \ufb01xed set of scalar weights \u03bbi, solving the P0 problem instance need not give a set that has\nsuf\ufb01cient value for every individual function fi(.). This is where MWU comes into the picture.\nWe start with uniform weights for functions, solve an instance of P0 to get a set X 1. Then we\nchange weights to undermine the functions for which fi(X 1) was closer to the target value and\nstress more on functions for which fi(X 1) was small, and repeat now with new weights. After\nrunning many rounds of this, we have a collection of sets X t for t \u2208 {1, . . . , T}. Using tricks\nfrom standard MWU analysis ([AHK12]) along with submodularity and monotonicity, we show that\n(cid:39) (1 \u2212 1/e)(Vi \u2212 fi(S1)). Thus far, this resembles how MWU has been used in the\nliterature for linear objectives, for instance the Plotkin-Shmoys-Tardos framework for solving LPs.\n\nfi(X) \u2265 \u03b1Vi for every i. While this seems hard, consider the combination(cid:80)\nmonotone submodular for non-negative \u03bbi. We can easily \ufb01nd a set X\u03bb such that(cid:80)\n(cid:80)\n\nfi(X t|S1)\n\n(cid:80)\n\nT\n\nt\n\n6\n\n\ft\n\nT\n\nfi(X t|S1)\n\n(cid:80)\nthat Fi(x2|xS1 ) \u2265 \u03b2(cid:80)\n\nHowever, a new issue now arises due to the non-linearity of functions fi. As an example, suppose\nthat by some coincidence x2 = 1\nt=1 X t turns out to be a binary vector, so we easily obtain the\nset S2 from x2. We want to lower bound fi(S2|S1), and while we have a good lower bound on\nT\n, it is unclear how the two quantities are related. More generally, we would like to show\nand this would then give us a \u03b2\u03b1 = \u03b2(1 \u2212 1/e) approximation\nusing Lemma 1. Indeed, we show that \u03b2 \u2265 (1 \u2212 1/e), resulting in a (1 \u2212 1/e)2 approximation. In\nthe lemmas that follow, we state this more concretely (proofs deferred to supplementary material).\nLemma 3. gt(X t) \u2265 k1\nLemma 4.\n\nfi(X t|S1)\n\nT\n\nt\n\nk \u03b1(cid:80)\ni,\u2200t.\n(cid:80)\n{1, . . . , T}, and a point x =(cid:80)\n\ni \u03bbt\n\nt\n\nLemma 5. Given monotone submodular function f, its multilinear extension F , sets X t for t \u2208\n\n\u02dcfi(X t)\nT\n\n\u2265 k1\nk\n\n(1 \u2212 1/e) \u2212 \u03b4 ,\u2200i.\n\nt X t/T , we have,\nF (x) \u2265 (1 \u2212 1/e)\n\nT(cid:88)\n\n1\nT\n\n8 ln m , 4(cid:112) m\n\nf (X t).\nk }, the algorithm makes O( n\n\nt=1\n\nTheorem 6. For \u0001 = min{ 1\nconstant probability outputs a feasible (1\u2212\u0001)(1\u22121/e)2(1\u2212 m\n\n(1 \u2212 1/e)2 \u2212 \u03b4 approximate for m = o(cid:0)k/ log3 k(cid:1).\n\n(cid:80)T\n\n\u03b43 log m log n\n\n\u03b4 ) queries, and with\nk\u00013 )\u2212\u03b4 approximate set. Asymptotically,\n\n(cid:80)\n\nk (1 \u2212 1/e)2 \u2212 \u03b4 ,\u2200i.\n\u2265 k1\nProof. Combining Lemmas 4 & 5 we have, \u02dcFi(x2) \u2265 (1 \u2212 1/e)\nThe asymptotic result follows just as in Theorem 2. For runtime, note that Stage 1 takes time O(n).\nStage 2 runs an instance of A(.), T times, leading to an upper bound of O(( n\n\u03b4 log n\n\u03b42 ) =\n\u03b4 ), if we use the thresholding algorithm in [BV14] (at the cost of a multiplicative\nO( n\nfactor of (1 \u2212 \u03b4) in the approximation guarantee). Finally, swap rounding proceeds in T rounds and\neach round takes O(k) time, leading to total runtime O( k\n\u03b42 log m) for Stage 3. Combining all three\nwe get a runtime of O( n\n\n\u03b4 ) \u00d7 log m\n\n\u03b43 log m log n\n\n\u02dcfi(X t)\nT\n\nt\n\n\u03b43 log m log n\n\n\u03b4 ).\n\n3.3 Variation in optimal solution value and derandomization\n\nk OP Tk, and the bound is easily seen to be tight.\n\nConsider the problem P0 with cardinality constraint k. Given an optimal solution Sk with value\nOP Tk for the problem, it is not dif\ufb01cult to see that for arbitrary k(cid:48) \u2264 k, there is a subset Sk(cid:48) \u2286 Sk\nof size k(cid:48), such that f (Sk(cid:48)) \u2265 k(cid:48)\nk OP Tk. For instance, indexing the elements in Sk using the\ngreedy algorithm, and choosing the set given by the \ufb01rst k(cid:48) elements gives such a set. This implies\nOP Tk(cid:48) \u2265 k(cid:48)\nThis raises a natural question: Can we generalize this bound on variation of optimal solution value\nwith varying k, for multi-objective maximization? A priori, this isn\u2019t obvious even for modular\nfunctions. In particular, note that indexing elements in order they are picked by the greedy algorithm\ndoesn\u2019t suf\ufb01ce since there are many functions and we need to balance values amongst all. We show\nthat one can indeed derive such a bound (proof in supplementary material).\nLemma 7. Given that there exists a set Sk such that fi(Sk) \u2265 Vi,\u2200i and \u0001 < 1\nk(cid:48) \u2208 [m/\u00013, k], there exists Sk(cid:48) \u2286 Sk of size k(cid:48), such that,\n\n8 ln m . For every\n\nfi(Sk(cid:48)) \u2265 (1 \u2212 \u0001)\n\n(cid:16) k(cid:48) \u2212 m/\u00013\n\nk \u2212 m/\u00013\n\n(cid:17)\n\nVi,\u2200i.\n\nConjecture in [OSU18]: Note that this resolves a slightly weaker version of the conjecture in\n[OSU18] for constant m. The original conjecture states that for constant m and every k(cid:48) \u2265 m,\nthere exists a set S of size k(cid:48), such that fi(S) \u2265 k(cid:48)\u2212\u0398(1)\nVi,\u2200i. Asymptotically, both k(cid:48)\u2212m/\u00013\nk\u2212m/\u00013\nand k(cid:48)\u2212\u0398(1)\nk . This implies that for large enough k(cid:48), we can choose sets of size k(cid:48) (k(cid:48)-\ntuples) at each step to get a deterministic (asymptotically) (1 \u2212 1/e) \u2212 \u0001 approximation with runtime\nO(knm/\u00014\n) for the multi-objective maximization problem, when m is constant (all previously known\napproximation algorithms, as well as the ones presented earlier, are randomized). We defer the proof\nto supplementary material.\n\ntend to k(cid:48)\n\nk\n\nk\n\n7\n\n\fTheorem 8. For k(cid:48) = m\n(1 \u2212 1/e)(1 \u2212 2\u0001) approximate, while making knm/\u00014 queries.\n\n\u00014 , choosing k(cid:48)-tuples greedily w.r.t. h(.) = mini fi(.) is asymptotically\n\n4 Experiments on Kronecker Graphs\n\nWe choose synthetic experiments where we can control the parameters to see how the algorithm\nperforms in various scenarios, esp. since we would like to test how the MWU algorithm performs\nfor small values of k and m = \u2126(k). We work with formulation P1 of the problem and consider a\nmulti-objective version of the max-k-cover problem on graphs. Random graphs for our experiments\nwere generated using the Kronecker graph framework introduced in [LCK+10]. These graphs exhibit\nseveral natural properties and are considered a good approximation for real networks (esp. social\nnetworks [HK16]).\nWe compare three algorithms: (i) A baseline greedy heuristic, labeled GREEDY, which focuses on one\nobjective at a time and successively picks k/m elements greedily w.r.t. each function (formally stated\nin supplementary material). (ii) A bi-criterion approximation called SATURATE from [KMGG08],\nto the best of our knowledge this is considered state-of-the-art for the problem. (iii) We compare these\nalgorithms to a heuristic inspired by our MWU algorithm. This heuristic differs from the algorithm\ndiscussed earlier in two ways. Firstly, we eliminate Stage 1 which was key for technical analysis\nbut in practice makes the algorithm perform similar to GREEDY. Second, instead of simply using\nthe the swap rounded set S2, we output the best set out of {X 1, . . . , X T} and S2. Also, for both\nSATURATE and MWU we estimate target value t using binary search and consider capped functions\nmin{fi(.), t}. Also, for the MWU stage, we tested \u03b4 = 0.5 or 0.2.\n\nAlgorithm 3 GREEDY\n1: Input: k, m, fi(.) for i \u2208 [m]\n2: S = \u2205, i = 1\n3: while |S| \u2264 k \u2212 1 do\n4: S = S + arg maxx\u2208N\u2212S fi(x|S)\n5: i = i + 1 mod m + 1\n6: Output: S\n\n2: g(.) =(cid:80)\n\nAlgorithm 4 SATURATE\n1: Input: k, t, f1, . . . , fm and set A = \u2205\n3: while |A| < k do A = A + argmax\nx\u2208N\u2212A\n4: Output: A\n\ni min{fi(.), t}\n\ng(x|A)\n\nWe pick Kronecker graphs of sizes n \u2208 {64, 512, 1024} with random initiator matrix 3 and for each\nn, we test for m \u2208 {10, 50, 100}. Note that each graph here represents an objective, so for a \ufb01xed\nn, we generate m Kronecker graphs to get m max-cover objectives. For each setting of n, m we\nevaluate the solution value for the heuristics as k increases and show the average performance over\n30 trials for each setting. All experiments were done using MATLAB.\n\n3To generate a Kronecker graph one needs a small initiator matrix. Using [LCK+10] as a guideline we use\nrandom matrices of size 2 \u00d7 2, each entry chosen uniformly randomly (and independently) from [0, 1]. Matrices\nwith sum of entries smaller than 1 are discarded to avoid highly disconnected graphs.\n\n8\n\n\fFigure 1: Plots for graphs of size 64. Number of objectives increases from left to right. The X axis\nis the cardinality parameter k and Y axis is difference between # vertices covered by MWU and\nSATURATE minus the # vertices covered by GREEDY for the same k. MWU outperforms the other\nalgorithms in all cases, with a max. gain (on SATURATE) of 9.80% for m = 10, 12.14% for m = 50\nand 16.12% for m = 100.\n\nFigure 2: Plots for graphs of size 512. MWU outperforms SATURATE in all cases with a max. gain\n(on SATURATE) of 7.95% for m = 10, 10.08% for m = 50 and 10.01% for m = 100.\n\nFigure 3: Plots for graphs of size 1024. MWU outperforms SATURATE in all cases, with max. gain\n(on SATURATE) of 6.89% for m = 10, 5.02% for m = 50 and 7.4% for m = 100.\n\n5 Open Problems\n\nA natural open question here is whether one can achieve similar approximations for a general matroid\nconstraint. Additionally, it also of interest to ask if there are fast algorithms with guarantee closer\nto 1 \u2212 1/e, in contrast to the guarantee of (1 \u2212 1/e)2 shown here. Further, it is unclear if one can\nextend the results right up to m = o(k).\n\n9\n\n\fAcknowledgments\n\nThe author gratefully acknowledges partial support from ONR Grant N00014-17-1-2194. The author\nwould also like to thank James B. Orlin and anonymous referees for their insightful comments and\nfeedback on early drafts of this work.\n\nReferences\n\n[AG12] Y. Azar and I. Gamzu. Ef\ufb01cient submodular function maximization under linear packing\n\nconstraints. ICALP, pages 38\u201350, 2012.\n\n[AHK12] S. Arora, E. Hazan, and S. Kale. The multiplicative weights update method: a meta-\n\nalgorithm and applications. Theory of Computing, 8:121\u2013164, 2012.\n\n[AK07] S. Arora and S. Kale. A combinatorial, primal-dual approach to semide\ufb01nite programs.\n\nIn STOC, pages 227\u2013236, 2007.\n\n[Bie06] D. Bienstock. Potential function methods for approximately solving linear programming\n\nproblems: theory and practice, volume 53. Springer, 2006.\n\n[BV14] A. Badanidiyuru and J. Vondr\u00e1k. Fast algorithms for maximizing submodular functions.\n\nIn SODA \u201914, pages 1497\u20131514. SIAM, 2014.\n\n[CCPV11] G. Calinescu, C. Chekuri, M. P\u00e1l, and J. Vondr\u00e1k. Maximizing a monotone submodular\nfunction subject to a matroid constraint. SIAM Journal on Computing, 40(6):1740\u20131766,\n2011.\n\n[CJV15] C. Chekuri, T.S. Jayram, and J. Vondrak. On multiplicative weight updates for concave\n\nand submodular function maximization. ITCS, pages 201\u2013210, 2015.\n\n[CLSS17] R. S. Chen, B. Lucier, Y. Singer, and V. Syrgkanis. Robust optimization for non-convex\nobjectives. In Advances in Neural Information Processing Systems, pages 4705\u20134714,\n2017.\n\n[CVZ10] C. Chekuri, J. Vondr\u00e1k, and R. Zenklusen. Dependent randomized rounding via\nexchange properties of combinatorial structures. In FOCS 10, pages 575\u2013584. IEEE,\n2010.\n\n[DV12] S. Dobzinski and J. Vondr\u00e1k. From query complexity to computational complexity. In\n\nSTOC \u201912, pages 1107\u20131116. ACM, 2012.\n\n[EAVSG09] K. El-Arini, G. Veda, D. Shahaf, and C. Guestrin. Turning down the noise in the\n\nblogosphere. In ACM SIGKDD, pages 289\u2013298, 2009.\n\n[Fei98] U. Feige. A threshold of ln n for approximating set cover. Journal of the ACM (JACM),\n\n45(4):634\u2013652, 1998.\n\n[Fle00] L. Fleischer. Approximating fractional multicommodity \ufb02ow independent of the number\n\nof commodities. SIAM Journal on Discrete Mathematics, 13(4):505\u2013520, 2000.\n\n[FNS11] M. Feldman, J.S. Naor, and R. Schwartz. A uni\ufb01ed continuous greedy algorithm for\n\nsubmodular maximization. In FOCS 11, pages 570\u2013579, 2011.\n\n[FW14] Y. Filmus and J. Ward. Monotone submodular maximization over a matroid via non-\n\noblivious local search. SIAM Journal on Computing, 43:514\u2013542, 2014.\n\n[GK94] M. Grigoriadis and L. Khachiyan. Fast approximation schemes for convex programs\nwith many blocks and coupling constraints. SIAM Journal on Optimization, 4(1):86\u2013107,\n1994.\n\n[GK04] N. Garg and R. Khandekar. Fractional covering with upper bounds on the variables:\nIn European Symposium on Algorithms, pages\n\nSolving lps with negative entries.\n371\u2013382, 2004.\n\n[GK07] N. Garg and J. Koenemann. Faster and simpler algorithms for multicommodity \ufb02ow\nand other fractional packing problems. SIAM Journal on Computing, 37(2):630\u2013652,\n2007.\n\n[GKS05] C. Guestrin, A. Krause, and A.P. Singh. Near-optimal sensor placements in gaussian\nprocesses. In Proceedings of the 22nd international conference on Machine learning,\npages 265\u2013272. ACM, 2005.\n\n10\n\n\f[GR06] A. Globerson and S. Roweis. Nightmare at test time: robust learning by feature\ndeletion. In Proceedings of the 23rd international conference on Machine learning,\npages 353\u2013360. ACM, 2006.\n\n[HK16] X. He and D. Kempe. Robust in\ufb02uence maximization. In SIGKDD, pages 885\u2013894,\n\n2016.\n\n[KG05] A. Krause and C. Guestrin. Near-optimal nonmyopic value of information in graphical\n\nmodels. UAI\u201905, pages 324\u2013331, 2005.\n\n[KGGK06] A. Krause, C. Guestrin, A. Gupta, and J. Kleinberg. Near-optimal sensor placements:\nMaximizing information while minimizing communication cost. In Proceedings of the\n5th international conference on Information processing in sensor networks, pages 2\u201310.\nACM, 2006.\n\n[KKT03] D. Kempe, J. Kleinberg, and \u00c9. Tardos. Maximizing the spread of in\ufb02uence through a\n\nsocial network. In ACM SIGKDD, pages 137\u2013146, 2003.\n\n[KLG+08] A. Krause, J. Leskovec, C. Guestrin, J. VanBriesen, and C. Faloutsos. Ef\ufb01cient sensor\nplacement optimization for securing large water distribution networks. Journal of Water\nResources Planning and Management, 134(6):516\u2013526, 2008.\n\n[KMGG08] A. Krause, H B. McMahan, C. Guestrin, and A. Gupta. Robust submodular observation\n\nselection. Journal of Machine Learning Research, 9:2761\u20132801, 2008.\n\n[KY07] C. Koufogiannakis and N. Young. Beating simplex for fractional packing and covering\n\nlinear programs. In FOCS, pages 494\u2013504, 2007.\n\n[LB11] H. Lin and J. Bilmes. A class of submodular functions for document summarization. In\n\nACL, pages 510\u2013520, 2011.\n\n[LCK+10] J. Leskovec, D. Chakrabarti, J. Kleinberg, C. Faloutsos, and Z. Ghahramani. Kronecker\n\ngraphs: An approach to modeling networks. JMLR, 11:985\u20131042, 2010.\n\n[LKG+07] J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, and N. Glance. Cost-\neffective outbreak detection in networks. In Proceedings of the 13th ACM SIGKDD\ninternational conference on Knowledge discovery and data mining, pages 420\u2013429.\nACM, 2007.\n\n[MBK+15] B. Mirzasoleiman, A. Badanidiyuru, A. Karbasi, J. Vondr\u00e1k, and A. Krause. Lazier\n\nthan lazy greedy. In AAAI, 2015.\n\n[NW78] G.L. Nemhauser and L.A. Wolsey. Best algorithms for approximating the maximum of\na submodular set function. Mathematics of operations research, 3(3):177\u2013188, 1978.\n[NWF78] G.L. Nemhauser, L.A. Wolsey, and M.L. Fisher. An analysis of approximations for\nmaximizing submodular set functions\u2014i. Mathematical Programming, 14(1):265\u2013294,\n1978.\n\n[OSU18] J. B. Orlin, A. S. Schulz, and R. Udwani. Robust monotone submodular function\n\nmaximization. Mathematical Programming, 172(1):505\u2013537, Nov 2018.\n\n[OUS+08] A. Ostfeld, J.G. Uber, E. Salomons, J.W. Berry, Phillips C.A. Hart, W.E., J.P. Watson,\nG. Dorini, P. Jonkergouw, Z. Kapelan, and F. di Pierro. The battle of the water sensor\nnetworks (BWSN): A design challenge for engineers and algorithms. Journal of Water\nResources Planning and Management, 2008.\n\n[PST91] S. Plotkin, D. Shmoys, and E. Tardos. Fast approximation algorithms for fractional\n\npacking and covering problems. In FOCS, pages 495\u2013504, 1991.\n\n[PY00] C. Papadimitriou and M. Yannakakis. On the approximability of trade-offs and optimal\n\naccess of web sources. In FOCS, pages 86\u201392, 2000.\n\n[SB14] A. Singla and I. Bogunovic. Near-optimally teaching the crowd to classify. In ICML,\n\npages 154\u2013162, 2014.\n\n[TCG+09] M. Thoma, H. Cheng, A. Gretton, J. Han, HP. Kriegel, A.J. Smola, L. Song, S.Y. Philip,\nX. Yan, and K.M. Borgwardt. Near-optimal supervised feature selection among frequent\nsubgraphs. In SDM, pages 1076\u20131087. SIAM, 2009.\n\n[VCZ11] J. Vondr\u00e1k, C. Chekuri, and R. Zenklusen. Submodular function maximization via the\nmultilinear relaxation and contention resolution schemes. In STOC \u201911, pages 783\u2013792.\nACM, 2011.\n\n11\n\n\f[Von08] J. Vondr\u00e1k. Optimal approximation for the submodular welfare problem in the value\n\noracle model. In STOC, pages 67\u201374, 2008.\n\n[Von13] J. Vondr\u00e1k. Symmetry and approximability of submodular maximization problems.\n\nSIAM Journal on Computing, 42(1):265\u2013304, 2013.\n\n[You95] N. Young. Randomized rounding without solving the linear program.\n\nvolume 95, pages 170\u2013178, 1995.\n\nIn SODA,\n\n[You01] Neal E Young. Sequential and parallel algorithms for mixed packing and covering. In\n\nFOCS, pages 538\u2013546, 2001.\n\n12\n\n\f", "award": [], "sourceid": 5773, "authors": [{"given_name": "Rajan", "family_name": "Udwani", "institution": "MIT"}]}