{"title": "Learning without the Phase: Regularized PhaseMax Achieves Optimal Sample Complexity", "book": "Advances in Neural Information Processing Systems", "page_first": 8641, "page_last": 8652, "abstract": "The problem of estimating an unknown signal, $\\mathbf x_0\\in \\mathbb R^n$, from a vector $\\mathbf y\\in \\mathbb R^m$ consisting of $m$ magnitude-only measurements of the form $y_i=|\\mathbf a_i\\mathbf x_0|$, where $\\mathbf a_i$'s are the rows of a known measurement matrix $\\mathbf A$ is a classical problem known as phase retrieval. This problem arises when measuring the phase is costly or altogether infeasible. In many applications in machine learning, signal processing, statistics, etc., the underlying signal has certain structure (sparse, low-rank, finite alphabet, etc.), opening of up the possibility of recovering $\\mathbf x_0$ from a number of measurements smaller than the ambient dimension, i.e., $m Cn amplitude-only measurements, where C > 1 is a constant that\ndepends on the algorithm as well as the measurement vectors. However, many interesting signals\nin practice contain fewer degrees of freedom than the ambient dimension (sparse signals, low-rank\nmatrices, \ufb01nite alphabet signals, etc.). Such low-dimensional structures open up the possibility of\nperfect signal recovery with a number of measurements signi\ufb01cantly smaller than n.\n\n1.1 Summary of contributions\n\nIn this paper we propose a new approach for recovering structured signals. Inspired by the PhaseMax\nalgorithm, we introduce a new convex formulation and investigate necessary and suf\ufb01cient conditions,\nin terms of the number of measurements, for perfect recovery. We refer to this new framework as\nregularized PhaseMax. The constrained set in this optimization is obtained by relaxing the non-convex\nequality constraints in the original phase retrieval problem to convex inequality constraints. The\nobjective function consists of two terms. One is a linear functional that relies on an initial estimate of\nthe true signal which must be externally provided. The second term is an additive regularization term\nthat is formed based on a priori structural information about the signal.\nWe utilize the recently developed Convex Gaussian Min-Max Theorem (CGMT) [39] to precisely\ncompute the necessary and suf\ufb01cient number of measurements for perfect signal recovery when the\nentries of the measurement matrix are i.i.d. Gaussian. To the extent of our knowledge, this is the \ufb01rst\nconvex optimization formulation for the problem of structured signal recovery given phaseless linear\nGaussian measurements that provably requires an order optimal number of measurements. In this\npaper we focus on real signals and real measurements. The complex case is more involved, requires\na different analysis, and will be considered in a separate work. Through our analysis, we make the\nfollowing main contributions:\n\n2\n\n\f\u2022 We \ufb01rst provide a suf\ufb01cient recovery condition, in Section 3.1, in terms of the number of\nmeasurements, for perfect signal recovery. We use this to infer that our proposed method is\norder-wise optimal.\n\u2022 We characterize the exact phase transition behavior for the class of absolutely scalable\n\u2022 We apply our \ufb01ndings to two special examples: unstructured signal recovery and sparse\nrecovery. We observe that the theory well matches the result of numerical simulations for\nthese two examples.\n\nregularization functions.\n\n1.2 Prior work\n\nPhase retrieval for structured signals has gained signi\ufb01cant attention in recent years. A review of all\nof the results is beyond the scope of this paper, and we instead brie\ufb02y mention some of the most\nrelevant literature for the Gaussian measurement model. Oymak et. al. [30] analyzed the performance\nof the regularized PhaseLift algorithm and observed that the required sample complexity is of a\nsuboptimal order compared to the optimal number of measurements required when phase information\nis available. For the special case of sparse phase retrieval similar results have been reported in [24]\nwhich indicates O(k2 log(n)) measurements are required for recovering of a k-sparse signal, using\nregularized PhaseLift. Recently, there has been a stream of work on solving phase retrieval using non-\nconvex methods [6, 47]. In particular, Soltanolkotabi [37] has shown that amplitude-based Wirtinger\n\ufb02ow can break the O(k2 log(n)) barrier. We also note that the paper [20] analyzed the PhaseMax\nalgorithm with (cid:96)1 regularizer and observed that it achieves perfect recovery with O(k log(n/k))\nsamples, provided a well-correlated initialization point.\n\n2 Preliminaries\n\n2.1 Problem setup\nLet x0 \u2208 Rn denote the the underlying structured signal. We consider the real phase retrieval\nproblem with the goal of recovering x0 from m magnitude-only measurements of the form,\n\nyi = |aT\n\ni x0|, i = 1, 2, . . . , m ,\n\n(1)\nwhere {ai \u2208 Rn}m\ni=1 is the set of (known) measurement vectors. In practice, this set is identi\ufb01ed\nbased on the experimental settings; however, throughout this paper (for our analysis purposes) we\nassume that the ai\u2019s are drawn independently from a Gaussian distribution with mean zero and\ncovariance matrix In. In order to exploit the structure of the signal we assume f (\u00b7) is a convex\nfunction that measures the \"complexity\" of the structured solution. The regularized PhaseMax\nalgorithm also relies on an initial estimate of the true signal. Here, xinit is used to represent this initial\nguess. Our analysis is based on the critical assumption that both xinit and x0 are independent of all\nthe measurement vectors. The constraint set in generalized PhaseMax is derived by simply relaxing\nthe equality constraints in (1) into convex inequality constraints. We introduce the following convex\noptimization problem to recover the signal:\n\n\u02c6x = argmin\nx\u2208Rn\nsubject to:\n\nL\u03bb(x) = \u2212xinit\n|aT\ni x| \u2264 yi ,\n\nTx + \u03bbf (x)\nfor 1 \u2264 i \u2264 m.\n\n(2)\n\nThe function f is assumed to be sign invariant, i.e., f (x) = f (\u2212x) for all x \u2208 Rn (\u2212x has the\nsame \"complexity\" as x.) Note that because of the global phase ambiguity of measurements in (1),\nwe can only estimate x0 up to a sign. Up to this sign ambiguity, we can use the normalized mean\nsquared error (NMSE), de\ufb01ned as ||\u02c6x\u2212x0||2\n, to measure the performance of the solution. In this paper\n||x0||2\nwe investigate the conditions under which the optimization program (2) uniquely identi\ufb01es the true\nsignal, i.e., \u02c6x = x0 (up to the sign). Our results are asymptotic which is valid when m, n \u2192 \u221e.\n\n2.2 Background on convex analysis\n\nOur results give the required number of measurements as a function of certain geometrical properties\nof the descent cone of the objective function. Here, we recall these de\ufb01nitions from convex analysis.\n\n3\n\n\fDe\ufb01nition 1. (Descent cone) For a function R : Rn \u2192 R the descent(tangent) cone at point x is\nde\ufb01ned as,\n\nTR(x) = cone({z \u2208 Rn : R(x + z) \u2264 R(x)}) ,\n\nwhere cone(S) denotes the closed conical hull of the set S.\nDe\ufb01nition 2. Let S be a closed convex set in Rn. For x \u2208 Rn the projection of x on S, denoted by\n\u03a0S (x), is de\ufb01ned as follows,\n\n(3)\n\n\u03a0S (x) := argmin\n\ny\u2208S\n\n||x \u2212 y|| ,\n\n(4)\n\nwhere || \u00b7 || is the Euclidean norm. The distance function is de\ufb01ned as: distS (x) = ||x \u2212 \u03a0S (x)||.\nDe\ufb01nition 3. (Statistical dimension) [1] The statistical dimension of a closed convex cone C in Rn\nis de\ufb01ned as,\n\nd(C) = Eg [||\u03a0C(g)||2] ,\n\n(5)\n\nwhere g \u2208 Rn is a standard normal vector.\n\nThe statistical dimension canonically extends the dimension of linear spaces to convex cones. This\nquantity has been extensively studied in linear inverse problems. It is well-known that as n \u2192\n\u221e, m > d(TL\u03bb(x0)) is the necessary and suf\ufb01cient condition for perfect signal recovery under\nnoiseless linear Gaussian measurements [11, 38]. Our analysis indicates that given phaseless linear\nmeasurements, the regularized PhaseMax algorithm requires O(d(TL\u03bb(x0))) measurements for\nperfect signal reconstruction. Therefore, it is order-wise optimal in that sense.\n\n3 Main Results\n\nIn this section we present the main results of the paper which provide us with the required number of\nmeasurements for perfect signal recovery in the regularized PhaseMax optimization (2). This gives\nthe value m0 = m0(n, x0, xinit, \u03bb), such that the regularized PhaseMax algorithm uniquely identi\ufb01es\nthe underlying signal x0 with high probability whenever m > m0.\nIn Section 3.1, we \ufb01nd suf\ufb01cient conditions for recovery of the underlying signal. Theorem 1 provides\nan upper bound on the number of measurements that is equal to a constant factor times the statistical\ndimension of the descent cone, d(TL\u03bb(x0)). Therefore, even though our analysis is not exact in this\nsection, it leads us to the important observation that our proposed method is order-wise optimal in\nterms of the required sample complexity for perfect signal reconstruction.\nIn Section 3.2, we provide an exact analysis for the phase transition behavior of regularized PhaseMax\nwhen the regularizer is an absolutely scalable function. We apply this result to the case of unstructured\nphaseless recovery as well as sparse phaseless recovery to compute the exact phase transitions. We\nthen compare the result of theory with the empirical results from numerical simulations.\n\n3.1 Suf\ufb01cient recovery condition\n\n0 and P\u22a5 := I \u2212 P denote the projectors onto the span of x0 and its orthogonal\nLet P := 1||x0||2 x0xT\ncomplement, respectively, where || \u00b7 || denotes the (cid:96)2-norm of the vectors. We also de\ufb01ne d(n) :=\nd(TL\u03bb(x0)) as the statistical dimension of the descent cone of the objective function at point x0. Our\nanalysis rigorously characterizes the phase transition behavior of the regularized PhaseMax in the\nlarge system limit, i.e., when n \u2192 \u221e, while m and d(n) grow at a proportional ratio \u03b4 = m\nd(n) . \u03b4\nis often called the oversampling ratio. Here, the superscript (n) is used to denote the elements of a\nsequence. To streamline the notations, we often drop this when understood from the context.\nTheorem 1 provides suf\ufb01cient conditions for the successful recovery of x0. The recovery threshold\ndepends on \u03bb and the initialization vector, xinit. We de\ufb01ne \u03c1init := xinit\nTx0 to quantify the caliber\nof the initial estimate. Due to the sign invariance property of the solution, we can assume without\nloss of generality that \u03c1init \u2265 0. Before stating the theorem, we shall introduce the function\nR(\u00b7) : (2, +\u221e) \u2192 R+.\n\n4\n\n\fFigure 1: R(x) for different values of x. R is a monotonically decreasing function.\n\nDe\ufb01nition 4. For x > 2, R(x) is the unique nonzero solution of the following equation:\n\nt2 =\n\nx\n\u03c0\n\n((1 + t2)atan(t) \u2212 t) .\n\n(6)\n\nFigure 1 depicts the evaluation of the function R(x) for different input values x. As observed, R(x)\nis a decreasing function with respect to x, and it approaches zero as x grows to in\ufb01nity. It can be\nshown that for large values of the input x, R(x) decays with the rate 1\nx.\nTheorem 1 (Suf\ufb01cient recovery condition). For a \ufb01xed oversampling ratio \u03b4 > 2, the regularized\nP{||\u02c6x \u2212 x0||2 >\nPhaseMax optimization (2) perfectly recovers the target signal (in the sense that lim\nn\u2192\u221e\n\u0001||x0||2} = 0, for any \ufb01xed \u0001 > 0) if,\n\nR(\u03b4) < sup\n\nv\u2208\u2202L\u03bb(x0)\n\n||Pv||\n||P\u22a5v|| ,\n\nwhere \u2202L\u03bb(x0) denotes the sub-differential set of the objective function L\u03bb(\u00b7) at point x0.\nIt is worth noting that \u2202L\u03bb(x0) is a convex and compact set, and it can be expressed in terms of the\nsub-differential of the regularization function \u2202f (x0) as following,\n\n\u2202L\u03bb(x0) = {\u03bbu \u2212 xinit : u \u2208 \u2202f (x0)} .\n\n(8)\nObserve that since R(\u00b7) is a monotonically decreasing function, the inequality (7) gives a lower bound\nfor the oversampling ratio \u03b4. Indeed, we can restate the result in terms of this lower bound as the\nfollowing corollary:\nCorollary 1. If there exist a \ufb01xed constant \u03c4 > 0 such that,\n\n(7)\n\n(9)\n\n||Pv||\n||P\u22a5v|| > \u03c4,\n\nsup\n\nv\u2208\u2202L\u03bb(x0)\n\nthen the regularized PhaseMax optimization (2) has perfect recovery for \u03b4 > C, where C is a\nconstant that only depends on \u03c4.\nProof. It is an immediate consequence of Theorem 1 by choosing C = R\u22121(\u03c4 ) and noting that R(\u00b7)\nis monotonically decreasing.\n\nThis result indicates that if xinit and \u03bb are chosen in such a way that the inequality (9) is satis\ufb01ed\nfor some positive constant \u03c4, then one needs m > Cd(n) measurement samples for perfect recovery,\nwhere C is a constant and d(n)(= d) is the statistical dimension of the descent cone of the objective\nfunction at point x0. As motivating examples, we use Theorem 1 to \ufb01nd upper bounds on the phase\ntransition when x0 has no structure or it is a sparse signal.\nExample 1: Assume the target signal x0 has no a priori structure. The objective function in this case\nwould be L(x) = \u2212xinit\nTx, and \u2202L(x0) = {\u2212xinit}. It can be shown that the statistical dimension is\n\n5\n\n\f(a)\n\n(b)\n\nFigure 2: Phase transition regimes for the regularized PhaseMax problem in terms of the oversampling ratio\n\u03b4 and \u03c1init = xinit\nTx0, for the cases of x0 with (a) no structure and (b) sparse signal recovery . The blue lines\nindicate the theoretical estimate for the phase transition derived from Theorem 2. The red line in (a) correspond\nto the upper bound calculated by Theorem 1. In the simulations we used signals of size n = 128. The result is\naveraged over 10 independent realization of the measurements.\n\nd(n) = n \u2212 1/2. Due to the absence of the regularization term in this case, without loss of generality\nwe can assume (cid:107)x0(cid:107) = (cid:107)xinit(cid:107) = 1. Theorem 1 provides the following suf\ufb01cient condition for perfect\nrecovery:\n\n> R(\u03b4) .\n\n(10)\nThis indicates O(n) measurements is suf\ufb01cient for perfect recovery as long as \u03c1init \u2265 \u03c10, where\n\u03c10 > 0 is a constant that does not approach zero as n \u2192 \u221e. The exact phase transition for\nthe unstructured case (PhaseMax) has been derived in [14] which is compatible with this result.\nFigure 2(a) shows the result of numerical simulation for different values of \u03b4 and \u03c1init, when n = 128.\nAs depicted in the \ufb01gure, the suf\ufb01cient recovery condition from Theorem 1 is approximately a factor\nof 2 away from the actual phase transition.\nExample 2: Let x0 be a k-sparse signal. In this case we use || \u00b7 ||1 as the regularization function.\n, then d(n) \u2264 Ck log(n/k), for some constants c, C > 0.\nWe show in Section 5.5 that if \u03bb > c\u221a\nThis matches the well-known order for the statistical dimension derived in the compressive sensing\nliterature [38].\nMoreover, in order to satisfy the condition in Corollary 1 we need to have\n> (1 + \u0001)\u03bb, for\nsome \u0001 > 0. Therefore, x0 can be perfectly recovered having O(k log(n/k)) samples when the\nhyper-parameter \u03bb is tuned properly, i.e.,\n. Figure 3(a) compares this upper bounds\nwith the precise analysis that we will show in Section 3.2. As depicted in this \ufb01gure, the suf\ufb01cient\nrecovery condition is a valid upper bound on the phase transition, but it is not sharp.\n\n< \u03bb < \u03c1init\n||x0||1\n\n\u03c1init\n||x0||1\n\n||Pxinit||\n||P\u22a5xinit|| =\n\n\u03c1init(cid:112)1 \u2212 \u03c12\n\ninit\n\nk\n\nc\u221a\nk\n\n3.2 Precise phase transition\n\nSo far, we have provided a suf\ufb01cient condition for perfect signal recovery in the regularized PhaseMax.\nIn this section we give the exact phase transition, i.e., the minimum number of measurements m0\nrequired for perfect recovery of the unknown vector x0. For our analysis, we assume that the function\nf (x) is absolutely homogeneous (scalable), i.e., f (\u03c4 \u00b7 x) = |\u03c4| \u00b7 f (x), for any scalar \u03c4. This covers a\n(x0) \u2282 Rn denote\nlarge range of regularization functions such as norms and semi-norms. Let \u2202L\u03bb\nthe projection of the sub-differential set into the orthogonal complement of x0, i.e.,\n\n\u22a5\n\n\u22a5\n\n(x0) = {P\u22a5u : u \u2208 \u2202L\u03bb(x0)} ,\n\n\u2202L\u03bb\n\n(11)\n\nwhich is a convex and compact set. To state the result in a general framework, we require a further\nassumption on functions L(n)\n\n\u03bb (\u00b7).\n\n6\n\n\f(a)\n\n(b)\n\nFigure 3: (a) Comparing the upper bounds on the phase transition, derived by Theorem 1 (dashed lines) and\nthe precise phase transition by Theorem 2 (solid lines), for three values of the sparsity factor s = k/n. (b) The\nphase transition behavior as a function of the regularization parameter \u03bb, derived from the result of Theorem 2.\n\nAssumption 1 (Asymptotic functionals) We say Assumption 1 holds if the following uniform conver-\ngences exist, as n \u2192 \u221e,\n\n\u03b2 \u2212 E(cid:2) 1\u221a\nE(cid:2)dist\u2202L\u03bb\n\nn\n\nh)(cid:3) Unif.\u2212\u2212\u2192 F\u03bb(\u03b2), and,\n\nhT \u03a0\u2202L\u03bb\n\u22a5(x0)(\n\u03b2\u221a\nn\n\n\u22a5(x0)(\n\n\u03b2\u221a\nn\n\nh)(cid:3) Unif.\u2212\u2212\u2192 G\u03bb(\u03b2) ,\n\n\u03bb(\u03b2), where G(cid:48)\n\nThis assumption especially holds for the class of separable regularizers, where f (v) =(cid:80)\n\n(12)\nwhere h \u2208 Rn has i.i.d. standard normal entries and F\u03bb, G\u03bb : R+ \u2192 R denote the functions that\nthe sequences uniformly converge to.\nOne can show that, under some mild conditions on the regularization function f (\u00b7), Assumption 1\n\u03bb(\u00b7) denotes the derivative of the function G\u03bb(\u00b7).\nholds and also F\u03bb(\u03b2) = G\u03bb(\u03b2)G(cid:48)\n\u02dcf (vi)\n(e.g. (cid:96)1 norm for the case of sparse phase-retrieval). Later in this section, we will see validity of\nthis assumption for two examples discussed earlier in Section 3. Our precise phase transition results\nindicate the required number of measurements as the solution of a set of two nonlinear equations\nwith two unknowns. We de\ufb01ne a new parameter \u03b1 := m\nn indicates the exact\nphase transition of the regularized PhaseMax optimization. The following theorem gives an implicit\nformula to derive \u03b1opt.\nTheorem 2 (Precise phase transition). Let \u02c6x be the solution to the regularized PhaseMax optimization\n(2) with the objective function L\u03bb(x) = \u2212xinit\nTx + \u03bbf (x), where the convex function f (\u00b7) is\nabsolutely homogeneous and Assumption 1 holds. The regularized PhaseMax optimization would\nperfectly recover the target signal x0 if and only if:\n\nn , where \u03b1opt = m0\n\ni\n\n1. \u03b1 > \u03b1opt, where \u03b1opt is the solution of the following system of non-linear equations with\n\ntwo unknowns, \u03b1 and \u03b2,\n\n(cid:40)\u2212G\u03bb(\u03b2) L\u03bb(x0) = tan( \u03c0\n\ntan( \u03c0\n\n\u03b1\u03b2 F\u03bb(\u03b2)) (G\u03bb(\u03b2) + \u03c0\n\n\u03bb(\u03b2) \u2212 \u03b2F\u03bb(\u03b2)) ,\n\n\u03b1\u03b2 F\u03bb(\u03b2)) (G2\n\u03b1\u03b2 F\u03bb(\u03b2) L\u03bb(x0)) = \u03c0\n\n\u03b1\u03b2 F\u03bb(\u03b2) G\u03bb(\u03b2) ,\n\n(13)\n\n2. and, L\u03bb(x0) < L\u03bb(0) = 0 .\n\nwhere the functions F\u03bb(\u00b7) and G\u03bb(\u00b7) are de\ufb01ned in (12).\nA few remarks are in place for this theorem:\n[Solving equations (13)] The system of nonlinear equations (13) only involves two scalars \u03b2 and\n\u03b1, and the functions F\u03bb(\u03b2) and G\u03bb(\u03b2) are determined by the objective function L\u03bb(x). For our\nnumerical simulations in the examples of Section 3.2.1 and Section 3.2.2, we used a \ufb01xed-point\n\n7\n\n\fiteration method that can quickly \ufb01nd the solution given a proper initialization.\n[Tuning \u03bb] Theorem 2 requires the objective function to satisfy L\u03bb(x0) = \u03bbf (x0) \u2212 \u03c1init <\n0. Therefore, it is necessary to choose \u03bb in such a way that \u03bb < \u03c1init/f (x0). Some additional\nassumptions on the unknown vector x0 enables us to calculate the proper range for \u03bb. For instance,\nif we consider a random ensemble for x0 where the non-zero entries of x0 are Gaussian (or other)\nrandom variables, E[f (x0)] gives a reasonable estimation on f (x0) that can help us choosing \u03bb\nappropriately. We will see an example of such case in section 3.2.2. Figure 3(b) shows an example of\nhow the phase transition of the regularized PhaseMax, or equivalently the required sample complexity,\nbehaves as a function of the hyper-parameter \u03bb.\nIn the next sections, we use the result of Theorem 2 to compute the exact phase transition for the case\nof unstructured signal as well as the sparse signal recovery. Since the regularizer f (x) is absolutely\nscalable, for both examples, we assume that (cid:107)x0(cid:107) = 1.\n\n3.2.1 Unstructured signal recovery\n\nWhen there is no a priori information about the structure of the target signal, we use the following\noptimization (PhaseMax) for signal recovery:\n\n\u02c6x = argmin\nx\u2208Rn\nsubject to:\n\nL(x) = \u2212xinit\n|aT\ni x| \u2264 yi ,\n\nTx\nfor 1 \u2264 i \u2264 m .\n\n(14)\n\n(15)\n\n(16)\n\n(17)\n\nDue to the absence of the regularization term, without loss of generality we can assume ||xinit|| = 1.\nMoreover, L(x0) = \u2212\u03c1init which indicates that the second condition in Theorem 2 . To apply the\nresult of our theorem, we \ufb01rst compute explicit formulas for the functions F\u03bb(\u03b2), and G\u03bb(\u03b2), as\nfollows,\n\n(cid:113)\n\n\u03b22 + 1 \u2212 \u03c12\nWe can now form the system of nonlinear equations (13) as follows,\n\nF\u03bb(\u03b2) = \u03b2 , G\u03bb(\u03b2) =\n\ninit .\n\n(cid:40) (cid:112)\u03b22 + 1 \u2212 \u03c12\n\n\u03b1 ) ((cid:112)\u03b22 + 1 \u2212 \u03c12\n\n\u03c1init\n1\u2212\u03c12\n\ninit\n\ninit\n\ntan( \u03c0\n\n= tan( \u03c0\n\u03b1 ) ,\ninit \u2212 \u03c0\u03c1init\n\u03b1 ) = \u03c0\n\n\u03b1\n\n(cid:112)\u03b22 + 1 \u2212 \u03c12\n\ninit .\n\nFinally, solving equations (16) yields the following necessary and suf\ufb01cient condition for perfect\nrecovery,\n\n\u03c0\n\n> 1 \u2212 \u03c12\n\ninit ,\n\n\u03b1 tan(\u03c0/\u03b1)\n\nwhich also veri\ufb01es the result of [14].\nFigure 2(a) shows the result of numerical simulations of running the PhaseMax algorithm for different\nvalues of \u03c1init and \u03b4. The intensity level of the color of each square in Figure 2, represents the\nerror of PhaseMax in recovering x0. As seen in the \ufb01gure, although our theoretical results has been\nestablished for the asymptotic setting (when the problem dimensions approach in\ufb01nity), the blue\nline, which is derived from (17), reasonably predicts the phase transition for n = 128. The suf\ufb01cient\nconditions that is derived from Theorem 1 is also depicted by the red line in the same \ufb01gure.\n\n3.2.2 Sparse recovery\n\nWe consider the case where the target signal x0 is sparse with k non-zero entries. The convex function\nn||x||1, which is known to be a proper regularizer that enforces sparsity [41], is used in the\nf (x) = 1\u221a\nregularized PhaseMax optimization to recover x0,\n\n\u02c6x = argmin\nx\u2208Rn\nsubject to:\n\nL\u03bb(x) = \u2212xinit\ni x| \u2264 yi ,\n|aT\n\n\u03bb\u221a\nn\n\n||x||1\nTx +\nfor 1 \u2264 i \u2264 m .\n\n(18)\n\n(cid:20) v\u2206\n\n(cid:21)\n\nTo streamline notations, we assume the non-zero entries of x0 are the \ufb01rst k entries and decompose\nvector v \u2208 Rn as v =\n, where v\u2206 \u2208 Rk denotes the \ufb01rst k entries of v, and v\u2206c \u2208 Rn\u2212k\nis the remaining n \u2212 k entries. As m, n \u2192 \u221e, we would like to apply the result of Theorem 2\n\nv\u2206c\n\n8\n\n\fF\u03bb(\u03b2) = \u03b2(s + 2(1 \u2212 s) \u00b7 Q(\n\n\u03bb(cid:113)\n\u03b22 +\n\u03bb(\u03b2) = s \u00b7 (\u03b22 + \u03bb2) + (cid:107)x\u2206\ninit(cid:107)2 \u2212 2\u03bb\nG2\ninit(cid:107)2\n(cid:107)x\u2206c\n1 \u2212 s\n\n+ (1 \u2212 s)(\u03b22 +\n\n) ) ,\n\n(cid:107)x\u2206c\ninit (cid:107)2\n1\u2212s\n\u221a\ns\u02dc\u03c1 \u2212 L2(x0)\n\n) \u00b7 EH [ shrink2(H,\n\n\u03bb(cid:113)\n\n\u03b22 +\n\n(cid:107)x\u2206c\ninit (cid:107)2\n1\u2212s\n\n) ]\n\n(20)\n\n(cid:20)x\u2206\n\n(cid:21)\n\nto compute the exact phase transition. Due to the rotational invariance property of the Gaussian\ndistribution, it can be shown that multiplying the last (n \u2212 k) entries of xinit, by a unitary matrix\nU \u2208 R(n\u2212k)\u00d7(n\u2212k) does not change the phase transition behavior in (2). Hence, we can assume the\nentries of x\u2206c\n\ninit have Gaussian distribution, i.e.,\n\n,\n\nand x\u2206c\n\ninit\nx\u2206c\ninit\n\ninit =\n\nxinit =\n\n(19)\nwhere g \u2208 Rn\u2212k has standard normal entries. This observation enables us to establish the following\nlemma:\nLemma 1. Consider the optimization problem (18) to recover the k-sparse signal x0. We assume the\ninit, where sign(\u00b7) denotes\nentries of xinit are distributed as in (19) and de\ufb01ne \u02dc\u03c1 := 1\u221a\nthe component-wise sign function. Then, Assumption 1 holds with:\n\nsign(x\u2206\n\n0 )Tx\u2206\n\nk\n\n1\u221a\nn \u2212 k\n\n||x\u2206c\n\ninit|| g ,\n\nwhere Q(\u00b7) is the tail distribution of the standard normal distribution, H has standard normal\ndistribution and s := k/n is the sparsity factor. The shrinkage function shrink(\u00b7,\u00b7) : R \u00d7 R+ \u2192 R+\nis de\ufb01ned as:\n\nshrink(x, \u03c4 ) = (|x| \u2212 \u03c4 )1{|x| \u2265 \u03c4} .\n\n(21)\nIt is worth noting that the function shrink(\u00b7,\u00b7) also appeared in computing the statistical dimension\nfor (cid:96)1 regularization (see Section 5.5) which indicates some implicit relation to \u03b1opt.\nWe have numerically computed the solution of the nonlinear system (20). Figure 2(a), and Figure\n2(b) shows the error of regularized PhaseMax over a range of \u03c1init and \u03b4. The comparison between\nour upper bound derived from Theorem 1 and precise analysis of Theorem 2 is depicted in Figure 3(a)\nfor three values of the sparsity factor s = 0.05, 0.1, 0.2. Observe that the upper bound is only a\nconstant factor away from the precise phase transition, while its derivation involves simpler formulas.\nFinally, Figure 3(b), illustrates impact of the regularization parameter \u03bb on the phase transition of\nthe regularized PhaseMax optimization for four values of \u03c1init. The values of \u03bb in this \ufb01gure are\nnormalized by \u03c1init\n\n(cid:107)x0(cid:107) , which is the maximum acceptable value of \u03bb in the regularized PhaseMax.\n\n\u221a\n\nn\n\n4 Conclusion and Future Directions\n\nIn this paper, we introduced a new convex optimization framework, regularized PhaseMax, to\nsolve the structured phase retrieval problem. We have shown that, given a proper initialization,\nthe regularized PhaseMax optimization perfectly recovers the underlying signal from a number\nof phaseless measurements that is only a constant factor away from the number of measurements\nrequired when the phase information is available. We explicitly computed this constant factor.\nAn important (yet still open) research problem is to investigate the required sample complexity to\nconstruct a proper initialization vector, xinit. As an example, for the case of sparse phase retrieval, even\nthough our analysis indicates that O(k log n\nk ) is the required sample complexity of the regularized\nPhaseMax optimization, the best known initialization technique [6] needs O(k2 log n) samples to\ngenerate a meaningful initialization, which is suboptimal. An important future direction is to study\ninitialization techniques that break this sample complexity barrier, or to use information theoretic\narguments (as in [28]) to show that the sample complexity for the initialization cannot be improved.\nTo form the objective function in the regularized PhaseMax, we exploited some a-priori knowledge\nabout the structure of the underlying signal. In many practical settings such prior information is\nnot available. There has been some interesting recent publications (e.g. [4, 48]) which introduce\nef\ufb01cient algorithms to learn the structure of the underlying signal. An interesting research direction\nis to investigate new optimization framework that does not rely on the prior information about the\nstructure of the underlying signal.\n\n9\n\n\fReferences\n[1] Dennis Amelunxen, Martin Lotz, Michael B McCoy, and Joel A Tropp. Living on the edge:\nPhase transitions in convex programs with random data. Information and Inference: A Journal\nof the IMA, 3(3):224\u2013294, 2014.\n\n[2] Sohail Bahmani and Justin Romberg. Ef\ufb01cient compressive phase retrieval with constrained\nsensing vectors. In Advances in Neural Information Processing Systems, pages 523\u2013531, 2015.\n\n[3] Sohail Bahmani and Justin Romberg. Phase retrieval meets statistical learning theory: A \ufb02exible\n\nconvex relaxation. In Arti\ufb01cial Intelligence and Statistics, pages 252\u2013260, 2017.\n\n[4] Milad Bakhshizadeh, Arian Maleki, and Shirin Jalali. Compressive phase retrieval of structured\nsignals. In 2018 IEEE International Symposium on Information Theory (ISIT), pages 2291\u20132295.\nIEEE, 2018.\n\n[5] Mohsen Bayati and Andrea Montanari. The lasso risk for gaussian matrices. IEEE Transactions\n\non Information Theory, 58(4):1997\u20132017, 2012.\n\n[6] T Tony Cai, Xiaodong Li, Zongming Ma, et al. Optimal rates of convergence for noisy sparse\nphase retrieval via thresholded wirtinger \ufb02ow. The Annals of Statistics, 44(5):2221\u20132251, 2016.\n\n[7] Emmanuel J Candes, Yonina C Eldar, Thomas Strohmer, and Vladislav Voroninski. Phase\n\nretrieval via matrix completion. SIAM review, 57(2):225\u2013251, 2015.\n\n[8] Emmanuel J Candes, Xiaodong Li, and Mahdi Soltanolkotabi. Phase retrieval from coded\n\ndiffraction patterns. Applied and Computational Harmonic Analysis, 39(2):277\u2013299, 2015.\n\n[9] Emmanuel J Candes, Xiaodong Li, and Mahdi Soltanolkotabi. Phase retrieval via wirtinger\n\ufb02ow: Theory and algorithms. IEEE Transactions on Information Theory, 61(4):1985\u20132007,\n2015.\n\n[10] Emmanuel J Candes, Thomas Strohmer, and Vladislav Voroninski. Phaselift: Exact and stable\nsignal recovery from magnitude measurements via convex programming. Communications on\nPure and Applied Mathematics, 66(8):1241\u20131274, 2013.\n\n[11] Venkat Chandrasekaran, Benjamin Recht, Pablo A Parrilo, and Alan S Willsky. The convex\ngeometry of linear inverse problems. Foundations of Computational mathematics, 12(6):805\u2013\n849, 2012.\n\n[12] Yuxin Chen and Emmanuel Candes. Solving random quadratic systems of equations is nearly\nas easy as solving linear systems. In Advances in Neural Information Processing Systems, pages\n739\u2013747, 2015.\n\n[13] Oussama Dhifallah and Yue M Lu. Fundamental limits of phasemax for phase retrieval: A\nreplica analysis. In Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP),\n2017 IEEE 7th International Workshop on, pages 1\u20135. IEEE, 2017.\n\n[14] Oussama Dhifallah, Christos Thrampoulidis, and Yue M Lu. Phase retrieval via polytope opti-\nmization: Geometry, phase transitions, and new algorithms. arXiv preprint arXiv:1805.09555,\n2018.\n\n[15] Martin Dierolf, Andreas Menzel, Pierre Thibault, Philipp Schneider, Cameron M Kewish, Roger\nWepf, Oliver Bunk, and Franz Pfeiffer. Ptychographic x-ray computed tomography at the\nnanoscale. Nature, 467(7314):436\u2013439, 2010.\n\n[16] David L Donoho, Arian Maleki, and Andrea Montanari. Message-passing algorithms for\ncompressed sensing. Proceedings of the National Academy of Sciences, 106(45):18914\u201318919,\n2009.\n\n[17] C Fienup and J Dainty. Phase retrieval and image reconstruction for astronomy. Image Recovery:\n\nTheory and Application, pages 231\u2013275, 1987.\n\n[18] James R Fienup. Phase retrieval algorithms: a comparison. Applied optics, 21(15):2758\u20132769,\n\n1982.\n\n10\n\n\f[19] Tom Goldstein and Christoph Studer. Phasemax: Convex phase retrieval via basis pursuit. IEEE\n\nTransactions on Information Theory, 2018.\n\n[20] Paul Hand and Vladislav Voroninski. Compressed sensing from phaseless gaussian measure-\nments via linear programming in the natural parameter space. arXiv preprint arXiv:1611.05985,\n2016.\n\n[21] Paul Hand and Vladislav Voroninski. An elementary proof of convex phase retrieval in the\nnatural parameter space via the linear program phasemax. arXiv preprint arXiv:1611.03935,\n2016.\n\n[22] Kishore Jaganathan, Yonina C Eldar, and Babak Hassibi. Phase retrieval: An overview of recent\n\ndevelopments. arXiv preprint arXiv:1510.07713, 2015.\n\n[23] Kishore Jaganathan, Samet Oymak, and Babak Hassibi. Sparse phase retrieval: Uniqueness\nguarantees and recovery algorithms. IEEE Transactions on Signal Processing, 65(9):2402\u20132410,\n2017.\n\n[24] Xiaodong Li and Vladislav Voroninski. Sparse signal recovery from quadratic measurements\nvia convex programming. SIAM Journal on Mathematical Analysis, 45(5):3019\u20133033, 2013.\n\n[25] Yue M Lu and Gen Li. Phase transitions of spectral initialization for high-dimensional nonconvex\n\nestimation. arXiv preprint arXiv:1702.06435, 2017.\n\n[26] Junjie Ma, Ji Xu, and Arian Maleki. Optimization-based amp for phase retrieval: The impact of\n\ninitialization and (cid:96)2-regularization. arXiv preprint arXiv:1801.01170, 2018.\n\n[27] Rick P Millane. Phase retrieval in crystallography and optics. JOSA A, 7(3):394\u2013411, 1990.\n\n[28] Marco Mondelli and Andrea Montanari. Fundamental limits of weak recovery with applications\n\nto phase retrieval. arXiv preprint arXiv:1708.05932, 2017.\n\n[29] Praneeth Netrapalli, Prateek Jain, and Sujay Sanghavi. Phase retrieval using alternating mini-\n\nmization. In Advances in Neural Information Processing Systems, pages 2796\u20132804, 2013.\n\n[30] Samet Oymak, Amin Jalali, Maryam Fazel, Yonina C Eldar, and Babak Hassibi. Simultaneously\nstructured models with application to sparse and low-rank matrices. IEEE Transactions on\nInformation Theory, 61(5):2886\u20132908, 2015.\n\n[31] Samet Oymak, Christos Thrampoulidis, and Babak Hassibi. The squared-error of generalized\n\nlasso: A precise analysis. arXiv preprint arXiv:1311.0830, 2013.\n\n[32] Ralph Tyrell Rockafellar. Convex analysis. Princeton university press, 2015.\n\n[33] Mark Rudelson and Roman Vershynin. Sparse reconstruction by convex relaxation: Fourier and\ngaussian measurements. In Information Sciences and Systems, 2006 40th Annual Conference\non, pages 207\u2013212. IEEE, 2006.\n\n[34] Walter Rudin et al. Principles of mathematical analysis, volume 3. McGraw-hill New York,\n\n1976.\n\n[35] Fariborz Salehi, Ehsan Abbasi, and Babak Hassibi. A precise analysis of phasemax in phase\nretrieval. In Information Theory (ISIT), 2018 IEEE International Symposium on, pages 976\u2013980.\nIEEE, 2018.\n\n[36] Fariborz Salehi, Kishore Jaganathan, and Babak Hassibi. Multiple illumination phaseless\nsuper-resolution (mips) with applications to phaseless doa estimation and diffraction imaging.\nIn Acoustics, Speech and Signal Processing (ICASSP), 2017 IEEE International Conference on,\npages 3949\u20133953. IEEE, 2017.\n\n[37] Mahdi Soltanolkotabi. Structured signal recovery from quadratic measurements: Breaking\nsample complexity barriers via nonconvex optimization. arXiv preprint arXiv:1702.06175,\n2017.\n\n[38] Mihailo Stojnic. Various thresholds for l1-optimization in compressed sensing. 2009.\n\n11\n\n\f[39] Christos Thrampoulidis, Ehsan Abbasi, and Babak Hassibi. Precise error analysis of regularized\n\nm-estimators in high-dimensions. IEEE Transactions on Information Theory, 2018.\n\n[40] Christos Thrampoulidis, Samet Oymak, and Babak Hassibi. Regularized linear regression: A\nprecise analysis of the estimation error. In Conference on Learning Theory, pages 1683\u20131709,\n2015.\n\n[41] Robert Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal\n\nStatistical Society. Series B (Methodological), pages 267\u2013288, 1996.\n\n[42] Joel A Tropp. Convex recovery of a structured signal from independent random linear measure-\n\nments. In Sampling Theory, a Renaissance, pages 67\u2013101. Springer, 2015.\n\n[43] Joel A Tropp and Anna C Gilbert. Signal recovery from random measurements via orthogonal\n\nmatching pursuit. IEEE Transactions on information theory, 53(12):4655\u20134666, 2007.\n\n[44] Roman Vershynin. High-dimensional probability. An Introduction with Applications, 2016.\n\n[45] Adriaan Walther. The question of phase retrieval in optics. Journal of Modern Optics, 10(1):41\u2013\n\n49, 1963.\n\n[46] Gang Wang, Georgios B Giannakis, and Yonina C Eldar. Solving systems of random quadratic\nequations via truncated amplitude \ufb02ow. IEEE Transactions on Information Theory, 64(2):773\u2013\n794, 2018.\n\n[47] Gang Wang, Liang Zhang, Georgios B Giannakis, Mehmet Akcakaya, and Jie Chen. Sparse\nphase retrieval via truncated amplitude \ufb02ow. IEEE Transactions on Signal Processing, 66(2):479\u2013\n491, 2016.\n\n[48] Shanshan Wu, Alexandros G Dimakis, Sujay Sanghavi, Felix X Yu, Daniel Holtmann-Rice,\nDmitry Storcheus, Afshin Rostamizadeh, and Sanjiv Kumar. The sparse recovery autoencoder.\narXiv preprint arXiv:1806.10175, 2018.\n\n[49] Teng Zhang. Phase retrieval using alternating minimization in a batch setting. arXiv preprint\n\narXiv:1706.08167, 2017.\n\n12\n\n\f", "award": [], "sourceid": 5227, "authors": [{"given_name": "Fariborz", "family_name": "Salehi", "institution": "California Institute of Technology"}, {"given_name": "Ehsan", "family_name": "Abbasi", "institution": "Caltech"}, {"given_name": "Babak", "family_name": "Hassibi", "institution": "Caltech"}]}