{"title": "Don't take it lightly: Phasing optical random projections with unknown operators", "book": "Advances in Neural Information Processing Systems", "page_first": 14855, "page_last": 14865, "abstract": "In this paper we tackle the problem of recovering the phase of complex linear measurements when only magnitude information is available and we control the input. We are motivated by the recent development of dedicated optics-based hardware for rapid random projections which leverages the propagation of light in random media. A signal of interest $\\mathbf{\\xi} \\in \\mathbb{R}^N$ is mixed by a random scattering medium to compute the projection $\\mathbf{y} = \\mathbf{A} \\mathbf{\\xi}$, with $\\mathbf{A} \\in \\mathbb{C}^{M \\times N}$ being a realization of a standard complex Gaussian iid random matrix. Such optics-based matrix multiplications can be much faster and energy-efficient than their CPU or GPU counterparts, yet two difficulties must be resolved: only the intensity ${|\\mathbf{y}|}^2$ can be recorded by the camera, and the transmission matrix $\\mathbf{A}$ is unknown. We show that even without knowing $\\mathbf{A}$, we can recover the unknown phase of $\\mathbf{y}$ for some equivalent transmission matrix with the same distribution as $\\mathbf{A}$. Our method is based on two observations: first, conjugating or changing the phase of any row of $\\mathbf{A}$ does not change its distribution; and second, since we control the input we can interfere $\\mathbf{\\xi}$ with arbitrary reference signals. We show how to leverage these observations to cast the measurement phase retrieval problem as a Euclidean distance geometry problem. We demonstrate appealing properties of the proposed algorithm in both numerical simulations and real hardware experiments. Not only does our algorithm accurately recover the missing phase, but it mitigates the effects of quantization and the sensitivity threshold, thus improving the measured magnitudes.", "full_text": "Don\u2019t take it lightly: Phasing optical random\n\nprojections with unknown operators\n\nSidharth Gupta\n\nUniversity of Illinois at Urbana-Champaign\n\ngupta67@illinois.edu\n\nR\u00e9mi Gribonval\n\nUniv Rennes, Inria, CNRS, IRISA\n\nremi.gribonval@inria.fr\n\nLaurent Daudet\nLightOn, Paris\n\nlaurent@lighton.ai\n\nIvan Dokmani\u00b4c\n\nUniversity of Illinois at Urbana-Champaign\n\ndokmanic@illinois.edu\n\nAbstract\n\nIn this paper we tackle the problem of recovering the phase of complex linear\nmeasurements when only magnitude information is available and we control the\ninput. We are motivated by the recent development of dedicated optics-based\nhardware for rapid random projections which leverages the propagation of light\nin random media. A signal of interest \u03be \u2208 RN is mixed by a random scattering\nmedium to compute the projection y = A\u03be, with A \u2208 CM\u00d7N being a realization\nof a standard complex Gaussian iid random matrix. Such optics-based matrix\nmultiplications can be much faster and energy-ef\ufb01cient than their CPU or GPU\ncounterparts, yet two dif\ufb01culties must be resolved: only the intensity |y|2 can be\nrecorded by the camera, and the transmission matrix A is unknown. We show\nthat even without knowing A, we can recover the unknown phase of y for some\nequivalent transmission matrix with the same distribution as A. Our method is\nbased on two observations: \ufb01rst, conjugating or changing the phase of any row\nof A does not change its distribution; and second, since we control the input\nwe can interfere \u03be with arbitrary reference signals. We show how to leverage\nthese observations to cast the measurement phase retrieval problem as a Euclidean\ndistance geometry problem. We demonstrate appealing properties of the proposed\nalgorithm in both numerical simulations and real hardware experiments. Not\nonly does our algorithm accurately recover the missing phase, but it mitigates the\neffects of quantization and the sensitivity threshold, thus improving the measured\nmagnitudes.\n\n1\n\nIntroduction\n\nRandom projections are at the heart of many algorithms in machine learning, signal processing and\nnumerical linear algebra. Recent developments ranging from classi\ufb01cation with random features [16],\nkernel approximation [25] and sketching for matrix optimization [24, 27], to sublinear-complexity\ntransforms [26] and randomized linear algebra are all enabled by random projections. Computing\nrandom projections for realistic signals such as images, videos, and modern big data streams is\ncomputation- and memory-intensive. Thus, from a practical point of view, any increase in the size\nand speed at which one can do the required processing is highly desirable.\nThis fact has motivated work on using dedicated hardware based on physics rather than traditional\nCPU and GPU computation to obtain random projections. A notable example is the scattering of\nlight in random media (Figure 1 (left)) with an optical processing unit (OPU). The OPU enables\nrapid (20 kHz) projections of high-dimensional data such as images, with input dimension scaling up\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\fRandom scattering medium\n\nCamera taking\n8-bit measurements\n\nLaser\nlight\nsource\n\nDMD encoding\nof (xq \u2212 xr)\n\n| (cid:104)a, xq \u2212 xr(cid:105) |2 = |yq \u2212 yr|2\na\n\niid\u223c N (0, I) + jN (0, I)\n\nIm\n\nyq\n\n|yq \u2212 yr|2\n\nyr\n\n\u03c6\n\nRe\n\nyrq\n\nyq\n\n|yq \u2212 yr|2\n\nyr\n\nyrq\n\nFigure 1: Left: The optical processing unit (OPU) is an example application of where the MPR\nproblem appears. A coherent laser beam spatially encodes a signal (xq \u2212 xr) via a digital micro-\nmirror device (DMD) which is then shined through a random medium. A camera measures the\nsquared magnitude of the scattered light which is equivalent to the Euclidean distance between\ncomplex numbers yq \u2208 C and yr \u2208 C. Furthermore the camera takes quantized measurements; Right:\nyq and yr are points on the two-dimensional complex plane. We can measure the squared Euclidean\ndistance between points and use these distances to localize points on the complex plane and obtain\ntheir phase. Note that transformations such as rotations and re\ufb02ections do not change the distances.\n\nto one million and output dimension also in the million range. It works by \u201cimprinting\u201d the input\ndata \u03be \u2208 RN onto a coherent light beam using a digital micro-mirror device (DMD) and shining the\nmodulated light through a multiple scattering medium such as titanium dioxide white paint. The\nscattered light\ufb01eld in the sensor plane can then be written as\n\ny = A\u03be\n\nwhere A \u2208 CM\u00d7N is the transmission matrix of the random medium with desirable properties.\nOne of the major challenges associated with this approach is that A is in general unknown. Though it\ncould in principle be learned via calibration [6], such a procedure is slow and inconvenient, especially\nat high resolution. On the other hand, the system can be designed so that the distribution of A\nis approximately iid standard complex Gaussian. Luckily, this fact alone is suf\ufb01cient for many\nalgorithms and the actual values of A are not required.\nAnother challenge is that common light sensors are only sensitive to intensity, so we can only\nmeasure the intensity of scattered light, |y|2, where | \u00b7 | is the elementwise absolute value. The\nphase information is thus lost. While the use of interferometric measurements with a reference could\nenable estimating the phase, the practical setup is more complex, sensitive, and it does not share the\nconvenience and simplicity of the one illustrated in Figure 1 (left).\nThis motivates us to consider the measurement phase retrieval (MPR) problem. The MPR sensor data\nis modeled as\n\nb = |y|2 + \u03b7 = |A\u03be|2 + \u03b7,\n\n(1)\nwhere b \u2208 RM , \u03be \u2208 RN , A \u2208 CM\u00d7N , y \u2208 CM , and \u03b7 \u2208 RM is noise. The goal is to recover the\nphase of each complex-valued element of y, yi for 1 \u2264 i \u2264 M, from its magnitude measurements\nb when \u03be is known and the entries of A are unknown. The classical phase retrieval problem which\nhas received much attention over the last decade [15, 4] has the same quadratic form as (1) but with\na known A and the task being to recover \u03be instead of y. While at a glance it might seem that not\nknowing A precludes computing the phase of A\u03be, we show in this paper that it is in fact possible via\nan exercise in distance geometry.\nThe noise \u03b7 is primarily due to quantization because standard camera sensors measure low precision\nvalues, 8-bit in our case (integers between 0 and 255 inclusive). Furthermore, cameras may perform\npoorly at low intensities. This is another data-dependent noise source which is modelled in (2) by\na binary mask vector w \u2208 RM which is zero when the intensity is below some threshold and one\notherwise; (cid:12) denotes the elementwise product.\n\nb = w (cid:12)(cid:16)|y|2 + \u03b7\n\n(cid:17)\n\n= w (cid:12)(cid:16)|A\u03be|2 + \u03b7\n\n(cid:17)\n\n(2)\n\nThe distribution of A follows from the properties of random scattering media [14, 6]. It has iid\nstandard complex Gaussian entries, amn \u223c N (0, 1) + jN (0, 1) for all 1 \u2264 m, n \u2264 M, N.\n\n2\n\n\fThe usefulness of phase is obvious. While in some applications having only the magnitude of the\nrandom projection is enough (see [17] for an example related to elliptic kernels), most applications\nrequire the phase. For example, with the phase one can implement a more diverse range of kernels as\nwell as randomized linear algebra routines like randomized singular value decomposition (SVD). We\nreport the results of the latter on real hardware in Section 3.1.\n\nOur contributions. We develop an algorithm based on distance geometry to solve the MPR\nproblem (1). We exploit the fact that we control the input to the system, which allows us to mix \u03be\nwith arbitrary reference inputs. By interpreting each pixel value as a point in the complex plane, this\nleads to a formulation of the MPR problem as a pure distance geometry problem (see Section 2.2\nand Figure 1 (right)). With enough pairwise distances (corresponding to reference signals) we can\nlocalize the points on the complex plane via a variant of multidimensional scaling (MDS) [23, 5],\nand thus compute the missing phase.\nAs we demonstrate, the proposed algorithm not only accurately recovers the phase, but also improves\nthe number of useful bits of the magnitude information thanks to the multiple views. Established\nEuclidean distance geometry bounds imply that even with many distances below the sensitivity\nthreshold and coarse quantization, the proposed algorithm allows for accurate recovery. This fact,\nwhich we verify experimentally, could have bearing on the design of future random projectors by\nnavigating the tradeoff between physics and computation.\n\n1.1 Related work\n\nThe classical phase retrieval problem looks at the case where A is known and \u03be has to be recovered\nfrom b in (1) [7, 21, 10]. A modi\ufb01ed version of the classical problem known as holographic phase\nretrieval is related to our approach: a known reference signal is concatenated with \u03be to facilitate the\nphase estimation [1]. Interference with known references for classical phase retrieval has also been\nstudied for known (Fourier) operators [3, 11] .\nAn optical random projection setup similar to the one we consider has been used for kernel-based\nclassi\ufb01cation [17], albeit using only magnitudes. A phaseless approach to classi\ufb01cation with the\nmeasured magnitudes fed into a convolutional neural network was reported by Satat et al. [18].\nAn alternative to obtaining the measurement phase is to measure, or calibrate, the unknown trans-\nmission matrix A. This has been attempted in compressive imaging applications but the process\nis impractical at even moderate pixel counts [6, 14]. Estimating A can take days and even the\nlatest GPU-accelerated methods take hours for moderately sized A [20]. Other approaches forego\ncalibration and use the measured magnitudes to learn an inverse map of x (cid:55)\u2192 |Ax|2 for use with the\nmagnitude measurements [9].\nLeaving hardware approaches aside, there have been multiple algorithmic efforts to improve the speed\nof random projections [12, 25] for machine learning and signal processing tasks. Still, ef\ufb01ciently\nhandling high-dimensional input remains a formidable challenge.\n\n2 The measurement phase retrieval problem\nWe will denote the signal of interest by \u03be \u2208 RN , and the K reference anchor signals by rk \u2208 RN\nfor 1 \u2264 k \u2264 K. To present the full algorithm we will need to use multiple signals of interest\nwhich we will then denote \u03be1, . . . , \u03beS; each \u03bes is called a frame. We set the last, Kth anchor to be\nthe origin, rK = 0. We ascribe \u03be and the anchors to the columns of the matrix X \u2208 RN\u00d7Q, so\nthat X = [\u03be, r1, r2,\u00b7\u00b7\u00b7 , rK] and let Q = K + 1. The qth column of X is denoted xq. For any\n1 \u2264 q, r \u2264 Q, we let yq = Axq and yqr := A(xq \u2212 xr), with yqr,m being its mth entry. Finally,\nthe mth row of A will be denoted by am so that yqr,m = (cid:104)am, xq \u2212 xr(cid:105).\n\n2.1 Problem statement and recovery up to a reference phase and conjugation\n\nSince we do not know A, it is clear that recovering the absolute phase of A\u03be is impossible. On the\nother hand, many algorithms do not require any knowledge of A except that it is iid standard complex\nGaussian, and that it does not change throughout the computations.\n\n3\n\n\fLet R be an operator which adds a constant phase to each row of its argument (multiplies it by\ndiag(ej\u03c61, . . . , ej\u03c6m) for some \u03c61, . . . , \u03c6m) and conjugates a subset of its rows. Since a standard\ncomplex Gaussian is circularly symmetric, R(A) has the same distribution as A. Therefore, since we\ndo not know A, it does not matter whether we work with A itself or with R(A) for some possibly\nunknown R. As long as the same effective R is used for all inputs during algorithm operation, the\nrelative phases between the frames will be the same whether we use R(A) or A.1\nProblem 1. Given a collection of input frames \u03be1, . . . , \u03beS to be randomly projected and a device\nillustrated in Figure 1 (left) with an unknown transmission matrix A \u2208 CM\u00d7N and a b-bit camera,\ncompute the estimates of projections \u02c6y1, . . . , \u02c6yS up to a global row-wise phase and conjugation; that\nis, so that there exists some R such that \u02c6ys \u2248 R(ys) for all 1 \u2264 s \u2264 S.\n2.2 MPR as a distance geometry problem\n\nSince the rows of A are statistically independent, we can explain our algorithm for a single row and\nthen repeat the same steps for the remaining rows. We will therefore omit the row subscript/superscript\nm except where explicitly necessary.\nInstead of randomly projecting \u03be and measuring the corresponding projection magnitude |A\u03be|2,\nconsider randomly projecting the difference between \u03be and some reference vector, or more generally\na difference between two columns in X, thus measuring |(cid:104)a, xq \u2212 xr(cid:105)|2 = |yq \u2212 yr|2. Interpreting\nyq and yr as points in the complex plane, we see that the camera sensor measures exactly the squared\nEuclidean distance between them. Since we control the input to the OPU, we can indeed set it to\nxq \u2212 xr and measure |yq \u2212 yr|2 for all 1 \u2264 q, r \u2264 Q.\nThis is the key point: as we can measure pairwise distances between a collection of two-dimensional\nvectors in the two-dimensional complex plane, we can use established distance geometry algorithms\nsuch as multidimensional scaling (MDS) to localize points and get their phase. This is illustrated in\nFigure 1 (right). The same \ufb01gure also illustrates the well known fact that rigid transformations of a\npoint set cannot be recovered from distance data. We need to worry about three things: translations,\nre\ufb02ections and rotations.\nThe translation ambiguity can be easily dealt with if one notes that for any column xq of X,\n|yq| = |(cid:104)a, xq(cid:105)| gives us the distance of yq to the origin which is a \ufb01xed point, ultimately resolving\nthe translation ambiguity. There is, however, no similar simple way to do away with the rotation and\nre\ufb02ection ambiguity, so it might seem that there is no way to uniquely determine the phase of (cid:104)a, \u03be(cid:105).\nThis is where the discussion from the preceding subsection comes to the rescue. Since R is arbitrary,\nas long as it is kept \ufb01xed for all the frames, we can arbitrarily set the orientation of any given frame\nand use it as a reference, making sure that the relative phases are computed correctly.\n\n2.3 Proposed algorithm\nAs de\ufb01ned previously, the columns of X \u2208 RN\u00d7Q list the signal of interest and the anchors. Recall\nthat all the entries of X are known. Using the OPU, we can compute a noisy (quantized) version of\n\n|yqr|2 = |(cid:104)a, xq \u2212 xr(cid:105)|2 = |yq \u2212 yr|2,\n\n(3)\nfor all (q, r), which gives us Q(Q \u2212 1)/2 squared Euclidean distances between points {yq \u2208 C}Q\nq=1\non the complex plane. These distances can be used to populate a Euclidean (squared) distance matrix\nD \u2208 RQ\u00d7Q as D = (d2\nq,r=1, which we will use to localize all complex points\nyq.\nWe start by de\ufb01ning the matrix of all the complex points in R2 which we want to recover as\n\nq,r=1 = (|yqr|2)Q\n\nqr)Q\n\n(cid:20)Re(y1) Re(y2)\n\nIm(y1)\n\nIm(y2)\n\n\u03a5 =\n\n(cid:21)\n\u00b7\u00b7\u00b7 Re(yQ)\n\u00b7\u00b7\u00b7\nIm(yQ)\nqr = (cid:107)\u03c5q \u2212 \u03c5r(cid:107)2\n\n\u2208 R2\u00d7Q.\n\nDenoting the qth column of \u03a5 by \u03c5q, we have d2\nthat\n\n2 = \u03c5T\n\nq \u03c5q \u2212 2\u03c5T\n\nq \u03c5r + \u03c5T\n\nr \u03c5r so\n\nD = diag (G) 1T\n\nQ \u2212 2G + 1Q diag (G)T =: K (G) ,\n\n(4)\n\n1Up to a sign.\n\n4\n\n\fwhere diag(G) \u2208 RQ is the column vector of the diagonal entries in the Gram matrix G := \u03a5T \u03a5 \u2208\nRQ\u00d7Q and 1Q \u2208 RQ is the column vector of Q ones. This establishes a relationship between the\nmeasured distances in D and the locations of the complex points in R2 which we seek. We denote by\nJ the geometric centering matrix, J := I \u2212 1\n\nQ 1Q1T\n\nQ so that\n\n(cid:98)G = \u2212 1\n\n(5)\n\n2 J DJ = J GJ = (\u03a5J )T (\u03a5J )\n\ncentered point set and the geometric centering matrix because \u03a5J is the points in \u03a5 with their mean\n\nis the Gram matrix of the centered point set in terms of \u03a5. (cid:98)G and J are know as the Gram matrix of the\nsubtracted. An estimate (cid:98)\u03a5 of the centered point set, \u03a5J, is then obtained by eigendecomposition as\n(cid:98)G = V diag(\u03bb1, . . . , \u03bbQ)V T and taking (cid:98)\u03a5 = [\n\n\u03bb2v2]T where v1 and v2 are the \ufb01rst and\nsecond columns of V and assuming that the eigenvalue sequence is nonincreasing. This process is\nthe classical MDS algorithm [23, 5]. Finally, the phases can be calculated via a four-quadrant inverse\ntangent, \u03c6(yq) = arctan(\u03c5q2, \u03c5q1).\n\n\u03bb1v1,\n\n\u221a\n\n\u221a\n\nProcrustes analysis. As we recovered a centered point set via MDS with a geometric centering\nmatrix J, the point set will have its centroid at the origin. This is a consequence of the used algorithm,\nand not the \u201ctrue\u201d origin. As described above, we know that |yq|2 de\ufb01nes squared distances to the\norigin and yQ = (cid:104)a, xQ(cid:105) = 0 + 0j (as xQ was set to the origin), meaning that we can correctly\n\ncenter the recovered points by translating the point set, (cid:98)\u03a5, by \u2212\u03c5Q.\n\nThe correct absolute rotation and re\ufb02ection cannot be recovered. However, since we only care about\nworking with some effective R(A) with the correct distribution, we only need to ensure that the\nrelative phases between the frames are correct. We can thus designate the \ufb01rst frame as the reference\nframe and set the rotation (which directly corresponds to the phase) and re\ufb02ection (corresponding to\nconjugation) arbitrarily. Once these are chosen, the anchors r1, . . . , rK are \ufb01xed, which in turn \ufb01xes\nthe phasing\u2013conjugation operator R.\nSince A is unknown, R is also unknown, but \ufb01xed anchors allow us to compute the correct relative\nphase with respect to R(A) for the subsequent frames. Namely, upon receiving a new input \u03bes to\nbe randomly projected, we now localize it with respect to a \ufb01xed set of anchors. This is achieved\n\nby Procrustes analysis. Denoting by (cid:101)\u03a51 our reference estimate of the anchor positions in frame 1\n(columns 2, . . . , Q of (cid:98)\u03a5 above which was recovered from (cid:98)G in (5)), and by (cid:101)\u03a5s the MDS estimate\nof anchor positions in frame s, adequately centered. Let (cid:101)\u03a5s(cid:101)\u03a5\ndecomposition of (cid:101)\u03a5s(cid:101)\u03a5\nR = V U T so that R(cid:101)\u03a5s \u2248 (cid:101)\u03a51 [19].\n\nT\n1 = U \u03a3V T be the singular value\nT\n1 . The optimal transformation matrix in the least squares sense is then\n\nFinally, we note that with a good estimate of the anchors, one can imagine not relocalizing them in\nevery frame. The localization problem for \u03be then boils down to multilateration, cf. Section C in the\nsupplementary material.\n\n2.4 Sensitivity threshold and missing measurements\n\nAs we further elaborate in Section A of the supplementary material, in practice some measurements\nfall below the sensitivity threshold of the camera and produce spurious values. A nice bene\ufb01t\nof multiple \u201cviews\u201d of \u03be via its interaction with reference signals is that we can ignore those\nmeasurements. This introduces missing values in D which can be modeled via a binary mask matrix\nW . The recovery problem can be modeled as estimating \u03a5 from W (cid:12) (D + E) where W \u2208 RN\u00d7N\ncontains zeros for the entries which fall below some prescribed threshold, and ones otherwise.\nWe can predict the performance of the proposed method when modeling the entries of W as iid\nBernoulli random variables with parameter p, where 1 \u2212 p is the probability that an entry falls below\n,\n2(2b\u22121)\nwhere b is the number of bits, and \u03ba an upper bound on the entries of D (in our case 28 \u2212 1 = 255).\nAdapting existing results on the performance of multidimensional scaling [28] (by noting that E is\nsub-Gaussian), we can get the following scaling of the distance recovery error with the number of\n\nthe sensitivity threshold and E as uniform quantization noise distributed as U(cid:16)\u2212 \u03ba\n\n2(2b\u22121) ,\n\n(cid:17)\n\n\u03ba\n\n5\n\n\fAlgorithm 1 MPR algorithm for S frames.\n\nInput: Squared distances(cid:2)|yjQ,m \u2212 ylQ,m|2(cid:3)\n\n1 \u2264 m \u2264 M; [\u00b7 ]s denotes frame s\n\ns for all 1 \u2264 j, l \u2264 Q for frames 1 \u2264 s \u2264 S and rows\n\n(cid:46) Initialize Y\n\n(cid:46) Solve each row separately\n(cid:46) D \u2208 RQ\u00d7Q\n(cid:46) [\u03a5]1 \u2208 R2\u00d7Q\n\nPopulate all frame s = 1 distances into distance matrix D\n[\u03a5]1 \u2190 MDS(D)\n[\u03a5]1 \u2190 GradientDescent(D, [\u03a5]1)\n[\u03a5]1 \u2190 [\u03a5]1 \u2212 [\u03c5Q]11T\ns \u2190 2\nwhile s \u2264 S do\n\nOutput: Y \u2208 CM\u00d7S containing all localized points such that ys = R(A)\u03bes for some \ufb01xed R.\n1: Y \u2190 0M\u00d7S\n2: m \u2190 1\n3: while m \u2264 M do\n4:\n5:\n6:\n7:\n8:\n9:\n10:\n11:\n12:\n13:\n14:\n15:\n16:\n17:\n18:\n19:\n20:\n21: end while\n\nPopulate all frame s distances into distance matrix D\n[\u03a5]s \u2190 MDS(D)\n[\u03a5]s \u2190 GradientDescent(D, [\u03a5]s)\n[\u03a5]s \u2190 [\u03a5]s \u2212 [\u03c5Q]s1T\nR \u2190 Procrustes([\u03c52, . . . , \u03c5Q]1, [\u03c52, . . . , \u03c5Q]s)\n[\u03a5]s \u2190 Align([\u03a5]s, R, [\u03c52, . . . , \u03c5Q]1)\ns \u2190 s + 1\n\nU \u2190(cid:2)[\u03c51]1, [\u03c51]2, . . . , [\u03c51]S\n\nend while\nym \u2190 u1 + ju2\nm \u2190 m + 1\n\n(cid:46) U \u2208 R2\u00d7S\n(cid:46) Multiply second row of U with j and add to \ufb01rst row\n\n(cid:46) R aligns frames 1 and s anchors\n(cid:46) Align anchors\n\n(cid:46) Translate to align with origin\n\n(cid:3)\n\nanchors K (for K suf\ufb01ciently large),\n\nE(cid:104)(cid:13)(cid:13)(cid:13) \u02c6D \u2212 D\n\n(cid:13)(cid:13)(cid:13)F\n\n(cid:105) (cid:46) \u03ba\u221a\n\n1\nK\n\n(6)\nwhere (cid:46) denotes inequality up to a constant which depends on the number of bits b, the sub-\nGaussian norm of the entries in E, and the dimension of the ambient space (here R2). An important\nimplication is that even for coarse quantization (small b) and for a large fraction of entries below the\nsensitivity threshold (small p), we can achieve arbitrarily small amplitude and phase errors per point\nby increasing the number of reference signals K.\n\npK\n\n,\n\nRe\ufb01nement with gradient descent. The output of the classical MDS method described above can\nbe further re\ufb01ned via a local search. A standard differentiable objective called the squared stress is\nde\ufb01ned as follows,\n\nZ\n\nZ\n\nmin\n\nf (\u03a5) = min\n\n(7)\nwhere K(\u00b7) is as de\ufb01ned in (4) and Z \u2208 R2\u00d7Q is the point matrix induced by row m of A. In our\nexperiments we report the result of re\ufb01ning the classical MDS results via gradient descent on (7).\nNote that the optimization (7) is nonconvex. The complete procedure is thus analogous to the usual\napproach to nonconvex phase retrieval by spectral initialization followed by gradient descent [15, 4].\nAlgorithm 1 summarizes our proposed method.\n\nZT Z\n\nF\n\n,\n\n(cid:13)(cid:13)(cid:13)W (cid:12)(cid:16)\n\nD \u2212 K(cid:16)\n\n(cid:17)(cid:17)(cid:13)(cid:13)(cid:13)2\n\n3 Experimental veri\ufb01cation and application\n\nWe test the proposed MPR algorithm via simulations and experiments on a real OPU. For hardware\nexperiments, we use a scikit-learn interface to a publicly available cloud-based OPU.2\n\n2https://www.lighton.ai/lighton-cloud/.\n\nReproducible code available at https://github.com/swing-research/opu_phase under the MIT License.\n\n6\n\n\fEvaluation metrics. The main challenge is to evaluate the performance without knowing the\ntransmission matrix A. To this end, we propose to use the linearity error. The rationale behind this\nmetric is that with the phase correctly recovered, the end-to-end system should be linear. That is, if we\nrecover y and z from |y|2 = |A\u03be1|2 and |z|2 = |A\u03be2|2, then we should get (y + z) when applying\nthe method to |v|2 = |A(\u03be1 + \u03be2)|2. With this notation, the relative linearity error is de\ufb01ned as\n\nlinearity error =\n\n1\nM\n\n|(ym + zm) \u2212 vm|\n\n|vm|\n\n.\n\n(8)\n\nM(cid:88)\n\nm=1\n\nThe second metric we use is the number of \u201cgood\u201d or correct bits. This metric can only be evaluated in\nsimulation since it requires the knowledge of the ground truth measurements. Letting |y|2 = |(cid:104)a, \u03be(cid:105)|2\nand \u02c6y be our estimate of y, the number of good bits is de\ufb01ned as\n\ngood bits = \u2212 20\n\n6.02 log\n\n(cid:16)||y|2 \u2212 |\u02c6y|2| /|y|2(cid:17)\n\n.\n\nIt is proportional to the signal-to-quantization-noise ratio if the distances uniformly cover all quanti-\nzation levels.3\n\n3.1 Experiments\n\nIn all simulations, intensity measurements are quantized to 8 bits and all signals and references are\niid standard (complex) Gaussian random vectors.\nWe \ufb01rst test the phase recovery performance by evaluating the linearity error. In simulation, we\ndraw random frames \u03be1, \u03be2, and A \u2208 C100\u00d7642. We apply Algorithm 1 to |A\u03be1|2, |A\u03be2|2 and\n|A(\u03be1 + \u03be2)|2 and calculate the linearity error (8). We use classical MDS and MDS with gradient\ndescent (MDS-GD). Figure 2a shows that the system is indeed approximately linear and that the\nlinearity error becomes smaller as the number of reference signals grows. In Figure 2b, we set the\nsensitivity threshold to \u03c4 = 6 and zero the distances below the threshold per (2). Again, the linearity\nerror quickly becomes small as the number of anchors increases showing that the overall system is\nrobust and that it allows recovery of phase for small-intensity signals.\nNext, we test the linearity error with a real hardware OPU. The OPU gives 8-bit unsigned integer\nmeasurements. A major challenge is that the DMD (see Figure 1) only allows binary input signals.\nThis is a property of the particular OPU we use and while it imposes restrictions on reference design,\nthe method is unchanged as our algorithm does not assume a particular type of signal. Section\nA in the supplementary material describes how we create binary references and addresses other\nhardware-related practicalities.\nFigure 2c reports the linearity error on the OPU with suitably designed references and the same size\nA. The empirically determined sensitivity threshold of the camera is \u03c4 = 6, and the measurements\nbelow the threshold were not used. We ignore rows of A which give points with small norms (less\nthan two) because they are prone to noise and disproportionately in\ufb02uence the relative error. Once\nagain, we observe that the end-to-end system with Algorithm 1 is approximately linear and that the\nlinearity improves as we increase the number of anchors.\nFinally, we demonstrate the magnitude denoising performance. We draw a \u2208 C100, a random signal\n\u03be \u2208 R100 and a set of random reference anchor signals. We run our algorithm for number of anchors\nvarying between 2 and 15. For each number of anchors, we recover \u02c6y for |y|2 = |(cid:104)a, \u03be(cid:105)|2 using\neither classical MDS or MDS-GD. We then measure the number of good bits. The average results\nover 100 trials are shown in Figure 3a. Figure 3b reports the same experiment with the sensitivity\nthreshold set to \u03c4 = 6 (that is, the entries below \u03c4 are zeroed in the distance matrix per (2)). Both\n\ufb01gures show that the proposed algorithm signi\ufb01cantly improves the estimated magnitudes in addition\nto recovering the phases. The approximately 1 additional good bit with gradient descent in Figure\n3b corresponds to the relative value of 21/28 \u2248 0.8% which is consistent with the gradient descent\nimprovement in Figure 2b.\nWe also test a scenario where the anchor positions on the complex plane are known exactly and\nwe only have to localize a single measurement. We compare this to localizing the anchors and the\n\n3Note that the quantity registered by the camera is actually the squared magnitude, hence the factor 20.\n\n7\n\n\f(a)\n\n(b)\n\n(c)\n\nFigure 2: Experiments in simulation and on real hardware to evaluate the linearity error as de\ufb01ned\nin (8). The input signals are of dimension 642, M in (8) is 100 and the number of anchors signals\nare increased. The classical MDS and MDS with gradient descent (MDS-GD) are used. In all cases\nthe error decreases as the number of anchors increases. (a) In simulation with Gaussian signals and\nGaussian reference signals; (b) In simulation with Gaussian signals and Gaussian reference signals\nwith sensitivity threshold \u03c4 = 6; (c) On a real OPU with binary signals and binary references.\n\n(a)\n\n(b)\n\n(c)\n\nFigure 3: (a) Magnitude denoising performance of MDS and MDS-GD over 100 trials. Input signals\nare Gaussian and of dimension 100; (b) Magnitude denoising performance of MDS and MDS-GD over\n100 trials with 100-dimensional Gaussian signals and sensitivity threshold \u03c4 = 6; (c) Comparison\nbetween recovering a single point and recovering the point and anchors at the same time. SR-LS is\nused to locate a single point when anchors are known and MDS is used to locate all points when\nanchors are unknown.\n\nmeasurements jointly. Localizing a single point via multilateration is performed by minimizing the\nSR-LS objective (see (9) in the supplementary material). The input signal dimension is 642 and\nwe recover \u02c6y for |y|2 = |(cid:104)a, \u03be(cid:105)|2. We perform 100 trials and calculate the SNR of the recovered\ncomplex points. Figure 3c shows that although having perfect knowledge of anchor locations helps,\nclassical MDS alone does not perform much worse.\n\nOptical randomized singular value decomposition. We use Algorithm 1 to implement random-\nized singular value decomposition (RSVD) as described in Halko et al. [8] on the OPU. We use 5\nanchors in all RSVD experiments. The original RSVD algorithm and a variant with adaptations for\nthe OPU are described in Algorithms 2 and 3 in the supplementary material.\nOne of the steps in the RSVD algorithm for an input matrix B \u2208 RM\u00d7N requires the computation\nof B\u2126 where \u2126 \u2208 RN\u00d72K is a standard real Gaussian matrix, K is the target number of singular\nvectors, and 2K may be interpreted as the number of random projections for each row of B. We\nuse the OPU to compute this random matrix multiplication. An interesting observation is that since\nin Algorithm 1 we recover the result of multiplications by a complex matrix with independent real\nand imaginary parts, we can halve the number of projections when using the OPU with respect to\nthe original algorithm. By treating each row of B as an input frame, we can obtain Y \u2208 CK\u00d7M\nvia Algorithm 1 when |Y |2 = |ABT|2 with A as de\ufb01ned in Problem 1 with K rows. Then, we can\nconstruct P = [Re(Y \u2217)\nIm(Y \u2217)] \u2208 RM\u00d72K which would be equivalent to computing B\u2126 for\nreal \u2126. Section B in the supplementary material describes this in more detail.\n\n8\n\n2468101214Number of anchors24681012Average relative error (%)MDSMDS-GD2468101214Number of anchors0102030405060Average relative error (%)MDSMDS-GD2468101214Number of anchors20304050Average relative error (%)MDSMDS-GD2468101214Number of anchors5.05.56.06.57.0Average good bitsMDSMDS-GDMeasured2468101214Number of anchors4.04.55.05.56.0Average good bitsMDSMDS-GDMeasured2468101214Number of anchors30.032.535.037.540.042.545.0SNR (dB)Single pointGroup\fFigure 4 shows the results when the OPU is used to perform the random matrix multiplication of the\nRSVD algorithm on a matrix B. Figure 4 (left) reports experiments with a random binary matrix\nB \u2208 R10\u00d7104, different numbers of random projections (number of rows in A), and ten trials per\nnumber of projections. We plot the average error per entry when reconstructing B from its RSVD\nmatrices and singular values. Next, we take 500 28 \u00d7 28 samples from the MNIST dataset [13],\nthreshold them to be binary, vectorize them, and stack them into a matrix B \u2208 R500\u00d7282. Figure\n4 (right) shows the seven leading right singular vectors reshaped to 28 \u00d7 28. The top row shows\nthe singular vectors that are obtained when using the OPU with 500 projections and the bottom row\nshows the result when using Python. The error is negligible.\n\nFigure 4: Left: Average RSVD error over 10 trials with varying number of projections on hardware\nwith an input matrix of size 10 \u00d7 1000; Right: Reshaped leading right singular vectors of an MNIST\nmatrix of size 500 \u00d7 282. The top rows shows the leading right singular vectors after performing\nRSVD with the OPU and using our algorithm. The bottom row shows the leading right singular\nvectors from Python. The relative error is below each singular vector.\n\n4 Conclusion\n\nTraditional computation methods are often too slow for processing tasks which involve large data\nstreams. This motivates alternatives which instead use fast physics to \u201ccompute\u201d the desired functions.\nIn this work, we looked at using optics and multiple scattering media to obtain linear random\nprojections. A common dif\ufb01culty with optical systems is that off-the-shelf camera sensors only\nregister the intensity of the scattered light. Our results show that there is nevertheless no need to\nreach for more complicated and more expensive coherent setups. We showed that measurement\nphase retrieval can be cast as a problem in distance geometry, and that the unknown phase of random\nprojections can be recovered even without knowing the transmission matrix of the medium.\nSimulations and experiments on real hardware show that the OPU setup combined with our algo-\nrithm indeed approximates an end-to-end linear system. What is more, we also improve intensity\nmeasurements. The fact that we get full complex measurements allows us to implement a whole new\nspectrum of randomized algorithms; we demonstrated the potential by the randomized singular value\ndecomposition. These bene\ufb01ts come at the expense of a reduction in data throughput. Future work\nwill have to precisely quantify the smallest achievable data rate reduction due to allocating a part of\nthe duty cycle for reference measurements, though we note that the optical processing data rates are\nvery high to begin with.\n\nAcknowledgement\n\nSidharth Gupta and Ivan Dokmani\u00b4c would like to acknowledge support from the National Science\nFoundation under Grant CIF-1817577.\n\n9\n\n246810Number of projections101610131010107104Average error per entry3.79e-134.96e-128.36e-132.90e-123.97e-121.72e-122.67e-12Leading right singular vectorsOPUPythonRelative error\fReferences\n[1] David A Barmherzig, Ju Sun, Emmanuel J Candes, TJ Lane, and Po-Nan Li. Holographic phase\n\nretrieval and optimal reference design. arXiv preprint arXiv:1901.06453, 2019.\n\n[2] Amir Beck, Petre Stoica, and Jian Li. Exact and approximate solutions of source localization\n\nproblems. IEEE Transactions on Signal Processing, 56(5):1770\u20131778, 2008.\n\n[3] Robert Beinert. One-dimensional phase retrieval with additional interference intensity measure-\n\nments. Results in Mathematics, 72(1-2):1\u201324, 2017.\n\n[4] Emmanuel J Candes, Xiaodong Li, and Mahdi Soltanolkotabi. Phase retrieval via wirtinger\n\ufb02ow: Theory and algorithms. IEEE Transactions on Information Theory, 61(4):1985\u20132007,\n2015.\n\n[5] Ivan Dokmanic, Reza Parhizkar, Juri Ranieri, and Martin Vetterli. Euclidean distance matrices:\nessential theory, algorithms, and applications. IEEE Signal Processing Magazine, 32(6):12\u201330,\n2015.\n\n[6] Ang\u00e9lique Dr\u00e9meau, Antoine Liutkus, David Martina, Ori Katz, Christophe Sch\u00fclke, Florent\nKrzakala, Sylvain Gigan, and Laurent Daudet. Reference-less measurement of the transmission\nmatrix of a highly scattering material using a dmd and phase retrieval techniques. Optics express,\n23(9):11898\u201311911, 2015.\n\n[7] James R Fienup. Phase retrieval algorithms: a comparison. Applied optics, 21(15):2758\u20132769,\n\n1982.\n\n[8] Nathan Halko, Per-Gunnar Martinsson, and Joel A Tropp. Finding structure with randomness:\nProbabilistic algorithms for constructing approximate matrix decompositions. SIAM review,\n53(2):217\u2013288, 2011.\n\n[9] Ryoichi Horisaki, Ryosuke Takagi, and Jun Tanida. Learning-based imaging through scattering\n\nmedia. Optics express, 24(13):13738\u201313743, 2016.\n\n[10] Kishore Jaganathan, Yonina C Eldar, and Babak Hassibi. Phase retrieval: An overview of recent\n\ndevelopments. arXiv preprint arXiv:1510.07713, 2015.\n\n[11] Wooshik Kim and Monson H Hayes. Phase retrieval using two fourier-transform intensities.\n\nJOSA A, 7(3):441\u2013449, 1990.\n\n[12] Quoc Le, Tam\u00e1s Sarl\u00f3s, and Alex Smola. Fastfood-approximating kernel expansions in loglinear\n\ntime. In Proceedings of the international conference on machine learning, volume 85, 2013.\n\n[13] Yann LeCun, L\u00e9on Bottou, Yoshua Bengio, Patrick Haffner, et al. Gradient-based learning\n\napplied to document recognition. Proceedings of the IEEE, 86(11):2278\u20132324, 1998.\n\n[14] Antoine Liutkus, David Martina, S\u00e9bastien Popoff, Gilles Chardon, Ori Katz, Geoffroy Lerosey,\nSylvain Gigan, Laurent Daudet, and Igor Carron. Imaging with nature: Compressive imaging\nusing a multiply scattering medium. Scienti\ufb01c reports, 4:5552, 2014.\n\n[15] Praneeth Netrapalli, Prateek Jain, and Sujay Sanghavi. Phase retrieval using alternating mini-\n\nmization. In Advances in Neural Information Processing Systems, pages 2796\u20132804, 2013.\n\n[16] Ali Rahimi and Benjamin Recht. Random features for large-scale kernel machines. In Advances\n\nin Neural Information Processing Systems, pages 1177\u20131184, 2008.\n\n[17] Alaa Saade, Francesco Caltagirone, Igor Carron, Laurent Daudet, Ang\u00e9lique Dr\u00e9meau, Sylvain\nGigan, and Florent Krzakala. Random projections through multiple optical scattering: Approx-\nimating kernels at the speed of light. In 2016 IEEE International Conference on Acoustics,\nSpeech and Signal Processing (ICASSP), pages 6215\u20136219. IEEE, 2016.\n\n[18] Guy Satat, Matthew Tancik, Otkrist Gupta, Barmak Heshmat, and Ramesh Raskar. Object\nclassi\ufb01cation through scattering media with deep learning on time resolved measurement. Optics\nexpress, 25(15):17466\u201317479, 2017.\n\n10\n\n\f[19] Peter Hans Schoenemann. A solution of the orthogonal Procrustes problem with applications to\northogonal and oblique rotation. PhD thesis, University of Illinois at Urbana-Champaign, 1964.\n\n[20] Manoj Sharma, Christopher A Metzler, Sudarshan Nagesh, Oliver Cossairt, Richard G Baraniuk,\nand Ashok Veeraraghavan. Inverse scattering via transmission matrices: Broadband illumination\nand fast phase retrieval algorithms. IEEE Transactions on Computational Imaging, 2019.\n\n[21] Yoav Shechtman, Yonina C Eldar, Oren Cohen, Henry Nicholas Chapman, Jianwei Miao, and\nMordechai Segev. Phase retrieval with application to optical imaging: a contemporary overview.\nIEEE signal processing magazine, 32(3):87\u2013109, 2015.\n\n[22] Petre Stoica and Jian Li. Lecture notes-source localization from range-difference measurements.\n\nIEEE Signal Processing Magazine, 23(6):63\u201366, 2006.\n\n[23] Warren S Torgerson. Multidimensional scaling: I. theory and method. Psychometrika, 17(4):401\u2013\n\n419, 1952.\n\n[24] Joel A Tropp, Alp Yurtsever, Madeleine Udell, and Volkan Cevher. Practical sketching algo-\nrithms for low-rank matrix approximation. SIAM Journal on Matrix Analysis and Applications,\n38(4):1454\u20131485, 2017.\n\n[25] Yun Yang, Mert Pilanci, Martin J Wainwright, et al. Randomized sketches for kernels: Fast and\n\noptimal nonparametric regression. The Annals of Statistics, 45(3):991\u20131023, 2017.\n\n[26] Felix Xinnan X Yu, Ananda Theertha Suresh, Krzysztof M Choromanski, Daniel N Holtmann-\nRice, and Sanjiv Kumar. Orthogonal random features. In Advances in Neural Information\nProcessing Systems, pages 1975\u20131983, 2016.\n\n[27] Alp Yurtsever, Madeleine Udell, Joel A Tropp, and Volkan Cevher. Sketchy decisions: Convex\n\nlow-rank matrix optimization with optimal storage. arXiv preprint arXiv:1702.06838, 2017.\n\n[28] Huan Zhang, Yulong Liu, and Hong Lei. Localization from incomplete euclidean distance ma-\ntrix: Performance analysis for the svd-mds approach. IEEE Transactions on Signal Processing,\n2019.\n\n11\n\n\f", "award": [], "sourceid": 8427, "authors": [{"given_name": "Sidharth", "family_name": "Gupta", "institution": "University of Illinois at Urbana-Champaign"}, {"given_name": "Remi", "family_name": "Gribonval", "institution": "INRIA"}, {"given_name": "Laurent", "family_name": "Daudet", "institution": "LightOn"}, {"given_name": "Ivan", "family_name": "Dokmani\u0107", "institution": "University of Basel"}]}