{"title": "Computing Motion Using Resistive Networks", "book": "Neural Information Processing Systems", "page_first": 422, "page_last": 431, "abstract": null, "full_text": "422 \n\nCOMPUTING MOTION USING RESISTIVE NETWORKS \n\nChristof Koch, Jin Luo, Carver Mead \n\nCalifornia Institute of Technology, 216-76, Pasadena, Ca. 91125 \n\nJet Propulsion Laboratory, California Institute of Technology \n\nJames Hutchinson \n\nPasadena, Ca. 91125 \n\nINTRODUCTION \n\nTo us, and to other biological organisms, vision seems effortless. We open \nour eyes and we \"see\" the world in all its color, brightness, and movement. \nYet, we have great difficulties when trying to endow our machines with similar \nabilities. In this paper we shall describe recent developments in the theory of \nearly vision which lead from the formulation of the motion problem as an ill(cid:173)\nposed one to its solution by minimizing certain \"cost\" functions. These cost \nor energy functions can be mapped onto simple analog and digital resistive \nnetworks. Thus, we shall see how the optical flow can be computed by injecting \ncurrents into resistive networks and recording the resulting stationary voltage \ndistribution at each node. These networks can be implemented in cMOS VLSI \ncircuits and represent plausible candidates for biological vision systems. \n\nAPERTURE PROBLEM AND SMOOTHNESS ASSUMPTION \n\nIn this study, we use intensity-based schemes for recovering motion. Let us \nderive an equation relating the change in image brightness to the motion of the \nimage (seel ). Let us assume that the brightness of the image is constant over \ntime: dI(~,y,t)/dt = o. On the basis of the chain rule of differentiation, this \ntransforms into \n\n81 d~ \n8~ dt + 8y dt + at = Izu + Iyv + It = 'V I\u00b7 v + It = 0, \n\n81 dy \n\n81 \n\n(1) \n\nwhere we define the velocity v as (u,v) = (d:1)/dt,dy/dt). Because we assume \nthat we can compute these spatial and temporal image gradients, we are now \nleft with a single linear equation in two unknowns, u and v, the two components \nof the velocity vector (aperture problem). Any measuring system with a finite \naperture, whether biological or artificial, can only sense the velocity component \nperpendicular to the edge or along the spatial gradient (-It! 1 'V I I). The \ncomponent of motion perpendicular to the gradient cannot, in principle, be \nregistered. The problem remains unchanged even if we measure these velocity \ncomponents at many points throughout the image. \n\nHow can this problem be made well-posed, that is, having a unique solu(cid:173)\n\ntion depending continuously on the data? One form of \"regularizing\" ill-posed \n\n@ American Institute of Physics 1988 \n\n\f423 \n\nproblems is to restrict the class of admissible solutions by imposing appropriate \nconstraints2 \u2022 Applying this method to motion, we shall argue that in gen(cid:173)\neral objects are smooth-except at isolated discontinuities-undergoing smooth \nmovements. Thus, in general, neighboring points in the world will have similar \nvelocities and the projected velocity field should reflect this fact. We therefore \nimpose on the velocity field the constraint that it should be the smoothest as well \nas satisfying the data. As measure of smoothness we choose, the square of the \nvelocity field gradient. The final velocity field (u, v) is the one that minimizes \n\nA J J [ (::)' + (::)' + (~:)' + (:~)'] dz dy \n\n(2) \n\n+ \n\n+ \n\n+ \n\n+ \n\n+ \n\n., \n+ \n_I~ \n\nII:, + \n\n+ \n\n+ \n\n!al \n\n(b) \n\nFig. 1. ( a) The location of the horizontal (lfj) and vertical (Iij) line processes \nrelative to the motion field nngrid. (b) The hybrid resistive network, computing \nthe optical flow in the presence of discontinuities. The conductances T c - ij con(cid:173)\nnecting both grids depend on the brightness gradient, as do the conductances \ngij and gij connecting each node with the battery. For clarity, only two such \nelements are shown. The battery Eij depends on both the temporal and the \nspatial gradient and is zero if no brightness change occurs. The ~ (resp. y) com(cid:173)\nponent of the velocity is given by the voltage in the top (resp. bottom) network. \nBinary switches, which make or break the resistive connections between nodes, \n\n\f424 \n\nimplement motion discontinuities. These switches could be under the control of \ndistributed digital processors. Analog cMOS implementations are also feasible 3 \u2022 \n\nThe first term implements the constraint that the final solution should follow \nas closely as possible the measured data whereas the second term imposes the \nsmoothness constraint on the solution. The degree to which one or the other \nterms are minimized is governed by the parameter).. If the data is very ac(cid:173)\ncurate, it should be \"expensive\" to violate the first term and), will be small. \nIf, conversely, the data is unreliable (low signal-to-noise), much more emphasis \nwill be placed on the smoothness term. Horn and Schunck1 first formulated this \nvariational approach to the motion problem. \n\nThe energy E( u, v) is quadratic in the unknown u and v. It then follows \nfrom standard calculus of variation that the associated Euler-Lagrange equations \nwill be linear in u and v: \n\nI~u + IzIyv -\nI z I 1I u + I:v -\n\n). \\721.\u00a3 + IzIt = 0 \n). \\72 v + Iylt = O. \n\n(3) \n\nWe now have two linear equations at every point and our problem is therefore \ncompletely determined. \n\nANALOG RESISTIVE NETWORKS \n\nLet us assume that we are formulating eqs. (2) and (3) on a discrete 2-D \n\ngrid, such as the one shown in fig. 1a. Equation (3) then transforms into \n\nI~ijuij + IzijI1Iijvij -\nIzijlyijuij + I:ijvij -\n\n). (UHlj + Uij+l - 4Uij + Ui-lj + Uij-l) + IZijltij = 0 \n). (VHlj + Vij+l - 4Vij + Vi-lj + Vij-l) + Iyijltij = 0 \n(4) \nwhere we replaced the Laplacian with its 5 point approximation on a rectangular \ngrid. We shall now show that this set of linear equations can be solved naturally \nusing a particular simple resistive network. Let us apply Kirchhoff's current law \nto the nodne i, j in the top layer of the resistive network shown in fig. lb. We \nthen have the following update equation: \n\ndu\u00b7\u00b7 \n\nC d;' = T (Ui+lj + Uij+l - 4Uij + Ui-lj + Uij-l) \n\n+ gij (Eij - Uij) + Tc-ij( Vij - Uij). \n\n(5) \n\nwhere Vij is the voltage at node i, j in the bottom network. Once dUij / dt = 0 \nand dVij/dt = 0, this equation is seen to be identical with eq. (4), if we identify \n\n\f425 \n\n(6) \n\nTc-ij ~ -IzijIyij \n\ngij ~ Izij (Izij + IJlij) \ngij ~ Iyii (Izii + Iyij) \n\n-It \n\nEij~----\nIzii + Iyij \n\n(a) \n\n(b) \n\n(c) \n\n(d) \n\n~ \n\n(e) \n\n(f) \n\nFig. 2. Motion sequence using synthetic data. (a) and (b) Two images of \nthree high contrast squares on a homogeneous background. \n(c) The initial \nvelocity data. The inside of both squares contain no data. (d) The final state \n\n\f426 \n\nof the network after 240 iterations, corresponding to the smooth optical flow \nfield. (e) Optical flow in the presence of motion discontinuities (indicated by \nsolid lines). (f) Discontinuities are strongly encouraged to form at the location \nof intensity edges4 \u2022 Both (e) and (f) show the state of the hybrid network after \nsix analog-digital cycles. \n\nOnce we set the batteries and the conductances to the values indicated in \neq. (6), the network will settle-following Kirchhoff's laws-into the state of \nleast power dissipation. The associated stationary voltages correspond to the \nsought solution: uii is equivalent to the :c component and Vii to the y component \nof the optical flow field. \n\nWe simulated the behavior of these networks by solving the above circuit \n\nequations on parallel computers of the Hypercube family. As boundary condi(cid:173)\ntions we copied the initial velocity data at the edge of the image into the nodes \nlying directly adjacent but outside the image. \n\nThe sequences in figs. 2 and 3 illustrate the resulting optical flow for syn(cid:173)\n\nthetic and natural images. As discussed by Horn and Schunck1 , the smoothness \nconstraint leads to a qualitatively correct estimate of the velocity field. Thus, \none undifferentiated blob appears to move to the lower right and one blob to \nthe upper left. However, at the occluding edge where both squares overlap, the \nsmoothness assumption results in a spatial average of the two opposing veloc(cid:173)\nities, and the estimated velocity is very small or zero. In parts of the image \nwhere the brightness gradient is zero and thus no initial velocity data exists (for \ninstance, the interiors of the two squares), the velocity estimates are simply the \nspatial average of the neighboring velocity estimates. These empty areas will \neventually fill in from the boundary, similar to the How of heat for a uniform \nflat plate with \"hot\" boundaries. \n\nMOTION DISCONTINUITIES \n\nThe smoothness assumption of Horn and Schunck1 regularizes the aperture \n\nproblem and leads to the qualitatively correct velocity field inside moving ob(cid:173)\njects. However, this approach fails to detect the locations at which the velocity \nchanges abruptly or discontinuously. Thus, it smoothes over the figure-ground \ndiscontinuity or completely fails to detect the boundary between two objects \nwith differing velocities because the algorithm combines velocity information \nacross motion boundaries. \n\nA quite successful strategy for dealing with discontinuities was proposed by \nGeman and Geman5 \u2022 We shall not rigorously develop their approach, which is \nbased on Bayesian estimation theory (for details see5,6). Suffice it to say that \na priori knowledge, for instance, that the velocity field should in general be \nsmooth, can be formulated in terms of a Markov Random Field model of the \nimage. Given such an image model, and given noisy data, we then estimate \nthe \"best\" flow field by some likelihood criterion. The one we will use here \n\n\f427 \n\nis the maximum a posteriori estimate, although other criteria are possible and \nhave certain advantages6 \u2022 This can be shown to be equivalent to minimizing an \nexpression such as eq. (2). \n\nIn order to reconstruct images consisting of piecewise constant segments, \nGeman and Geman5 further introduced the powerful idea of a line process 1. \nFor our purposes, we will assume that a line process can be in either one of two \nstates: \"on\" (1 = 1) or \"off\" (1 = 0). They are located on a regular lattice set \nbetween the original pixel lattice (see fig. 1a), such that each pixel i,j has a \nhorizontallfi and a verticallij line process associated with it. If the appropriate \nline process is turned on, the smoothness term between the two adjacent pixels \nwill be set to zero. In order to prevent line processes from forming everywhere \nand, furthermore, in order to incorporate additional knowledge regarding dis(cid:173)\ncontinuities into the line processes, we must include an additional term Vc(l) \ninto the new energy function: \n\nE( 'IL, v, lh., IV) = L (Iz'ILii + IyVii + I t )2 + \n\ni.i \n\n). L (1 -It) [('lLi+1i - 'lLii)2 + (Vi+li - Vii)2] + \ni.i \n\n). L (1 -Iii) [('lLij+l - 'lLii)2 + (vii+1 - Vij)2] + Vc(l). \ni.i \n\n(7) \n\nVc contains a number of different terms, penalizing or encouraging specific \n\nconfigurations of line processes: \n\ni.; \n\ni.i \n\nplus the corresponding expression for the vertical line process Iii (obtained by in(cid:173)\nterchanging i with j and Iii with Ifi). The first term penalizes each introduction \nof a line process, since the cost Cc has to be \"payed\" every time a line process \nis turned on. The second term prevents the formation of parallel lines: if either \nlfi+l or Ifi+2 is turned on, this term will tend to prevent It from turning on. \nThe third term, CIVI , embodies the fact that in general, motion discontinuities \noccur along extended contours and rarely intersect (for more details see7 ). \n\nWe obtain the optical flow by minimizing the cost function in eq. (7) with \nrespect to both the velocity v and the line processes Ih. and IV. To find an \noptimal solution to this non-quadratic minimization problem, we follow Koch \net a1. 7 and use a purely deterministic algorithm, based on solving Kirchhoff's \nequations for a mixed analogi digital network (see also 8). Our algorithm exploits \nthe fact that for a fixed distribution of line processes, the energy function (7) \nis quadratic. Thus, we first initialize the analog resistive network (see fig. 2b) \naccording to eq. (6) and with no line processes on. The network then converges to \n\n\f428 \n\nthe smoothest solution. Subsequently, we update the line processes by deciding \nat each site of the line process lattice whether the overall energy can be lowered \nby setting or breaking the line proceSSj that is, lfi will be turned on if E( u, v, lfi = \n1, IV) < E( u, v, Ifi = 0, IV); otherwise, Ifj = o. Line processes are switched on \nby breaking the appropriate resistive connection between the two neighboring \nnodes. After the completion of one such analog-digital cycle, we reiterate and \ncompute-for the newly updated distribution of line processes-the smoothest \nstate of the analog network. Although there is no guarantee that the system will \nconverge to the global minimum, since we are using a gradient descent rule, it \nseems to find next-to-optimal solutions in about 10 to 15 analog-digital cycles. \n\n(8) \n\n(c) \n\n(e) \n\nFigure 3. Optical flow of a moving person. \n(a) and (b) Two 128 by 128 \npixel images captured by a video camera. The person in the foreground is \nmoving toward the right while the person in the background is stationary. The \nnoise in the lower part of the image is a camera artifact. (c) Zero-crossings \nsuperimposed on the initial velocity data. (d) The smooth optical flow after 1000 \niterations. Note that the noise in the lower part of both images is completely \nsmoothed away. \n(e) The final piecewise smooth optical flow. The velocity \nfield is subsampled to improve visibility. The evolution of the hybrid network is \nshown after the 1. (a), 3. (b), 5. (c), 7. (d), 10. (e), and 13. (f) analog-digital \ncycle in the right part of the figure. \n\nThe synthetic motion sequence in fig. 2 demonstrates the effect of the line \n\n\f429 \n\nprocesses. The optical flow outside the discontinuities approximately delineating \nthe boundaries of the moving squares is zero, as it should be (fig. 2e). However, \nwhere the two squares overlap the velocity gradient is high and multiple inter(cid:173)\nsecting discontinuities exist. To restrict further the location of discontinuities, we \nadopt a technique used by Gamble and Poggio4 to locate depth discontinuities \nby requiring that depth discontinuities coincide with the location of intensity \nedges. Our rationale behind this additional constraint is that with very few \nexceptions, the physical processes and the geometry of the 3-dimensional scene \ngiving rise to the motion discontinuity will also give rise to an intensity edge. As \nedges we use the zero-crossings of a Laplacian of a Gaussian convolved with the \noriginal image9 \u2022 We now add a new term VZ-Cii to our energy function E, such \nthat Vz -Cii is zero if Iii is off or if Iii is on and a zero-crossing exists between \nlocations i and j. If Iii = 1 in the absence of a zero-crossing, V Z - Cii is set \nto 1000. This strategy effectively prevents motion discontinuities from forming \nat locations where no zero-crossings exist, unless the data strongly suggest it. \nConversely, however, zero-crossings by themselves will not induce the formation \nof discontinuities in the absence of motion gradients (figs. 2f and 3). \n\nANALOG VLSI NETWORKS \n\nEven with the approximations and optimizations described above, the com(cid:173)\n\nputations involved in this and similar early vision tasks require minutes to hours \non computers. It is fortunate then that modern integrated circuit technology \ngives us a medium in which extremely complex, analog real-time implementa(cid:173)\ntions of these computational metaphors can be realized3 \u2022 \n\nWe can achieve a very compact implementation of a resistive network using \nan ordinary cMOS process, provided the transistors are run in the sub-threshold \nrange where their characterstics are ideal for implementing low-current analog \nfunctions. The effect of a resistor is achieved by a circuit configuration, such as \nthe one shown in fig. 4, rather than by using the resistance of a special layer in \nthe process. The value of the resulting resistance can be controlled over three \norders of magnitude by setting the bias voltages on the upper and lower current \nsource transistors. The current-voltage curve saturates above about 100 mVj a \nfeature that can be used to advantage in many applications. When the voltage \ngradients are small, we can treat the circuit just as if it were a linear resistor. \nResistances with an effective negative resistance value can easily be realized. \n\nIn two dimensions, the ideal configuration for a network implementation is \nshown in fig. 4. Each point on the hexagonal grid is coupled to six equivalent \nneighbors. Each node includes the resistor apparatus, and a set of sample-and(cid:173)\nhold circuits for setting the confidence and signal the input and output voltages. \nBoth the sample-and-hold circuits and the output buffer are addressed by a \nscanning mechanism, so the stored variables can be refreshed or updated, and \nthe map of node voltages read out in real time. \n\n\f430 \n\n~ I, \n\nv, \n\nI \n\nVI \n\n(a) \n\nv \n\n(b) \n\nFigure 4. Circuit design for a resistive network for interpolating and smoothing \nnoisy and sparsely sampled depth measurements. (a) Circuit-consisting of 8 \n(b) If the voltage \ntransistors-implementing a variable nonlinear resistance. \ngradient is below 100 mV its approximates a linear resistance. The voltage VT \ncontrols the maximum current and thus the slope of the resistance, which can \nvary between 1 MO and 1 GO 3. This cMOS circuit contains 20 by 20 grid \npoints on a hexagonal lattice. The individual resistive elements with a variable \nslope controlled by VT correspond to the term governing the smoothness, A. At \nthose locations where a depth measurement dij is present, the battery is set to \nthis value (Vin = dij ) and the value of the conductance G is set to some fixed \nvalue. If no depth data is present at that node, G is set to zero. The voltage \nat each node corresponds to the discrete values of the smoothed surface fitted \nthrough the noisy and sparse measurements7 \u2022 \n\nA 48 by 48 silicon retina has been constructed that uses the hexagonal \nnetwork of fig. 4 as a model for the horizontal cell layer in the vertebrate \nretinal 0 \u2022 \nrithmic photoreceptors-implemented via phototransistors-and the potential \ndifference across the conductance T formed an excellent approximation to the \nLaplacian operator. \n\nIn this application, the input potentials were the outputs of loga(cid:173)\n\nDISCUSSION \n\nWe have demonstrated in this study that the introduction of binary motion \n\n\f431 \n\ndiscontinuities into the algorithm of Horn and Schunck1 leads to a dramatically \nimproved performance ~f their method, in particular for the optical flow in the \npresence of a number of moving non-rigid objects. Moreover, we have shown \nthat the appropriate computations map onto simple resistive networks. We are \nnow implementing these resistive networks into VLSI circuits, using subtheshold \ncMOS technology. This approach is of general interest, because a great number \nof problems in early vision can be formulated in terms of similar non-convex \nenergy functions that need to be minimized, such as binocular stereo, edge \ndetection, surface interpolation, structure from motion, etc. 2 ,6,8. \n\nThese networks share several features with biological neural networks. Spe(cid:173)\n\ncifically, they do not require a system-wide clock, they rely on many connections \nbetween simple computational nodes, they converge rapidly-within several time \nconstants-and they are quite robust to hardware errors. Another interesting \nfeature is that our networks only consume very moderate amounts of powerj the \nentire retina chip requires about 100 J.L W 10 \n\nAcknowledgments: An early version of this model was developed and im(cid:173)\n\nplemented in collaboration with A. L. Yuille8 \u2022 M. Avalos and A. Hsu wrote the \ncode for the Imaging Technology system and E. Staats for the NCUBE. C.K. \nis supported by an ONR Research Young Investigator Award and by the Sloan \nand the Powell Foundations. C.M. is supported by ONR and by the System \nDevelopment Foundation. A portion of this research was carried out at the Jet \nPropulsion Laboratory and was sponsored by NSF grant No. EET-8714710, and \nby NASA. \n\nREFERENCES \n\n1. Horn, B. K. P. and Schunck, B. G. Artif. Intell. 17,185-203 (1981). \n2. Poggio, T., Torre, V. and Koch, C. Nature 317,314-319 (1985). \n3. Mead, C. Analog VLSI and Neural Systems. Addison-Wesley: Reading, \n\nMA (1988). \n\n4. Gamble, E. and Poggio, T. Artif. Intell. Lab. Memo. No. 970, MIT, Cam(cid:173)\n\nbridge MA (1987). \n\n5. Geman, S. and Geman, D. IEEE Trans. PAMI 6, 721-741 (1984). \n6. Marroquin, J., Mitter, S. and Poggio, T. J. Am. Stat. Assoc. 82, 76-89 \n\n(1987). \n\n7. Koch, C., Marroquin, J. and Yuille, A. Proc. Natl. Acad. Sci. USA 83, \n\n4263-4267 (1986). \n\n8. Yuille, A. L. Artif. Intell. Lab. Memo. No. 987, MIT, Cambridge, MA \n\n(1987). \n\n9. Marr, D. and Hildreth, E. C. Proc. R. Soc. Lond. B 207, 187-217 (1980). \n10. Sivilotti, M. A., Mahowald, M. A. and Mead, C. A. In: 1987 Stanford VLSI \n\nConference, ed. P. Losleben, pp. 295-312 (1987). \n\n\f", "award": [], "sourceid": 25, "authors": [{"given_name": "Christof", "family_name": "Koch", "institution": null}, {"given_name": "Jin", "family_name": "Luo", "institution": null}, {"given_name": "Carver", "family_name": "Mead", "institution": null}, {"given_name": "James", "family_name": "Hutchinson", "institution": null}]}