{"title": "Classifying Patterns of Visual Motion - a Neuromorphic Approach", "book": "Advances in Neural Information Processing Systems", "page_first": 1147, "page_last": 1154, "abstract": null, "full_text": "Classifying Patterns of Visual Motion -\n\na Neuromorphic Approach\n\nJakob Heinzle and Alan Stocker\nInstitute of Neuroinformatics\nUniversity and ETH Z\u00a8urich\n\nWinterthurerstr. 190, 8057 Z\u00a8urich, Switzerland\n\u0001 jakob,alan\n\n@ini.phys.ethz.ch\n\nAbstract\n\nWe report a system that classi\ufb01es and can learn to classify patterns of\nvisual motion on-line. The complete system is described by the dynam-\nics of its physical network architectures. The combination of the fol-\nlowing properties makes the system novel: Firstly, the front-end of the\nsystem consists of an aVLSI optical \ufb02ow chip that collectively computes\n2-D global visual motion in real-time [1]. Secondly, the complexity of\nthe classi\ufb01cation task is signi\ufb01cantly reduced by mapping the continu-\nous motion trajectories to sequences of \u2019motion events\u2019. And thirdly, all\nthe network structures are simple and with the exception of the optical\n\ufb02ow chip based on a Winner-Take-All (WTA) architecture. We demon-\nstrate the application of the proposed generic system for a contactless\nman-machine interface that allows to write letters by visual motion. Re-\ngarding the low complexity of the system, its robustness and the already\nexisting front-end, a complete aVLSI system-on-chip implementation is\nrealistic, allowing various applications in mobile electronic devices.\n\n1 Introduction\n\nThe classi\ufb01cation of continuous temporal patterns is possible using Hop\ufb01eld networks with\nasymmetric weights [2], but classi\ufb01cation is restricted to periodic trajectories with a well-\nknown start and end point. Also purely feed-forward network architectures were proposed\n[3]. However, such networks become unfeasibly large for practical applications.\n\nWe simplify the task by \ufb01rst mapping the continuous visual motion patterns to sequences of\nmotion events. A motion event is characterized by the occurrence of visual motion in one\nout of a pre-de\ufb01ned set of directions. Known approaches for sequence classi\ufb01cation can\nbe divided into two major categories: The \ufb01rst group typically applies standard Hop\ufb01eld\nnetworks with time-dependent weight matrices [4, 5]. These networks are relatively inef\ufb01-\ncient in storage capacity, using many units per stored pattern. The second approach relies\non time-delay elements and some form of coincidence detectors that respond dominantly\nto the correctly time-shifted events of a known sequence [6, 7]. These approaches allow a\ncompact network architecture. Furthermore, they require neither the knowledge of the start\n\n\u0003 corresponding author; www.ini.unizh.ch/\u02dcalan\n\n\u0002\n\fand end point of a sequence nor a reset of internal states. The sequence classi\ufb01cation net-\nwork of our proposed system is based on the work of Tank and Hop\ufb01eld [6], but extended\nto be time-continuous and to show increased robustness. Finally, we modify the network\narchitecture to allow the system to learn arbitrary sequences of a particular length.\n\n2 System architecture\n\nN\nt 2 t 3\n\nE\nt 2 t 3\n\nt 1\n\nt 1\n\nS\nt 2 t 3\n\nW\nt 2 t 3\n\nt 1\n\nt 1\n\nmy\n\nE\n\nmx\n\n N\n\nW\n\n0\n\nS\n\nA\n\nB\n\nC\n\nNWE\n\nOptical\nflow\nchip\n\nDirection\nselective\nnetwork\n\nSequence\nclassification\nnetwork\n\ntime\n\nSystem\noutput\n\nFigure 1: The complete classi\ufb01cation system. The input to the system is a real-world mov-\ning visual stimulus and the output is the activity of units representing particular trajectory\nclasses.\n\nThe system contains three major stages of processing as shown in Figure 1: the optical\n\ufb02ow chip estimates global visual motion, the direction selective network (DSN) maps the\nestimate to motion events and the sequence classi\ufb01cation network (SCN) \ufb01nally classi\ufb01es\nthe sequences of these events. The architecture re\ufb02ects the separation of the task into the\nclassi\ufb01cation in motion space (DSN) and, consecutively, the classi\ufb01cation in time (SCN).\nClassi\ufb01cation in both cases relies on identical WTA networks differing in their inputs only.\nThe outputs of the DSN and the SCN are \u2019quasi-discrete\u2019 - both signals are continuous-time\nbut due to the non-linear ampli\ufb01cation of the WTA represent discrete information.\n\n2.1 The optical \ufb02ow chip\n\nThe front-end of the classi\ufb01cation system consists of the optical \ufb02ow chip [1, 8], that esti-\nmates 2D visual motion. Due to adaptive circuitry, the estimate of visual motion is fairly\nindependent of illumination conditions. The estimation of visual motion requires the inte-\ngration of visual information within the image space in order to solve for inherent visual\nambiguities. For the purpose of the here presented classi\ufb01cation system, the integration of\nvisual information is set to take place over the complete image space. Thus, the resulting\nestimate represents the global visual motion perceived. The output signals of the chip are\nthat represent at any instant the two components of the\nactual global motion vector. The output signals are linear to the perceived motion within\n\ntwo analog voltages \u0002\u0001 and \u0002\u0003\na range of \u0004\u0006\u0005\b\u0007\n\t volts. The resolvable speed range is 1-3500 pix/sec, thus spans more than\nthree orders of magnitude. The continuous-time voltage trajectory \u000b\n\r\f\u000f\u000e\u0011\u0010\u0013\u0012\u0014\f\u000f\u0015\u0001\u0016\f\u0017\u000e\u0011\u0010\u0019\u0018\u0011\u0002\u0003\u001a\f\u0017\u000e\u0011\u0010\u0011\u0010\n\nis the input to the direction selective network.\n\n2.2 The direction selective network (DSN)\n\ninto a sequence of motion events, where\nan event means that the motion vector points into a particular region of motion space.\nMotion space is divided into a set of regions each represented by a unit of the DSN (see\nis within\n\nThe second stage transforms the trajectory \u000b\nFigure 2a). Each direction selective unit (DSU) receives highest input when \u000b\n\n\u001b\f\u0017\u000e\u0011\u0010\n\n\n\f(1)\n\n\u0001\n\t\n\n\f\u000e\n\n\u0007\u001c\b\n\n\u0007\u0002\b\n\nthe corresponding region. In the following we choose four motion directions referred to as\nnorth (N), east (E), south (S) and west (W) and a central region for zero motion.\n\nThe WTA behavior of the DSN can be described by minimizing the cost function [9]\n\n\u0017\u0019\u0018 are the excitatory and inhibitory weights between the DSU [8]. The\n\n\u0010 . Following gradient descent, the\n\n\u001a \u001f\u001d!#\"\n\ndynamics of the units are described by\n\n\u0010\u0016\u0015\n\f\u000e\r\u000f\f\n\u0010\n\u0011\n\r\u0014\u0013\n\u0017\u0019\u0018\u001b\u001a\n\u0002\u0001\u0004\u0003\u0006\u0005\n\r&%'\n\n*),+#-\n/103254\nwhere\u0007\u0002\b\n\u0001\n\t and\u0007\nunits have a sigmoidal activation function\u0013\n\f;:\n\u0010\u0016\u0015\n%'\r\n6=:\n\u0007\u0002\b\n\u0017\u0019\u0018\n\u0001\n\t\n\u0001\n\t\nwhere<\nand(\n\u000b . The input to the DSU is\n\u0018 DSU is given by the angleA\n\u001a \u001f\nthe>@?\n\u0012CB\u0013\f;>\n%'\r\n\u001aON\n\u001aON\nifH\n\u0012GFIH\nH7JLK=M\n\u001aON\nifH\nmotion unit is%VU\nthresh\u001a\nH .\n\n\u0001\u001d\t\n\u001398\n\u001076\n\u0017\u0019\u0018\u001b\u001a\nHQPSR\nHQT\n\nis the motion estimate in polar coordinates. The input to the zero\nIn Figure 2b we compare the outputs of a DSU to\n\nare the capacitance and resistance of the units. The preferred direction of\n\nwhere \u000b\n\n\u0007\u0002\b\n\n\u0010ED\n\n(2)\n\n(3)\n\na\n\nW\n\nmy\n\nE\n\nmx\n\n N\n\n0\n\nS\n\nb\n\n1\n\ny\nt\ni\nv\ni\nt\nc\na\n\nm\n\n0\n0.3\notio\n\nc\n\n1\n\ny\nt\ni\nv\ni\nt\nc\na\n\nm\n\n0\n0.3\notio\n\nn N\n\n0\n-S [\n\n0.3\ns\nt\n[ V o l\n\n]\n\n0\n\ni o n E - W \n\n-0.3\n\nm o t\n\nV\n\n-0.3\nolts]\n\nn N\n\n0\n-S [\n\n0.3\ns\nt\n[ V o l\n\n]\n\n0\n\ni o n E - W \n\n-0.3\n\nm o t\n\nV\n\n-0.3\nolts]\n\nFigure 2: The direction selective network. a) The WTA architecture of the DSN. Filled\nconnections are excitatory, empty ones are inhibitory. Dotted lines show the regions in\nmotion space where the different units win. b) The response of the N-DSU to constant\ninput is shown as surface plot, while the responses of the same unit to dynamic motion\ntrajectories (circles and straight lines) are plotted as lines. Differences between constant\nand dynamic inputs are marginal. c) The output of the zero motion unit to constant input.\n\nconstant and varying input\n\n. The dynamic response is close to the steady state as long as\n\nthe time-constant of the DSN is smaller than the typical time-scale of \u000b\n\n\u001b\f\u0017\u000e\u0011\u0010 .\n\n\u0012\n\u000b\n\u0012\n\n\u0013\n\u0007\n\n\u000b\n\u001e\n\u0013\n\n\u001a\n$\n\f\n\n\u0013\n\u0015\n\u001f\n(\n\f\n.\n\n\f\n\u0013\n\n\u0012\n0\n\n<\n\n6\n\u000e\n\u0012\n\u001a\n\u001f\n(\n:\n\n\u0015\n\u0013\n\n\u001a\n\u0007\n\n\f\n\u0010\n\u0013\n\f\n\u0007\n\n\u0010\n\u0015\n$\n\u0018\n\n\u000b\n\n\"\nA\n\nA\n\n\"\n\u0005\nA\n\nR\n\"\n\n\u0012\n\f\nH\n\u000b\n\nH\n\u0018\nN\n\u0010\n\u0012\n%\nH\n\u000b\n\n\u000b\n\n\f2.3 The sequence classi\ufb01cation network (SCN)\n\nThe classi\ufb01cation of the temporal structure of the DSN output is the task of the SCN.\nThe network uses time-delays to \u201dconcentrate information in time\u201d [6] (see Figure 3b). In\nequivalence with the regions in motion space these time-delays form \u2019regions\u2019 in time.\n\n\u0001\u0004\u0003\u0006\u0005\n\ntime-delays, where \n\n\u0010 and follow the dynamics\n\nThe number of units (SCU) of the SCN is equal to the number of trajectory classes the\nis the number of events\n\nof the time-delay functions is the characteristic time-scale of the sequence classi\ufb01cation.\n(1), except\nis introduced to provide constant input. The SCU have an\n\nsystem is able to classify. We use \nof the longest sequence to be classi\ufb01ed. The time interval \u0001 delay between two maxima\nAgain, the SCN is a WTA network with a cost function equivalent to\nthat an additional term\u0002\u0004\u0003\u0006\u0005\b\u0007\nactivation function \u0007\n\f\n\t\n6\u000b\t\n\n\u0001\n\t\f\u0007\n\u0017\u0019\u0018\nThe last term is equivalent to the input term$\n6\u001b\u0019\nconnections between the DSN and the SCN and \u0011\noutput of the DSU. The time-delay functions are the same as in [6] 1. Note that\u0002\n\nis the delayed\nis the only\nadditional term compared to the dynamics in (2). It allows to set a detection threshold to\nthe sequence classi\ufb01cation.\n\n\u000e are the weights of the\n\f\u0017\u000e\n\n\u0017\u0019\u0018\n\u0001\n\t\nU\u0018\u0017\n\u0012\u0014\u0013\u0016\u0015\n\nin (2).\n\n\u0007\u0002\b\n\n(4)\n\n\u0007\u0002\b\n\n\u000e\u0012\u0011\n\n\f\u001a\u0019\n\nFigure 3a shows an outline of the SCN and its connectivity. For example, if the sequence N-\n\nW-E has to be classi\ufb01ed, the inputs from the E-DSU delayed by \u0001 delay, from the W-DSU by\n\u0001 delay and from the N-DSU by\u001c\u001d\u0001 delay are excitatory, while all the others are inhibitory. All\n\nexcitatory as well as all inhibitory weights are equal with excitation being twice as strong\nas inhibition. The additional time-delay is always inhibitory. It prevents the \ufb01rst motion\nevent from overruling the rest of the sequence and is crucial for the exact classi\ufb01cation of\nshort sequences.\n\na\n\nN E\nt 2 t 3\nt 2 t 3\n\nt 1\n\nS W\nt 2 t 3\nt 2 t 3\n\nt 1\n\nt 1\n\nt 1\n\nWTA\n\nb\n\nN\n\ndelayed\nmotion\nevents\n\n3xTdelay\n\nW\n\n2xTdelay\n\nTdelay\n\nE\n\nN\nW\nE\n\nNWE\n\nmotion\nevents\n\nsimultaneous\ninput\n\ntime\n\nFigure 3: The sequence classi\ufb01cation network. a) Outline of its WTA structure (shown\nwithin the dashed line) and its input stage (k=3). The time-delays between the DSU and\n\nthe SRU are numbered in units of\u0001 delay. Filled dots are excitatory connections while empty\n\nones are inhibitory. The additional inhibitory delay is not shown. The marked unit recog-\nnizes the sequence N-W-E. b) A sequence is classi\ufb01ed by delaying consecutive motion\nevents such that they provide a simultaneous excitatory input.\n\n1\u001e \u001f\"!$#&%(' delay)+*-,&.\u000b/\n\n!10\n\n!32\n\u001f&4 delay)657,8.\"/\n\n\u001e:9\n\n0;!<2\n\u001f&4 delay)\n\n! , where0\n\n*>=@?\fA\u000bB@B\n\nCDAFE\n\n\u0015\n\u001f\n\u0005\n\u0005\n\u0012\n0\n\u0005\n<\n\u0005\n6\n\u000e\n\u0012\n\u001a\n\u001f\n(\n\t\n\u0005\n\u0015\n\u0005\n\u001a\n\f\n\n\u0007\n\n\u0007\n\n\u001a\n\u0002\n\u0015\n\f\n\u0007\n\n\u001a\n\u0010\n\u0015\n\f\n\n\f\n\u000e\n\u000f\n\u0005\n\n\u0010\n\u0013\n\n\u0010\n\u000e\n\u0007\n%\n\u0005\n\u000f\n\u0005\n\n\u0010\n\u0013\n\n\u0010\n\u000e\n\u000e\n\u0010\n\u0013\n\n\u001a\n\u0019\n\u0010\n\u000b\n)\n\f3 Performance of the system\n\nWe measure the performance of the system in two different ways. Firstly, we analyze the\nrobustness to time warping. Knowing the response properties of the optical \ufb02ow chip [8] we\nsimulate its output to analyze systematically the two other stages of the system. Secondly,\nwe test the complete system including the optical \ufb02ow chip under real conditions. Here,\nonly a qualitative assessment can be given.\n\n3.1 Robustness to time warping\n\n\u0002\u0001\u0004\u0003\n\n\u0006\b\u0007\n\t\n\n\f\u000e\n\n\u0010\u0019\u0018\n\n\u0010\u0010\u000f\n\n. The important parameters are\nbetween the centers of two\n\nand a linear stretch in time, i.e. a change in both parameters. Time is always measured in\n\nwhere \u000b\n\u000e and the time difference\n\nWe simulate the visual motion trajectories as a sum of Gaussians in time, thus \u000b\n\u001b\f\u0017\u000e\u0011\u0010\n\u0011\u0004\u0001 only\n\u000e only, changes of\n\u0001 delay for sequences of length\n\u0001 delay, classi\ufb01cation is still\n\u000e\u0006\u0012\n\u0005\u0014\u0013\n\nthe width of the GaussiansN\nneighboring Gaussians. Three schemes are tested: changes ofN\nunits of the characteristic time-delay\u0001 delay.\n\u0001 delay,N\n\u000e can be decreased down to \u001c\u0006D\ntwo and down to\u001f\n\u0001 delay for longer sequences. FixingN\n\u000b volts,\n\naccording to Figure 4a; e.g. for a sequence of length three and\n. For three and four\nevents (gray and white bars in Figure 4). Linear time stretches change the total input to the\nsystem. This causes the asymmetry seen in Figure 4b. Short sequences are relatively more\nrobust to any change in\n\ninput strength \n\ncan maximally increase by\n\nthan longer sequences2\n\nFor \ufb01xed\n\nguaranteed for varying\n\n\u0005\b\u0007\n\nno class. \n\na\n\np\nr\na\nw\n \ne\nm\n\ni\nt\n\n+150%\n\n+100%\n\n+50%\n\n0%\n\n-50%\n\nno class. \n\nb\n\np\nr\na\nw\n \ne\nm\n\ni\nt\n\n+150%\n\n+100%\n\n+50%\n\n0%\n\n-50%\n\n0.1\n\n0.2\n\ninput [Volts]\n\n0.3\n\n0.1\n\n0.2\n\ninput [Volts]\n\n0.3\n\nFigure 4: Time warping. The histograms shows the maximal acceptable time warping.\nThe results are shown for three different trajectory lengths (black:\ntwo motion events,\ngray: three events, white: four events) and three different input strengths (maximal output\nis changed. b) Time\n. No\n\nis stretched linearly and therefore the duration of the events is proportional to\nclassi\ufb01cation is possible for sequences of length four at very low input levels.\n\nvoltages of the optical \ufb02ow chip). a)N\n\nis held at\u001f\n\n\u0001 delay while\n\n\u0011\u0004\u0001\n\nThe system cannot distinguish between the sequences e.g. N-W-E-W and N-W-W-W. In\nthis case, the sum of the weighted integrals of the delay functions of both sequences leads\nto an equivalent input to the SCN. However, if two adjacent events are not allowed to be\nthe same, this problem does not occur.\n\n2Imagine the time warp being C A\u0016\u0015\nbecomes larger than' delay for some of the events, which leads to inhibition instead of excitation.\n\n. For a sequence with \ufb01ve events and more, the time shift\n\n\u0012\n\nU\n\u0003\n\u0010\n\u000b\n6\n\u0010\n\u001a\n\"\n\u0005\n?\n2\n\u0010\n/\n\u000b\n?\n6\n\u0010\n\u0004\n\f\n\u001f\n\u0018\n\u0005\n\u0004\n\f\n\u0005\n\u0018\n\u001f\n\u0011\n\u0001\n\u0011\n\u0001\n\u0012\n\t\nD\n\u000b\n\u0011\n\u0001\nU\n\u0012\n\u0011\n\u0001\n\u0012\n\u0011\n\u0001\n\u000e\nD\n\u000b\n\u0011\n\u0001\n\f3.2 Real world application - writing letters with patterns of hand movements\n\nThe complete system was applied to classify visual motion patterns elicited by hand move-\nments in front of the optical \ufb02ow chip. Using sequences of three events we are able to\nclassify 36 valid sequences and therefore encode the alphabet. Figure 5 shows a typical\nvisual motion pattern (assigned to the letter \u2019H\u2019) and the corresponding signals at all stages\nof processing.\n\na\n\nc\n\n]\ns\nt\nl\no\nV\n\n[\n \n\nn\no\ni\nt\no\nm\n\n0.2\n\n0\n\n-0.2\n\n1\n\n0.5\n\ny\nt\ni\nv\ni\nt\nc\na\n \nU\nS\nD\n\n-0.2\n\n0\n\nmotion [Volts]\n\n0.2\n\nb\n\nd\n\n]\ns\nt\nl\no\nV\n\n[\n \n\nn\no\ni\nt\no\nm\n\n0.2\n\n0\n\n-0.2\n\n1\n\n0.5\n\ny\nt\ni\nv\ni\nt\nc\na\n \nU\nC\nS\n\n0 \n\n1\n\n2\n\n3\ntime [T\n\ndelay\n\n5\n\n4\n]\n\n0\n\n0 \n\n1\n\n2\n\ntime [T\n\n5\n\n4\n\n]\n\n0\n\n0 \n\n1\n\n2\n\n3\ntime [T\ndelay\n\n4\n]\n\n5\n\n3\ndelay\n\nFigure 5: Tracking a signal through all stages. a) The output of the optical \ufb02ow chip to\na moving hand in a N-S vs. E-W motion plot. The marks on the trajectory show different\ntime stamps. b) The same trajectory including the time stamps in a motion vs. time plot\n\n(N-S motion: solid line, E-W motion: dashed line). Time is given in units of \u0001 delay. c) The\n\noutput of the DSN showing classi\ufb01cation in motion space. (N: solid line, E: dashed, W:\ndotted). d) The output of the SCN. Here, the unit that recognizes the trajectory class \u2019H\u2019 is\nshown by the solid line. The detection threshold is set at 0.8 maximal activity.\n\nThe system runs on a 166Mhz Pentium PC using MatLab (TheMathworks Inc.). The signal\nof the optical \ufb02ow chip is read into the computer using an AD-card. All simulations are\ndone with simple forward integration of the differential equations.\n\n4 Learning motion trajectories\n\nWe expanded the system to be able to learn visual motion patterns. We model each set\nof four synapses connecting the four DSU to a single SCU with the same time-delay by a\ncompetitive network of four synapse units (see Figure 6) with very slow time constants. We\n\nimpose on the output of the four units that their sum \u0003\n\n\u0001\n\n\u000e equals\u001f . The cost function\n\n\u0005\n\n\u0010\n\fa\n\nN E S W\nt 3\nt 3\n\nt 3\n\nt 3\n\nx\n\nx\n\nx\n\nx\n\n-1\n\n+\n\nb\n\nc\n\n1\n\n0.5\n\ny\nt\ni\nv\ni\nt\nc\na\n\n0\n\n0 \n\n5\n\n10\n\n15\n\n20\n\nwexc\n\ns\nt\nh\ng\ni\ne\nw\n\n0\n\n-winh\n\n0 \n\n5\n\n10\n15\ntime [sec]\n\n20\n\nof synapses. The dashed line shows one synapse: the synaptic weight \u000f\nsynapse unit and its output\n\nFigure 6: Learning trajectory classes. a) Schematics of the competitive network of a set\n, the input to the\n. Multiplication by the output signal of the SCU is indicated\nby the \u201cx\u201d in the small square, the linear mapping by the bold line from the synapse output\nto the weight. b) Output of the SCU during the repetitive presentation of a particular\ntrajectory. Initial weights were random. c) Learning the synaptic weights associated with\none particular time-delay.\n\nis given by\n\n\u0005\u000b\u0010\n\n)\u0005\u0004\u0007\u0006\n\n-\t\b\n\n\u001a \u001f\u0003\u0002\n\f\f\u000b\n\n\u001f\u0002\u001a\n\n\u0010&\u0007\n\n(5)\n\n and\n\n(6)\n\n\u0001 ,\n\n6\u000e\n\n\f\r\n\n\u0010 and \u0011\n\u0010&\u0007\n\nSince the activity of the synapse units\n\nis always between 0 and 1 a linear mapping to\n\nwhere the synapse units have an sigmoidal activation function\n\n\u0005 are de\ufb01ned as in (2) and (4). The synaptic dynamics are given by\n\f\f\u000b\n\u001f\u001c\u001a\n\u0017\u0019\u0018\n\u0017\u0019\u0018\n\n\u000e ,\u0013\n\u001a \u001f\n\u0007\u001c\b\n\u0017\u0019\u0018 . To allow\nthe actual synaptic weights is performed: \u000f\n\u0001\u001d\t\nactivation of the SCU with unlearned synapses we choose\u0007\n\u0017\u0019\u0018\u0010\u000f\nwhere\u0007\n\u0017\u0019\u0018\u0010\u000f\nare all slightly positive before learning.\u0007\n\u0017\u0019\u0018\nthe input weight ($ ), the delayed input to the\n\u000e ) and the output of the SCU (\u0007\nsynapse (\u0011\n\u0005 ) (see Figure 6a). The term \f\n\nis\nincluded to enable learning only if the sequence is completed. The weight of a particular\nsynapse is increased if both, the input to the synapse and the activity of the target SCU are\nhigh. The reduction of the other weights is due to the competitive network behavior. The\nlearning mechanism is tested using simulated and real world inputs. Under the restriction\nthat trajectories must differ by more than one event the system is able to learn sequences\nof length three. Sequences that differ by only one event are learnt by the same SCU, thus\nsubsequent sequences overwrite previous learned ones. In Figure 6b,c the learning process\nof one particular trajectory class of three events is shown. This trajectory is part of a set of\n\nis the strongest possible inhibitory weight. This assures that the weights\nincreases with increasing learning progress.\n\nThe input term in (6) is the product of:\n\n\n\n\u000e\n\u0012\nA\n\u000b\n\u0001\n\f\n\n\n\u0005\n\n\u0010\n\u000e\n\"\n\u0015\n\u001f\n(\n\f\n\n.\n/\n0\n2\n4\n\f\n\n\u0010\n6\n\n\u001a\n\f\n\n$\n\u0011\n\u0013\n\n\u0010\n\u000e\n\f\n\u0013\n\u000b\n\u0005\n\n\u0005\n\n\u0010\n\u000e\n\u0018\n\n\u0012\n0\n\u0013\n\n\u0010\n\u0007\n<\n\u0005\n\n\u0010\n\u000e\n6\n\u000e\n\u0012\n\u001a\n\u001f\n(\n\n\u0005\n\n\u0010\n\u000e\n\u001a\nA\n\f\n\f\n\n\n\u0005\n\n\u0010\n\u000e\n\u0010\n\u0015\n$\n\u0011\n\u0013\n\n\u0010\n\u000e\n\f\n\u0013\n\u000b\n\u0005\n\u0007\n\n\u0005\n\n\u0010\n\u000e\n\u0005\n\n\u0010\n\u000e\n\u0012\n\f\n\u0015\n\u0007\n\n\u0010\n\n\u0005\n\n\u0010\n\u000e\n\u001a\n\u0007\n\n\u0012\n\f\n\u0003\n\n\n\"\n\u0005\n\n\u0010\n\u000e\n\u0010\n\u0007\n\n\b\n\n\b\n\u0001\n\n\u0013\n\n\u0010\n\u001f\n\u001a\n\u0003\n\n\u0013\n\n\u0010\n\fsix trajectories that were learned during one simulation cycle, where each input trajectory\nwas consecutively presented \ufb01ve times.\n\n5 Conclusions and outlook\n\nWe have shown a strikingly simple3 network system that reliably classi\ufb01es distinct visual\nmotion patterns. Clearly, the application of the optical \ufb02ow chip substantially reduces the\nremaining computational load and allows real-time processing.\n\nA remarkable feature of our system is that - with the exception of the visual motion front-\nend, but including the learning rule - all networks have competitive dynamics and are based\non the classical Winner-Take-All architecture. WTA networks are shown to be compactly\nimplemented in aVLSI [10]. Thus, given also the small network size, it seems very likely\nto allow a complete aVLSI system-on-chip integration, not considering the learning mech-\nanism. Such a single chip system would represent a very ef\ufb01cient computational device,\nrequiring minimal space, weight and power. The \u2019quasi-discretization\u2019 in visual motion\nspace that emerges from the non-linear ampli\ufb01cation in the direction selective network\ncould be re\ufb01ned to include not only more directions but also different speed-levels. That\nway, richer sets of trajectories can be classi\ufb01ed. Many applications in mobile electronic de-\nvices are imaginable that require (or desire) a touchless interface. Commercial applications\nin people control and surveillance seem feasible and are already considered.\n\nAcknowledgments\n\nThis work is supported by the Human Frontiers Science Project grant no. RG00133/2000-B\nand ETHZ Forschungskredit no. 0-23819-01.\n\nReferences\n\n[1] A. Stocker and R. J. Douglas. Computation of smooth optical \ufb02ow in a feedback connected\n\nanalog network. Advances in Neural Information Processing Systems, 11:706\u2013712, 1999.\n\n[2] L. G. Sotelino, M. Saerens, and H. Bersini. Classi\ufb01cation of temporal trajectories by\n\ncontinuous-time recurrent nets. Neural Networks, 7(5):767\u2013776, 1994.\n\n[3] D. T. Lin, J. E. Dayhoff, and P. A. Ligomenides. Trajectory recognition with a time-delay neural\n\nnetwork. International Joint Conference on Neural Networks, Baltimore, III:197\u2013202, 1992.\n\n[4] H. Gutfreund and M. Mezard. Processing of temporal sequences in neural networks. Phys. Rev.\n\nLetters, 61(2):235\u2013238, July 1988.\n\n[5] D.-L. Lee. Pattern sequence recognition using a time-varying hop\ufb01eld network. IEEE Trans.\n\non Neural Networks, 13(2):330\u2013342, March 2002.\n\n[6] D. W. Tank and J. J. Hop\ufb01eld. Neural computation by concentrating information in time.\n\nProc. Natl. Acad. Sci. USA, 84:1896\u20131900, April 1987.\n\n[7] J. J. Hop\ufb01eld and C. D. Brody. What is a moment? Transient synchrony as a collective mecha-\nnism for spatiotemporal integration. Proc. Natl. Acad. Sci. USA, 98:1282\u20131287, January 2001.\n[8] A. Stocker. Constraint optimization networks for visual motion perception - analysis and syn-\n\nthesis. PhD thesis, ETH Z\u00a8urich, No. 14360, 2001.\n\n[9] J. J. Hop\ufb01eld. Neurons with graded response have collective computational properties like those\n\nof two-state neurons. Proc. Natl. Acad. Sci. USA, 81:3088\u20133092, May 1984.\n\n[10] R. Hahnloser, R. Sarpeshkar, M. Mahowald, R. Douglas, and S. Seung. Digital selection and\nanalogue ampli\ufb01cation coexist in a cortex-inspired silicon circuit. Nature, 405:947\u2013951, June\n2000.\n\n3e.g.\n\nthe presented man-machine interface consists only of 31 units and 4x4 time-delays, not\n\ncounting the network elements in the optical \ufb02ow chip.\n\n\f", "award": [], "sourceid": 2311, "authors": [{"given_name": "Jakob", "family_name": "Heinzle", "institution": null}, {"given_name": "Alan", "family_name": "Stocker", "institution": null}]}