{"title": "A Method for the Associative Storage of Analog Vectors", "book": "Advances in Neural Information Processing Systems", "page_first": 590, "page_last": 595, "abstract": null, "full_text": "590 \n\nAtiya and Abu-Mostafa \n\nA Method for the Associative Storage \n\nof Analog Vectors \n\nAmir Atiya (*) and Yaser Abu-Mostafa (**) \n\n(*) Department of Electrical Engineering \n\n(**) Departments of Electrical Engineering and Computer Science \n\nCalifornia Institute Technology \n\nPasadena, Ca 91125 \n\nABSTRACT \n\nA method for storing analog vectors in Hopfield's continuous feed(cid:173)\nback model is proposed. By analog vectors we mean vectors whose \ncomponents are real-valued. The vectors to be stored are set as \nequilibria of the network. The network model consists of one layer \nof visible neurons and one layer of hidden neurons. We propose \na learning algorithm, which results in adjusting the positions of \nthe equilibria, as well as guaranteeing their stability. Simulation \nresults confirm the effectiveness of the method . \n\n1 INTRODUCTION \n\nThe associative storage of binary vectors using discrete feedback neural nets has \nbeen demonstrated by Hopfield (1982). This has attracted a lot of attention, and \na number of alternative techniques using also the discrete feedback model have \nappeared. However, the problem of the distributed associative storage of analog \nvectors has received little attention in literature. By analog vectors we mean vec(cid:173)\ntors whose components are real-valued. This problem is important because in a \nvariety of applications of associative memories like pattern recognition and vector \nquantization the patterns are originally in analog form and therefore one can save \nhaving the costly quantization step and therefore also save increasing the dimension \nof the vectors. In dealing with analog vectors, we consider feedback networks of the \ncontinuous-time graded-output variety, e.g. Hopfield's model (1984): \n\ndu dt = -u + Wf(u) + a, \n\nx = f(u), \n\n(1) \n\nwhere u = (Ul, ... , UN)T is the vector of neuron potentials, x = (x!, ... , XN)T is the \nvector of firing rates, W is the weight matrix, a is the threshold vector, and f(u) \nmeans the vector (f( uI), ... , f( UN)) T, where f is a sigmoid-shaped function. \n\nThe vectors to be stored are set as equilibria of the network. Given a noisy version \nof any of the stored vectors as the initial state of the network, the network state has \n\n\fA Method for the Associative Storage of Analog Vectors \n\n591 \n\nto reach eventually the equilibrium state corresponding to the correct vector. An \nimportant requirement is that these equilibria be asymtotically stable, otherwise \nthe attraction to the equilibria will not be guaranteed. Indeed, without enforcing \nthis requirement, our numerical simulations show mostly unstable equilibria. \n\n2 THE MODEL \n\nIt can be shown that there are strong limitations on the set of memory vectors \nwhich can be stored using Hopfield's continuous model (Atiya and Abu-Mostafa \n1990). To relieve these limitations, we use an architecture consisting of both visible \nand hidden units. The outputs of the visible units correspond to the components of \nthe stored vector. Our proposed architecture will be close to the continuous version \nof the BAM (Kosko 1988). The model consists of one layer of visible units and \nanother layer of hidden units (see Figure 1). The output of each layer is fed as an \ninput to the other layer. No connections exist within each of the layers. Let y and \nx be the output vectors of the hidden layer and the visible layer respectively. Then, \nin our model, \n\ndu dt = -u + Wf(z) + a = e, \ndz \ndt = -z + Vf(u) + b = h, \n\ny = f(u) \n\nx = f(z) \n\n(2a) \n\n(2b) \n\nwhere W = [Wij] and V = [Vij] are the weight matrices, a and b are the threshold \nvectors, and f is a sigmoid function (monotonically increasing) in the range from \n-1 to 1, for example \n\nf(u) = tanh(u). \n\nx \n\nx \n\nhld~n \nl~y.,. \n\nvlSlbl. \nl~y.,. \n\nFigure 1: The model \n\n\f592 \n\nAtiya and Abu\u00b7Mostafa \n\nAs we mentioned before, for a basin of attraction to exist around a given mem(cid:173)\nory vector, the corresponding equilibrium has to be asymtotically stable. For the \nproposed architecture a condition for stability is given by the following theorem. \n\nTheorem: An equilibrium point (u*, z*) satisfying \n\nJ'l/2( un 2:IWij If'l/2(zj) < 1 \n\nj \n\nJ'l/\\Z;) 2:I Vij l!,l/2(uj) < 1 \n\n(3a) \n\n(3b) \n\nj \nfor all i is asymptotically stable. \n\nProof: We linearize (2a), (2b) around the equilibrium. We get \n\nwhere \n\ndq \n-=Jq, \ndu \n\nif i = 1, ... , Nl \nif i = Nl + 1, ... , Nl + N 2, \n\nNl and N2 are the number of units in the hidden layer and the visible layer respec(cid:173)\ntively, and J is the Jacobian matrix, given by \n\n~ ~ fu ~ \naUl \n\naZ 1 \n\naUNl \n\naZN'J \n\nJ= \n\nae~l \naUl \nghl \nUl \n\nahNa \naUl \n\nae~l \naUNl \nah \n8U\";1 \n\nae~l \naZ 1 \n~hl ~ \nZl \n\naeN1 \naZN'J \n\naZN'J \n\nahNa \naUNl \n\nahNa \nlhl \n\nahNa \naZN'J \n\nthe partial derivatives evaluated at the equilibrium point. Let Al and A2 be re(cid:173)\nspectively the Nl x Nl and N2 x N2 diagonal matrices with the ith diagonal element \nbeing respectively f'(un and f'(z;). Furthermore, let \n\nThe Jacobian is evaluated as \n\nwhere IL means the L x L identity matrix. Let \n\nA-\n-\n\n(\n\n_A- l \n1 \nV \n\n\fA Method for the Associative Storage of Analog Vectors \n\n593 \n\nThen, \n\nJ=AA. \n\nEigenvalues of AA are identical to the eigenvalues of A 1/2 AA 1/2 because if ). is an \neigenvalue of AA corresponding to eigenvector v, then \n\nAAv = ).v, \n\nand hence \n\nNow, we have \n\nAl/2AAI/2 _ (-INl \n\n-\n\nA~/2V A~/2 \n\nA~/2WA~/2) \n. \n\n-IN2 \n\nBy Gershgorin's Theorem (Franklin 1968), an eigenvalue of J has to satisfy at least \none of the inequalities: \n\nI). + 11 ::; f'1/2( un 2:IWii 1f'1/2(zi) \n\ni = 1, ... ,N1 \n\ni \n\nI). + 11::; f'1/2(zn2:lvjil!,1/2(uj) \n\ni \n\ni = 1, ... ,N2' \n\nIt follows that under conditions (3a), (3b) that the eigenvalues of J will have neg(cid:173)\native real parts, and hence the equilibrium of the original system (2a), (2b) will be \nasymptotically stable. \n\nThus, if the hidden unit values are driven far enough into the saturation region \n(i.e. with values close to 1 or -1), then the corresponding equilibrium will be stable \nbecause then, 1'( un will be very small, causing Inequalities (3) to be satisfied. \nAlthough there is nothing to rule out the existence of spurious equilibria and limit \ncycles, if they occur then they would be far away from the memory vectors because \neach memory vector has a basin of attraction around it. In our simulations we have \nnever encountered limit cycles. \n\n3 TRAINING ALGORITHM \nLet xm, m = 1, ... , M be the vectors to be stored. Each xm should correspond to the \nvisible layer component of one of the asymptotically stable equilibria. We design \nthe network such that the hidden layer component of the equilibrium corresponding \nto xm is far into the saturation region. The target hidden layer component ym can \nbe taken as a vector of l's and -1 's, chosen arbitrarily for example by generating \nthe components randomly. Then, the weights have to satisfy \n\nyj = !(2:Wi/X, + aj), \n\n/ \n\nxi = ![2:Vjj!(2:Wj/x/ + aj) + b;]. \n\nj \n\n/ \n\n\f594 \n\nAtiya and Abu-Mostafa \n\nTraining is performed in two steps. In the first step we train the weights of the \nhidden layer. We use steepest descent on the error function \n\nEl = Lllyj - f(LWjlX; + aj )11 2 . \n\nm,j \n\nI \n\nIn the second step we train the weights of the visible layer, using steepest descent \non the error function \n\nE2 = L \n\nII xi - ![LVij!(LWj/x; + aj) + bd 112. \n\nm,i \n\nj \n\nI \n\nWe remark that in the first step convergence might be slow since the targets are lor \n-1. A way to have fast convergence is to stop if the outputs are within some constant \n(say 0.2) from the targets. Then we multiply the weights and the thresholds of the \nhidden layer by a big positive constant, so as to force the outputs of the hidden \nlayer to be close to 1 or -1. \n\n4 IMPLEMENTATION \n\nWe consider a network with 10 visible and 10 hidden units. The memory vectors are \nrandomly generated (the components are from -0.8 to 0.8 rather than the full range \nto have a faster convergence). Five memory vectors are considered. After learning, \nthe memory is tested by giving memory vectors plus noise (100 vectors for a given \nvariance). Figure 2 shows the percentage correct recall in terms of the signal to \nnoise ratio. Although we found that we could store up to 10 vectors, working close \nto the full capacity is not recommended, as the recall accuracy dc>teriorates. \n\n/. correct \n\n100 -r--.......--~~---------::::_-----> \n80 \n60 \n40 \n20 \nO.f...o.-----------............-----I \n-6 \n10 \n\n-2 \n\n2 \n\n6 \n\nsnr (db) \n\nFigure 2: Recall accuracy versus signal to noise ratio \n\n\fA Method for the Associative Storage of Analog Vectors \n\n595 \n\nAcknowledgement \n\nThis work is supported by the Air Force Office of Scientific Research under grant \nAFO SR-88-0231 . \n\nReferences \n\nJ. Hopfield (1982), \"Neural networks and physical systems with emergent collective \ncomputational abilities\", Proc. Nat. Acad. Sci. USA, vol. 79, pp. 2554-2558. \nJ. Hopfield (1984), \"Neurons with graded response have collective computational \nproperties like those of two state neurons\", Proc. Nat. Acad. Sci. USA, vol. 81, p. \n3088-3092. \n\nA. Atiya and Y. Abu-Mostafa (1990), \"An analog feedback associative memory\", \nto be submitted. \nB. Kosko (1988), \"Bidirectional associative memories\", IEEE Trans. Syst. Man \nCybern., vol. SMC-18, no. 1, pp. 49-60. \nJ. Franklin (1968) Matrix Theory, Prentice-Hall, Englewood Cliffs, New Jersey. \n\n\f\fPART VII: \n\nEMPIRICAL ANALYSES \n\n\f", "award": [], "sourceid": 206, "authors": [{"given_name": "Amir", "family_name": "Atiya", "institution": null}, {"given_name": "Yaser", "family_name": "Abu-Mostafa", "institution": null}]}