{"title": "Exploiting Chaos to Control the Future", "book": "Advances in Neural Information Processing Systems", "page_first": 647, "page_last": 654, "abstract": null, "full_text": "Exploiting Chaos to Control the Future \n\nGary W. Flake* \n\nGuo-Zhen Sunt \n\nYee-Chun Leet \n\nHsing-Hen Chent \n\nInstitute for Advance Computer Studies \n\nUniversity of Maryland \nCollege Park, MD 20742 \n\nAbstract \n\nRecently, Ott, Grebogi and Yorke (OGY) [6] found an effective \nmethod to control chaotic systems to unstable fixed points by us(cid:173)\ning only small control forces; however, OGY's method is based on \nand limited to a linear theory and requires considerable knowledge \nof the dynamics of the system to be controlled. In this paper we use \ntwo radial basis function networks: one as a model of an unknown \nplant and the other as the controller. The controller is trained \nwith a recurrent learning algorithm to minimize a novel objective \nfunction such that the controller can locate an unstable fixed point \nand drive the system into the fixed point with no a priori knowl(cid:173)\nedge of the system dynamics. Our results indicate that the neural \ncontroller offers many advantages over OGY's technique. \n\n1 \n\nIntroduction \n\nRecently, Ott, Grebogi and Yorke (OGY) [6] proposed a simple but very good idea. \nSince any small perturbation can cause a large change in a chaotic trajectory, it \nis possible to use a very small control force to achieve a large trajectory modifi(cid:173)\ncation. Moreover, due to the ergodicity of chaotic motion, any state in a chaotic \n\n*Department of Computer Science, peyote@umiacs.umd.edu \ntLaboratory for Plasma Research \n\n647 \n\n\f648 \n\nFlake, Sun, Lee, and Chen \n\nattractor can be reached by a small control force . Since OGY published their work, \nseveral experiments and simulations have proven the usefulness of OGY's method. \nOne prominent application of OGY's method is the prospect of controlling cardiac \nchaos [1] . \n\nWe note that there are several unfavorable constraints on OGY's method. First, \nit requires a priori knowledge of the system dynamics, that is, the location of \nfixed points. Second, due to the limitation of linear theory, it will not work in the \npresence of large noise or when the control force is as large as beyond the linear \nregion from which the control law was constructed. Third, although the ergodicity \ntheory guarantees that any state after moving away from the desired fixed point \nwill eventually return to its linear vicinity, it may take a very long time for this to \nhappen, especially for a high dimensional chaotic attractor. \n\nIn this paper we will demonstrate how a neural network (NN) can control a chaotic \nsystem with only a small control force and be trained with only examples from \nthe state-space. To solve this problem, we introduced a novel objective function \nwhich measures the distance between the current state and its previous average. \nBy minimizing this objective function, the NN can automatically locate the fixed \npoint. As a preliminary step, a training set is used to train a forward model for \nthe chaotic dynamics. The work of Jordan and Rumelhart [4] has shown that \ncontrol problems can be mapped into supervised learning problems by coupling the \noutputs of a controller NN (the control signals) to the inputs of a forward model \nof a plant to form a multilayer network that is indirectly recurrent. A recurrent \nlearning a.lgorithm is used to train the controller NN. To facilitate learning we use an \nextended radial basis function (RBF) network for both the forward model and the \ncontroller. To benchmark with OGY's result, the Himon map is used as a numerical \nexample. The numerical results have shown the preliminary success of the proposed \nscheme. Details will be given in the following sections. \n\nIn the next section we give our methodology and describe the general form of the \nrecurrent learning algorithm used in our experiments. In Section 3, we discuss RBF \nnetworks and reintroduce a more powerful version. \nIn Section 4, the numerical \nresults are presented in detail. Finally, in Section 5, we give our conclusions. \n\n2 Recurrent Learning for Control \n\nLet kC) denote a NN whose output, tit, is composed through a plant, l(\u00b7), with \nunknown dynamics. The output of the unknown plant (the state), it+l' forms \npart of the input for the NN a.t the next time step, hence the recurrency. At each \ntime step the state is also passed to an output function , gC), which computes the \nsensation, Yt+l. The time evolution of this system is more accurately described by \n\nfit \nit+l \nYt+l \n\nk(it,!h+l'W) \n{(it, fit) \ng(Xt+I), \n\nwhere ii7+1 is the desired sensation for time t + 1 and W represents the trainable \nweights for the network . Additionally, we define the temporally local and global \n\n\fExploiting Chaos to Control the Future \n\n649 \n\nerror functionals \n\nJt = ~11Y7 - Ytl1 2 and E = L~I Ji, \n\nwhere N is the final time step for the system. \n\nThe real-time recurrent learning (RTRL) algorithm [9] for training the network \nweights to minimize E is based on the fair assumption that minimizing the local \nerror functionals with a small learning rate at each time step will correspond to \nminimizing the global error. To derive the learning algorithm, we can imagine the \nsystem consisting of the plant, controller, and error functionals as being unfolded \nin time. From this perspective we can view each instance of the controller NN \nas a separate NN and thus differentiate the error functionals with respect to the \nnetwork weights at different times. Hence, we now add a time index to Wt to \nrepresent this fact. However, when we use W without the time index, the term \nshould be understood to be time invariant. \n\nWe can now define the matrix \n\nf t = ~ a~t = ~it aiIt-1 \n\na -\n\n+ \u00a3)-\n\n\u00a3) \nUUt-l Wt-I \n\nL.J \u00a3) -\ni=O UWi \n\n(axt aiIt-1 \n\na.... + a.... \n\nUUt-l Xt-I \n\nXt-l \n\nait ) \n\nwhich further allows us to define \n\naJi \naw \naE \naw \n\nft-I, \n\n(1) \n\n(2) \n\n(3) \n\nEquation 2 is the gradient equation for the RTRL algorithm while Equation 3 is for \nthe backpropagation through time (BPTT) learning algorithm [7]. The gradients \ndefined by these equations are usually used with gradient descent on a multilayer \nperceptron (MLP) . We will use them on RBF networks. \n\n3 The CNLS Network \n\nThe Connectionist Normalized Local Spline (eNLS) network [3] is an extension of \nthe more familiar radial basis function network of Moody and Darken [5]. The \nforward operation of the network is defined by \n\nwhere \n\n(4) \n\n(5) \n\nAll of the equations in this section assume a single output. Generalizing them for \nmultiple outputs merely adds another index to the terms. For all of our simulations, \nwe choose to distribute the centers, iii, based on a sample of the input space. \n\n\f650 \n\nFlake, Sun, Lee, and Chen \n\nAdditionally, the basis widths, f3i' are set to an experimentally determined constant . \nBecause the output, <p, is linear in the terms Ii and d~, training them is very fast. \nTo train the CNLS network on a prediction problem we, can use a quadratic error \nfunction of the form E = ~(y(i) - qj(i\u00bb2, where y(i) is the target function that \nwe wish to approximate. We use a one-dimensional Newton-like method [8] which \nyields the update equations \n\nIf + 7J (y(i) - <P(i\u00bbL'~~~i)' \n\n~ + 7J (y(x) - <p(x\u00bb~=---=-!J...\u00a3...!..lo..::....L--\n\nThe right-most update rules form the learning algorithm when using the CNLS \nnetwork for prediction, where 7J is a learning rate that should be set below 1.0. The \nleft-most update rules describe a more general learning algorithm that can be used \nwhen a target output is unknown. \n\nWhen using the CNLS network architecture as part of a recurrent learning algorithm \nwe must be able to differentiate the network outputs with respect to the inputs. Note \nthat in Equations 1 and 2 each of the terms aXt/aUt-l, aUt-daxt-l, ait/Bit- 1 , \nand Biii/ aii can either be exactly solved or approximated by differentiating a CNLS \nnetwork. Since the CNLS output is highly nonlinear in its inputs, computing these \npartial derivatives is not quite as elegant as it would be in a MLP . Nevertheless, it \ncan be done. We skip the details and just show the end result: \n\nann \na: = ~ d~Pi(X) + 2 ~(pj (x) qj f3j (aj - i)) - 2<p(x)::;, \n\n(6) \n\nl=l \n\nJ=l \n\n4 Adaptive Control \n\nBy combining the equations from the last two sections, we can construct a recurrent \nlearning scheme for RBF networks in a similar fashion to what has been done with \nMLP networks. To demonstrate the utility of our technique, we have chosen a well(cid:173)\nstudied nonlinear plant that has been successfully modeled and controlled by using \nnon-neural techniques. Specifically, we will use the Henon map as a plant, which \nhas been the focus of much of the research of OGY [6]. We also adopt some of their \nnotation and experimental constraints. \n\n4.1 The Himon Map \n\nThe Henon map [2] is described by the equations \n\n(7) \n(8) \n\n\fExploiting Chaos to Control the Future \n\n651 \n\nwhere A = Ao + p and p is a control parameter that may be modified at each time \nstep to coerce the plant into a desirable state. For all simulations we set Ao = 1.29 \nand B = 0.3 which gives the above equations a chaotic attracter that also contains \nan unstable fixed point. Our goal is to train a CNLS network that can locate and \ndrive the map into the unstable fixed point and keep it there with only a minimal \namount of information about the plant and by using only small values of p. \nThe unstable fixed point (XF, YF) in Equations 7 and 8 can be easily calculated as \nXF = YF ~ 0.838486. Forcing the Henon map to the fixed point is trivial if the \ncontroller is given unlimited control of the parameter. To make the problem more \nrealistic we define p* as the maximum magnitude that p can take and use the rule \nbelow on the left \n\nif Ipi < p* \nif p > p* \nif p < -p* \n\n_ {p if Ipl < p* \nif Ipl > p* \n\n0 \n\nPn -\n\nwhile OGY use the rule on the right. The reason we avoid the second rule is that \nit cannot be modeled by a CNLS network with any precision since it is step-like. \n\nThe next task is to define what it means to \"control\" the Henon map. Having \nanalytical knowledge of the fixed point in the attracter would make the job of the \ncontroller much easier, but this is unrealistic in the case where the dynamics of \nthe plant to control are unknown. Instead, we use an error function that simply \ncompares the current state of the plant with an average of previous states: \n\n2] \net=2 (Xt-(x)r) +(Yt-(Y)r) \n\n2 \n\n1 [ \n\n, \n\n(9) \n\nwhere (.)r is the average of the last T values of its argument. This function ap(cid:173)\nproaches zero when the map is in a fixed point for time length greater than T. This \nfunction requires no special knowledge about the dynamics of the plant, yet it still \nenforces our constraint of driving the map into a fixed point. \n\nThe learning algorithm also requires the partial derivatives of the error function with \nrespect to the plant state variables, which are oet!f)xt = Xt - (x}r and oet!oYt = \n(Y)r. These two equations and the objective function are the only special \nYt -\npurpose equations used for this problem. All other equations generalize from the \nderivation of the algorithm. Additionally, since the \"output\" representation (as \ndiscussed earlier) is identical to the state representation, training on a distinct \noutput function is not strictly necessary in this case. Thus, we simplify the problem \nby only using a single additional model for the unknown next-state function of the \nHenon map. \n\n4.2 Simulation \n\nTo facilitate comparison between alternate control techniques, we now introduce \nthe term f6t where 6t is a random variable and f is a small constan~ which specifies \nthe intensity of the noise. We use a Gaussian distribution for bt such that the \ndistribution has a zero mean, is independent, and has a variance of one. In keeping \nwith [6], we discard any values of 6t which are greater in magnitude than 10. For \ntraining we set f = 0.038. However, for tests on the real controller, we will show \nresults for several values of f. \n\n\f652 \n\nFlake, Sun, Lee, and Chen \n\n(a) \n\n\u2022 \n\n\u2022 \n\n\u2022 \n\n(b) \n\n\u2022 \n\n\u2022 \n\n\u2022 \n\n(c) \n\n\u2022 \n\n\u2022 \n\n\u2022 \n\n\"\"\" \n'r \n\n(d) \n\n\u2022 \n\n\u2022 \n\n- (e) . \n\n. \n\n\u2022 \n\n(f) \n\n\u2022 \n\nFigure 1: Experimental results from training a neural controller to drive the Himon \nmap into a fixed point. From (a) to (f), the values of fare 0.035, 0.036, 0.038, \n0.04,0.05, and 0.06, respectively. The top row corresponds to identical experiments \nperformed in [6]. \n\nWe add the noise in two places. First, when training the model, we add noise to \nthe target output of the model (the next state). Second, when testing the controller \non the real Henon map, we add the noise to the input of the plant (the previous \nstate). In the second case, we consider the noise to be an artifact of our fictional \nmeasurements; that is, the plant evolves from the previous noise free state. \n\nTraining the controller is done in two stages: an off-line portion to tune the model \nand an on-line stage to tune the controller. To train the model we randomly pick \na starting state within a region (-1.5, 1.5) for the two state variables. We then \niterate the map for one hundred cycles with p = 0 so that the points will converge \nonto the chaotic attractor. Next, we randomly pick a value for p in the range of \n(-p*, p*). The last state from the iteration is combined with this control parameter \nto compute a target state. We then add the noise to the new state values. Thus, \nthe model input consists of a clean previous state and a control parameter and the \ntarget values consist of the noisy next state. We compute 100 training patterns \nin this manner. Using the prediction learning algorithm for the CNLS network \nwe train the model network on each of the 100 patterns (in random order) for 30 \nepochs. The model quickly converges to a low average error. \n\nIn the next stage, we use the model network to train the controller network in two \nways. First, the model acts as the plant for the purposes of computing a next state. \nAdditionally, we differentiate the model for values needed for the RTRL algorithm. \nWe train the controller for 30 epochs, where each epoch consists of 50 cycles. At \nthe beginning of each epoch we initialize the plant state to some random values \n(not necessarily on the chaotic attracter ,) and set the recurrent history matrix, \n\n\fExploiting Chaos to Control the Future \n\n653 \n\n... - .... .._ ..... _ ... _--.. _.... . .. _. -.-_ ..... _ .. \n\n... -. . -.- -_ ... _---.-- . _ ... ...... __ .-\n\n.... -. -\n\n..... . \n\n.-.-.. _. \n\n. ... _ ... _-.. \n\n(a) \n\n(b) \n\n(c) \n\nFigure 2: Experimental results from [6]. From left to right, the values of f. are 0.035, \n0.036, and 0.038, respectively. \n\nr t, to zero. Then, for each cycle, we feed the previous state into the controller \nas input. This produces a control parameter which is fed along with the previous \nstate as input into the model network, which in turn produces the next state. This \nnext state is fed into the error function to produce the error signal. At this point \nwe compute all of the necessary values to train the controller for that cycle while \nmaintaining the history matrix. \n\nIn this way, we train both the model and control networks with only 100 data points, \nsince the controller never sees any of the real values from the Henon map but only \nestimates from the model. For this experiment both the control and model RBF \nnetworks consist of 40 basis functions. \n\n4.3 Summary \n\nOur results are summarized by Figure 1. As can be seen, the controller is able to \ndrive the Henon Map into the fixed point very rapidly and it is capable of keeping \nit there for an extended period of time without transients. As the level of noise is \nincreased, it can be seen that the plant maintains control for quite some time . The \nfirst visible spike can be observed when f. = 0.04. \nThese results are an improvement over the results generated from the best non(cid:173)\nneural technique available for two reasons: First, the neural controller that we \nhave trained is capable of driving the Henon map into a fixed point with far fewer \ntransients then other techniques. Specifically, alternate techniques , as illustrated \nin Figure 2, experience numerous spikes in the map for values of f. for which our \ncontroller is spike-free (0.035 - 0.038). Second, our training technique has smaller \ndata requirements and uses less special purpose information. For example, the \nRBF controller was trained with only 100 data points compared to 500 for the non(cid:173)\nneural. Additionally, non-neural techniques will typically estimate the location of \nthe fixed point with an initial data set. In the case of [6] it was assumed that the \nfixed point could be easily discovered by some technique, and as a result all of their \nexperiments rely on the true (hard-coded) fixed point. This, of course, could be \ndiscovered by searching the input space on the RBF model, but we have instead \nallowed the controller to discover this feature on its own. \n\n\f654 \n\nFlake, Sun, Lee, and Chen \n\n5 Conclusion and Future Directions \n\nA crucial component of the success of our approach is the objective function that \nmeasures the distance between the current state and the nearest time average. \nThe reason why this objective function works is that during the control stage the \nlearning algorithm is minimizing only a small distance between the current point \nand the \"moving target.\" This is in contrast to minimizing the large distance \nbetween the current point and the target point, which usually causes unstable long \ntime correlation in chaotic systems and ruins the learning. The carefully designed \nrecurrent learning algorithm and the extended RBF network also contribute to the \nsuccess of this approach. Our results seem to indicate that RBF networks hold great \npromise in recurrent systems. However, further study must be done to understand \nwhy and how NNs could provide more useful schemes to control real world chaos. \n\nAcknowledgements \n\nWe gratefully acknowledge helpful comments from and discussions with Chris \nBarnes, Lee Giles, Roger Jones, Ed Ott, and James Reggia. This research was \nsupported in part by AFOSR grant number F49620-92-J-0519. \n\nReferences \n\n[1] A. Garfinkel, M.L. Spano, and W.L. Ditto. Controlling cardiac chaos. Science, \n\n257(5074):1230, August 1992. \n\n[2] M. HEmon. A two-dimensional mapping with a strange attractor. Communica(cid:173)\n\ntions in Mathematical Physics, 50:69-77, 1976. \n\n[3] R.D. Jones, Y.C. Lee, C.W. Barnes, G.W. Flake, K. Lee, P.S. Lewis, and S. Qian. \nIn \n\nFunction approximation and time series prediction with neural network. \nProceedings of the International Joint Conference on Neural Networks, 1990. \n\n[4] M.1. Jordan and D.E. Rumelhart. Forward models: Supervised learning with a \n\ndistal teacher. Technical Report Occasional Paper #40, MIT Center for Cogni(cid:173)\ntive Science, 1990. \n\n[5] J. Moody and C. Darken. Fast learning in networks of locally-tuned processing \n\nunits. Neural Computation, 1:281-294, 1989. \n\n[6] E. Ott, C. Grebogi, and J .A. Yorke. Controlling chaotic dynamical systems. \nIn D.K. Campbell, editor, CHAOS: Soviet-American Perspectives on Nonlinear \nScience, pages 153-172. American Institute of Physics, New York, 1990. \n\n[7] F.J. Pineda. Generalization of back-propagation to recurrent neural networks. \n\nPhysical Review Letters, 59:2229-2232, 1987. \n\n[8] W.H. Press, B.P. Flannery, S.A. Teukolsky, and W.T. Vetterling. Numerical \n\nRecipes. Cambridge University Press, Cambridge, 1986. \n\n[9] R.J. Williams and D. Zipser. Experimental analysis of the real-time recurrent \n\nlearning algorithm. Connection Science, 1:87-111, 1989. \n\n\f", "award": [], "sourceid": 872, "authors": [{"given_name": "Gary", "family_name": "Flake", "institution": null}, {"given_name": "Guo-Zhen", "family_name": "Sun", "institution": null}, {"given_name": "Yee-Chun", "family_name": "Lee", "institution": null}]}