{"title": "Neural Networks Structured for Control Application to Aircraft Landing", "book": "Advances in Neural Information Processing Systems", "page_first": 415, "page_last": 421, "abstract": null, "full_text": "Neural Networks Structured for Control \n\nApplication to Aircraft Landing \n\nCharles Schley, Yves Chauvin, Van Henkle, Richard Golden \n\nThomson-CSP, Inc., Palo Alto Research Operations \n\n630 Hansen Way, Suite 250 \n\nPalo Alto, CA 94306 \n\nAbstract \n\nWe present a generic neural network architecture capable of con(cid:173)\ntrolling non-linear plants. The network is composed of dynamic. \nparallel, linear maps gated by non-linear switches. Using a recur(cid:173)\nrent form of the back-propagation algorithm, control is achieved \nby optimizing the control gains and task-adapted switch parame(cid:173)\nters. A mean quadratic cost function computed across a nominal \nplant trajectory is minimized along with performance constraint \npenalties. The approach is demonstrated for a control task con(cid:173)\nsisting of landing a commercial aircraft in difficult wind conditions. \nWe show that the network yields excellent performance while re(cid:173)\nmaining within acceptable damping response constraints. \n\n1 INTRODUCTION \nThis paper illustrates how a recurrent back-propagation neural network algorithm \n(Rumelhart, Hinton & Williams, 1986) may be exploited as a procedure for con(cid:173)\ntrolling complex systems. In particular. a simplified mathematical model of an \naircraft landing in the presence of severe wind gusts was developed and simulated. \nA recurrent back-propagation neural network architecture was then designed to \nnumerically estimate the parameters of an optimal non-linear control law for land(cid:173)\ning the aircraft. The performance of the network was then evaluated. \n\n1.1 A TYPICAL CONTROL SYSTEM \nA typical control system consists of a controller and a process to be controlled. The \ncontroller's function is to accept task inputs along with process outputs and to de(cid:173)\ntermine control signals tailored to the response characteristics of the process. The \n\n415 \n\n\f416 \n\nSchley, Chauvin, Henkle, Golden \n\nphysical process to be controlled can be electro-mechanical, aerodynamic, etc. \nand generally has well defined behavior. It may be subjected to disturbances from \nits external environment. \n\nControl \n\nTask Neural Net \n\n1.2 CONTROLLER DESIGN \nMany variations of both classical and modern methods to design control systems \nare described in the literature. Classical methods use linear approximations of the \nplant to be controlled and some loosely defined response specifications such as \nbandwidth (speed of response) and phase margin (degree of stability). Classical \nmethods are widely used in practice, even for sophisticated control problems. \nModern methods are more universal and generally assume that a performance in(cid:173)\ndex for the process is specified. Controllers are then designed to optimize the \nperformance index. Our approach relates more to modern methods. \nNarendra and Parthasarathy (1990) and others have noted that recurrent back(cid:173)\npropagation networks can implement gradient descent algorithms that may be used \nto optimize the performance of a system. The essence of such methods is first to \npropagate performance errors back through the process and then back through the \ncontroller to give error signals for updating the controller parameters. Figure 1 \nprovides an overview of the interaction of a neural control law with a complex \nsystem and possible performance indices for evaluating various control laws. The \ncomponents \n\n1~!!P!!U!;t ~~~~\u00a7nc~o~n~tr~o~1 j------.. process \n\nfunctional \nneeded to train the con(cid:173)\ntroller are shown within \nthe shaded box of Figure \nThe objective per(cid:173)\n1. \nformance measure con(cid:173)\ntains factors that are writ(cid:173)\nten mathematically and \nusually \nterms \nsuch as weighted square \nerror or other quantifiable \nmeasures. The perform(cid:173)\nance constraints are often \nmore subjective in nature \nand can be formulated as \nreward or penalty functions on categories of performance (e.g., \"good\" or \"bad\"). \n2 A GENERIC NON-LINEAR CONTROL ARCHITECTURE \nMany complex systems are in fact non-linear or \"multi-modal.\" That is, their \nbehavior changes in fundamental ways as a function of their position in the state \nspace. In practice, controllers are often designed for such systems by treating them \nas a collection of systems linearized about a \"setpoint\" in state space. A linear \ncontroller can then be determined separately for each of these system \"modes.\" \nThese observations suggest that a reasonable approach for controlling non-linear \nor \"multi-modal\" systems would be to design a \"multi-modal\" control law. \n\nPerformance ... - - -.. \nIndex \nSystem \n\n'-_____ ;-',.. stralnt \n\nFigure 1: Neural Network Controller Design \n\nSignals Process \n\nOut \n\nto be \nControlled \n\nrepresent \n\n2.1 THE SWITCHING PRINCIPLE \nThe architecture of our proposed general non-linear control law for \"multi-mod(cid:173)\nal\" plants is shown in Figure 2. Task inputs and process outputs are entered into \n\n\fNeural Networks Structured for Control Application to Aircraft Landing \n\n417 \n\n\u2022 \n: \n\nRepeat \n\n. \n\nTask \nInputs \n\nProcess \nOutputs \n\nmultiple basic controller \nblocks (shown within the \nshaded box of Figure 2). \nEach basic controller block \nfirst determines a weighted \nsum of the task inputs and \nprocess outputs (multipli(cid:173)\ncation by weights W) . \nThen, the degree to which \nthe weighted sum passes \nthrough the block is modi-\nfied by means of a saturat-\n: \ning switch and multiplier. \n\u2022 . The input to the switch is \nFlgure 2: Neural Network Controller ArchItecture \nitself another weighted sum \nof the task inputs and process outputs (multiplication by weights V). If the input to \nthe saturating switch is large, its output is unity and the weighted sum (weighted by \nW) is passed through unchanged. At the other extreme, if the saturating switch has \nzero output, the weighted sum of task inputs and process outputs does not appear in \nthe output. When these basic controller blocks are replicated and their outputs are \nadded, control signals consist of weighted sums of the controller inputs that can be \nselected and/or blended by the saturating switches. The overall effect is a prototyp(cid:173)\nical feed-forward and feedback controller with selectable gains and multiple path(cid:173)\nways where the overall equivalent gains are a function of the task inputs and process \noutputs. The resulting architecture yields a sigma-pi processing unit in the final \ncontroller (Rumelhart, Hinton & Williams, 1986). \n\n2.2 MODELLING DYNAMIC MAPPINGS \nWeights shown in Figure 2 may be constant and represent a static relationship be(cid:173)\ntween input and control. However, further controller functionality is obtained by \nconsidering the weights Vand Was dynamic mappings. For example, proportional \nplus integral plus derivative (PID) feedback may be used to ensure that process \noutputs follow task inputs with adequate steady-state error and transient damping. \nThus, the weights can express parameters of various generally useful control func(cid:173)\ntions. These functions, when combined with the switching principle, yield rich \ncapabilities that can be adapted to the task at hand. \n3 AIRCRAFr LANDING \nThe generic neural network architecture of Figure 2 and the associated neural net(cid:173)\nwork techniques were tested with a \"real-world\" application: automatic landing of \nan aircraft. Here, we describe the aircraft and environment model during landing. \n\n3.1 GLIDESLOPE AND FLARE \nDuring aircraft landing, the final two phases of a landing trajectory consist of a \n\"glideslope\" phase and a \"flare\" phase. Figure 3 shows these two phases. Flare \noccurs at about 45 feet. Glideslope is characterized by a linear downward slope; \nflare by an exponential shaped curve. When the aircraft begins flare, its response \ncharacteristics are changed to make it more sensitive to the pilot's actions, making \nthe process \"multi-modal\" or non-linear over the whole trajectory. \n\n\f418 \n\nSchley, Chauvin, Henkle, Golden \n\nAt Flare Initiation \nh = h, \nAltitude: \nAltitude Rate: h = h, \nv= V, \nSpeed: \n\nGlideslope Angle: Ygs \n\nhTD = 0 \n\nAt Touchdown: \nAltitude: \nAltitude Rate: 0 < hmin s hTD S hmax \nPosition: \nmax \nPitch Angle: 8min S 8TD S 8max \n\nx \nmin..;;:t \n\n~ x \n\n< X \n\nTD -\n\ntouchdown point \n\n--- .. \n\nXmax \n\nposition x \n\nGlideslope Predicted \nIntercept Point \n\nXmin \n\nXTD \n\nFigure 3: Glideslope and Flare Geometry \n\n3.2 STATE EQUATIONS \nLinearized equations of motion were used for the aircraft during each phase. They \nare adequate during the short period of time spent during glideslope and flare. A \npitch stability augmentation system and an auto-throttle were added to the aircraft \nstate equations to damp the bare airframe oscillatory behavior and provide speed \ncontrol. The function of the autoland controller is to transform information about \ndesired and actual trajectories into the aircraft pitch command. This is input to the \npitch stability augmentation system to develop the aircraft elevator angle that in \nturn controls the aircraft's actual pitch angle. Simplifications retain the overall \nquality of system response (Le., high frequency dynamics were neglected). \n\n3.3 WIND MODEL \nThe environment influences the process through wind disturbances represented by \nconstant velocity and turbulence components. The magnitude of the constant ve(cid:173)\nlocity component is a function of altitude (wind shear). Turbulence is a stochastic \nprocess whose mean and variance are functions of altitude. For the horizontal and \nvertical wind turbulence velocities, the so-called Dryden spectra for spatial turbu(cid:173)\nlence distribution are assumed. These are amenable to simulation and show rea(cid:173)\nsonable agreement with measured data (Neuman & Foster, 1970). The generation \nof turbulence is effected by applying Gaussian white noise to coloring filters. \n4 NEURAL NETWORK LEARNING IMPLEMENTATION \nAs previously noted, modern control theory suggests that a performance index for \nevaluating control laws should first be constructed, and then the control law should \nbe computed to optimize the performance index. Generally, numerical methods \nare used for estimating the parameters of a control law. Neural network algorithms \ncan actually be seen as constituting such numerical methods (Narendra and Partha(cid:173)\nsarathy, 1990; Bryson and Ho, 1969; Le Cun, 1989) . We present here an imple(cid:173)\nmentation of a neural network algorithm to address the aircraft landing problem. \n\n\fNeural Networks Structured for Control Application to Aircraft Landing \n\n419 \n\n4.1 DIFFERENCE EQUATIONS \nThe state of the aircraft (including stability augmentation and autothrottle) can be \nrepresented by a vector X, containing variables representing speed, angle of attack, \npitch rate, pitch angle, altitude rate and altitude . The difference equations de(cid:173)\nscribing the dynamics of the controlled plant can be written as shown in equation 1. \nXt+l = AX, + B,U, + CD, + IV, \n(1) \nThe matrix A represents the plant dynamics and B represents the aircraft response \nto the control U. D is the desired state and N is the additive noise computed from \nthe wind model. The switching controller can be written as in equation 2 below. \nReferring to Figure 2, the weight matrix V in the sigmoidal switch links actual alti(cid:173)\ntude to each switch unit. The weight matrix W in the linear controller links altitude \nerror, altitude rate error and altitude integral error to each linear unit output. \nU, = pl L, where P, = Sigmoidal switch and L, = Linear controller \n(2) \nFigure 4 shows a recurrent network implementation of the entire system. Actual \nand desired states at time t+1 are fed back to the input layers. Thus, with recurrent \n\nX ,;;(cid:173)\n\n\u2022\u2022\u2022\u2022\u2022\u2022\u2022\u2022 \n\n- Dt+l \n\n\u2022 \u2022\u2022 \n\nJP \u2022 \n~ \n\\ 7 \n\nconnections between out(cid:173)\nput and input, the network \ngenerates entire trajectories \nand is seen as a recurrent \nback-propagation network \n(Rumelhart, Hinton & Wil(cid:173)\nliams, 1986; Jordan & Ja(cid:173)\ncobs, 1990). The network \nis trained using the back(cid:173)\npropagation algorithm with \ngiven wind distributions. \nthe controller, we \nFor \nchose \ntwo basic \nPID controller blocks (see \nFigure 2) to represent gli(cid:173)\ndeslope and flare. The task \nof the network is then to \nlearn the state dependent \nPID controller gains that \noptimize the cost function. \n\ninitially \n\nKh \u2022 \n\n\u2022 Kit \n\nDamping judge \n\n\u2022\u2022\u2022\u2022\u2022\u2022\u2022\u2022 \n\nX, \n\n\u2022 \u2022\u2022 \n\nD, \n\nFigure 4: Recurrent Neural Network Architecture \n\n4.2 PERFORMANCE INDEX OPTIMIZATION \nThe basic performance measure selected for this problem was the squared trajecto(cid:173)\nry error accumulated over the duration of the landing. Trajectory error corre(cid:173)\nsponds to a weighted combination of altitude and altitude rate errors. \nSince minimizing only the trajectory error can lead to undesirable responses (e.g., \noscillatory aircraft motions), we include relative stability constraints in the perform(cid:173)\nance index. Aircraft transient responses depend on the value of a damping factor. \nAn analysis was performed by probing the plant with a range of values for controller \nweight parameters. The data was used to train a \"damping judge\" net to categorize \n\"good\" and \"bad\" damping. This net was used to construct a penalty function on \n\"bad\" damping. As seen in Figure 4, additional units were added for this purpose. \n\n\f420 \n\nSchley, Chauvin, Henkle, Golden \n\nThe main optimization problem is now stated. Given an initial state, minimize the \nexpected value over the environmental disturbances of performance index J. \nJ = JE + Jp = Trajectory Error + Performance Constraint Penalty \n\n(3) \n\nIE = I ah[ hcmdt - ht ]2 + ali[ hcmd/ - h t ]2 We used ah = ali = 1 \nJp = I Max(O, ejudge - e\u00b7judgeXejudge - e\u00b7judge) Note that when ejudl{f! S e\u00b7judge> there is \n\nt= 1 \n\nno penalty. Otherwzse, it is quadratic. \n\nT \n\nT \n\nt= 1 \n\n5 SIMULATION EXPERIMENTS \nWe now describe our simulations. First. we introduce our training procedure. We \nthen present statistical results of flight simulations for a variety of wind conditions. \n\n5.1 TRAINING PROCEDURE \nNetworks were initialized in various ways and trained with a random wind distribu(cid:173)\ntion where the constant sheared speed varied from 10 ft/sec tailwind to 40 ft/sec \nheadwind (a strong wind). Several learning strategies were used to change the way \nthe network was exposed to various response characteristics of the plant. The exact \nform of the resulting V switch weights varied. but not the equivalent gain schedules. \n\n5.2 STATISTICAL RESULTS \nAfter training. the performance of the network controller was tested for different \nwind conditions. Table 1 shows means and standard deviations of performance \nvariables computed over 1000 landings for five different wind conditions. Shown \n\nTable 1: Landing Statistics (standard deviations in parenthesis). \n\nI \n\nO vera ll \n\nP,' rformance \n\nI \n\\lea n ::iquared Error \n\nG lide Slope \n\n\\ \n\nFlue \n\n\\[ pan Squared Error \n\nTouchdown \n\nPerformance \n\nJ : hg, \n\nJ : hg, \n\nJ : hJ, \n\nJ : h\" \n\nZTO \n\nOTO \n\nh To \n\nWind \n\nJ I T \n\nII=-l0 \n\n1.34 \n\nT \n\n21.6 \n\n13 .500 \n\n(0 56) \n\n(0.047) \n\n(9.8) \n\nH=!) \n\n1).27 \n\n22.2 \n\n10.00) \n\n(0.000) \n\n0.603 \n\n(0.0) \n\nH=1O \n\n0.96 \n\n23.0 \n\n12.500 \n\n(0.53) \n\n(0.052) \n\n(9.2) \n\nH= 20 \n\n3.56 \n\n23 .S \n\n54 .000 \n\n(210) \n\n(0.100) \n\n(38 .0) \n\nH=30 \n\n8.03 \n\n24.6 \n\n130000 \n\n43 .200 \n\n(470) \n\n(0.160) \n\n(91.0) \n\n(25.0) \n\nH=-tO \n\n1340 \n\n25.5 \n\n21 !J 000 \n\n76200 \n\n17 80) \n\n(0.220) \n\n( 1~00) \n\n( ~6 .0) \n\n,~ 6 . 0 ) \n\nI ~ . O) \n\n4.230 \n\n(2.6) \n\n0.209 \n\n(0.0) \n\n4.390 \n\n(2.6) \n\n18.500 \n\n(11.0) \n\n3.50 \n\n( 1.4) \n\n3.31 \n\n(0.0) \n\n3.17 \n\n(2.2) \n\n8 65 \n\n(8 .2) \n\n19.20 \n\n( 19.0) \n\n3700 \n\n2.67 \n\n( 1.2) \n\n1 86 \n\n(00) \n\n2.01 \n\n(1.1 ) \n\n3.43 \n\n(2.7) \n\n5.73 \n\n(4 i ) \n\n9 ~ 4 \n\n1030 \n\n(11 ) \n\n10S0 \n\n(0) \n\n1160 \n\n(12) \n\n1230 \n\n(24) \n\n1310 \n\n(39) \n\n0.0473 \n\n-2.15 \n\n(0.0039) \n\n(0 .052 ) \n\n-0.0400 \n\n-1.9S \n\n(0.0000) \n\n(0.000) \n\n-0.1260 \n\n-1.79 \n\n(0.0046) \n\n(0.040) \n\n-0.2liO \n\n-1.64 \n\n(0.0100) \n\n(0.061) \n\n-0 .3110 \n\n-1.50 \n\n(0.0170) \n\n(0 .076) \n\n!-tOO \n(54) \n\n-0..1030 \n(nOnOI \n\n-1.39 \n\n(0 .083) \n\nare values for overall performance (quadratic cost J per time step, landing time T), \ntrajectory performance (quadratic cost J on altitude and altitude rate), and landing \nperformance (touchdown position, pitch angle, altitude rate). \n\n\fNeural Networks Structured for Control Application to Aircraft Landing \n\n421 \n\n5.3 CONTROL LAWS OBTAINED BY LEARNING \nBy examining network weights, equation 2 yields the gains of an equivalent control(cid:173)\nler over the entire trajectory (gain schedules). These gain schedules represent \noptimality with respect to a given performance index. Results show that the switch \nbuilds a smooth transition between glideslope and flare and provides the network \ncontroller with a non-linear distributed control law for the whole trajectory. \n\n6 DISCUSSION \nThe architecture we propose integrates a priori knowledge of real plants within the \nstructure of the neural network. The knowledge of the physics of the system and its \nrepresentation in the network are part of the solution. Such a priori knowledge \nstructures are not only useful for finding control solutions, but also allow interpreta(cid:173)\ntions of network dynamics in term of standard control theory. By observing the \nweights learned by the network, we can compute gain schedules and understand \nhow the network controls the plant. \nThe augmented architecture also allows us to control damping. In general, inte(cid:173)\ngrating optimal control performance indices with constraints on plant response \ncharacteristics is not an easy task. The neural network approach and back-propa(cid:173)\ngation learning represent an interesting and elegant solution to this problem. Other \nconstraints on states or response characteristics can also be implemented with simi(cid:173)\nlar architectures. In the present case, the control gains are obtained to minimize \nthe objective performance index while the plant remains within a desired stability \nregion. The effect of this approach provides good damping and control gain sched(cid:173)\nules that make the plant robust to disturbances. \nAcknowledgements \nThis research was supported by the Boeing High Technology Center. Particular \nthanks are extended to Gerald Cohen of Boeing. We would also like to thank Anil \nPhatak for his decisive help and Yoshiro Miyata for the use of his XNet simulator. \nReferences \nBryson, A. & Ho, Y. C. (1969). Applied Optimal Control. Blaisdel Publishing Co. \nJordan, M. I. & Jacobs, R. A. (1990). Learning to control an unstable system with \nforward modeling. In D. S. Touretzky (Ed.), Neural Information Processing Sys(cid:173)\ntems 2. Morgan Kaufman: San Mateo, CA. \nLe Cun, Y. (1989). A theoretical framework for back-propagation. In D. Tou(cid:173)\nretzky, G. Hinton and T. Sejnowski (Eds.), Proceedings of the 1988 Connectionist \nModels Summer School. Morgan Kaufman: San Mateo, CA. \nNarendra, K. & Parthasarathy, K. (1990). Identification and control of dynamical \nsystems using neural networks. IEEE Transactions on Neural Networks, 1, 4-26. \nNeuman, F. & Foster, J. D. (1970). Investigation of a digital automatic aircraft \nlanding system in turbulence. NASA Technical Note TN D-6066. NASA-Ames \nResearch Center, Moffett Field, CA. \nRumelhart, D. E., Hinton G. E., Williams R. J. (1986). Learning internal repre(cid:173)\nsentations by error propagation. In D. E. Rumelhart & J. L. McClelland (Eds.) \nParallel Distributed Processing: Explorations in the Microstructures of Cognition \n(Vol. I). Cambridge, MA: MIT Press. \n\n\f", "award": [], "sourceid": 384, "authors": [{"given_name": "Charles", "family_name": "Schley", "institution": null}, {"given_name": "Yves", "family_name": "Chauvin", "institution": null}, {"given_name": "Van", "family_name": "Henkle", "institution": null}, {"given_name": "Richard", "family_name": "Golden", "institution": null}]}