{"title": "Neural Network Based Model Predictive Control", "book": "Advances in Neural Information Processing Systems", "page_first": 1029, "page_last": 1035, "abstract": null, "full_text": "Neural Network Based Model Predictive \n\nControl \n\nStephen Piche \n\nJim Keeler \n\nGreg Martin \n\nPavilion Technologies \n\nPavilion Technologies \n\nPavilion Technologies \n\nAustin, TX 78758 \nspiche@pav.com \n\nAustin, TX 78758 \njkeeler@pav.com \n\nAustin, TX 78758 \ngmartin@pav.com \n\nGene Boe \n\nDoug Johnson \n\nMark Gerules \n\nPavilion Technologies \n\nPavilion Technologies \n\nPavilion Technologies \n\nAustin, TX 78758 \n\ngboe@pav.com \n\nAustin, TX 78758 \ndjohnson@pav.com \n\nAustin, TX 78758 \nmgerules@pav.com \n\nAbstract \n\nModel Predictive Control (MPC), a control algorithm which uses \nan optimizer to solve for the optimal control moves over a future \ntime horizon based upon a model of the process, has become a stan(cid:173)\ndard control technique in the process industries over the past two \ndecades. In most industrial applications, a linear dynamic model \ndeveloped using empirical data is used even though the process it(cid:173)\nself is often nonlinear. Linear models have been used because of the \ndifficulty in developing a generic nonlinear model from empirical \ndata and the computational expense often involved in using non(cid:173)\nlinear models. In this paper, we present a generic neural network \nbased technique for developing nonlinear dynamic models from em(cid:173)\npirical data and show that these models can be efficiently used in \na model predictive control framework. This nonlinear MPC based \napproach has been successfully implemented in a number of indus(cid:173)\ntrial applications in the refining, petrochemical, paper and food \nindustries. Performance of the controller on a nonlinear industrial \nprocess, a polyethylene reactor, is presented. \n\n1 \n\nIntroduction \n\nModel predictive control has become the standard technique for supervisory control \nin the process industries with over 2,000 applications in the refining, petrochemicals, \nchemicals, pulp and paper, and food processing industries [1]. Model Predictive \nControl was developed in the late 70's and came into wide-spread use, particularly \nin the refining industry, in the 80's. The economic benefit of this approach to control \nhas been documented [1,2] . \n\n\f1030 \n\ns. Piche, J. Keeler, G. Martin, G. Boe, D. Johnson and M. Gerules \n\nSeveral factors have contributed to the wide-spread use of MPC in the process \nindustries: \n\n1. Multivariate Control: Industrial processes are typically coupled multiple(cid:173)\ninput multiple-output (MIMO) systems. MIMO control can be imple(cid:173)\nmented using MPC. \n\n2. Constraints: Constraints on the inputs and outputs of a process due to \n\nsafety considerations are common in the process industries. These con(cid:173)\nstraints can be integrated into the control calculation using MPC. \n\n3. Sampling Period: Unlike systems in other industries such as automotive or \naerospace, the open-loop settling times for many processes is on the order \nof hours rather than milliseconds. This slow settling time translates to \nsampling periods on the order of minutes. Because the sampling period is \nsufficiently long, the complex optimization calculations that are required to \nimplement MPC can be solved at each sampling period. \n\n4. Commercial Tools: Commercial tools that facilitate model development and \ncontroller implementation have allowed proliferation of MPC in the process \nindustries. \n\nU nti! recently, industrial applications of MPC have relied upon linear dynamic \nmodels even though most processes are nonlinear. MPC based upon linear models \nis acceptable when the process operates at a single setpoint and the primary use of \nthe controller is the rejection of disturbances. However, many chemical processes, \nincluding polymer reactors, do not operate at a single setpoint. These processes \nare often required to operate at different set points depending upon the grade of the \nproduct that is to be produced. Because these processes operate over the nonlinear \nrange of the system, linear MPC often results in poor performance. To properly \ncontrol these processes, a nonlinear model is needed in the MPC algorithm. \n\nThis need for nonlinear models in MPC is well recognized. A number of researchers \nand commercial companies have developed both simulation and industrial appli(cid:173)\ncations using a variety of different technologies including both first principles and \nempirical approaches such as neural networks [3,4]. Although a variety of different \nmodels have been developed, they have not been practical for wide scale industrial \napplication. On one hand, nonlinear models built using first principle techniques \nare expensive to develop and are specific to a process. Conversely, many empirically \nbased nonlinear models are not appropriate for wide scale use because they require \ncostly plant tests in multiple operating regions or because they are too computa(cid:173)\ntionally expensive to use in a real-time environment. \n\nThis paper presents a nonlinear model that has been developed for wide scale indus(cid:173)\ntrial use. It is an empirical model based upon a neural network which is developed \nusing plant test data from a single operating region and historical data from all \nregions. This is in contrast to the usual approach of using plant test data from \nmultiple regions. This model has been used on over 50 industrial applications and \nwas recognized in a recent survey paper on nonlinear MPC as the most widely used \nnonlinear MPC controller in the process industries[l]. \n\n\fNeural Network Based Model Predictive Control \n\n1031 \n\nAfter providing a brief overview of model predictive control in the next section, \nwe present details on the formulation of the nonlinear model. After describing the \nmodel, an industrial application is presented that validates the usefulness of the \nnonlinear model in an MPC algorithm. \n\n2 Model Predictive Control \n\nModel predictive control is based upon solving an optimization problem for the \ncontrol actions at each sampling interval. Using MPC, an optimizer computes \nfuture control actions that minimize the difference between a model of the process \nand desired performance over a time horizon (typically the time horizon is greater \nthan the open-loop settling time of the process). For example, given a linear model \nof process, \n\n(1) \n\nwhere u(t) represents the input to the process, the optimizer may be used to mini(cid:173)\nmize an objective function at time t, \n\nT \n\nJ = 2)(Yt+i - Yt+i)2 + (Ut+i - Ut+i_l)2) \n\ni=l \n\n(2) \n\nwhere Yt is the desired set point for the output and T is the length of the time \nhorizon. In addition to minimizing an objective function, the optimizer is used to \nobserve a set of constraints. For example, it is common to place upper and lower \nbounds on the inputs as well as bounds on the rate of change of the input, \n\nU upper 2:: Ut+i 2:: Ul ower V 1:::; i :::; T \nAUupper 2:: Ut+i - Ut+i-l 2:: AUlower V 1:::; i :::; T \n\n(3) \n(4) \n\nwhere Uupper and Ulower are the upper and lower input bounds while AUupper and \nAUlower are the upper and lower rate of change bounds. After the trajectory of \nfuture control actions is computed, only the first value in the trajectory is sent as a \nsetpoint to the actuators. The optimization calculation is re-run at each sampling \ninterval using a model which has been updated using feedback. \n\nThe form of the model, the objective function, the constraints and the type of \noptimizer have been active areas of research over the past two decades. A number \nof excellent survey papers on MPC cover these topics [1,2,4]. As discussed above, \nwe have selected a MIMO nonlinear model which is presented in the next section. \nAlthough the objective function given above contains two terms (desired output \nand input move suppression), the objective function used in our implementation \ncontains thirteen separate terms. (The details of the objective function are beyond \nthe scope of this paper.) Our implementation uses the constraints given above in \n(3) and (4). Because we use nonlinear models, a nonlinear programming technique \nmust be used to solve the optimization problem. We use LS-GRG which is a reduced \ngradient solver [5]. \n\n\f1032 \n\nS. Piche, J. Keeler, G. Martin, G. Roe, D. Johnson and M Gerules \n\n3 A Generic and Parsimonious Nonlinear Model \n\nFor a nonlinear model to achieve wide-spread industrial use, the model must be \nparsimonious so that it can be efficiently used in an optimization problem. Fur(cid:173)\nthermore, it must be developed from limited process data. As discussed below, the \nnonlinear model we use is composed of a combination of a nonlinear steady state \nmodel and a linear dynamic model which can be derived from available data. The \nmethod of combining the models results in a parsimonious nonlinear model. \n\n3.1 Process data and component models \n\nThe quantity and quality of available data ultimately determines the structure of \nan empirical model. In developing our models, the available data dictated the type \nof model that could be created. In the process industries, two types of data are \navailable: \n\n1. Historical data: The values of the inputs and outputs of most processes \n\nare saved at regular intervals to a data base. Furthermore, most process(cid:173)\ning companies retain historical data associated with their plant for several \nyears. \n\n2. Plant tests: Open-loop testing is a well accepted practice for determining \n\nthe process dynamics for implementation ofMPC. However, open-loop test(cid:173)\ning in multiple operating regions is not well accepted and is impractical in \nmost cases even if it were accepted. \n\nMost practitioners of MPC models have used plant test data and ignored historical \ndata. Practitioners have ignored the historical data in the past because it was \ndifficult to extract and preprocess the data, and build models. Historical data \nwas also viewed as not useful because it was collected in closed-loop and therefore \nprocess dynamics could not be extracted in many cases. Using only the plant test \ndata, the practitioner is limited to linear dynamic models. \n\nWe chose to use the historical data because it can be used to create nonlinear \nsteady state models of processes that operate at multiple setpoints. Combining the \nnonlinear steady state model with linear dynamic models from the plant test data \nprovides a generic approach to developing nonlinear models. \n\nTo easily facilitate the development of nonlinear models, a suite of tools has been \ndeveloped for data extraction and preprocessing as well as model training. The \nnonlinear steady state models, \n\nYss = NNss(u) \n\n(5) \n\nare implemented by a feedforward neural network and trained using variants of \nthe backpropagation algorithm [6]. The developer has a great deal of flexibility in \ndetermining the architecture of the network including the ability to select which \ninputs affect which outputs. Finally, an algorithm for specifying bounds on the \ngain (Jacobian) of the model has recently been implemented [7]. \n\nBecause of limited plant test data, the dynamic models are restricted to second \norder models with input time delay, \n\nYt = -alYt-l - a2Yt-2 + b1 Ut-d-l + b2U t-d-2 \n\n(6) \n\n\fNeural Network Based Model Predictive Control \n\n1033 \n\nThe parameters of (6) are identified by minimizing the squared error between the \nmodel and the plant test data. To prevent a biased estimate of the parameters, \nthe identification problem is solved using an optimizer because of the correlation in \nthe model inputs [8]. Tools for selecting the identification regions and viewing the \nresults are provided. \n\n3.2 Combining the nonlinear steady state and dynamic models \n\nA variety of techniques exist for combining nonlinear steady state and linear dy(cid:173)\nnamic models. The dynamic models can be used to either preprocess the inputs \nor postprocess the outputs of the steady state model. These models, referred to as \nHammerstein and Weiner models respectively [8], contain a large number of parame(cid:173)\nters and are computationally expensive in an optimization problem when the model \nhas many inputs and outputs. These models, when based upon neural networks, \nalso extrapolate poorly. \n\nGain scheduling is often used to combine nonlinear steady state models and linear \ndynamic models. Using a neural network steady state model, the gain at the current \noperating point, Ui, \n\nayss \n\ngi = au I U=Ui \n\nis used to update the gain of the linear dynamic model of (6), \n\nwhere \n\n= \n\n1 + al + a2 \n\nb \n19i b1 + b2 \nb \n2gi b1 + b2 \n\n1 + al + a2 \n\n(7) \n\n(8) \n\n(9) \n\n(10) \n\nThe difference equation is linearized about the point Ui and Yi = N N(Ui), thus, \n~Y = Y - Yi and ~u = U - Ui\u00b7 To simplify the equations above, a single-input single(cid:173)\noutput (8180) system is used. Gain scheduling results in a parsimonious model that \nis efficient to use in the MPC optimization problem, however, because this model \ndoes not incorporate information about the gain over the entire trajectory, its use \nleads to suboptimal performance in the MPC algorithm. \n\nOur nonlinear model approach remedies this problem. By solving a steady state \noptimization problem whenever a setpoint change is made, it is possible to compute \nthe final steady state values of the inputs, U f. Given the final steady state input \nvalues, the gain associated with the final steady state can be computed. For a 8180 \nsystem, this gain is given by \n\nUsing the initial and final gain associated with a setpoint change, the gain structure \nover the entire trajectory can be approximated. This two point gain scheduling \novercomes the limitations of regular gain scheduling in MPC algorithms. \n\n(11) \n\n\f1034 \n\ns. Piche, J Keeler, G. Martin, G. Boe. D. Johnson and M Gerules \n\nCombining the initial and final gain with the linear dynamic model, a quadratic \ndifference equation is derived for the overall nonlinear model, \n\nwhere \n\nbi (1 + al + a2)(9f - 9i) \n(b1 + b2)(uf - ud \nb2 (1 + al + a2)(9f - 9d \n(b1 + b2)(uf - ud \n\n= \n\n(13) \n\n(14) \n\nand VI and V2 are given by (9) and (10). Use of the gain at the final steady state \nintroduces the last two terms of (12). This model allows the incorporation of gain \ninformation over the entire trajectory in the MPC algorithm. The gain at of (12) at \nUi is 9i while at uf it is 9f. Between the two points, the gain is a linear combination \nof 9i and 9 f. For processes with large gain changes, such as polymer reactors, this \ncan lead to dramatic improvements in MPC controller performance. \n\nAn additional benefit of using the model of (12) is that we allow the user to bound \nthe initial and final gain and thus control the amount of nonlinearity used in the \nmodel. For practitioners who are use to implementing MPC with linear models, \nusing gain bounds allows them to transition from linear to nonlinear models. This \nability to control the amount of nonlinearity used in the model has been important \nfor acceptance of this new model in many applications. Finally, bounding the gains \ncan be used to guarantee extrapolation performance of the model. \n\nThe nonlinear model of (12) fits the criteria needed in order to allow wide spread \nuse of nonlinear models for MPC. The model is based upon readily available data \nand has a parsimonious representation allowing models with many inputs and out(cid:173)\nputs to be efficiently used in the optimizer. Furthermore, it addresses the primary \nnonlinearity found in processes, that being the significant change in gain over the \noperating region. \n\n4 Polymer Application \n\nThe nonlinear model described above has been used in a wide-variety of industrial \napplications including Kamyr digesters (pUlp and paper), milk evaporators and \ndryers (food processing), toluene diamine purification (chemicals), polyethylene and \npolypropylene reactors (polymers) and a fluid catalytic cracking unit (refining). \nHighlights of one such application are given below. \n\nA MPC controller that uses the model described above has been applied to a Gas \nPhase High Density Polyethylene reactor at Chevron Chemical Co. in Cedar Bayou, \nTexas [9]. The process produces homopolymer and copolymer grades over a wide \nrange of melt indices. It's average production rate per year is 230,000 tons. \n\nOptimal control of the process is difficult to achieve because the reactor is a highly \ncoupled nonlinear MIMO system (7 inputs and 5 outputs). For example, a number \nof input-output pairs exhibit gains that varying by a factor of 10 or more over the \noperating region. In addition, grade changes are made every few days. During these \ntransitions nonprime polymer is produced. Prior to commissioning these controllers, \n\n\fNeural Network Based Model Predictive Control \n\n1035 \n\nthese transitions took several hours to complete. Linear and gain scheduling based \ncontroller have been tried on similar reactors and have delivered limited success. \n\nThe nonlinear model was constructed using only historical data. The nonlinear \nsteady state model was trained upon historical data from a two year period. This \ndata contained examples of all the products produced by the reactor. Accurate dy(cid:173)\nnamic models were derived both from historical data and knowledge of the process, \nthus, no step tests were conducted on the process. \n\nExcellent performance of this controller has been reported [9]. A two-fold decrease \nin the variance of the primary quality variable (melt index) has been achieved. In \naddition, the average transition time has been decreased by 50%. Unscheduled \nshutdowns which occurred previously have been eliminated. Finally, the controller, \nwhich has been on-line for two years, has gained high operator acceptance. \n\n5 Conclusion \n\nA generic and parsimonious nonlinear model which can be used in an MPC algo(cid:173)\nrithm has been presented. The model is created by combining a nonlinear steady \nstate model with a linear dynamic models. They are combined using a two-point \ngain scheduling technique. This nonlinear model has been used for control of a \nnonlinear MIMO polyethylene reactor at Chevron Chemical Co. The controller has \nalso been used in 50 other applications in the refining, chemicals, food processing \nand pulp and paper industries. \n\nReferences \n\n[1] Qin, S.J. & Badgwell, T.A. (1997) An overview of industrial model predictive control \ntechnology. In J. Kantor, C. Garcia and B. Carnahan (eds.), Chemical Process Control -\nAIChE Symposium Series, pp. 232-256. NY: AIChB. \n\n[2] Seborg, D.E. (1999) A perspective on advanced strategies for Process Control (Revis(cid:173)\nited). to appear in Pmc. of European Control Conf. Karlsruhe, Germany. \n\n[3] Qin, S.J. & Badgwell, T.A. (1998) An overview of nonlinear model predictive control \napplications. Pmc. IFAC Workshop on Nonlinear Model Predictive Control - Assessment \nand Future Directions, Ascona, Switzerland, June 3-5. \n[4] Meadow, E.S. & Rawlings, J.B . (1997) Model predictive control. In M. Hesnon and D. \nSeborg (eds.), Nonlinear Model Predictive Control, pp. 233-310. NJ: Prentice Hall. \n\n[5] Nash, S. & Sofer, A. (1996) Linear and Nonlinear Programming. NY: McGraw-Hill. \n\n[6] Rumelhart D.E, Hinton G.B. & Williams, R.J. (1986) Learning internal representations \nIn D. Rumelhart and J. McClelland (eds.), Parallel Distributed \nby error propagation. \nProcessing, pp. 318-362. Cambridge, MA: MIT Press. \n\n[7] Hartman, E. (2000) Training feedforward neural networks with gain constraints. To \nappear in Neural Computation. \n\n[8] Ljung, L. (1987) System Identification. NJ: Prentice Hall. \n\n[9] Goff S., Johnson D. & Gerules, M. (1998) Nonlinear control and optimization of a high \ndensity polyethylene reactor. Proc. Chemical Engineering Expo, Houston, June. \n\n\f", "award": [], "sourceid": 1788, "authors": [{"given_name": "Stephen", "family_name": "Piche", "institution": null}, {"given_name": "James", "family_name": "Keeler", "institution": null}, {"given_name": "Greg", "family_name": "Martin", "institution": null}, {"given_name": "Gene", "family_name": "Boe", "institution": null}, {"given_name": "Doug", "family_name": "Johnson", "institution": null}, {"given_name": "Mark", "family_name": "Gerules", "institution": null}]}