{"title": "Learning Trajectory and Force Control of an Artificial Muscle Arm by Parallel-hierarchical Neural Network Model", "book": "Advances in Neural Information Processing Systems", "page_first": 436, "page_last": 442, "abstract": null, "full_text": "Learning Trajectory and Force Control \n\nof an Artificial Muscle Arm \n\nby Parallel-hierarchical Neural Network Model \n\nMasazumi Katayama \n\nMitsuo Kawato \n\nCognitive Processes Department \n\nATR Auditory and Visual Perception Research Laboratories \n\nSeika-cho. Soraku-gun. Kyoto 619-02. JAPAN \n\nAbstract \n\nWe propose a new parallel-hierarchical neural network model to enable motor \nlearning for simultaneous control of both trajectory and force. by integrating \nHogan's control method and our previous neural network control model using a \nfeedback-error-learning scheme. Furthermore. two hierarchical control laws \nwhich apply to the model, are derived by using the Moore-Penrose pseudo(cid:173)\ninverse matrix. One is related to the minimum muscle-tension-change trajectory \nand the other is related to the minimum motor-command-change trajectory. The \nhuman arm is redundant at the dynamics level since joint torque is generated by \nagonist and antagonist muscles. Therefore, acquisition of the inverse model is \nan ill-posed problem. However. the combination of these control laws and \nfeedback-error-learning resolve the ill-posed problem. Finally. the efficiency of \nthe parallel-hierarchical neural network model is shown by learning experiments \nusing an artificial muscle arm and computer simulations. \n\n1 INTRODUCTION \nFor humans to properly interact with the environment using their arms. both arm posture \nand exerted force must be skillfully controlled. The hierarchical neural network model \nwhich we previously proposed was successfully applied to trajectory control of an \nindustrial manipulator (Kawato et al.. 1987). However. this model could not directly be \napplied to force control. because the manipulator mechanism was essentially different \nfrom the musculo-skeletal system of a human arm. Hogan proposed a biologically \nmotivated control method which specifies both the virtual trajectory and the mechanical \nimpedance of a musculo-skeletal system (Hogan, 1984, 1985). One of its advantages is \nthat both trajectory and force can be simultaneously controlled. However. this control \nmethod does not explain motor learning. \n\n436 \n\n\fLearning Trajectory and Force Control of an Artificial Muscle Arm \n\n437 \n\nIn this paper, by integrating these two previous studies, we propose a new Parallel(cid:173)\nHierarchical Neural network Model (PHNM) using afeedhack-error-learning scheme we \npreviously proposed (Kawato et aI., 1987), as shown in Fig.l. PHNM explains the \nbiological motor learning for simultaneous control of both trajectory and force. Arm \nmovement depends on the static and dynamic properties of a musculo-skeletal system. \nFrom this viewpoint, its inverse model which computes a motor command from a desired \ntrajectory and force, consists of two parallel inverse models: the Inverse Statics Model \n(ISM) and the Inverse Dynamics Model (ISM) (see Fig.I). \nThe human arm is redundant at the dynamics level since joint torque is generated by \nagonist and antagonist muscles. Therefore, acquisition of the inverse model is an ill(cid:173)\nposed problem in the sense that the muscle tensions can not be uniquely determined from \nthe prescribed trajectory and force. The central nervous system can resolve the ill-posed \nproblem by applying suitable constraints. Based on behavioral data of human multi-joint \narm movement, Uno et al. (1989) found that the trajectory was generated on the criterion \nthat the time integral of the squared sum of the rate of change of muscle tension is \nminimized. From this point of view, we assume that the central nervous system controls \nthe arm by using two hierarchical objective functions. One objective function is related \nto the minimum muscle-tension-change trajectory. The other objective function is related \nto the minimum motor-command-change trajectory. From this viewpoint, we propose \ntwo hierarchical control laws which apply to the feedback controller shown in Fig.l. \nThese control laws are calculated with the Moore-Penrose pseudo-inverse matrix of the \nJacobian matrix from muscle tensions or motor commands to joint torque. The \ncombination of these control laws and the feedback-error-Iearning resolve the ill-posed \nproblem. As a result, the inverse model related to hierarchical objective functions can be \nacquired by PHNM. We ascertained the efficiency of PHNM by performing experiments \nin learning control using an artificial-muscle arm with agonist and antagonist muscle-like \nrubber actuators as shown in Fig.2 (Katayama et aI., 1990). \n\n2 PARALLEL-HIERARCHICAL NEURAL NETWORK MODEL \nIn a simple case, the dynamics equation of a human multi-joint arm is described as \nfollows: \n\nR(O)8 + B(O, 0)0 = r+ G(O), \n\n'! = af(O)Tf(M f' 0, 0) - ae(O)Te(Me, 0, 0). \n\n. \n\n. \n\n(1a) \n\n(1 b) \n\nHere, R( 0) is the inertia matrix, B( 0,0) expresses a matrix of centrifugal, coriolis and \nfriction forces and G(O) is the vector of joint torque due to gravity. Mfand Me are agonist \nand antagonist motor commands, Tf and Te are agonist and antagonist muscle tensions, 0 \nis the joint-angle, '! is joint torque generated from the tensions of a pair of muscles and \naj. 0) and ai 0) are moment arms. \n\nIf the arm is static (0 = 0 = 0), (1 a) and (1 b) are reduced to the following: \n\n0= af(O)Tf(M 1,0,0) - ae(O)Te(Me, 0,0) + G(O). \n\n(2) \n\nTherefore, (2) is a statics equation. The problem, which calculates the motor commands \nfrom joint angles based on (2), is called the inverse statics. There are two difficulties: \nfirst, (2) including nonlinear functions (al' ae. TI , Te and G), must be solved. Second, \nthe inverse statics is an ill-posed problem as mentioned above. These difficulties are \nresolved by\u00b7the ISM. The problem of computing dynamic torque other than (2) is called \n\n\f438 \n\nKatayama and Kawato \n\n+ \nDesired Trajectory-\n\nand Force \n\nMism \n\n, \n:Midm + \n, + \n\n\" \n\n1 \n\n\u2022 \u2022 \n\nIf~'-i : \n\n: :Mfc \n+ \n\nMotor Command \n\n+ \n\n8r Fr \n\nRealized Trajectory \n\nand Force \n\nFigure 1: Parallel-Hierarchical Neural Network Model \n\nthe inverse dynamics and it is resolved by the 10M. The main role of the ISM is to \ncontrol the equilibrium posture and mechanical stiffness (Hogan, 1984), and that of the \nIDM is to compensate for dynamic properties of the arm in fast movements. PHNM, in \naddition to a feedback controller, hierarchically arranges these parallel inverse models. \nThe motor command is the sum of three outputs (M ism' M idm and Mfe) calculated by the \nISM, the 10M and the feedback controller, respectively, as shown in Fig.1. The outputs \nfrom the ISM and 10M are calculated by feedforward neural networks with synaptic \nweights w from desired trajectory ()d and desired force F d. These neural network models \ncan be described as the mapping from inputs ()d and F d to motor commands. In order to \nacquire the parallel inverse model, synaptic weights change according to the following \nfeedback-error-learning algorithm. \n\ndw = (a'P)t M \n\ndw \n\nfe \n\ndt \n\n(3) \n\nThe ISM learns when the arm is static and the 10M learns when it is moving. The \n\nfeedback motor command Mfe is fed only to the ISM when e = 0 and only to the 10M \nwhen 0 * 0 as an error signal for synaptic modification. The arm is mainly controlled by \nthe feedback controller before learning, and the feed forward control is basically performed \nonly by the parallel inverse model after learning because the output Mfe of the feedback \ncontroller is minimized after learning. Two control laws which apply to the feedback \ncontroller, are derived below. \n\n3 HIERARCHICAL CONTROL MECHANISM \nIn order to acquire the parallel inverse models related to hierarchical objective functions, \nwe propose two control laws reducing the redundancy at the dynamics level, which apply \nto a feedback controller in the PHNM. \n3.1 MATHEMATICAL MUSCLE MODEL \nTensions (T[, Te) of agonist and antagonist muscles are generally modeled as follows: \n\nT[ = K(M f){ ()oof(Mf) - ()} - B(M f)O, \nTe = K(Me){ e - ()o.e(Me)} + B(Me)O. \n\n( 4a) \n\n(4b) \n\n\fLearning Trajectory and Force Control of an Artificial Muscle Arm \n\n439 \n\nHere, M consists of MJ and Me for agonist and antagonist muscles, respectively. The \nmechanical impedance of a human arm can be manipulated by the stiffness K(M) and \nviscosity B(M) of the muscle itself, depending on their motor commands. ()oIMJ) and \n()o.e<M e) are joint angles at equilibrium position. K(M) , B(M) , ()oIMJ) and ()o,iM e) are \napproximately given as K(M) == ko + kM, B(M) == bo + bM, ()o,f (M J) == ()o + eMJ and \n80\u2022e{Me} =-80 -eMe, respectively. k and b are coefficients which, respectively, \ndetermine elasticity and viscosity. ko and bo are intrinsic elasticity and viscosity, \nrespectively. \n()o is the intrinsic equilibrium angle and c is a constant. Small changes in \njoint torque are expressed by using the Jacobian matrix A from small changes in motor \ncommand to small changes in joint torque. Therefore, by using the Moore-Penrose \npseudo-inverse matrix A#, small changes in motor command are calculated as follows: \n(aJ\u00ab()(C+Rf)J~'l' \n\n(MfJJ=A#~'l'= \n\n21 \n\naJ\u00ab()2(C+RJ) +ae\u00ab()/(C-Re)2 ae\u00ab()}(C-1:e) \n\nMfe \n\n.: C = -(k() + bO), \nl' )-1 \n\nA =A AA \n\nl' ( \n\n# \n\n, \n\n(5) \n\n3.2 HIERARCHICAL CONTROL LAWS \nTwo feedback control laws are explained below, which apply to the feedback controller \nshown in Fig.1. Firstly, .1TJ=.1MJ and .1Te=.1Me are given from (4a) and (4b) by \nassuming k=b=O, c:;tO, a/()=ai()=a and g.r=Re= 1 in the simplest case. The solution \nA#.1'l' in which the norm (.1Tj+.1Te 2 )112 of vector .1 T is minimized by using the \npseudo-inverse matrix A#, is selected. Therefore, the control law related to the minimum \nmuscle-tension-change trajectory is derived from (5). Then the feedback control law is \nacquired by using ~'l' = K p( ()d - ()r) + Kd( Od - Or) + K J( Fd - Fr ). Here, Kp' Kd and KJ \nare feedback gains. Learning is performed by applying the motor commands calculated by \nthis feedback control law to the learning algorithm of (3). As a result, the inverse model \nis acquired by the PHNM after learning. Only when a;C())=ae\u00ab())=a does, the inverse model \nstrictly give the optimal solution based on the minimum muscle-tension-change \ntrajectory during the movement. a is a constant moment arm. \nNext, another control law is derived from (5) by assuming k,b;r!{J, e=O, a/()=ae\u00ab())=a and \ng.r=ge= 1 by a similar way. In this case, the control law is related to the minimum motor(cid:173)\ncommand-change trajectory, because the norm (.1M/+.1M /)112 of vector.1M is \nminimized by using the pseudo-inverse matrix A # . Then the control law explains the \nbehavioral data of rapid arm movement, during which the mechanical impedance is \nincreased by coactivation of agonist and antagonist muscles (Kurauchi et aI., 1980). The \nmechanical impedance of the muscles increases when C increases. Therefore, C explains \nthe coactivation because C increases when the arm moves rapidly. Thus, rapid arm \nmovement can be stably executed by such coactivation. It is noted that the control law \ndirectly takes account of the variable stiffness and viscosity of the muscle itself. Learning \nis performed by the same algorithm above. As a result, the inverse model acquired by the \nPHNM gives the approximate solution related to the minimum motor-command-change \ntrajectory, because A# depends on the joint angle in this case. Furthermore, stiffness and \n\n\f440 \n\nKatayama and Kawato \n\nvirtual trajectory are uniquely determined from a \nmathematical muscle model using the outputs \nof the trained inverse models. \n\nFigure 2: Artificial Muscle Arm \n\n4 EFFICIENCY OF PHNM \nThe efficiency of the PHNM is shown by the \nexperiment results using two hierarchical \ncontrol laws. \n4.1 ARTIFICIAL MUSCLE ARM \nThe artificial muscle arm used \nin our \nexperiments is the rubber-actuator-arm (5 \ndegrees of freedom, 16 rubber actuators, made \nby Bridgestone Co.), as shown in Fig.2. which \nis a manipulator with agonist and antagonist \nmuscle-like actuators. The actuators are made \nof rubber and driven by air. In our experiment, \nthe motor command is air-pressure. \nThe mechanical structure of the arti ficial arm is \nbasically the same as that of the human arm. \nMoreover, properties of the actuator are also similar to those of muscle. The actuator has \na variahle mechanical impedance which consists of stiffness and viscosity. Then, the \nstiffness which is mechanically realized. expresses the spring-like behavior of muscle. \nThis property acts as a simple mechanical feedback system whose time delay is \"zero\". \nFurthermore, the ratio of the output torque and the weight of the arm is extremely high. \nTherefore, we hope it will be easy to control the force and trajectory at the end-effector or \njoint. However, it is difficult to controi the trajectory of the arm because the artificial \narm, like the human arm. is a very nonlinear system. We note that feedforward control \nusing the trained ISM and 10M is necessary to control the arm. \n4.2 TRAJECTORY CONTROL OF ARTIFICIAL MUSCLE ARM \nLearning control experiments using an artificial muscle arm are performed with the \nfeedback control law related to the minimum muscic-lensioll-c/zwIRe trajectory. The ISM \nand 10M use a 3-layer perceptron. The \nresults shown in Fig.3 indicate that the \nconventional feedback control method can not \nrealize accurate trajectory control, because the \nrealized trajectory lagged behind the desired \ntrajectory. While the results shown in FigAa \nindicate that accurate and smooth trajectory \ncontrol of a slow movement can be realized \nonly by feedforward control using the trained \nISM and 10M after learning, because the \nrealized trajectory fits the desired trajectory. \nMoreover, the result indicates that the PHNM \ncan resolve the ill-posed inverse problem. \nThe results shown in FigAb indicate that \nlearning of the ISM and IDM is finished after \nabout 2,000 iterations, because the output of \n\n30 \n-40 L-.-'----'-...................................... --'--'--Io \n0 \n\nFigure 3: Feedback Control Using \nConventional Feedback Controller \n\n~o ,...-------------, \n\n-\n\nDesired \nTrdJCdory \nReaIitxd \n\"I taJcclory \n\no \n\n21 4 5 \n\n~ \n\n~ 10 \n2 -\n\n20 \n\n..... \n.S - 10 \no \n....., -20 \n\nTime \n\n7 \n(sec.) \n\n\fLearning Traj ectory and Force Control of an Artificial Muscle Arm \n\n441 \n\nA \n\n-\n\nDesired \n\n~ 40r---------------------~ \noil \n.g 30 \n\n~ 10 ~ I J \n~ 20 ~ I '~ ... ~~~:? \n-< (~. \n.~ -10 \n\nTrajeclory \n\n\\. \n\n/ \n\n.r-\n\n~ -20 \n\n-30 \n\n\\d' \n\n40~----~----~--~~~~ \no 2345678 \n\nTime \n\n(sec.) \n\n40 r------------------------. \n\n........ ISM \n11)\\1 \nFeedhack \nConln>lIcr \n\n-\n\n2345678 \n\nTime \n\n(~cc. ) \n\n(a) Fcedforward Control Using \n\nTrained ISM and IDM \n\n(after learning) \n\n(b) Output to Agonist Actuator \n\n(after learning) \n\nFigure 4: Trajectory Control Using Control Law \n\nRelated to Minimum Muscle-Tension-Change Criterion \n\n(in slow movement using artificial muscle arm) \n\nDesired TraJcclory \nRcauzed 'l'ra jcClory \n\nAgonist \\1usclc:\\1f \nAmagonisl \\1u,dc:\\1c \n\n/ .\" \n\no \n\n.2 \n\n.4 \n.3 \nTime \n(a) Trajectory \n\n.5 \n\n6 \n\n(,ce.) \n\n[J \n\n.1 \n\n'IL-.. \n\n.3 \n.4 \nTime \n\n2 \n\n5 \n\n.r, \n(sec.) \n\n(b) Motor Command \n\nFigure 5: Feedback Trajectory Control Using Control Law \n\nRelated to Minimum Motor-command-change Criterion \n\n(in fast movement using computer simulation) \n\nthe feedback controller is minimized. Then note that the output of the ISM is greater \nthan the other outputs. Furthermore, we confirmed that by using an untrained trajectory, \nthe generalization capability of the trained parallel inverse models is good. \n\n4.3 TRAJECTORY CONTROL IN FAST MOVEMENT \nOne of the advantages of the control law related to the minimum motor-command-change \ncriterion, is shown by a trajectory control experiment in fast movement. We confirmed \nthat the feedback control law allowed stable trajectory control in fast movement. Control \nexperiments were performed by computer simulation. The results shown in Fig.5a \nindicate that PHNM applying this feedback control law realizes stable trajectory control in \nrapid movement, because no oscillation characteristics can be found when the arm reaches \nthe desired position. This is because the mechanical impedance of the joint increases \nwhen a pair of muscles are coactivated (see Fig.5b). Moreover, the results also explain \nbehavioral data in fast arm movement (Kurauchi et aI., 1980). \n\n\f442 \n\nKatayama and Kawato \n\n4.4 FORCE CONTROL \nWe confinned that the feedback control law \nrelated to the minimum motor-command(cid:173)\nchange criterion succeeded for accurate force \ncontrol. The results shown in Fig.6 \nindicate that accurate force control can be \nperformed by combining the trained IDM \nand ISM, with PHNM using this feedback \ncontrol law . \n\n12 \n\ng 10 \n\n0 \nU \n!5 \n~ \n\nS-\n\n6-\n\n'\" \n\n2 \n\nj \no \n\n/--', \n/ \" \nf \n, \n, \n( \u2022 \nI \nf \n.. \nj \n\n-\n\nDesired Force \nRealil.ed Force \n\n'. , , \n\n,; \n'. \n\n, \n\\ \n\n'. \n\n\". \n\n2 \n\n4 \nTime \n\n6 \n(sec.) \n\n/ \n\nRelate to Minimum \n\nISM and IDM With Control Law \n\nMotor-command-change Criterion \n\nFigure 6: Force Control Using Trained \n\n5 DISCUSSION \nThe ISM we proposed in this paper has two \nadvantages. The first is that it is easy to \ntrain the inverse model of the controlled \nobject because the inverse model is \nseparated into the ISM and IDM. The \nsecond is that control using the ISM explains Bizzi's experiment results with a \ndeafferented rhesus monkey (Bizzi et al., 1984). Furthermore the control using the ISM \nrelates to Hogan's control method using the virtual trajectory (Hogan, 1984, 1985). \nThe Parallel-Hierarchical Neural network Model proposed in this paper integrates Hogan's \nimpedance control and our previous model, and hence can explain motor learning for \nsimultaneous control of both trajectory and force. There is an infinite number of possible \ncombinations of mechanical impedance and virtual trajectory that can produce the same \ntorque and force. Thus, the problem of determining the impedance and the virtual \ntrajectory was ill-posed in Hogan's framework. In the present paper, they were uniquely \ndetennined from (5). \n\nReferences \n[1] Bizzi, E., Accornero, N., Chapple, W. & Hogan, N. (1984) Posture Control and \nTrajectory Formation During Arm Movement. The Journal of Neuroscience, 4, II, \n2738-2744. \n\n[2] Hogan, N. (1984) An Organizing Principle for a Class of Voluntary Movements. \n\nThe Journal of Neuroscience, 4, 11, 2745-2754. \n\n[3] Hogan, N. (1985) Impedance Control: An Approach to Manipulation Part I II III. \n\nJournal of Dynamic Systems, Measurement, and Control, 107, 1-24. \n\n[4] Katayama, M. & Kawato, M. (1990) Parallel-Hierarchical Neural Network Model for \nMotor Control of Musculo-Skeletal System. The Transactions of The Institute of \nElectronics, Information and Communication Engineers, J73-D-II, 8, 1328-1335. \nin Japanese. \n\n[5J Kawato, M., Furukawa, K. & Suzuki, R. (1987) A Hierarchical Neural-Network \nModel for Control and Learning of Voluntary Movement. Biological Cybernetics, \n57,169-185. \n\n[6] Kurauchi, S., Mishima, K. & Kurokawa, T. (1980) Characteristics of Rapid \nPositional Movements of Forearm. The Japanese Journal of Ergonomics, 16, 5, \n263-270. in Japanese. \n\nl7] Uno, Y., Suzuki, R. & Kawato, M. (1989) Minimum Muscle-Tension-Change \nModel which Reproduces Human Arm Movement. Proceedings (l the 4th \nSymposium on Biological and Physiological Engineering, 299-302. in Japanese. \n\n\f", "award": [], "sourceid": 316, "authors": [{"given_name": "Masazumi", "family_name": "Katayama", "institution": null}, {"given_name": "Mitsuo", "family_name": "Kawato", "institution": null}]}