{"title": "Evidence for a Forward Dynamics Model in Human Adaptive Motor Control", "book": "Advances in Neural Information Processing Systems", "page_first": 3, "page_last": 9, "abstract": null, "full_text": "Evidence for  a  Forward  Dynamics  Model \n\nin  Human Adaptive Motor  Control \n\nNikhil Bhushan  and Reza Shadmehr \n\nDept.  of Biomedical Engineering \n\nJohns Hopkins University,  Baltimore, MD  21205 \nEmail:  nbhushan@bme.jhu.edu,  reza@bme.jhu.edu \n\nAbstract \n\nBased  on  computational  principles,  the  concept  of  an  internal \nmodel for  adaptive control has been divided  into a  forward  and an \ninverse model.  However, there is as yet little evidence that learning \ncontrol by the eNS is through adaptation of one or the other.  Here \nwe  examine  two  adaptive control  architectures, one  based only on \nthe inverse model and other based on a combination of forward and \ninverse  models.  We  then show that for  reaching movements of the \nhand  in  novel  force  fields,  only  the  learning of the  forward  model \nresults  in  key  characteristics of performance  that match  the  kine(cid:173)\nmatics of human subjects.  In  contrast, the adaptive control system \nthat relies only on the inverse model fails to produce the kinematic \npatterns observed  in  the subjects,  despite the  fact  that it  is  more \nstable.  Our results  provide evidence that learning control of novel \ndynamics is  via formation of a  forward  model. \n\n1  Introduction \n\nThe concept of an internal model,  a  system for  predicting behavior of a  controlled \nprocess, is  central to the current theories of motor control (Wolpert et al.  1995) and \nlearning  (Shadmehr  and  Mussa-Ivaldi  1994).  Theoretical  studies  have  proposed \nthat  internal  models  may  be  divided  into  two  varieties:  forward  models,  which \nsimulate the causal flow of a process by predicting its state transition given a motor \ncommand,  and inverse  models,  which  estimate motor commands  appropriate for  a \ndesired state transition (Miall and Wolpert, 1996).  This classification is  relevant for \nadaptive  control  because  based  on  computational  principles,  it has  been  proposed \nthat learning control of a  nonlinear system might be facilitated  if a  forward  model \nof the  plant  is  learned initially,  and then during an off-line  period is  used  to train \nan  inverse  model  (Jordan  and  Rumelhart,  1992).  While  there  is  no  experimental \nevidence  for  this  idea in  the  central  nervous  system,  there  is  substantial  evidence \n\n\f4 \n\nN.  Bhushan and R.  Shadmehr \n\nthat  learning  control  of arm  movements  involves  formation  of an  internal  model. \nFor  example,  practicing  arm  movements  while  holding  a  novel  dynamical  system \ninitiates an adaptation process which results in the formation of an internal model: \nupon  sudden removal  of the force  field,  after-effects  are observed  which  match  the \nexpected behavior  of a  system  that has  learned to predict  and  compensate for  the \ndynamics of the imposed field  (Shadmehr and Brashers-Krug, 1997).  However,  the \ncomputational nature of this  internal model,  whether  it be a  forward or an inverse \nmodel,  or a  combination of both, is  not known. \n\nHere  we  use  a  computational approach  to  examine  two  adaptive  control  architec(cid:173)\ntures:  adaptive  inverse  model  feedforward  control  and  adaptive  forward-inverse \nmodel  feedback  control.  We  show  that the two systems predict  different  behaviors \nwhen  applied  to  control  of arm  movements.  While  adaptation  to  a  force  field  is \npossible with either approach, the second system with feedback control through an \nadaptive forward model, is far less stable and is accompanied with distinct kinematic \nsignatures,  termed  \"near path-discontinuities\".  We  observe remarkably similar in(cid:173)\nstability and near path-discontinuities in the kinematics of 16 subjects that learned \nforce  fields.  This  is  behavioral evidence  that learning control of novel  dynamics  is \naccomplished with  an adaptive forward  model of the system. \n\n2  Adaptive  Control using Internal Models \n\nAdaptive  control  of a  nonlinear  system  which  has  large  sensory  feedback  delays, \nsuch  as  the human arm, can be accomplished by using two different  internal model \narchitectures.  The first  method  uses  only  an  adaptive inverse  dynamics  model  to \ncontrol  the  system  (Shadmehr  and  Mussa-Ivaldi,  1994).  The  adaptive  controller \nis  feedforward  in  nature  and  ignores  delayed  feedback  during  the  movement.  The \ncontrol system is  stable because it relies on the equilibrium properties of the muscle \nand  the  spinal  reflexes  to  correct  for  any  deviations  from  the  desired  trajectory. \nThe  second  method  uses  a  rapidly  adapting forward  dynamics  model  and  delayed \nsensory feedback  in  addition  to  an  inverse  dynamics  model  to  control  arm  move(cid:173)\nments  (Miall  and  Wolpert,  1996).  In  this  case,  the corrections  to deviations  from \nthe desired trajectory are a  result of a  combination of supraspinal feedback  as  well \nas spinal/muscular feedback.  Since the two methods rely on different internal model \nand feedback  structures, they are expected to behave differently when the dynamics \nof the system are altered. \n\nThe  Mechanical Model of the Human Arm \n\nFor the purpose of simulating arm movements with  the two different  control archi(cid:173)\ntectures, a  reasonably accurate model of the human arm is  required.  We  model the \narm  as  a  two  joint revolute  arm  attached  to six  muscles  that  act  in  pairs  around \nthe  two  joints.  The  three  muscle  pairs  correspond  to  elbow  joint,  shoulder  joint \nand two joint muscles and are assumed to have constant moment arms.  Each mus(cid:173)\ncle  is  modeled  using a  Hill  parametric model  with  nonlinear stiffness  and viscosity \n(Soechting and Flanders, 1997).  The dynamics of the muscle can be represented by \na  nonlinear state function  f M, such  that, \n\n(1) \n\nwhere,  Ft  is  the force  developed  by  the  muscle,  N  is  the  neural  activation  to  the \nmuscle,  and  Xm, xm  are  the  muscle  length  and  velocity.  The  passive  dynamics \nrelated  to  the  mechanics  of the  two-joint  revolute  arm  can  be  represented  by  fD, \nsuch that, \n\nx =  fD(T, x, x)  =  D- 1 (x)[T  - C(x, x)x + JT Fxl \n\n(2) \n\n\fEvidence for a Forward Dynamics Model in Human Adaptive Motor Control \n\n5 \n\nwhere,  x is  the  hand  acceleration,  T  is  the joint torque  generated by  the muscles, \nx,  x are  the  hand  position  and  velocity,  D  and  C  are  the  inertia and  the  coriolis \nmatrices of the arm,  J  is  the Jacobian for  hand position and joint angle,  and  Fx  is \nthe external dynamic interaction force  on the hand. \nUnder  the  force  field  environment,  the  external  force  Fx  acting  on  the  hand  is \nequal  to Bx, where  B  is  a  2x2  rotational viscosity  matrix.  The effect  of the force \nfield  is  to  push  the  hand  perpendicular to the  direction of movement  with  a  force \nproportional to  the speed  of the hand.  The overall  forward  plant  dynamics  of the \narm is  a  combination of JM  and JD  and can be  repff~sented by  the function  Jp , \n\n(3) \n\nAdaptive  Inverse Model Feedforward Control \n\nThe  first  control  architecture  uses  a  feedforward  controller  with  only  an  adaptive \ninverse  model.  The  inverse  model  computes  the  neural  activation  to  the  muscles \nfor  achieving  a  desired  acceleration,  velocity  and  position  of the  hand.  It  can  be \nrepresented as the estimated inverse, 1;1, of the forward plant dynamics, and maps \nthe desired position Xd,  velocity Xd,  and acceleration Xd  of the hand, into descending \nneural commands N c. \n\nNc =  1;1 (Xd, Xd, Xd) \n\n(4) \nAdaptation to novel external dynamics occurs by learning a new inverse model of the \naltered external environment.  The error between desired and actual hand trajectory \ncan  be  used  for  training  the  inverse  model.  When  the  inverse  model  is  an  exact \ninverse of the forward plant dynamics, the gain of the feedforward path is  unity and \nthe arm exactly tracks the desired trajectory.  Deviations from the desired trajectory \noccur when the inverse model does not exactly model the external dynamics.  Under \nthat situation, the spinal reflex corrects for  errors in desired  (Xmd, Xmd)  and actual \n(xm,x m) muscle state, by producing a corrective neural signal NR based on a linear \nfeedback  controller with constants K1  and K 2 \u2022 \n\n(5) \n\nAdaptive Forward-Inverse Model Feedback  Control \n\nThe  second  architecture  provides  feedback  control  of arm  movements  in  addition \nto  the  feedforward  control  described  above.  Delays  in  feedback  cause  instability, \ntherefore, the system relies on a forward model to generate updated state estimates \nof  the  arm.  An  estimated  error  in  hand  trajectory  is  given  by  the  difference  in \ndesired and estimated state, and can be used by the brain to issue  corrective neural \nsignals to the muscles while a movement is  being made.  The forward model, written \n\nDesired \nTrajectory \n\n6d(t+60) \n\nInverse Arm \n\nDynamics Model  T d \n\nA\u00b71 \nto \n\nInverse Muscle \nModel  f;:.,' \n\nMuscle  T \n\nfM \n\nDynamics \n\nArm \n\nto \n\n6 \n\n/ \n\nr+----------------~ \n\n! '\"-.... ----v-----~ \n. \nl ........ _ .. _ ..... _ .... __ ._ ... _!e ........... __ ..... _ ........... . \n\n\"., \n\nFx \n\n(external force) \n\nA=gO ms \n\n6 d(I.30)  + \n\n. \n\n6(1.30) \n\nA=30ms \n\nFigure  1:  The adaptive inverse model feedforward  control system. \n\n\f6 \n\nN.  Bhushan and R. Shadmehr \n\n1\\  1\\ \n\nx, X (t+60) \n\nA=120ms \n\nDesired \nTrajectory \n\nTd  Inverse Muscle  Nc \n\nModel \nf\u00b7' \nM \n\nf--L-..~ \nA=60 ms  + \n\nNR  L........_---' \n\nA-9Oms \n\nA~30ms \n\nFigure 2:  A control system that provides feedback control with the use of a forward \nand an  inverse  model. \n\nas  jp,  mimics  the forward  dynamics of the plant and predicts hand acceleration i, \nfrom  neural signal  Nc, and an estimate of hand state x, \u00b1. \n\n(6) \nU sing this equation, one can solve for x, \u00b1 at time t,  when given the estimated state \nat some earlier time t - T,  and the descending neural commands N c  from time t - T \nto t.  If t is the current time and T  is the time delay in the feedback loop, then sensory \nfeedback gives the hand state x, x at t-T. The current estimate of the hand position \nand velocity can be computed by assuming initial conditions x(t - T)=X(t - T)  and \n\u00b1(t - T)=X(t - T),  and then solving Eq.  6.  For the simulations, T has value of 200 \nmsec,  and is  composed of 120 msec feedback delay, 60 msec descending neural path \ndelay,  and 20  msec muscle activation delay. \n\nBased on the current state estimate and the estimated error in trajectory, the desired \nacceleration  is  corrected  using  a  linear  feedback  controller  with  constants  Kp  and \nKv.  The  inverse  model  maps  the  hand  acceleration  to  appropriate  neural  signal \nfor  the  muscles  Nc.  The spinal  reflex  provides  additional corrective feedback  N R , \nwhen there is  an error in  the estimated  and actual muscle  state. \n\nXd  + Xc  = Xd  + Kp(Xd  - x) + Kv(Xd  - \u00b1) \n1;1 (x new , x, \u00b1) \nK 1 (xm - xm) + K 2 (\u00b1md - xm) \n\n(7) \n(8) \n\n(9) \n\nWhen the forward model is an exact copy of the forward plant dynamics  jp= jp, and \nthe inverse model is correct j;l = 1;1, the hand exactly tracks the desired trajectory. \nErrors due  to an incorrect inverse  model  are corrected through the  feedback  loop. \nHowever,  errors  in  the forward  model  cause  deviations  from  the  desired  behavior \nand instability in  the system due  to inappropriate feedback  action. \n\n3  Simulations results and comparison to human behavior \n\nTo test  the two  control architectures,  we  compared  simulations of arm movements \nfor the two methods to experimental human results under a novel force field environ(cid:173)\nment.  Sixteen human  subjects were  trained to make  rapid point-to-point reaching \n\n\fEvidence for a Forward Dynamics Model in Human Adaptive Motor Control \n\n7 \n\nFeedforward Control \n\n...-\n\nInverse Model \n\n~.  -('.::::)(:::?). \n\n_.i \n\n.')-'\" \n...... \n\n(1) \n\n(2) \n\nTypical Subject \n\n1 \n\n1.5 \n\n0.5 \n\n\",A \n\n!A \n\n\u00b7o ..  v \n\"\"' .... \n\n.. \"'-\n.\"\"\".. \nc. .... :.~ ...... ) \ni.. .. :; . .. : . . .. , \n021N\\l \n~:~ \nO'IT[] \n~ffi1TI \n\nO.5 sec  1 \n\n0.2 \n0, \n\n15 \n\n0.3 \n\nForward\u00b7lnverse Model \n\n.,~ \n\n~ .. \n\nFeedback Control \nr<:.  .~ \nt>4J(::::~ \n., . ./  .) \u2022. / \no:lJIffl:J \nQ2~ \n04[m:J \n\nO.S \n\n0.3 \n\n1.5 \n\n1 \n\n0.' \n\nQl \n\no \n\n0.5 \n\n1 \n\n1.5 \n\n~w o \n\nO.5 sec  1 \n\n1.5 \n\nFigure 3:  Performance in field B2 after a typical subject (middle column) and each of \nthe controllers (left and right columns)  had adapted to field  B 1 .  (1)  hand paths for \n8 movement  directions,  (2-5)  hand  velocity,  speed,  derivative of velocity  direction, \nand segmented hand path for  the  -900  downward movement.  The segmentation in \nhand trajectory  that is  observed in our subjects is  almost  precisely reproduced  by \nthe controller that uses  a  forward  model. \n\nmovements  with  their hand  while  an  external force  field , Fx =  Bx,  pushed  on  the \nhand.  The  task  was  to  move  the  hand  to  a  target  position  10  cm  away  in  0.5 \nsec.  The  movement  could  be  directed  in  any  of eight  equally  spaced  directions. \nThe  subjects  made  straight-path  minimum-jerk  movements  to  the  targets  in  the \nabsence  of  any  force  fields.  The  subjects  were  initially  trained  in  force  field  Bl \nwith B=[O  13;-130]' until they had completely adapted to this field  and  converged \nto  the  straight-path  minimum-jerk  movement  observed  before  the  force  field  was \napplied.  Subsequently,  the force  field  was  switched  to B2  with  B=[O  -13;13 0]  (the \nnew field  pushed anticlockwise, instead of clockwise), and the first three movements \nin  each  direction  were  used  for  data analysis.  The  movements  of the  subjects  in \nfield  B2  showed  huge  deviations  from  the  desired  straight  path  behavior  because \nthe  subjects  expected  clockwise force  field  B 1 \u2022  The hand trajectories for  the first \nmovement  in  each  of the  eight  directions  are  shown  for  a  typical  subject  in  Fig.  3 \n(middle  column). \nSimulations  were  performed  for  the  two  methods  under  the  same  conditions  as \nthe  human  experiment.  The  movements  were  made  in  force  field  B 2 ,  while  the \ninternal  models  were  assumed  to  be  adapted  to  field  B 1 .  Complete  adaptation \nto  the  force  field  Bl  was  found  to  occur  for  the  two  methods  only  when  both \n\n\f8 \n\nN.  Bhushan and R. Shadmehr \n\n(a) \n\nExpenmental \n\n\u2022  data  from \n16 subjects \n\nForward \n\n\u2022  Model \nControl \n\n:[[[1 &' I~ ~ III \n\n= \nA,(\")  cJ(m/s' )  Ns \n\nt,(s) \n\nA1(\") \n\ndl(m)  An \n\nQ \n\nFigure 4:  The  mean and  standard deviation for  segmentation parameters for  each \ntype  of  controller  as  compared  to  the  data  from  our  subjects.  Parameters  are \ndefined  in  Fig.  3:  Ai  is  angle  about  a  seg.  point,  di  is  the  distance  to  the  i-th \nseg.  point,  ti  is  time  to  reach  the  i-th  seg.  point,  Cj  is  cumulative  squared  jerk \nfor  the  entire  movement,  Ns  is  number  of seg.  point  in  the  movement.  Up  until \nthe  first  segmentation  point  (AI  and  dd,  behavior  of  the  controllers  are  similar \nand  both  agree  with  the  performance of our subjects.  However,  as  the movement \nprogresses, only the controller that utilizes a forward model continues to agree with \nthe movement  characteristics of the subjects. \n\nthe  inverse  and  forward  models  expected  field  B I .  Fig.  3  (left  column)  shows  the \nsimulation of the adaptive inverse model feedforward control for  movements in field \nB2  with the inverse model incorrectly expecting B I .  Fig. 3 (right column) shows the \nsimulation  of the  adaptive forward-inverse  model  feedback  control for  movements \nin  field  B2  with  both the  forward  and the inverse  model  incorrectly expecting  B I . \nSimulations with the two  methods show clear differences  in stability and corrective \nbehavior  for  all  eight  directions  of  movement.  The  simulations  with  the  inverse \nmodel  feedforward  control  seem  to  be  stable,  and  converge  to  the  target  along  a \nstraight  line  after  the  initial  deviation.  The  simulations  with  the  forward-inverse \nmodel  feedback  control  are  more  unstable  and  have  a  curious  kinematic  pattern \nwith  discontinuities  in  the hand path.  This is  especially  marked for  the  downward \nmovement.  The  subject's  hand  paths  show  the  same  kinematic  pattern  of  near \ndiscontinuities  and  segmentation  of  movement  as  found  with  the  forward-inverse \nmodel  feedback  control. \nTo  quantify  the  segmentation  pattern  in  the  hand  path,  we  identified  the  \"near \npath-discontinuities\"  as  points  on  the trajectory where there  was  a  sudden  change \nin  both the derivative  of hand  speed  and the direction of hand velocity.  The hand \npath  was  segmented on  the  basis  of these  near  discontinuities.  Based  on  the  first \nthree  segments  in  the  hand  trajectory  we  defined  the  following  parameters:  AI, \nangle between the first  segment and the straight path to the target; dl ,  the distance \ncovered during the first segment; A2,  angle between the second segment and straight \npath to the target from the first segmentation point; t2,  time duration of the second \n\n\fEvidence for a Forward Dynamics Model in Human Adaptive Motor Control \n\n9 \n\nsegment;  A3,  angle  between  the  second  and  third  segments;  Ns,  the  number  of \nsegmentation points  in  the movement .  We  also  calculated  the  cumulative jerk CJ \nin the movements to get  a  measure of the instability in the system. \n\nThe results of the movement segmentation are presented in Fig. 4 for  16 human sub(cid:173)\njects,  25  simulations of the inverse  model  and 20 simulations of the forward  model \ncontrol for  three movement directions  (a)  -900  downward,  (b)  900  upward and  (c) \n1350  upward.  We  performed the different  simulations for  the two  methods by  sys(cid:173)\ntematically varying various model parameters over a reasonable physiological range. \nThis was done because the parameters are only approximately known and also vary \nfrom  subject  to subject.  The parameters of the second  and third  segment,  as  rep(cid:173)\nresented  by  A2,  t2  and  A3,  clearly  show  that  the  forward  model  feedback  control \nperforms  very  differently from  inverse  model feedforward  control and the behavior \nof human subjects is  very well  predicted by  the former.  Furthermore, this  charac(cid:173)\nteristic  behavior could be  produced by the forward-inverse  model feedback  control \nonly  when  the forward  model  expected field  B 1 .  This  could  be  accomplished only \nby adaptation of the forward model during initial practice in field  B 1 \u2022  This provides \nevidence for  an adaptive forward  model in the control of human arm movements in \nnovel  dynamic  environments. \n\nWe  further  tried to fit  adaptation curves of simulated movement parameters  (using \nforward-inverse model feedback  control) to real data as subjects trained in field  B 1 . \nWe  found  that the best fit  was  obtained for  a  rapidly adapting forward  and inverse \nmodel  (Bhushan  and  Shadmehr,  1999).  This  eliminated  the  possibility  that  the \ninverse model was trained offline  after practice.  The data, however,  suggested that \nduring learning of a force  field,  the rate of learning of the forward model was  faster \nthan  the  inverse  model.  This  finding  could  be  paricularly  relevant  if it  is  proven \nthat  a  forward  model  is  easier  to  learn  than  an  inverse  model  (Narendra,  1990), \nand  could  provide  a  computational rationale for  the existence  of forward  model  in \nadaptive motor control. \n\nReferences \nBhushan  N,  Shadmehr  R  (1999)  Computational  architecture  of  the  adaptive  controller \nduring learning of reaching  movements in  force  fields.  Biol Cybern,  in  press. \nJordan  MI,  Flash  T,  Arnon  Y  (1994)  A model  of learning  arm  trajectories  from  spatial \ndeviations  Journal  of Cog  Neur 6:359-376 . \nJordan  MI,  Rumelhart  DE  (1992)  Forward  model:  supervised  learning  with  a  distal \nteacher.  Cog  Sc  16:307-354. \nMiall  RC,  Wolpert  DM  (1996)  Forward  models  for  phySiological  motor  control.  Neural \nNetworks  9:1265-1279. \nNarendra KS  (1990) Identification and control of dynamical systems using neural networks. \nNeural  Networks  1:4-27. \nShadmehr  R,  Brashers-Krug T  (1997)  Functional  stages  in  the formation  of human  long(cid:173)\nterm memory.  J  Neurosci  17:409-19. \nShadmehr R,  Mussa-Ivaldi FA  (1994)  Adaptive representation of dynamics during learning \nof a motor  task.  The  Journal  of Neuroscience  14:3208-3224. \nSoechting  JF,  Flanders  M (1997)  Evaluating  an  integrated  musculoskeletal  model  of the \nhuman arm  J  Biomech  Eng 9:93-102 . \nWolpert  DM,  Ghahramani Z,  Jordan  MI  (1995)  An  internal  model for  sensorimotor  inte(cid:173)\ngration.  Science 269:1880-82. \n\n\f", "award": [], "sourceid": 1486, "authors": [{"given_name": "Nikhil", "family_name": "Bhushan", "institution": null}, {"given_name": "Reza", "family_name": "Shadmehr", "institution": null}]}