{"title": "A Connectionist Model of the Owl's Sound Localization System", "book": "Advances in Neural Information Processing Systems", "page_first": 606, "page_last": 613, "abstract": null, "full_text": "A  Connectionist Model of the  Owl's \n\nSound Localization System \n\nD alliel J.  Rosen\u00b7 \n\nDepartment of Psychology \n\nStanford  University \nStanford, CA 94305 \n\nDavid E.  Rumelhart \n\nDepartment of Psychology \n\nStanford  University \nStanford, CA 94305 \n\nEric.  I.  Knudsen \n\nDepartment of Neurobiology \n\nStanford  University \nStanford, CA 94305 \n\nAbstract \n\n,,\"'e do not have a good understanding of how  theoretical principles \nof learning are realized in neural systems.  To address this problem \nwe  built a  computational model of development in the owl's sound \nlocalization  system.  The  structure  of  the  model  is  drawn  from \nknown  experimental data while  the  learning principles  come from \nrecent  work  in  the  field  of  brain  style  computation.  The  model \naccounts  for  numerous  properties  of the  owl's  sound  localization \nsystem,  makes specific  and  testable  predictions  for  future  experi(cid:173)\nments, and provides a  theory of the  developmental process. \n\n1 \n\nINTRODUCTION \n\nThe  barn  owl,  Tyto  Alba,  has  a  remarkable ability  to  localize sounds  in  space.  In \ncomplete darkness  it  catches  mice  with  nearly  flawless  precision.  The owl  depends \nupon  this skill for  survival, for  it  is  a  nocturnal hunter  who  uses  audition to guide \n\n\u00b7Current  address:  Keck  Center  for  Integrative  Neuroscience,  UCSF,  513  Parnassus \n\nAve.,  San  Francisco,  CA 94143-0444. \n\n606 \n\n\fA Connectionist Model of the Owl's Sound Localization System \n\n607 \n\nits search for  prey  (Payne,  1970;  Knudsen,  Blasdel  and  Konishi,  1979).  Central  to \nthe  owl's  localization system  are  the  precise  auditory  maps of space  found  in  the \nowl's optic tectum and in  the external nucleus of the inferior colliculus (lex). \n\nThe  development  of these  sensory  maps poses  a  difficult  problem for  the  nervous \nsystem, for their accuracy  depends upon changing relationships between  the animal \nand  its environment.  The owl  encodes  information about  the  location  of a  sound \nsource  by  the  phase  and  amplitude  differences  with  which  the  sound  reaches  the \nowl's  two  ears.  Yet  these  differences  change  dramatically  as  the  animal  matures \nand  its  head  grows.  The  genome  cannot  \"know\"  in  advance  precisely  how  the \nanimal's head  will  develop  - many environmental factors  affect  this process - so  it \ncannot encode the precise development of the auditory system.  Rather, the genome \nmust  design  the  auditory  system  to  adapt  to  its environment,  letting it  learn  the \nprecise  interpretation of auditory cues  appropriate for  its head and ears. \n\nIn order to understand the nature of this developmental process,  we  built a connec(cid:173)\ntionist model of the owl's sound localization system, using both theoretical principles \nof learning and knowledge of owl  neurophysiology and neuroanatomy. \n\n2  THE ESSENTIAL SYSTEM  TO BE MODELED \n\nThe  owl  calculates  the  horizontal  component  of a  sound  source  location  by  mea(cid:173)\nsuring  the  interaural  time  difference  (lTD)  of a  sound  as  it  reaches  the  two  ears \n(Knudsen  and Konishi,  1979).  It computes the vertical  component of the signal by \ndetermining the interaurallevel difference  (ILD) of that same sound  (Knudsen  and \nKonishi,  1979).  The animal processes  these  signals through  numerous sub-cortical \nnuclei to form ordered auditory maps of space in both the ICx and the optic tectum. \nFigure  1 shows  a  diagram of this neural circuit. \nNeurons  in  both  the  ICx  and  the  optic  tectum  are  spatially  tuned  to  auditory \nstimuli.  Cells in these  nuclei respond  to sound signals originating from  a  restricted \nregion of space in relation to the owl (Knudsen,  1984).  Neurons in the ICx respond \nexclusively to auditory signals.  Cells in the optic tectum, on the other hand, encode \nboth audito!y and visual sensory  maps, and drive the motor system to orient to the \nlocation of an  auditory or visual signal. \n\nResearchers  study  the  owl's  development  by  systematically  altering  the  animal's \nsensory  experience,  usually  in  one  of two  ways.  They  may fit  the  animal with  a \nsound attenuating earplug, altering its auditory experience,  or they  may fit  the owl \nwith displacing prisms, altering its visual experience. \n\nDisturbance  of either  auditory  or visual  cues,  during a  period  when  the  owl  is  de(cid:173)\nveloping to maturity, causes neural and behavioral changes that bring the  auditory \nmap of space back into alignment with the visua.l map, and/or tune the auditory sys(cid:173)\ntem to be sensitive to the appropriate range of binaural sound signals.  The earplug \ninduced  changes  take  place  at  the level  of the VLVp,  where  ILD  is  first  computed \n(Mogdans  and  Knudsen,  1992).  The  visually  induced  adjustment of the  auditory \nmaps of space  seems  to take  place  at the level  of the  ICx (Brainard  and  Knudsen, \n1993b).  The ability of the owl  to  adjust to altered  sensory  signals diminishes over \ntime, and is  greatly restricted  in  adulthood  (Knudsen  and  Knudsen,  1990). \n\n\f608 \n\nRosen, Rumelhart, and Knudsen \n\nOVERVIEW of the BARN OWL'. \n\nSOUND LOCALIZATION SYSTEM \n\n(  ~~dIC \n\n( \n\nNUCLBJS \nMAGNO(cid:173)\n\nCEWJLAAIS \n\nT\"'*'a \n\nL\"\". \n\nNUCLBJS \nMAGNO(cid:173)\n\nCB.LULAAIS \n\nTIn*'II \n\nFigure  1:  A  chart  describing  the  flow  of auditory  information  in  the  owl's sound \nlocalization system.  For simplicity, only  the  connections leading  to  the  one  of the \nbilateral optic  tecta are shown.  Nuclei labeled  with an asterisk  (*)  are  included in \nthe model.  Nuclei  that process  ILD  and/or lTD information are so  labeled. \n\n3  THE NETWORK MODEL \n\nThe model has two major components:  a  network  architecture based on  the neuro(cid:173)\nbiology of the  owl's localization system,  as shown  in  Figure  1,  and  a  learning rule \nderived  from  computational learning  theory.  The elements of the  model are stan(cid:173)\ndard  connectionist  units  whose  output  activations  are sigmoidal functions  of their \nweighted  inputs.  The learning  rule  we  use  to  train  the  model  is  not  standard.  In \nthe following section  we  describe  how  and  why  we  derived  this rule. \n\n3.1  DEFINING THE  GOAL  OF THE NETWORK \n\nThe goal of the network, and presumably the owl, is to accurately map sound signals \nto sound  source  locations.  The network  must  discover  a  model of the world  which \nbest  captures  the  relationship  between  sound  signals  and  sound  source  locations. \nRecent  work in connectionist learning theory has shown us ways to design networks \nthat  search  for  the  model  that  best  fits  the  data at  hand  (Buntine  and  Weigend, \n1991;  MacKay,  1992;  Rumelhart,  Durbin,  Golden  and  Chauvin,  in  press).  In  this \nsection  we  apply such  an analysis to the  localization network. \n\n\fA Connectionist Model of the Owl's Sound Localization System \n\n609 \n\nTable 1:  A  table showing the mathematical terms used  in  the analysis. \nI  TERM  I MEANING \nThe  Model \nThe  Data \n\nM \n1J \n\nP(MI1J)  Probability of the  Model given  the  Data \n< X,Y>i  The set of i input/target training pairs \n\nxi \nYi \nYi \nYij \nWij \n7Jj \n\n:F(7Jj) \n\nC \n\nThe input vector for  training trial  i \nThe target  vector  for  training trial  i \nThe output vector for  training trial  i \nThe  value of output unit j  on  training trial  i \nThe  weight from unit j  to  unit  i \nThe netinput  to unit j \nThe activation function  of unit j  evaluated  at its netinput \nThe term to be maximized by the network \n\n3.2  DERIVING THE FUNCTION  TO BE MAXIMIZED \n\nThe  network should  maximize the  probability of the  model given  the  data.  Using \nBayes'  rule  we  write this probability as: \n\nP(MI1J) =  P(1JIM)P(M) \n. \n\nP(1J) \n\nHere  M  represents  the  model  (the  units,  weights  and  associated  biases)  and  D \nrepresents  the  data.  We  define  the  data  as  a  set  of ordered  pairs,  [<  sound(cid:173)\nsignal, location - signal >d, which represent  the cues and targets normally used  to \ntrain  a  connectionist  network.  In  the  owl's case  the  cues  are  the  auditory  signals, \nand  the  target  information  is  provided  by  the  visual  system.  (Table  1  lists  the \nmathematical terms we  use  in  this section.) \n\nWe simplify this equation  by  taking the natural logarithm of each side giving: \n\nIn P(MI1J) = In P(1JIM) + InP(M) -In P(1J). \n\nSince the natural logarithm is a monotonic transformation, if the network maximizes \nthe second  equation  it will  also maximize the first. \nThe  final  term  in  the  equation,  In P(1J),  represents  the  probability of the  ordered \npairs  the  network  observes.  Regardless  of which  model  the  network  settles  upon, \nthis term remains the same - the data are a constant during training.  Therefore  we \ncan ignore it when  choosing a  model. \nThe second term in the equation, In P(M), represents  the probability of the model. \nThis is  the  prior term  in  Bayesian  analysis  and  is  our estimation of how  likely  it \nis  that  a  particular model is  true,  regardless  of the  data.  'Ve will discuss  it below. \nFor now  we  will  concentrate  on  maximizing In P(1JIM). \n\n\f610 \n\nRosen, Rumelhart, and Knudsen \n\n3.3  ASSUMPTIONS  ABOUT THE NETWORK'S  ENVIRONMENT \n\nWe  assume  that  the  training data - pairs of stylized  auditory  and  visual signals -\nare independent  of one another and  re-write  the previous term  as: \n\nInP(VIM)  =  L:lnP\u00ab \n\ni,Y>i 1M), \n\ni \n\nThe  i subscript  denotes  the  particular  data, or  training,  pair.  We  further  expand \nthis term to: \n\nIn P(VIM) = Lin P(ih Iii 1\\ M) + L: In P(Xi). \n\ni \n\ni \n\nWe  ignore  the  last  term,  since  the sound  signals  are  not  dependent  on  the  model. \nvVe  are left,  then,  with the task of maximizing Li In P(Ui Iii  1\\ M). It is  important \nto note  that Yi  represents  a  visual signal,  not  a  localization decision.  The network \nattempts to predict  its visual experience  given  its auditory experience.  It does  not \npredict  the  probability of making an  accurate  localization  decision.  If we  assume \nthat visual signals provide the target values for the network, then this analysis shows \nthat the auditory map will always follow  the visual map, regardless of whether this \nleads  to  accurate  localization  behavior  or  not.  Our  assumption  is  supported  by \nexperiments showing that,  in  the owl,  vision  does  guide  the  formation of auditory \nspatial maps (Knudsen  and  Knudsen,  1985; Knudsen,  1988). \nNext,  we  must  clarify  the  relationship  between  the  inputs,  Xi  and  the  targets,  ih. \n\\Ve  know  that  the  real  world  is  probabilistic - that for  a  given  input  there  exists \nsome distribution of possible  target  values.  We  need  to estimate the  shape of this \ndistribution.  In this case we assume that the target values are binomially distributed \n- that  given  a  particular  sound  signal,  the  visual  system  did  or  did  not  detect  a \nsound source  at each  point in owl-centered  space. \n\nHaving  made  this  assumption,  we  can  clarify  our  interpretation  of the  network \noutput array,  Y~.  Each element, Yij,  of this vector  represents  the  activity of output \nunit  j  on  training  trial  i.  We  assume  that  the  output  activation of each  of these \nunits  represents  the  expected  value  of its  corresponding  target,  Yij.  In  this  case \nthe  expected  value  is  the  mean  of a  binomial distribution.  So  the  value  of each \noutput unit Yij  represents  the probability that a sound signal originated from  that \nparticular location.  vVe  now  write  the probability of the data given  the model as: \n\nP(yilxi 1\\ M) = II yft (1 - Yij )l-Yij\n\n. \n\nj \n\nTaking the  natural log of the  probability and summing over  all  data pairs we  get: \n\nC = L L: Yij  In Yij  + (1  - Yij) In( 1 - Yij) \n\ni \n\nj \n\nwhere C is  the term we  want to maximize.  This is  the standard cross-entropy  term. \n\n3.4  DERIVING THE LEARNING  RULE \n\nHaving defined our goal we  derive a learning rule appropriate to achieving that goal. \nTo determine this rule we compute  :~ where  7}j  is the net input to a unit.  (In these \n\n\fA Connectionist Model of the Owl's Sound Localization System \n\n611 \n\nequations  we  have  dropped  the  i  subscript,  which  denotes  the  particular  training \ntrial, since this analysis is  identical for  all trials.)  We  write this as: \n\nwhere  aF( '1]j)  is  the  derivative of a  unit's activation function  evaluated  at its net \ninput. \nNext we  choose an appropriate activation function for the output units.  The logistic \nfunction,  F('1]j) = (  1_,,\"),  is a good choice for  two reasons.  First, it is  bounded by \nzero  and one.  This makes sense  since  we  assume  that the probability that a  sound \nsignal  originated  at  anyone  point  in  space  is  bounded  by  zero  and  one.  Second, \nwhen  we  compute the derivative of the logistic function  we  get the following result: \n\nl+e \n\n, \n\naF('1]j) = F('1]j)(I- F('1]j))  = 1/j(1- 1/j). \n\nThis  term  is  the  variance  of a  binomial  distribution  and  when  we  return  to  the \nderivative  of our  cost  function,  we  see  that  this  variance  term  is  canceled  by  the \ndenominator.  The  final  derivative  we  use  to  compute  the  weight  changes  at  the \noutput units is  therefore: \n\nac \n~  ) \n~ <X  Yj  - Yj  . \nu'1]j \n\n( \n\nThe  weights  to  other  units  in  the  network  are  updated  according  to  the  standard \nbackpropagation learning algorithm. \n\n3.5  SPECIFYING MODEL  PRIORS \n\nThere are two types of priors in this model.  First is the architectural one.  We design \na  fixed  network  architecture,  described  in  the  previous  section,  based  upon  our \nknowledge of the nuclei involved in  the owl's localization system.  This is equivalent \nto setting the  prior probability of this architecture  to  1,  and  all others to O. \n\nWe  also  use  a  weight elimination prior.  This and similar priors may be interpreted \nas ways to reduce the complexity of a network (\\Veigend, Huberman and Rumelhart, \n1990).  The network, therefore,  maximizes an expression  which is a function of both \nits error  and complexity. \n\n3.6  TRAINING \n\nWe  train the model by presenting it with input to the core of the  inferior colli cui us \n(ICc),  which  encodes  interaural  phase  and  time  differences  (IPD/ITD),  and  the \nangular  nuclei,  which  encode  sound  level.  The  outputs  of the  network  are  then \ncompared to target  values,  presumed  to come from  the  visual system.  The weights \nare adjusted in order to minimize this difference.  \\Ve mimic plug training by varying \nthe  average  difference  between  the  two  angular  input  values.  We  mimic  prism \ntraining by systematically changing the target  values associated  with  an input. \n\n\f612 \n\nRosen, Rumelhart, and Knudsen \n\nFigure  2:  The  activity  level  of lex units  in  response  to  a  particular  auditory  in(cid:173)\nput  immediately after simulated prism training was  begun  (left),  midway through \ntraining (middle) and  after  training was  completed  (right). \n\n4  RESULTS  and  DISCUSSION \n\nThe trained network localizes accurately, shows appropriate auditory tuning curves \nin  each  of the  modeled  nuclei,  and  responds  appropriately  to  manipulations  that \nmimic experiments such  as blocking inhibition at the level of the lex.  The network \nalso shows  appropriate responses  to changing average binaural intensity at the level \nof the  VLVp,  the lateral shell  and  the lex. \nFurthermore,  the  network  exhibits  many  properties  found  in  the  developing  owl.. \nThe  model  appropriately  adjusts  its  auditory  localization  behavior  in  simulated \nearplug  experiments  and  this  plasticity  takes  place  at  the  level  of the  VLVp.  As \nearplug simulations are  begun  progressively  later in  training,  the network's ability \nto adapt to plug training gradually diminishes, following a  time course of plasticity \nqualitatively similar to  the sensitive  and  critical periods described  in the owl. \nThe  network  adapts  appropriately  in  simulated prism  studies  and  the  changes  in \nresponse  to  these  simulations  primarily  take  place  along  the  lateral  shell  to  lex \nconnections.  As  with  the  plug  studies,  the  network's  ability  to  adapt  to  prisms \ndiminishes  over  time.  However,  unlike  the  mature owl,  a  highly  trained  network \nretains the ability to adapt in  a simulated prism experiment. \nWe  also discovered  that the principally derived learning rule better models interme(cid:173)\ndiate stages of prism  adjustment  than  does  a  standard  back-propagation network. \nBrainard and Knudsen (1993a) report observing two peaks of activity across the tec(cid:173)\ntum in response  to an auditory stimulus during prism training - one corresponding \nto the pre-training response  and one  corresponding  to the newly  learned  response. \nOver  time the  pre-trained  response  diminishes while  the  newly  learned  one  grows. \nAs shown in Figure 2,  the network exhibits this same pattern of learning.  Networks \nwe  trained  under  a  standard  back-propagation learning algorithm  do  not.  Such  a \n\n\fA Connectionist Model of the Owl's Sound Localization System \n\n613 \n\nresult  lends support  to  the  idea  that  the owl's localization system  is  computing a \nfunction  similar to the one the network  was  designed  to learn. \n\nIn  addition  to  accounting for  known  data,  the  network  predicts  results  of experi(cid:173)\nments it was  not  designed  to mimic.  Specifically,  the  network  accurately  predicted \nthat  removal  of the  animal's facial  ruff,  which  causes  ILD  to  vary  with  azimuth \ninstead of elevation, would have no effect  on  the animal's response  to varying ILD. \n\nThe network accomplishes the goals for which it was designed.  It accounts for much, \nthough  not  all,  of the developmental data,  it  makes testable  predictions for future \nexperiments,  and  since  we  derived  the  learning  rule  in  a  principled  fashion,  the \nnetwork  provides us  with  a  specific  theory of the owl's sound localization system. \n\nReferences \n\nBrainard,  M.  S.,  &  Knudsen,  E.  1.  (1993a).  Dynamics of the  visual  calibration of \nthe  map of interaural  time  difference  in  the  barn  owl's  optic  tectum.  Society \nJor  Neuroscience  Abstracts,  19,  369.8. \n\nBrainard, M.  S.,  &  Knudsen,  E.  1.  (1993b).  Experience-dependent  plasticity in  the \ninferior  colliculus:  a  site  for  visual  calibration  of the  neural  representation  of \nauditory space  in  the  barn owl.  The  Journal  of Neuroscience,  13,  4589-4608. \nBuntine,  W.  L.,  & Weigend,  A.  S.  (1991).  Bayesian  back-propagation.  Complex \n\nSystems,  5,  603-612. \n\nKnudsen,  E.  (1984).  Auditory properties of space-tuned units in owl's optic tectum. \n\nJournal  of Neurophysiology,  52(4),  709-723. \n\nKnudsen,  E.  (1988).  Early  blindness  results  in  a  degraded  auditory  map of space \nin  the  optic  tectum of the  barn  owl.  Proceedings  of the  National  Academy  of \nScience,  U.S.A.,  85,  6211-6214. \n\nKuudsen,  E.,  Blasdel,  G.,  &  Konishi,  M.  (1979).  Sound  localization  by  the  barn \n\nowl  (tyto alba) measured with the search coil  technique.  The  Journal of Com(cid:173)\nparative  Physiology  A,  133,  1-11. \n\nKnudsen,  E.,  &  Knudsen,  P.  (1985).  Vision  guides  the  adjustment  of  auditory \n\nlocalization in  young barn owls.  Science,  230,  545-548. \n\nKnudsen,  E., &  Knudsen,  P.  (1990).  Sensitive and critical periods for visual calibra(cid:173)\ntion of sound  localization  by  barn owls.  The  Journal  of Neuroscience,  10(1), \n222-232. \n\nMacKay, D. J. (1992).  Bayesian Methods for Adaptive :Models.  Unpublished doctoral \n\ndissertation,  California Institute of Technology,  Pasadena, California. \n\nMogdans,  J.,  &  Knudsen,  E.  1.  (1992).  Adaptive  adjustment  of unit  tuning  to \nsound  localization  cues  in  response  to  monaural occlusion  in  developing  owl \noptic  tectum.  The  Journal  of Neuroscience,  12,  3473-3484. \n\nPayne, R. S.  (1970).  Acoustic location of prey by barn owls (tyto  alba).  The  Journal \n\nof Experimental Biology,  54,  535-573. \n\nRumelhart, D.  E.,  Durbin, R., Golden, R., &  Chauvin, Y.  (in press).  Backpropaga(cid:173)\ntion:  The theory.  In Y.  Chauvin & D.  E.  Rumelhart (Eds.),  Backpropagation: \nTheory,  Architectures  and  Applications.  Hillsdale,  N.J.:  Lawrence  Earlbaum \nAssociates. \n\nWeigend,  A.  S.,  Huberman,  B.  A.,  &  Rumelhart,  D.  E.  (1990).  Predicting  the \nfuture:  A connectionist  approach.  International  Journal  of Neural Systems,  1, \n193-209. \n\n\f", "award": [], "sourceid": 790, "authors": [{"given_name": "Daniel", "family_name": "Rosen", "institution": null}, {"given_name": "David", "family_name": "Rumelhart", "institution": null}, {"given_name": "Eric", "family_name": "Knudsen", "institution": null}]}