{"title": "Processing of Time Series by Neural Circuits with Biologically Realistic Synaptic Dynamics", "book": "Advances in Neural Information Processing Systems", "page_first": 145, "page_last": 151, "abstract": null, "full_text": "Processing of Time Series by Neural Circuits \nwith Biologically Realistic Synaptic Dynamics \n\nThomas NatschIager &  Wolfgang Maass \nInstitute for Theoretical Computer Science \n\nTechnische Universitat Graz, Austria \n\n{tna t schl,maass }@i g i.tu -gra z. ac . a t \n\nEduardo D. Sontag \nDept.  of Mathematics \n\nRutgers University \n\nAnthony Zador \n\nCold Spring Harbor Laboratory \n\n1 Bungtown Rd \n\nNew Brunswick, NJ 08903, USA \n\nsont ag@h ilbert. r ut ge rs . e du \n\nCold Spring Harbor, NY  11724 \n\nzad or@cshl. org \n\nAbstract \n\nExperimental data show that biological synapses behave quite differently \nfrom the symbolic synapses in common artificial neural network models. \nBiological synapses are dynamic, i.e., their \"weight\" changes on  a short \ntime  scale  by  several  hundred percent in  dependence of the  past  input \nto  the  synapse.  In  this  article  we  explore  the  consequences  that  these \nsynaptic  dynamics  entail  for  the  computational  power  of feedforward \nneural  networks. We  show that gradient descent suffices to  approximate \na given  (quadratic) filter  by  a rather small  neural  system  with  dynamic \nsynapses.  We  also  compare our network model to  artificial  neural net(cid:173)\nworks  designed  for  time  series  processing.  Our  numerical  results  are \ncomplemented by  theoretical analysis which  show  that even with just a \nsingle hidden layer such networks can approximate a surprisingly large \nlarge  class  of nonlinear filters:  all  filters  that  can  be  characterized  by \nVolterra series. This result is robust with regard to various changes in the \nmodel for synaptic dynamics. \n\n1  Introduction \n\nMore than two decades of research on artificial neural networks has emphasized the central \nrole of synapses in neural computation. In a conventional artificial neural network, all units \n(\"neurons\") are assumed to be identical, so that the computation is completely specified by \nthe synaptic \"weights,\" i. e. by the strengths of the connections between the units.  Synapses \nin common artificial  neural network models are static:  the value Wij  of a synaptic weight \nis  assumed to  change only during \"learning\".  In contrast to  that, the  \"weight\" Wij (t)  of \na biological synapse at time t  is  known to  be  strongly dependent on the inputs Xj(t  - T) \nthat this  synapse has  received from  the presynaptic neuron i  at previous time  steps  t  - T, \nsee e.g.  [1].  We  will focus in this article on mean-field models for populations of neurons \nconnected by dynamic synapses. \n\n\fA  1 \n\nB  1 \n\nC  1 \n\n0.5 \n\npure facilitation \n\n0.5 \n\n\\ \n\npure depression \n\n0.5 \n\nfacilitation  and \ndepression \n\n00 \n\n100 \ntime \n\n200 \n\n00 \n\n100 \ntime \n\n200 \n\n00 \n\n100 \ntime \n\n200 \n\nFigure  1:  A dynamic synapse can produce quite different outputs for the same input.  The \nresponse of a  single  synapse to  a  step  increase  in  input activity  applied  at time  step 0  is \ncompared for three different parameter settings. \n\nSeveral models for single synapses have been proposed for the dynamic changes in synaptic \nefficacy.  In [2]  the model of [3]  is  extended to populations of neurons where the  current \nsynaptic efficacy Wij (t)  between a population j  and a population i  at time t is modeled as a \nproduct of a facilitation  term lij (t)  and a depression term dij (t)  scaled by the factor Wij . \nWe consider a time discrete version of this model defined as follows: \n\nWij ( t) = W ij  . hj ( t) . dij ( t) \n\n-\nlij(t + 1) = hj(t) - ~ + Uij . (1- hj(t)) . Xj(t) \n\nlij(t) \n\n-\n\n-\n\ndij(t + 1) = dij(t) +  1-~~j(t) - lij(t). dij(t)\u00b7 Xj(t) \n\n'J \n\n(1) \n\n(2) \n\n(3) \n\n'J \n\nhj(t) = hj(t) . (1- Uij)  + Uij \n\n(4) \nwith dij (0)  =  1 and hj (0)  =  O.  Equation (2) models facilitation (with time constant Fij ), \nwhereas equation (3) models the combined effects of synaptic depression (with time con(cid:173)\nstant D ij) and facilitation.  Depending on the  values  of the characteristic parameters Uij, \nDij , Fij  a  synaptic connection (ij) maps an  input function  Xj(t)  into the  corresponding \ntime varying synaptic output Wij (t)  . Xj (t).  The same input Xj (t)  can yield markedly dif(cid:173)\nferent outputs Wij (t)  . Xi (t)  for different values of the characteristic parameters Uij,  Dij, \nFij.  Fig.  1 compares the  output for three different sets  of values for the parameters Uij, \nDij , Fij . These examples illustrate just three of the range of input-output behaviors that a \nsingle synapse can achieve. \n\nIn this article we will  consider feedforward networks coupled by  dynamic synapses.  One \nshould think of the computational units in such a network as populations of spiking neurons. \nWe refer to such networks as \"dynamic networks\", see Fig. 2 for details. \n\nhidden units \n\ndynamic synapses \n\nFigure  2:  The  dynamic  network  model. \nThe  output  Xi(t)  of the  itk  unit  is  given \nby  Xi(t)  = u(Ej Wij(t)  . Xj(t)),  where \nu  is  either  the  sigmoid  function  u(u)  = \n1/(1 +  exp(-u))  (in  the  hidden  layers) \nor just the identity function u( u)  = u (in \nthe output layer) and Wij (t)  is modeled ac(cid:173)\ncording to Equ. (1) to  (4). \n\n\fIn  Sections  2  and  3  we  demonstrate  (by  employing gradient descent to  find  appropriate \nvalues for the  parameters  Uij,  D ij ,  Fij  and Wij)  that even small dynamic networks can \ncompute complex quadratic  filters.  In Section 4  we  address  the  question which synaptic \nparameters are  important for a  dynamic  network to  learn a  given  filter.  In  Section 5  we \ngive a precise mathematical characterization of the computational power of such dynamic \nnetworks. \n\n2  Learning Arbitrary Quadratic Filters by Dynamic Networks \n\nIn  order to  analyze which  filters  can be approximated by  small  dynamic networks we  in(cid:173)\nvestigate the task of learning a quadratic filter Q randomly chosen from a class  Qm.  The \nclass  Qm  consists  of all  quadratic  filters  Q whose  output (Qx) (t)  in  response  to  the  in(cid:173)\nput  time  series  x(t)  is  defined  by  some  symmetric  m  x  m  matrix  HQ  = \n[hkd  of fil(cid:173)\nter  coefficients  hkl  E  ~ k  =  1 .. . m,  l  = \nl  ... m  through  the  equation  (Qx)(t)  = \nZ=;:1  Z=~=1 hkl  x(t - k) x(t - l)  . An  example of the  input and  output for one choice \nof quadratic parameters (m  = 10) are shown in Figs.  3B  and  3C,  respectively.  We  view \nsuch  filter  Q  as  an  example for the kinds  of complex transformations  that are  important \nto  an  organism's survival,  such  as  those required for motor control  and the processing of \ntime-varying  sensory  inputs.  For example,  the  spectrotemporal  receptive  field  of a  neu(cid:173)\nron in  the  auditory cortex [4]  reflects  some complex transformation of sound pressure to \nneuronal activity.  The real transformations actually required may be very complex, but the \nsimple filter Q provides a useful starting point for assessing the capacity of this architecture \nto transform one time-varying signal into another. \n\nCan  a network of units coupled by  dynamic synapses implement the  filter  Q?  We  tested \nthe  approximation  capabilities  of  a  rather  small  dynamic  network  with  just  10 hidden \nunits  (5  excitatory  and  5  inhibitory  ones),  and  one  output  (Fig.  3A).  The  dynamics  of \ninhibitory  synapses  is  described  by  the  same  model  as  that for excitatory  synapses.  For \nany particular temporal pattern applied at the input and any particular choice of the synap(cid:173)\ntic  parameters,  this  network generates  a  temporal pattern as  output.  This  output can  be \nthought of,  for  example,  as  the  activity  of a  particular population  of neurons in  the  cor(cid:173)\ntex,  and the target function  as  the  time  series  generated for  the  same input by  some  un(cid:173)\nknown  quadratic  filter  Q.  The  synaptic  parameters  Wij,  D ij ,  Fij  and  Uij  are  chosen \nso  that,  for each input in  the  training  set,  the  network minimized the  mean-square error \nE[z, zQ]  = ~ z=;=-oI(Z(t)  - ZQ(t))2  between its output z(t)  and the desired output zQ(t) \nspecified  by  the filter Q.  To  achieve this  minimization, we used a conjugate gradient al(cid:173)\ngorithm. l  The  training  inputs  were  random  signals,  an  example  of which  is  shown  in \nFig.  3B. The test inputs were drawn from the same random distribution as  the training in(cid:173)\nputs, but were not actually used during training.  This test of generalization ensured that the \nobserved performance represented more  than  simple \"memorization\" of the  training  set. \nFig. 3C compares the network performance before and after training.  Prior to  training, the \noutput is  nearly flat,  while after training the network output tracks the filter output closely \n(E[z,zQ]  = 0.0032). \nFig. 3D shows the performance after training for different randomly chosen quadratic filters \nQ E  Qm for m  =  4, ... ,16. Even for larger values of m  the relatively small network with \n10  hidden  units  performs  rather  well.  Note  that  a  quadratic  filter  of dimension  m  has \nm(m + 1)/2 free parameters, whereas the dynamic network has a constant number of 80 \nadjustable parameters.  This shows clearly that dynamic synapses enable a small network \nto mimic a wide range of possible quadratic target filters. \n\nIi E [z  zQ ] \n\nIi E[z  zQ] \n\n1 In order to apply such a conjugate gradient algorithm ones has to calculate the partial derivatives \nIi u'\u00b7 . \nl.J \n\nIi w..  for  all  synapses  ~J  ill the network.  For more  detaIls \n\nIi n '\u00b7 . \n'&J \n\nIi E[z  zQ] \n\nIi F: .  and \n\nIi E[z  zQ ] \n\n(  \u2022. ) \n\n. \n\n. \n\n, \n\n, \n\n'l.J \n\n'I., \n\nabout conjugate gradient algorithms see e.g.  [5]. \n\n\fA \n\nc \n\no \n\n-020 \n\n50 \n\nB \n\nO.B \n\n0\u00b720 \n\n50 \n\nD \n\n100 \n\n150 \n\n200 \n\ntime steps \n\n100 \n\n150 \n\ntime steps \n\n200 \n\no  4 \n\n6 \n\nB \n\n10  12  14  16 \nm \n\nFigure 3:  A network with  units coupled by dynamic  synapses can approximate randomly \ndrawn quadratic filters.  A Network architecture. The network had one input unit, 10 hidden \nunits (5  excitatory, 5 inhibitory), and  one output unit,  see Fig.  2 for details.  B One of the \ninput patterns used in the training ensemble.  For clarity, only a portion of the actual input \nis  shown.  C Output of the network prior to  training, with random initialization of the pa(cid:173)\nrameters, and the output of the dynamic network after learning.  The target was  the output \nof a quadratic filter Q E  QlQ.  The filter coefficients  hkl  (1  :::;  k, l  :::;  10) were generated \nrandomly by subtracting J-t/2  from a random number generated from an exponential distri(cid:173)\nbution with mean J-t  =  3.  D Performance after network training.  For different sizes of HQ \n(HQ  is a symmetric m  x  m matrix) we plotted the average performance (mse measured on \na test set) over 20 different filters Q, i.e.  20 randomJy generated matrices HQ. \n\n3  Comparison with the model of Back and Tsoi \n\nOur dynamic network model is not the first to incorporate temporal dynamics via dynamic \nsynapses.  Perhaps the earliest suggestion for a role for synaptic dynamics in network com(cid:173)\nputation was  by  [7].  More recently,  a number of networks have  been proposed in  which \nsynapses implemented linear filters;  in particular [6]. \n\nTo  assess  the performance of our network model in relation to the model proposed in  [6] \nwe  have  analyzed  the  performance of our dynamic  network model  for  the  same  system \nidentification task that was employed as  benchmark task in  [6].  The goal of this task is  to \nlearn a filter F with (Fx)(t)  = sin(u(t)) where u(t) is  the output of a linear filter applied \nto the input time series X(t).2 \nThe  result  is  summarized in  Fig.  4.  It can clearly  be  seen  that  our network  model  (see \nFig. 3A for the network architecture) is able to learn this particular filter.  The mean square \nerror (mse) on the test data is  0.0010, which is  slightly smaller than the mse of 0.0013 re(cid:173)\nported in [6].  Note that the network Back and Tsoi used to learn the task had 130 adjustable \nparameters (13  parameters per IIR synapse,  10 hidden units)  whereas our network model \nhad  only  80  adjustable  parameters  (all  parameters  U ij ,  F ij ,  Dij  and  W ij  were  adjusted \nduring learning). \n\n2U(t) is the solution to the difference equation u(t)-1.99u(t-1)+ 1.572u(t-21)-0.4583u(t-\n31) = O.0154x(t) + O.0462x(t  - 1) + O.0462x(t  - 2 1) + O.0154x(t - 31).  Hence,  u(t) is the \noutput of a linear filter applied to the input x(t). \n\n\fA \n\n2 \n\n.~ 0 \n\n- 1 \n\n-20 \n\nB \n\n50 \n\n100 \ntime \n\n150 \n\n200 \n\n150 \n\n200 \n\nC \n\nI~~~ \nI DN \n\n= ST \n\n0.5 \n\n0.0 \nD \n\n1.0 \n\n1.5 \n\nI \n\nI \n\n50 \n\n100 \n\n150 \n\nFigure  4:  Performance of our model  on  the  system  identification  task used  in  [6].  The \nnetwork  architecture  is  the  same  as  in  Fig.  3.  A  One  of the  input  patterns  used  in  the \ntraining ensemble.  B Output of the network after learning and  the target.  C Comparison \nof the mean square error (in units of to- 3 )  achieved on test data by the model of Back and \nTsoi (BT) and by the dynamic network (DN).  D Comparison of the number of adjustable \nparameters.  The network model  of Back and Tsoi (BT)  utilizes  slightly  more adjustable \nparameters than the dynamic network (DN). \n\n____ ..  _____ 1 _____ ....  ___ _ \n\nA  1 and 2-tuples \n\nW  _!.!.!. \n- r - - -F. _ \u2022 \n\nU. !- !.!_ \nDilL. , _ ~. \n\nW  U  D  F \n\nI--- -\n\nB3-tuples \n\n.\n\n.\n\n.\n\nw/oF \n\nw/oD \n\nw/oU \n\n. w/oW \n\nC  1 and 2-tuples \n\n'.'.' \n\n:-\n.\nW \n---- ... -----1----- 1- ----\nU. : \u00b7 :.: \u2022 \nD. ,-i-i-\n\n__ _ _  l  __ _ __ I ___ _ _ L  __ _  _ \n\n: \n\n: \n\n- I- - - - - r \n\n- - -\n\n-- - - , \n\n- -\n\n, \n\n, \n\n, \n\nF  _ '  \u2022  ' .  ' \n\nW  U  D  F \n\nF \n\nD \n\nU \n\nW \n\nFigure 5:  Impact of different synaptic parameters on the learning capabilities of a dynamic \nnetwork.  The  size  of a square (the  \"impact\") is  proportional to  the  inverse  of the  mean \nsquared error averaged over N  trials.  A In each trial  (N = 100) a different quadratic filter \nmatrix HQ  (m  = 6) was randomly generated as  described in  Fig.  3.  Along the diagonal \none can see the impact of a single parameter, whereas the off-diagonal elements (which are \nsymmetric) represent the impact of changing pairs of parameters. B The impact of subsets \nof size three is  shown where the labels indicate which parameter is  not included.  C Same \ninterpretation as  for panel A but the results  shown (N = 20) are for the filter used in [6]. \nD Same interpretation as for panel B but the results shown (N = 20) are for the same filter \nas in panel C. \n\nThis shows that a very simple feedforward network with biologically realistic synaptic dy(cid:173)\nnamics  yields  performance comparable to  that of artificial networks that were previously \ndesigned to yield good performance in the time series domain without any claims of bio(cid:173)\nlogical realism. \n\n4  Which Parameters Matter? \n\nIt remains  an  open  experimental question  which  synaptic  parameters are  subject  to  use(cid:173)\ndependent plasticity,  and under what conditions.  For example, long term potentiation ap(cid:173)\npears to change synaptic dynamics between pairs of layer 5 cortical neurons [8]  but not in \nthe hippocampus [9].  We therefore wondered whether plasticity in the synaptic dynamics is \nessential for a dynamic network to be able to learn a particular target filter.  To address this \nquestion, we compared network performance when different parameter subsets were opti(cid:173)\nmized using the conjugate gradient algorithm, while the other parameters were held fixed. \nIn all experiments, the fixed parameters were chosen to ensure heterogeneity in presynaptic \ndynamics. \n\n\fFig.  5 shows that changing only the postsynaptic parameter W  has comparable impact to \nchanging only the presynaptic parameters U or D, whereas changing only F  has little im(cid:173)\npact on the dynamics of these networks (see diagonal of Fig. 5A and Fig. 5C). However, to \nachieve good performance one has to change at least two different types of parameters such \nas  {W, U} or {W, D} (all other pairs yield worse performance).  Hence, neither plasticity \nin the presynaptic dynamics (U, D, F) alone nor plasticity of the postsynaptic efficacy (W) \nalone was sufficient to achieve good performance in this model. \n\n5  A Universal Approximation Theorem for Dynamic Networks \n\nIn  the preceding sections we  had presented empirical evidence for the approximation ca(cid:173)\npabilities of our dynamic network model for computations in the time series domain.  This \ngives  rise  to  the  question,  what the  theoretical limits  of their approximation capabilities \nare.  The  rigorous  theoretical  result  presented  in  this  section  shows  that  basically  there \nare  no  significant a priori  limits.  Furthermore, in  spite  of the rather complicated system \nof equations that defines  dynamic networks,  one can give a precise mathematical charac(cid:173)\nterization  of the class  of filters  that can be  approximated by  them.  This  characterization \ninvolves  the following  basic  concepts.  An  arbitrary  filter  F  is  called  time  invariant if a \nshift of the input functions by a constant to just causes a shift of the output function by the \nsame  constant to.  Another essential property  of filters  is fading  memory.  A filter  F  has \nfading  memory if and  only if the  value  of F;f(O)  can be  approximated arbitrarily closely \nby  the  value  of F~(O) for  functions  ~ that  approximate  the  functions  ;f for  sufficiently \nlong bounded intervals [-T, 0].  Interesting examples of linear and nonlinear time invariant \nfilters  with fading  memory can  be  generated with the help of representations of the form \n(Fx)(t) = Iooo  ... Iooo x(t - Tt)  ..... x(t - Tk)hh, . ..  ,Tk)dTl ... dTk  for measurable \nand essentially bounded functions x  : R -+ R (with hELl). One refers to such an integral \nas  a Volterra  term of order k.  Note that for k  =  1 it yields the  usual representation for a \nlinear time invariant filter.  The class  of filters  that can be represented by  Volterra series, \ni.e., by finite or infinite sums of Volterra terms of arbitrary order, has been investigated for \nquite some time in neurobiology and engineering. \n\nTheorem 1  Assume  that  X  is  the  class  of functions from  R  into  [Bo, B l ] which  satisfy \nIx(t)  - x(s)1  ~ B2  \u00b7It - sl for all t,s  E  ffi,  where B o,Bl ,B2 are arbitrary real-valued \nconstants with 0 < Bo  < Bl and 0 < B 2.  Let F  be an arbitrary filter that maps vectors \nof functions ;f =  (Xl, ... ,xn) E xn into functions from R into ~ Then  the following are \nequivalent: \n\n(a)  F  can  be  approximated by dynamic  networks' N  defined  in  Fig.  2  (i.e.,  for any \nfor all \n\n\u20ac  >  0 there  exists  such network N  such  that I (F;f)(t)  - (N ;f)(t) I <  \u20ac \n;f E xn and all t  E R) \n\n(b)  F  can be approximated by dynamic networks (see Fig.  2) with just a single layer \n\nof sigmoidal neurons \n\n( c)  F  is time invariant and has fading memory \n\n(d)  F  can be approximated by a sequence of (finite or infinite) Volterra  series. \n\nThe proof of Theorem 1 relies on the Stone-Weierstrass Theorem, and is contained as  the \nproof of Theorem 3.4 in  [10]. \n\nThe  universal approximation result contained in  Theorem 1 turns  out to  be rather robust \nwith regard to changes in the definition of a dynamic network. Dynamic networks with just \none layer of dynamic synapses and one subsequent layer of sigmoidal gates can approxi(cid:173)\nmate the same class of filters  as  dynamic networks with  an  arbitrary number of layers  of \n\n\fdynamic  synapses  and  sigmoidal  neurons.  It can  also  be  shown  that Theorem  1 remains \nvalid  if one  considers  networks  which  have  depressing  synapses  only  or if one uses  the \nmodel for synaptic dynamics proposed in [1]. \n\n6  Discussion \n\nOur central hypothesis is  that rapid changes in synaptic strength, mediated by mechanisms \nsuch as facilitation and depression, are an integral part of neural processing. We have ana(cid:173)\nlyzed the computational power of such dynamic networks, which represent a new paradigm \nfor  neural  computation  on  time  series  that  is  based  on  biologically  realistic  models  for \nsynaptic dynamics [11]. \n\nOur analytical results  show that the class  of nonlinear filters  that can  be  approximated by \ndynamic networks, even with just a single hidden layer of sigmoidal neurons, is remarkably \nrich.  It contains every time invariant filter with fading memory, hence arguable every filter \nthat is potentially useful for a biological organism. \n\nThe computer simulations we performed show that rather small dynamic networks are not \nonly able to perform interesting computations on time series, but their performance is com(cid:173)\nparable to  that of previously considered artificial neural  networks  that were designed for \nthe purpose of yielding efficient processing of temporal signals.  We  have tested dynamic \nnetworks on tasks such as  the  learning of a randomly chosen quadratic filter,  as well as  on \nthe learning task used in [6], to illustrate the potential of this architecture. \n\nReferences \n\n[1]  J. A.  Varela, K. Sen, 1. Gibson, J. Fost, L. F. Abbott, and S. B. Nelson.  A quantitative descrip(cid:173)\ntion  of short-term plasticity at excitatory  synapses  in  layer 2/3  of rat primary  visual cortex.  J. \nNeurosci,  17:220-4, 1997. \n\n[2]  M.Y. Tsodyks, K.  Pawelzik, and H. Markram.  Neural networks with dynamic synapses.  Neural \n\nComputation,  10:821-835, 1998. \n\n[3]  H. Markram, Y. Wang, and M. Tsodyks.  Differential signaling via the same axon of neocortical \n\npyramidal neurons.  PNAS,95:5323-5328,  1998. \n\n[4]  R.C. deCharms and M.M. Merzenich.  Optimizing sound features for cortical neurons.  Science, \n\n280:1439-43,  1998. \n\n[5]  John Hertz, Anders  Krogh,  and Richard Palmer.  Introduction  to  the  Theory oj Neural Compu(cid:173)\n\ntation.  Addison-Wesley,  1991. \n\n[6]  A. D.  Back and A. C. Tsoi.  A simplified gradient algorithm for 1IR synapse multilayer percep(cid:173)\n\ntrons.  Neural Computation, 5:456-462, 1993. \n\n[7]  W.A.  Little and G.L.  Shaw.  A statistical theory  of short and long term memory.  Behavioural \n\nBiology,  14:115-33,  1975. \n\n[8]  H. Markram and M. Tsodyks. Redistribution of synaptic efficacy between neocortical pyramidal \n\nneurons.  Nature, 382:807-10, 1996. \n\n[9]  D.K. Selig, R.A. Nicoll, and R.C. Malenka.  Hippocampal long-term potentiation preserves the \n\nfidelity of postsynaptic responses to presynaptic bursts.  J.  Neurosci. , 19:1236-46, 1999. \n\n[10]  W.  Maass  and  E.  D.  Sontag.  Neural  systems  as  nonlinear  filters.  Neural  Computation, \n\n12(8):1743-1772,2000. \n\n[11]  A. M.  Zador.  The basic unit of computation.  Nature Neuroscience, 3(Supp):1167, 2000. \n\n\f", "award": [], "sourceid": 1903, "authors": [{"given_name": "Thomas", "family_name": "Natschl\u00e4ger", "institution": null}, {"given_name": "Wolfgang", "family_name": "Maass", "institution": null}, {"given_name": "Eduardo", "family_name": "Sontag", "institution": null}, {"given_name": "Anthony", "family_name": "Zador", "institution": null}]}