{"title": "An Adaptive WTA using Floating Gate Technology", "book": "Advances in Neural Information Processing Systems", "page_first": 720, "page_last": 726, "abstract": null, "full_text": "An Adaptive WTA using Floating  Gate \n\nTechnology \n\nw. Fritz Kruger,  Paul Hasler,  Bradley A.  Minch,  and Christ of Koch \n\nCalifornia Institute of Technology \n\nPasadena, CA  91125 \n\n(818)  395 - 2812 \n\nstretch@klab.caltech.edu \n\nAbstract \n\nWe  have  designed,  fabricated,  and  tested  an  adaptive  Winner(cid:173)\nTake-All  (WTA)  circuit  based  upon  the  classic  WTA  of Lazzaro, \net  al  [IJ.  We  have  added  a  time  dimension  (adaptation)  to  this \ncircuit to make the input derivative an important factor in winner \nselection.  To  accomplish  this,  we  have  modified  the  classic  WTA \ncircuit  by  adding floating  gate transistors which  slowly  null  their \ninputs over time.  We  present a  simplified analysis and experimen(cid:173)\ntal data of this adaptive WTA fabricated in a standard CMOS 2f.tm \nprocess. \n\n1  Winner-Take-All Circuits \n\nIn a WTA network, each cell has one input and one output.  For any set of inputs, the \noutputs will all be at zero except for the one which is from the cell with the maximum \ninput.  One way to accomplish this is by a global nonlinear inhibition coupled with a \nself-excitation term [2J.  Each cell  inhibits all others while exciting itself;  thus a cell \nwith even a slightly greater input than the others will excite itself up to its maximal \nstate  and  inhibit  the  others  down  to  their  minimal  states.  The  WTA  function  is \nimportant for  many classical neural  nets that involve  competitive  learning,  vector \nquantization  and  feature  mapping.  The  classic  WTA  network  characterized  by \nLazzaro et.  al.  [IJ  is  an  elegant,  simple  circuit  that shares just one  common  line \namong all  cells of the network to propagate the inhibition. \n\nOur motivation  to  add  adaptation  comes  from  the  idea of saliency  maps.  Picture \na  saliency  map  as  a  large  number  of cells  each  of which  encodes  an  analog  value \n\n\fAn Adaptive wrA using Floating Gate Technology \n\n721 \n\nVtun01 \n\nVdd  \u00b1 \n\nM4 \n\nV 1 \n\n~C1 \n\n, \ni C2 \n\nVfg1 \n\nM2 \n\nVb1 \n\nV \n\nVtun02 \n\nJLV~ \n-A ~5 \n\n1--r---'-c-\n2\n\nFigure 1:  The circuit diagram of a  two  input winner-take-all circuit. \n\nreflecting  some  measure  of the importance  (saliency)  of its  input.  We  would  like \nto pay attention to the most  salient cell,  so  we  employ a  WTA  function  to tell  us \nwhere to look.  But if the input doesn't change,  we  never look away from  that one \ncell.  We  would like  to introduce some concept of fatigue and refraction to each cell \nsuch that after winning for some time, it tires, allowing other cells to win,  and then \nit  must  wait  some  time  before  it  can win  again.  We  call  this  circuit  an  adaptive \nWTA. \n\nIn this paper, we  present an adaptive WTA based upon the classic  WTA;  Figure 1 \nshows  a  two-input,  adaptive WTA circuit.  The difference  between the  classic  and \nadaptive WTA is  that  M4  and Ms  are pFET single transistor synapses.  A single \ntransistor synapse [3]  is either an nFET or pFET transistor with a floating gate and \na  tunneling junction.  This enhancement results in the ability of each transistor to \nadapt to its input bias current.  The adaptation is  a result of the electron tunneling \nand hot-electron injection modifying the charge on the floating gate;  equilibrium is \nestablished when the tunneling current equals  the injection current.  The circuit is \ndevised  in such  a  way  that these  are negative feedback  mechanisms,  consequently \nthe output voltage will  always return to the same steady state voltage determined \nby its bias  current regardless of the DC  input level.  Like the autozeroing amplifier \n[4],  the adaptive WTA is  an example of a  circuit where the adaptation occurs as  a \nnatural part of the circuit operation. \n\n2  pFET hot-electron injection and electron tunneling \n\nBefore considering the behavior of the adaptive WTA, we will review the processes of \nelectron tunneling and hot-electron injection in pFETs.  In subthreshold operation, \nwe  can describe the channel current of a pFET (Ip)  for  a differential change in gate \n\nvoltage,  ~ Vg, around a fixed  bias current Iso,  as Ip  = Iso exp ( - ,,~:g ) where  Kp  is \nthe amount by which ~ Vg  affects the surface potential of the pFET, and UT  is  ki. \n\nWe  will  assume for  this paper that all  transistors are identical. \n\nFirst,  we  consider  electron tunneling.  We  start with  the  classic  model  of electron \n\n\f722 \n\nW.  F.  Kruger, P.  Hasler, B. A.  Minch and C.  Koch \n\nDrain \n\nL \nJ ,,' \n\n,0\u00b7 \n\n-1 ... 2QOnA \n\nI .... ~ \n\n-1  \u2022\u2022 1nA \n-La.1inA \n\nt \n\nIS \n\n10 \n\n10.6 \n\n11 \n\ne \n\n&.5 \n\n\"'\" \n(b) \n\nEc~ \u2022\u2022\u2022 \nEv  - - - .  ' - ... -....\".~ \n\nSource \n\nChannel \n\n(a) \n\nFigure  2:  pFET  Hot  Electron  Injection. \n(a)  Band  diagram  of  a  subthreshold  pFET \ntransistor for  favorable  conditions for  hot-electron injection.  (b)  Measured data of pFET \ninjection  efficiency  versus the drain  to channel  voltage  for  four  source currents.  Injection \nefficiency  is  the  ratio  of injection  current  to  source  current.  At  cI>dc  equal  to  8.2V,  the \ninjection  efficiency  increases  a factor  of e for  an increase  cI>dc  of 250mV. \n\ntunneling through a  silicon - Si0 2  system  [5].  As  in  the  autozeroing amplifier  [4], \nthe  tunneling  current  will  be  only  a  weak  function  for  the  voltage  swing  on  the \nfloating gate voltage through the region of subthreshold currents; therefore we  will \napproximate the tunneling junction as  a  current source supplying I tunO  current to \nthe floating gate. \n\nSecond,  we  derive a  simple model of pFET hot-electron injection.  Figure 2a shows \nthe  band  diagram of a  pFET operating at  bias  conditions  which  are favorable  for \nhot-electron injection.  Hot-hole impact ionization creates electrons at the drain edge \nof the  depletion  region.  These  secondary  electrons  travel  back  into  the  channel \nregion  gaining  energy  as  they  go.  When  their  energy  exceeds  that  of the  Si02 \nbarrier,  they  can  be  injected  through  the  oxide  to  the  floating  gate.  The  hole \nimpact ionization current is proportional to the source current, and is an exponential \nfunction  of the  voltage  drop from  channel  to drain  (c)de).  The injection current is \nproportional  to  the hole  impact  ionization current  and  is  an exponential function \nof the voltage  drop from  channel to  drain.  We  will  neglect  the  dependence  of the \nfloating-gate  voltage  for  a  given  source  current  and  c)de  as  we  did  in  [4].  Figure \n2b shows  measured injection efficiency  for  several source currents,  where  injection \nefficiency  is  the  ratio  of  the  injection  current  to  source  current.  The  injection \nefficiency  is  independent  of  source  current  and  is  approximately  linear  over  a  1 \n- 2V swing  in  c)de;  therefore  we  model  the  injection  efficiency  as  proportional to \n\nexp ( - t~~c ) within that 1 to 2V swing, where Vinj  is a measured device parameter \nwhich  for  our process  is  250mV at a  bias  c)de  = 8.2V,  and  6,c)de  is  the  change  in \nc) de  from the bias level.  An increasing voltage input will  increase the pFET surface \npotential  by  capacitive coupling to the floating gate.  Increasing the pFET surface \npotential will  increase the source current thereby decreasing  c) de  for  a  fixed  output \nvoltage and lowering the injection efficiency. \n\n\fAn Adaptive WTA  using Floating  Gate Technology \n\n723 \n\n,o'r-----~----~---___, \n\nCulftlnt steP I nput \n\n10 ,77nA \u00b7 14.12nA - lO.11M \n\nV .. n . 43.3SV \n\n~~\\ \n\\ \n\\ \n\\ \n\\ \n\\ \n\\ \n\\ \n\\ \n\n\\ \n\n\"-\n\n\" \n\n1.55 \n\n~,. \n\n, \n,I \n/ \nt \n/ \n~\n! \nius  ! \nI \n! \nI \n\nJ \n\n1.4 \n\nj \n\nj \n\n~~+~ \n1 .35 O~--;:20::----!:40'---:!:60:--:::'80-----:'::::OO--;'=:-20 ----:-, 40=-~' 60:::---:-:'80::--::!200 \n\n1111\"18 (5) \n\n(a) \n\n1000~------7.50:-------:'::::OO------'!'SO \n\nInput CuTent Step  (%  of bas cumtnt) \n\n(b) \n\nFigure  3:  Illustration  of  the  dynamics  for  the  winning  and  losing  input  voltages.  (a) \nMeasured  Vi  verses  time  due  to  an  upgoing  and  a  downgoing  input  current  step.  The \ninitial input voltage change due to the input step is  much smaller than the voltage change \ndue to the adaptation.  (b)  Adaptation time of a losing  input voltage for  several tunneling \nvoltages.  The adaptation time is  the time from  the start of the input current step to the \ntime the input voltage is within 10% of its steady state voltage.  A larger tunneling current \ndecreases the adaptation time by increasing the tunneling current supplied to the floating \ngate. \n\n3  Two input  Adaptive WTA \n\nWe  will  outline  the  general  procedure to  derive  the  general  equations  to  describe \nthe  two  input  WTA  shown  in Fig.  1.  We  first  observe that transistors M 1 ,  M 2 , \nand Ma  make  up  a  differential  pair.  Regardless  of any  adaptation,  the  middle  V \nnode and output currents are set by the input voltages  (Vl  and V2) , which  are set \nby the input currents, as in the classic WTA  [1].  The dynamics for  high frequency \noperation are also similar to the  classic  WTA  circuit.  Next,  we  can write the two \nKirchhoff Current Law  (KCL)  equations at Vl  and V2 ,  which  relate the change in \n~ and V2  as  a  function  of the  two  input  currents and  the  floating  gate voltages. \nFinally,  we  can  write  the  two  KCL  equations  at  the  two  floating  gates  VJgl  and \nVJ g2 , which relates the changes in the floating gate voltages in terms of Vl  and V2. \nThis  procedure  is  directly  extendable to multiple  inputs.  A  full  analysis  of these \nequations is  very difficult  and will  be described in another paper. \n\nFor this  discussion , we  present a  simplified  analysis to develop the  intuition of the \ncircuit operation.  At sufficiently high frequencies,  the tunneling and injection cur(cid:173)\nrents  do  not  adapt  the  floating  gate  voltages  sufficiently  fast  to  keep  the  input \nvoltages  at their steady state levels.  At these frequencies,  the adaptive WTA  acts \nlike the classic  WTA  circuit  with one small  difference.  A  change in the input volt(cid:173)\nages, Vl  or V2 is linearly related to V  by the capacitive coupling (~Vl =  - \u00a7; ~ V), \nwhere this relationship is exponential in the classic WTA.  There is always some ca(cid:173)\npacitance C2 ,  even if not explicitly drawn due to the overlap capacitance from  the \nfloating gate to drain.  This property gives the designer the added freedom to mod(cid:173)\nify  the gain.  We  will  assume the  circuit operates in its intended  operating regime \nwhere  the  floating  gate  transistors  settle  sufficiently  fast  such  that  their  channel \n\n\f724 \n\nW.  F.  Kruger, P.  Hasler, B. A.  Minch and C.  Koch \n\n. ' \n\n. \n\n\" . .,.V ....... f \n\n,- .; \n\n35 \n\n~25 \n\nL > \n1 .. \n\n10'\u00b7 \n\n10\" \n\n10\" \nc~ ..... t2(A.. \n\n(a) \n\n'0' \n~1fIp.ll12(A) \n\n(b) \n\nFigure 4:  Measured change  in  steady state input  voltages  as  a  function  of bias  current. \n(a)  Change in the two  steady state output voltages as  a function of the bias current of the \nsecond input.  The bias  current  of the first  input was  held fixed  at 8.14nA.  (b)  Change in \nthe RMS  noise  of the two  output voltages  as  a function  of the bias  current of the second \ninput.  The  RMS  noise  is  much  higher  for  the  losing  input  than for  the  winning  input. \nNote that where the two bias currents crOSS  roughly corresponds to the location where the \nRMS  noise  on the two  input voltages is  equal. \n\ncurrent equals the input currents \n\nI \n\nJ.  -\n, - 80 exp \n\n(_ K6,V/9i ) \n\nUT \n\ndIi  _  -J.~ dV/ gi \n, UT  dt \n\n-+  dt  -\n\n(1) \n\nfor  all inputs indexed by i, but not necessarily fast enough for  the floating gates to \nsettle to their final  steady state levels. \n\nTo develop some initial intuition,  we shall begin by considering one half of the two \ninput WTA: transistors M 1 ,  M2 and M4 of Figure 1.  First, we  notice that Ioutl  is \nequal to  Ib  (the current through transistor Mt};  note that this is  not true for  the \nmultiple  input  case.  By equating these  two  currents we  get  an  equation for  V  as \nV  =  KV1  - KVb,  where we will  assume that Vb  is  a  fixed  bias voltage.  Assuming the \ninput current equals the current through M 4 ,  VI  obeys the equation \n\n(KG1 + G2 ) - =  - - - - + ItunO \n\nGTUT dII \nKIt  dt \n\ndVI \ndt \n\n(  II \n\n-\n180 \n\n) \nexp( ---) -1 \n\n6, VI \nVinj \n\n(2) \n\nwhere CT  is  the total capacitance connected to the floating gate.  The steady state \nof (2)  is \n\nsv;  = KVinj  I  (~) \n\n'n \n\nU \nT \n\nn \n\nI \n\n80 \n\n(3) \n\nwhich is exactly the same expression for each input in a multiple input WTA. We get \na  linear differential equation by making the substitution X  = exp( D..v..Vl)  [4],  and we \nget similar solutions to the behavior of the autozeroing amplifier.  Figure 3a shows \nmeasured  data for  an  upgoing  and  a  downgoing  current step.  The  input  current \nchange results in an initial fast  change in the input  voltage,  and the input voltage \nthen  adapts  to  its  steady  state  voltage  which  is  a  much  greater  voltage  change. \nFrom the voltage difference  between the steady states,  we  get that Vinj  is  roughly \n500mV. \n\n\"'1 \n\n\fAn Adaptive WTA  using Floating Gate Technology \n\n725 \n\no \n\n10 \n\n15 \n\n20 \n\n25 \n\nl1me(a) \n\n(a) \n\n30 \n\n35 \n\n.a \n\n45 \n\n50 \n\no \n\n5 \n\n10 \n\n,5 \n\n20 \n\n30 \n\n35 \n\n.a \n\n45 \n\n50 \n\n25 \n\n11me(.) \n(b) \n\nFigure  5:  Experimental  time  traces  measurements  of  the  output  current  and  voltage \nfor  small  differential  input  current  steps.  (a)  Time  traces  for  small  differential  current \nsteps around nearly identical bias currents of 8.6nA.  (b)  Time traces for  small differential \ncurrent steps around two different bias currents of 8.7nA and O.88nA .  In the classic WTA, \nthe output currents would show  no  response to the input current steps. \n\nReturning to the two  input  case,  we  get  two  floating  gate equations  by  assuming \nthat the currents through M4 and M5  are equal to their respective input currents \nand  writing  the  KCL  equations  at  each  floating  gate.  If VI  and  V2  do  not  cross \neach other in the circuit operation, then one can easily solve these KCL equations. \nAssume without loss of generality that VI  is the winning voltage; which implies that \n~ V  =  K~ Vl .  The initial input voltage change before the floating gate adaptation \ndue to a  step in the two input currents of II ~ It and 12  ~ It is \n\n~VI =  GT  In (It)  ~V2 ~ GT  In (II It) \n\nG2 \n\nIt 12 \n\nKGl \n\nII' \n\n(4) \n\nfor  G2  much less than KGl .  In this case,  Vl  moves on the order of the floating gate \nvoltage change, but V2  moves on the order of the floating gate change amplified up \nby .g;..  The response of ~ VI  is governed by an identical equation to (2)  ofthe earlier \nhalf-analysis, and therefore results in a  small change in VI.  Also,  any perturbation \nof V  is  only slightly amplified at  Vl  due to the feedback;  therefore any noise at V \nwill  only  be slightly amplified into VI.  The restoration of V2  is  much  quicker  than \nthe Vl  node if G2  is  much  less  than KGl ;  therefore after the initial  input step,  one \ncan safely assume that V  is nearly constant.  The voltage at V  is  amplified by - ~ \nat  112;  therefore  any  noise  at V  is  amplified  at  the  losing  voltage,  but  not  at  the \nwinning voltage  as  the data in  Fig.  4b shows.  The  losing dynamics  are identical \nto the step response of an autozeroing amplifier  [4].  Figure 3b shows the variation \n. of the adaptation time verses the percent input current change for  several values of \ntunneling voltages. \n\nThe  main  difficulty  in  exactly  solving  these  KCL  equations  is  the  point  in  the \ndynamics  where  Vi  crosses  V2 ,  since  the  behavior changes  when  the  signals  move \n\n\f726 \n\nW.  F.  Kruger, P.  Hasler,  B. A. Minch and C.  Koch \n\nthrough the crossover point.  If we  get more than  a  sufficient  Vi  decrease  to  reach \nthe starting V2  equilibrium,  then the rest of the input  change  is  manifested  by  an \nincrease  in V2 \u2022  If the  voltage V2  crosses  the voltage  Vi,  then V  will  be  set  by  the \nnew  steady  state,  and  Vi  is  governed  by  losing  dynamics  until  Vi  :::::l  V2 \u2022  At  this \npoint Vi  is  nearly constant and V2  is  governed  by losing dynamics.  This analysis is \ndirectly extendible to arbitrary number of inputs. \n\nFigure 5 shows some characteristic traces from the two-input circuit.  Recall that the \nwinning node is  that with the lowest voltage, which is reflected in its corresponding \nhigh output current.  In Fig.  5a, we  see that as an input step is  applied, the output \ncurrent jumps and then  begins to adapt to a  steady state value.  When the inputs \nare  nearly  equal,  the  steady state outputs  are  nearly  equal;  but  when  the  inputs \nare  different,  the steady state output  is  greater for  the  cell  with  the  lesser  input. \nIn general,  the input current change that is  the largest after reaching the previous \nequilibrium  becomes  the  new  equilibrium.  This  additional  decrease  in  Vi  would \nlead  to  an  amplified  increase  in  the  other  voltage  since  the  losing  stage  roughly \nlooks  like  an autozeroing  amplifier  with  the common  node  as  the  input  terminal. \nThe extent  to which  the  inputs  do  not  equal  this  largest input  is  manifested  as  a \nproportionally larger input voltage.  The other voltage would return to equilibrium \nby slowly,  linearly decreasing in voltage due to the tunneling current.  This process \nwill  continue  until  Vi  equals  V2.  Note  in  general  that  the  inputs  with  lower  bias \ncurrents have  a slight starting advantage over the inputs with higher bias currents. \n\nFigure 5b illustrates the advantage of the adaptive WTA over the classic WTA.  In \nthe classic WTA,  the output voltage and current would not change throughout the \nexperiment,  but the adaptive WTA  responds  to changes in the input.  The second \ninput step does  not evoke a  response  because there was  not enough time to adapt \nto steady state after the previous step;  but the next  step immediately causes  it to \nwin.  Also  note  in  both  of these  traces  that  the  noise  is  very  large  in  the  loosing \nnode and small in the winner because of the gain differences  (see  Figure 4b). \n\nReferences \n\n[1]  J.  Lazzaro,  S.  Ryckebusch,  M.A.  Mahowald,  and  C.A.  Mead  \"Winner-Take(cid:173)\nAll  Networks  of  O(N)  Complexity\" ,  NIPS  1  Morgan  Kaufmann  Publishers, \nSan Mateo, CA, 1989, pp 703 - 711. \n\n[2]  Grossberg S.  \"Adaptive Pattern Classification and Universal Recoding:  I. Par(cid:173)\nallel Development and Coding of Neural Feature Detectors.\"  Biological  Cyber(cid:173)\nnetics vol.  23,  121-134,  1988. \n\n[3]  P.  Hasler,  C.  Diorio,  B.  A.  Minch,  and  C.  Mead,  \"Single \n\n'fransis(cid:173)\n\ntor  Learning  Synapses\",  NIPS  7,  MIT  Press,  1995,  817-824.  Also  at \nhttp://www.pcmp.caitech.edu/ anaprose/paul. \n\n[4]  P.  Hasler,  B.  A.  Minch,  C.  Diorio,  and  C.  Mead,  \"An  autozeroing  amplifier \nusing pFET Hot-Electron Injection\",  ISCAS,  Atlanta,  1996, III-325  - III-328. \nAlso at http://www.pcmp.caitech.edu/anaprose/paul. \n\n[5]  M.  Lenzlinger and E.  H. Snow  (1969),  \"Fowler-Nordheim tunneling into ther(cid:173)\n\nmally grown Si02 ,\"  J.  Appl.  Phys.,  vol.  40,  pp.  278-283,  1969. \n\n\f", "award": [], "sourceid": 1205, "authors": [{"given_name": "W.", "family_name": "Kruger", "institution": null}, {"given_name": "Paul", "family_name": "Hasler", "institution": null}, {"given_name": "Bradley", "family_name": "Minch", "institution": null}, {"given_name": "Christof", "family_name": "Koch", "institution": null}]}