{"title": "Backpropagation and Its Application to Handwritten Signature Verification", "book": "Advances in Neural Information Processing Systems", "page_first": 340, "page_last": 347, "abstract": null, "full_text": "340 \n\nBACKPROPAGATION AND ITS \n\nAPPLICATION TO HANDWRITTEN \n\nSIGNATURE VERIFICATION \n\nDorothy A.  Mighell \nElectrical Eng.  Dept. \nInfo.  Systems Lab \nStanford University \nStanford, CA 94305 \n\nTimothy S.  Wilkinson \nElectrical Eng.  Dept. \n\nInfo.  Systems Lab \nStanford University \nStanford, CA 94305 \n\nJoseph  W.  Goodman \nElectrical Eng.  Dept. \nInfo.  Systems Lab \nStanford  University \nStanford, CA 94305 \n\nABSTRACT \n\nA  pool  of handwritten  signatures  is  used  to  train  a  neural  net(cid:173)\nwork for the task of deciding whether or not a  given signature is a \nforgery.  The network is  a feedforward  net, with a binary image as \ninput.  There is a hidden layer, with a single unit output layer.  The \nweights are  adjusted according to the backpropagation algorithm. \nThe signatures are entered into a C  software program through the \nuse of a Datacopy Electronic Digitizing Camera.  The binary signa(cid:173)\ntures  are normalized  and centered.  The performance is  examined \nas  a  function of the training set  and network structure.  The  best \nscores  are  on  the  order of 2%  true signature  rejection  with  2-4% \nfalse  signature acceptance. \n\nINTRODUCTION \n\nSignatures are used everyday to authorize the transfer of funds for millions of people. \nWe use our signature as a form of identity, consent, and authorization.  Bank checks, \ncredit cards, legal documents and waivers all require the everchanging personalized \nsignature.  Forgeries  on  such  transactions  amount  to millions  of dollars  lost  each \nyear.  A trained eye can spot most forgeries,  but it is not cost effective to handcheck \nall signatures due to the massive number of daily  transactions.  Consequently, only \ndisputed  claims  and  checks written for  large  amounts  are  verified.  The consumer \nwould certainly benefit from the added protection of automated verification.  Neural \nnetworks lend themselves very well to signature verification.  Previously, they have \nproven  applicable  to  other  signal  processing  tasks,  such  as  character recognition \n{Fukishima, 1986} {Jackel,  1988}, sonar target classification {Gorman, 1986}, and \ncontrol- as  in the broom  balancer {Tolat, 1988}. \n\nHANDWRITING ANALYSIS \n\nSignature  verification  is  only  one  aspect  of  the  study  of  handwriting  analysis. \nRecognition is  the objective, whether it  be of the writer or the characters.  Writer \nrecognition can be further broken down into identification and verification.  Identi-\n\n\fBackpropagation and Handwritten Signature Verification \n\n341 \n\nfication selects the  author of a  sample from  among a  group of writers.  Verification \nconfirms  or rejects  a  written  sample for  a  single  author.  In  both  cases,  it  is  the \nstyle of writing that is important. \n\nDeciphering written text is the basis of character recognition.  In this task, linguistic \ninformation such as the individual characters or words are extracted from the text. \nStyle  must  be  eliminated  to  get  at  the  content.  A  very  important  application of \ncharacter recognition is  automated reading  of zip-codes  in  the  post  office  {Jackel, \n1988}. \n\nData for  handwriting  analysis  may  be  either  dynamic  or  static.  Dynamic  data \nrequires  special  devices  for  capturing  the  temporal  characteristics  of the  sample. \nFeatures  such  as  pressure,  velocity,  and  position  are  examined  in  the  dynamic \nframework.  Such analysis is  usually performed on-line in real time. \n\nStatic  analysis  uses  the  final  trace  of the  writing,  as  it  appears  on  paper.  Static \nanalysis does not require any special processing devices while the signature is being \nproduced.  Centralized  verification  becomes  possible,  and  the  processing  may  be \ndone off-line. \n\nWork  has  been  done  in  both  static  and dynamic  analysis  {Sato,  1982}  {Nemcek, \n1974}.  Generally,  signature  verification  efforts  have  been  more  successful  using \nthe  dynamic  information.  It  would  be  extremely  useful  though,  to  perform  the \nverification  using  only  the  written  signature.  This  would  eliminate  the  need  for \ncostly  machinery  at  every  place  of business.  Personal checks  may  also  be  verified \nthrough a  static signature analysis. \n\nTASK \n\nThe handwriting analysis task with which this paper is concerned is  that of signa(cid:173)\nture verification using an off-line method to detect casual forgeries.  Casual forgeries \nare  non-professional  forgeries,  in  which  the  writer  does  not  practice  reproducing \nthe signature.  The writer may  not even have a  copy of the true signature.  Casual \nforgeries  are very  important  to detect.  They  are far  more  abundant,  and  involve \ngreater monetary losses than professional forgeries.  This signature verification task \nfalls  into the writer  recognition  category,  in which the  style  of writing  is  the  im(cid:173)\nportant variable.  The off-line analysis allows centralized verification at a  lower cost \nand broader use. \n\nHANDWRITTEN SIGNATURES \n\nThe  signatures  for  this  project  were  gathered  from  individuals  to  produce  a  pool \nof 80 true signatures and 66 forgeries.  These  are signatures,  true and false,  for one \nperson.  There  is  a  further  collection  of signatures,  both  true  and  false,  for  other \npersons,  but the majority of the results presented will be  for  the one individual.  It \nwill be  clear when other individuals are  included in the demonstration. \n\nThe  signatures  are  collected  on  3x5  index  cards  which  have  a  small  blue  box  as \n\n\f342 \n\nWilkinson, Mighell and Goodman \n\na  guideline.  The  cards  are  scanned  with  a  CCD  array  camera  from  Datacopy, \nand thresholded to produce binary images.  These  binary images are centered and \nnormalized to fit  into a  128x64 matrix.  Either the entire 128x64 image is presented \nas input, or a 90x64 image of the three initials alone is presented.  It is  also possible \nto present preprocessed inputs to the network. \n\nSOFTWARE SIMULATION \n\nThe  type  of learning  algorithm  employed  is  that  of  backpropagation.  Both  dwell \nand momentum  are  included.  Dwell is  the  type  of scheduling employed,  in which \nan image is presented to the network,  and the network is allowed to \"dwell\"  on that \ninput for a few  iterations while updating its weights.  C. Rosenberg and T. Sejnowski \nhave done a few  studies on the effects of scheduling on learning {Rosenberg,  1986}. \nMomentum is a term included in the change of weights equation to speed up learning \n{Rumelhart, 1986}. \n\nThe software is written in  Microsoft  C,  and run on an IBM  PC/AT with an 80287 \nmath co-processor chip. \n\nIncluded in the simulation is a piece-wise linear approximation to the sigmoid trans(cid:173)\nfer  function  as  shown in Figure  1.  This greatly improves the speed of calculation, \nbecause  an  exponential  is  not  calculated.  The  non-linearity  is  kept  to  allow  for \nlayering  of  the  network.  Most  of  the  details  of initialization  and  update  are  the \nsame  as  that reported in  NetTalk {Sejnowski,  1986}. \n\nOUT \n\n~-111111::::+~----'. IN \n\nFigure 1.  Piece-wise  linear transfer function. \n\nMany  different  nets were  trained  in this signature verification project,  all of which \nwere feed-forward.  The output layer most often consisted of a single output neuron, \nbut  5  output  neurons  have  been  used  as  well.  If a  hidden  layer  was  used,  then \nthe  number  of hidden  units ranged  from  2  to  53.  The  networks  were  both  fully(cid:173)\nconnected and partially-connected. \n\nSAMPLE RUN \nThe simplest  network is  that of a  single  neuron  taking all  128x64 pixels  as  input, \nplus one bias.  Each pixel has a weight associated with it, so that the total number \nof weights is 128x64 + 1 =  8193.  Each white pixel is  assigned an input value of + 1, \neach  black  pixel has  a  value of -1.  The  training set  consists of 10  true  signatures \n\n\fBackpropagation and Handwritten Signature Verification \n\n343 \n\nwith  10 forgeries.  Figure 2a depicts the network  structure of this sample run. \n\nOUT \n\nc:: \n\n- 1 \n0 -u \nCD -CD \" 0.5  f- .. \n\"---Q. \n\nCD \n:J \n\n~1~111J 111.  \"~~~mlla \n\n(a) \n\n(e) \n\n0 \n\n0 \n\n1 \n\nen \nLL. \nC  0.5 \n0 \n\nf \n\n0 \n\n0 \n\n0.5 \n\nP(false  acceptance) \n\n(b) \n\n1/ \n\n~ \n\n0.5 \n\nOutput  Values \n\n(d) \n\n-\n\n1 \n\n1 \n\nFigure 2.  Sample run. \n\na)  Network = one output neuron,  one weight  per pixel,  fully  con(cid:173)\n\nnected.  Training set  =  10 true signatures + 10 forgeries. \n\nb)  ROC  plot for  the  sample run.  (Probability of fa1se  acceptance \nvs probability of true detection).  Test  set = 70  true signatures \n+ 56 forgeries. \n\nc)  Clipped  picture  of the  weights  for  the  sample  run.  White  = \n\npositive weight,  black =  negative weight. \n\nd)  Cumulative distribution function for the true signatures (+)  and \n\nfor  the forgeries  (0)  of the sample run. \n\nThe  network  is  trained  on  these  20  signatures  until  all  signatures  are  classified \n\n\f344 \n\nWilkinson, Mighell and Goodman \n\ncorrectly.  The trained  network is  then tested on the remaining 70  true signatures \nand 56  forgeries. \n\nThe  results  are  depicted  in  Figures  2b  and  2d.  Figure  2b  is  a  radar  operating \ncharacteristic curve,  or roc  plot for  short.  In  this  presentation of data,  the proba(cid:173)\nbility of detecting a  true signature is  plotted against the probability of accepting a \nforgery.  Roc  plots have been  used  for  some  time in the radar sciences  as  a  means \nfor visualizing performance  {Marcum,  1960}.  A  perfect  roc  plot has  a  right  angle \nin the  upper  left-hand corner  which would  show  perfect  separation of  true  signa(cid:173)\ntures from forgeries.  The curve is plotted by varying the threshold for classification. \nEverything  above  the  threshold  is  labeled  a  true  signature,  everything  below  the \nthreshold  is  labeled  a  forgery.  The  roc  plot  in  Figure  2b  is  close  to  perfect,  but \nthere is some overlap in the output values of the true signatures and forgeries.  The \noverlap can be seen in the cumulative distribution functions  (cdfs)  for the true and \nfalse  signatures  as  shown  in  Figure  2d.  As  seen  in  the  cdfs,  there  is  fairly  good \nseparation of the output values.  For a  given  threshold of 0.5,  the network produces \n1%  rejection  of true  signatures  as false,  with  4%  acceptance  of forgeries  as  being \ntrue.  IT  one lowers  the threshold for  classification down to  0.43,  the  true rejection \nbecomes nil,  with a  false  acceptance  of 7%  .  A simplified picture  of the weights is \nshown in Figure 2c, with white pixels designating positive weights, and black pixels \nnegative weights. \n\nOTHER NETWORKS \nThe sample run  above was  expanded  to  include  2  and  3  hidden  neurons  with  the \nsingle output neuron.  The results were similar to the single unit network, implying \nthat the separation is  linear. \n\nThe 128x64 input image was also divided into regions, with each region feeding into \na  single neuron.  In  one network structure,  the input was  sectioned into 32  equally \nsized  regions  of 16x16  pixels.  The hidden  layer thus has  32  neurons,  each neuron \nreceiving  16x16 + 1 inputs.  The output neuron had 33 inputs.  Likewise,  the input \nimage was divided into 53  regions of 16x16 pixels,  this time overlapping. \n\nFinally,  only  the  initials  were  presented  to  the  network. \n(Handwriting  experts \nhave noted that leading strokes and  separate capital letters  are very significant in \nclassification  {Osborn,  1929}.)  In  this  case,  two  types  of networks  were  devised. \nThe first  had a single output neuron, the second had three hidden neurons plus one \noutput neuron.  Each of the hidden  neurons received  inputs from  only one initial, \nrather  than from  all three.  The  network  with  the  single output neuron  produced \nthe best results of all, with 2%  true rejection and 2%  false  acceptance. \n\nIMPORTANCE OF FORGERIES IN THE TRAINING SET \nIn all cases, the networks performed much better when forgeries were included in the \ntraining set.  When an all-white image is presented as the only forgery,  performance \ndeteriorates  significantly.  When no forgeries  are present, the network  decides  that \n\n\fBackpropagation and Handwritten Signature Verification \n\n345 \n\nall signatures are true signatures.  It is therefore desirable to include actual forgeries \nin  the  training  set,  yet  they  may  be  impractical  to  obtain.  One  possibility  for \navoiding \u00b7the collection of forgeries  is to use computer generated forgeries.  Another \nis to distort the true signatures.  A third is to use true signatures of other people as \nforgeries  for  the person  in question.  The  attraction of this last option is  that the \nmasquerading forgeries  are  already  available for use. \n\nNETWORK  WITHOUT FORGERIES \n\nTo test the use of true signatures of other people for forgeries,  the following network \nis  devised.  Once  again,  the  input  is  the  128x64 pixel  image.  The  output  layer  is \ncomprised of five  output neurons fully connected to the input image.  The function \nof each  output  neuron  is  to be  active  when  presented  with  a  particular  persons' \nsignature.  When a forgery is present, the output is to be low.  Figure 3a depicts this \nnetwork.  The training set  has 50 true signatures,  ten for each of five  people.  Each \nsignature has  a  desired  output of true for one neuron,  and false  for  the remaining \nfour neurons.  Once  the network  is  trained,  it is  tested on 210  true  signatures  and \n150 forgeries.  Figures  3b and 3c  record the results.  At a  threshold of 0.5, the true \nrejection is  3%  and the false  acceptance is  14%.  Decreasing  the threshold down  to \n0.41  gives  0%  true  rejection  and  28%  false  acceptance.  These  results  are  similar \nto the sample run,  though not  as  good.  This is  a  simple demonstration of the use \nof other true  signatures  as  forgeries.  More sophisticated techniques  could improve \nthe  discrimination.  For instance,  selecting  names with  similar lengths or spelling \nshould improve the classification. \n\nCONCLUSION \n\nAutomated  signature  verification  systems  would  be  extremely  important  in  the \nbusiness world for verifying  monetary transactions.  Countless dollars are lost each \nday  to  instances  of casual  forgeries.  An  artificial  neural  network  employing  the \nbackpropagation learning algorithm has been trained on both true and false signa(cid:173)\ntures  for  classification.  The results have been very  good:  2%  rejection of genuine \nsignatures with  2%  acceptance  of forgeries.  The  analysis requires  only  the static \npicture  of the  signature,  there by offering  widespread  use  through  centralized  ver(cid:173)\nification.  True  signatures  of other  people  may  substitute  for  the  forgeries  in the \ntraining set - eliminating the need for collecting non-genuine signatures. \n\n\f346 \n\nWilkinson, Mighell and Goodman \n\nJWG  JTH  TSW  LDK  ABH \n\n-C \no -(,) \nCD -Q) \n::J .. --\n\nQ) \n\n\"t:S  U.5 \n\n1 r--::iif1l---------.. \n\n(a) \n\nlr-----------~~~--_=~ \n\n(f. \n00.5 \no \n\nI \n\n~ \n\no~~----~~~--------~ \no \n1 \n\n0.5 \n\nOutput  Values \n\n(c) \n\n~  o~----------~--------~ \n1 \n\n0.5 \n\no \n\nP(false  acceptance) \n\n(b) \n\nFigure 3.  Network without forgeries  for  5 individuals. \n\na)  Network = 5 output neurons,  one  for  each  individua~ as  indi(cid:173)\ncated by the initials.  Training set  =  10 true signatures for each \nindividual. \n\nb)  ROC  plot for the network without forgeries. \n\nTest set = 210 true signatures + 150 forgeries. \n\nc)  Cumulative distribution function for the true signatures (+)  and \n\nfor  the forgeries  (0)  of the network without forgeries. \n\nReferenees \n\nK.  Fukishima and  S.  Miyake,  \"Neocognitron:  A  biocybernetic  approach to visual \npattern recognitionJt ,  in NHK Laboratorie~ Note,  Vol.  336,  Sep  1986  (NHK \nScience  and Technical Research  Laboratories,  Tokyo). \n\n\fBackpropagation and Handwritten Signature Verification \n\n347 \n\nP.  Gorman  and  T.  J.  Sejnowski,  \"Learned  classification  of sonar  targets  using  a \nmassively  parallel network\",  in  the  proceedings  of the  IEEE  ASSP  Oct  21, \n1986 DSP Workshop,  Chatham,  MA. \n\nL.  D.  Jackel,  H.  P.  Graf,  W.  Hubbard,  J.  S.  Denker,  and  D.  Henderson,  \"An \n\napplication of neural net chips:  handwritten digit recognition\", in  IEEE In(cid:173)\nternational  Oonference  on  Neural  Networks  1988,  II  107-115. \n\nJ.  T.  Marcum,  \"A  statistical theory  of target  detection  by pulsed  radar\",  in  IRE \n\nTransactions  in  Information  Theory,  Vol.  IT-6  (Apr.),  pp  145-267, 1960. \n\nW.  F.  Nemcek  and W. C.  Lin,  \"Experimental investigation of automatic signature \nverification\"  in  IEEE  Transactions  on  Systems,  Man,  and  Oybernetics,  Jan. \n1974, pp  121-126. \n\nA.  S.  Osborn,  Questioned Documents,  2nd edition (Boyd Printing Co,  Albany NY) \n\n1929. \n\nC.  R.  Rosenberg  and  T.  J.  Sejnowski,  \"The  spacing  effect  on  NETtalk,  a  mas(cid:173)\n\nsively  parallel network\",  in  Proceedings  of the  Eighth  Annual  Oonference  of \nthe  Oognitive  Science  Society,  (Hillsdale,  New  Jersey:  Lawrence  Erlbaum \nAssociates,  1986)  72-89. \n\nD.  E.  Rumelhart, G. E.  Hinton, and R.  J.  Williams,  \"Learning internal representa(cid:173)\n\ntions by error propagation\",  in  Parallel Distributed  Processing:  Explorations \nin  the  Microstructures  of Oognition.  Vol.  1:  Foundations,  edited  by  D.  E. \nRumelhart  &  J.  L.  McClelland,  (MIT Press,  1986). \n\nY.  Sato  and  K.  Kogure,  \"Online  signature  verification  based  on  shape,  motion, \nand writing pressure\", in  Proceedings  of the  6th  International  Oonference  on \nPattern Recognition,  Vol.  2,  pp 823-826  (IEEE  NY)  1982. \n\nT.  J.  Sejnowski  and  C.  R.  Rosenberg,  \"NETtalk:  A  Parallel  Network  that  Learns \n\nto  Read  Aloud\",  Johns  Hopkins  University  Department  of Electrical  Engi(cid:173)\nneering and Computer Science  Technical Report  JHU /EECS-86/01,  (1986). \n\nV.  V.  Tolat and B.  Widrow,  \"An adaptive 'broom balancer' with visual inputs\" , in \n\nIEEE International  Oonference  on Neural  Networks  1988,  II 641-647. \n\n\f", "award": [], "sourceid": 105, "authors": [{"given_name": "Timothy", "family_name": "Wilkinson", "institution": null}, {"given_name": "Dorothy", "family_name": "Mighell", "institution": null}, {"given_name": "Joseph", "family_name": "Goodman", "institution": null}]}