{"title": "Unsupervised Parallel Feature Extraction from First Principles", "book": "Advances in Neural Information Processing Systems", "page_first": 136, "page_last": 143, "abstract": null, "full_text": "Unsupervised Parallel Feature Extraction \n\nfrom  First  Principles \n\n.. \n\nMats  Osterberg \n\nImage Processing  Laboratory \n\nDept.  EE., Linkoping University \n\nS-58183  Linkoping Sweden \n\nReiner Lenz \n\nImage Processing  Laboratory \n\nDept.  EE.,  Linkoping University \n\nS-58183 Linkoping Sweden \n\nAbstract \n\nWe describe a number of learning rules that can be used to train un(cid:173)\nsupervised  parallel feature  extraction systems.  The learning  rules \nare  derived  using  gradient  ascent  of a  quality  function.  We  con(cid:173)\nsider  a  number of quality functions  that are  rational functions  of \nhigher  order  moments of the  extracted  feature  values.  We  show \nthat  one  system  learns  the  principle  components  of  the  correla(cid:173)\ntion matrix.  Principal component analysis systems are usually not \noptimal feature  extractors  for  classification.  Therefore  we  design \nquality functions which produce feature vectors that support unsu(cid:173)\npervised  classification.  The properties of the  different systems are \ncompared  with  the  help  of different  artificially  designed  datasets \nand a  database consisting of all Munsell color spectra. \n\n1 \n\nIntroduction \n\nThere  are  a  number of unsupervised  Hebbian  learning  algorithms  (see  Oja,  1992 \nand  references  therein)  that  perform some  version  of the  Karhunen-Loeve  expan(cid:173)\nsion.  Our approach to unsupervised  feature extraction is to identify some desirable \nproperties of the extracted feature  vectors and to construct a  quality functions that \nmeasures these properties.  The filter functions are then learned from the input pat(cid:173)\nterns  by  optimizing this selected  quality function.  In  comparison  to  conventional \nunsupervised Hebbian learning this approach reduces the amount of communication \nbetween  the units needed  to learn  the weights in parallel since  the complexity now \nlies in the learning rule  used. \n\n136 \n\n\fUnsupervised Parallel Feature Extraction from First Principles \n\n137 \n\nThe optimal (orthogonal) solution to two of the proposed  quality functions turn out \nto be related  to the  Karhunen-Loeve expansion:  the first  learns  an arbitrary rota(cid:173)\ntion of the eigenvectors  whereas  the later learns  the  pure eigenvectors.  A  common \nproblem with the Karhunen-Loeve expansion is  the fact  that the first  eigenvector is \nnormally the mean vector of the input patterns.  In this case one filter function  will \nhave a more or less uniform response for a wide range of input patterns which makes \nit rather useless  for  classification.  We  will show  that one  quality function  leads to \na system that tend  to learn filter functions  which have a  large magnitude response \nfor  just one  class of samples  (different  for  each  filter  function)  and  low  magnitude \nresponse  for  samples from  all  other classes.  Thus,  it is  possible  to  classify  an  in(cid:173)\ncoming pattern by simply observing which filter function  has the largest magnitude \nresponse .  Similar to Intrator's Projection  Pursuit related  network  (see  Intrator & \nCooper,  1992 and references  therein) some quality functions use  higher order  (> 2) \nstatistics of the input process but in contrast to Intrator's network there  is  no need \nto specify the amount of lateral inhibition needed  to learn different  filter functions. \n\nAll systems considered in this paper are  linear but at the end we  will briefly discuss \npossible non-linear extensions. \n\n2  Quality functions \n\nIn  the  following  we  consider  linear  filter  systems.  These  can  be  described  by  the \nequation: \n\nO(t) \n\nW(t)P(t) \n\n(1) \nwhere  P(t)  E  RM :l  is  the  input  pattern  at  iteration t,  W(t)  E  RN:M  is  the  filter \ncoefficient  matrix  and  O(t)  = (01 (t), ... ,0N(t))'  E  RN :l  is  the  extracted  feature \nvector.  Usually  M  > N,  i.e.  the  feature  extraction  process  defines  a  reduction  of \nthe dimensionality.  Furthermore,  we  assume that  both the input patterns  and the \nfilter  functions  are normed;  IIP(t)1I = 1 and  IIWn(t)1I = 1, \"It \"In.  This implies that \n10~(t)1 ~ 1, 'Vi \"In. \nOur first  decision is to measure the scatter of the extracted feature  vectors  around \nthe origin by  the determinant of the output correlation matrix: \n\nQMS(t) = det EdO(t)O'(t)} \n\n(2) \nQMS(t)  is  the  quality function  used  in  the  Maximum Scatter  Filter  System  (MS(cid:173)\nsystem).  The use of the determinant is motivated by the following two observations: \n1.  The determinant is equal to the product of the eigenvalues and hence the product \nof the  variances  in  the  principal  directions  and  thus  a  measure  of the  scattering \nvolume in the feature space.  2.  The determinant vanishs if some filter functions  are \nlinearly dependent. \n\nIn  (Lenz  & Osterberg,  1992)  we  have  shown  that  the  optimal filter  functions  to \nQMS(t)  are  given  by  an  arbitrary  rotation of the  N eigenvectors  corresponding  to \nthe N largest  eigenvalues of the input correlation matrix: \n\n(3) \nwhere  Ueig  contains  the  largest  eigenvectors  (or  principal  components)  of the  in(cid:173)\nput  correlation  matrix  EdP(t)P'(t)}.  R  is  an  arbitrary  rotation  matrix  with \ndet( R) =  1.  To differentiate  between these solutions we  need  a second  criterion. \n\nRUeig \n\nWopt \n\n\f138 \n\nOsterberg and Lenz \n\nOne attempt to define the best rotation is to require that the mean energy E t { o~ (t)} \nshould  be concentrated  in as few  components on(t)  of the extracted  feature  vector \nas possible.  Thus, the mean energy Ed o~ (t)} of each filter function should be either \nvery  high  (i.e.  near  1)  or very  low  (i.e.  near 0).  This leads  to the following second \norder concentration measure: \n\nN \n\nQ2(t)  =  L  Edo~(t)} (1- Edo!(t)}) \n\nn=l \n\n(4) \n\nwhich  has  a  low  non-negative value if the energies  are concentrated. \nAnother  idea is  to find  a system  that produces feature  vectors  that have unsuper(cid:173)\nvised  discrimination power.  In this case each learned filter  function should respond \nselectively,  i.e.  have a  large  response  for  some input samples and  low  response  for \nothers.  One formulation of this goal is that each extracted feature  vector should be \n(up  to the sign)  binary; Oi(t)  = \u00b11 and  on(t) = 0, n  1=  i, 'Vt.  This can  be measured \nby  the following fourth  order expression: \n\nQ4(t)  =  EdL o~(t) (1 - o~(t\u00bb)} \n\nN \n\nn=l \n\nN \nL  Edo~(t)} - Edo!(t)} \nn=l \n\n(5) \n\nwhich  has  a  low  non-negative  value  if the  features  are  binary.  Note  that it is  not \nsufficient  to  use  on(t)  instead  of  o~(t) since  Q4(t)  will  have  a  low  value  also  for \nfeature  vectors  with  components  equal  in  magnitude  but  with  opposite  sign.  A \nthird  criterion  can  be  found  as  follows:  if the  filter  functions  have selective  filter \nresponse  then the response  to different  input patterns differ  in magnitude and thus \nthe variance of the mean energy Ed o~(t)} is large.  The total variance is measured \nby: \n\nN \n\nN \n\nL  Var  {o~ (t)}  = L  Ed ( o~ (t) - Ed o~ (t)} ) 2 } \n\nn=l \n\nn=l \nN \nL  Edo!(t)} - (Ed o!(t)})2 \nn=l \n\n(6) \n\nFollowing  (Darlington,  1970)  it  can  be  shown  that  the  distribution  of  o~  should \nbe  bimodal  (modes  below  and  above  Edo~}) to  maximize  QVar(t).  The  main \ndifference  between  QVar(t)  and  the  quality  function  used  by  Intrator  is  the  use \nof a  fourth  order  term  Edo!(t)}  instead  of a  third  order  term  Edo~(t)}.  With \nEd o~(t)}  the  quality  function  is  a  measure  of the  skewness  of  the  distribution \no(t)  and  it is  maximized  when  one  mode  is  at  zero  and one  (or  several)  is  above \nEdo~(t)}. \nIn  this  paper  we  will  examine  the  following  non-parametric  combinations of the \nquality functions above: \n\nQMS(t) \nQ2(t) \nQMS(t) \nQ4(t) \n\nQVar(t)QM set) \n\n(7) \n\n(8) \n\n(9) \n\n\fUnsupervised Parallel Feature Extraction from First Principles \n\n139 \n\nWe  refer  to  the  corresponding  filter  systems  as:  the  Karhunen-Loeve  Filter  Sys(cid:173)\ntem (KL-system),  the  Fourth Order  Filter System (FO-system)  and the Maximum \nVariance  Filter System  (MV-system). \n\nSince each quality function is  a combination of two different  functions it is  hard to \nfind  the global optimal solution.  Instead we use  the following strategy to determine \na  local optimal solution. \n\nDefinition 1  The  optimal  orthogonal  solution  to  each  quality  function  is  of the \nform: \n\n(10) \nwhere  Ropt  is  the  rotation  of the  largest  eigenvectors  which  minimize Q2(t),  Q4(t) \nor  maximize QYar(t). \n\nW opt \n\nIn  (Lenz  &  Osterberg,  1992 and Osterberg,  1993) we  have shown  that the optimal \northogonal solution to the  KL-system  are  the  N  pure  eigenvectors  if the  N largest \neigenvalues  are  all  distinct  (i.e.  Ropt  = I).  If some eigenvalues  are equal  then  the \nsolution  is  only  determined  up  to  an  arbitrary  rotation  of the  eigenvectors  with \nequal eigenvalues.  The fourth order  term Edo~(t)} in Q4(t)  and  QYar(t)  makes it \ndifficult  to  derive  a  closed  form  solution.  The  best  we  can  achieve  is  a  numerical \nmethod (in the case of Q4(t) see Osterberg, 1993) for the computation of the optimal \northogonal filter  functions. \n\n3  Maximization of the quality function \n\nThe partial derivatives of QMS(t), Q2(t), Q4(t)  and QYar(t)  with respect  to w~(t) \n(the  mth  weight  in  the  nth  filter  function  at iteration  t)  are  only functions  of the \ninput  pattern  pet),  the  output  values  OCt)  = (OI(t), ... , ON(t\u00bb  and  the  previous \nvalues of the weight coefficients (w~ (t - 1), ... , w~ (t  - 1\u00bb  within the filter function \n(see  Osterberg,  1993).  Especially,  they  are  not  functions  of the  internal  weights \n((wlCt - 1), ... , wf1(t -1\u00bb, i;/; n) of the other filter  functions  in  the system.  This \nimplies that the filter  coefficients  can  be learned  in  parallel  using  a system  of the \nstructure shown  in  Figure  1. \n\nIn  (Osterberg,  1993)  we  used  on-line  optimization  techniques  based  on  gradient \nascent.  We  tried  two  different  methods to select  the step  length  parameter.  One \nrather  heuristical  depending  on  the  output  On (t)  of  the  filter  function  and  one \ninverse  proportional  to  the  second  partial  derivative  of the  quality  function  with \nrespect  to w~ (t).  In each iteration the length of each filter  function  was  explicitly \nnormalized to one.  Currently,  we  investigate standard unconstrained  optimization \nmethods  (Dennis &  Schnabel,  1983)  based on batch  learning.  Now  the step length \nparameter is selected  by  line search in the search  direction  Set): \n\nmrc Q(W(t) + AS(t\u00bb \n\n(11) \n\nTypical choices of Set)  include Set)  = I  and Set) =  H-l. With the identity matrix \nwe  get  Steepest  Ascent  and  with  the inverse  Hessian  the  quasi-Newton  algorithm. \nU sing  sufficient  synchronism  the  line  search  can  be  incorporated  in  the  parallel \nstructure (Figure 1).  To incorporate the quasi-Newton algorithm we  have to assume \n\n\f140 \n\nOsterberg and Lenz \n\nInpul  pall.ra \nP(I)  --...---.1 \n\nOuIPl'I \n~----.,.--+  0.(1)  -\n\n.... '(1)1'(1) \n\nOulpul \n0,(1)  -\n\n... ,'(1)1'(1) \n\nQuIP'\" \n~--+-++--+ o . .{t) -\n\n...... (1)1'(1) \n\nFigure  1:  The architecture of the filter  system \n\nthat  the  Hessian  matrix is  block  diagonal, i.e.  the second  partial  derivatives  with \nrespect  to wr(t)w,(t), k f.  I, \"1m  are assumed to be zero.  In general this is not the \ncase and it is not clear if a block diagonal approximation is valid or not.  The second \npartial  derivatives  can  be  approximated  by  secant  methods  (normally  the  BFGS \nmethod).  Furthermore the condition of normalized filter  functions can  be  achieved \nby optin4izing in hyperspherical polar coordinates.  Preliminary experiments (mustly \nwith Steepest  Ascent)  show  that more advanced optimization techniques  lead  to a \nmore robust convergence  of the filter  functions. \n\n4  Experiments \n\nIn  (Osterberg,  1993)  we  describe  a  series  of experiments  in  which  we  investigate \nsystematically the following  properties  of the  MS-system,  the  KL-system  and  the \nFO-system:  convergence speed,  dependence  on initial solution W(O)  , distance  be(cid:173)\ntween  learned  solution and optimal (orthogonal) solution, supervised  classification \nof the extracted  feature  vectors  using  linear regression  and  the  degree  of selective \nresponse  of the learned filter functions.  We use  training sets  with controlled scalar \nproducts  between  the  cluster  centers  of three  classes  of input  patterns  embedded \nin a  32-D space.  The results  of the experiments can  be summarized as  follows .  In \ncontrast to the  MS-system,  we  noticed  that the KL- and  FO-system  had  problems \nto converge  to the optimal orthogonal solutions for some initial solutions.  All  sys(cid:173)\ntems learned orthogonal solutions regardless of W(O).  The supervised classification \npower  was  independent  of the  filter  system  used.  Only  the  FO-system  produced \n\n\fUnsupervised Parallel Feature Extraction from First Principles \n\n141 \n\nTable 1:  Typical filter  response  to patterns from  (a)-(c)  Tsetl and  (d)  Tset2  using \nthe filter  functions  learned with  (a) the KL-system,  (b)  the  FO-system and  (c)-(d) \nthe MV-system.  (e)-(f)  Output covariance matrix using the filter functions learned \nwith (e)  the KL-system and (f)  the MV-system. \n\n[(  -0.12)  (-0.46)  (0.73)] \n\n, \n\n, \n\n[ (  -0.71)  (-0.99)  (-0.22)] \n\n, \n\n, \n\n0.92 \n-0.38 \n\n0.83 \n0.32 \n(a.) \n\n0.66 \n0.14 \n\n0.59 \n0.28 \n\n-0.80 \n0.50 \n\n-0.08 \n0.01 \n(b) \n\n-0.50 \n0.81 \n(d) \n\n-0.04 \n0.97 \n\n-0.49 \n0.50 \n\n[( \n\n0.28) \n-0.91 \n0.44 \n\n( \n\n, \n\n0.10) \n-0.39 \n0.95 \n(c) \n\n( \n\n, \n\n0.98)] \n-0.23 \n0.11 \n\n[ (  -0.50)  (-0.49)  (-0.81)] \n\n, \n\n, \n\n( \n\n0.0340 \n0.0001 \n0.0005 \n\n0.0001  0.0005) \n0.9300  0.0000 \n0.0000  0.0353 \n\n(e) \n\n( \n\n0.3788 \n0.3463 \n-0.3473 \n\n0.3463 \n0.3760 \n-0.3467 \n\n(f) \n\n-0.3473  ) \n-0.3467 \n0.3814 \n\nfilter  functions  which  mainly react  for  patterns from just one  class  and only if the \nsimilarity (measured by the scalar product)  between  the classes in  the training set \nwas smaller than approximately 0.5.  Thus,  the  FO-system extracts feature  vectors \nwhich  have  unsupervised  discrimination power.  Furthermore,  we  showed  that  the \nFO-system can  distinguish between  data sets  having identical  correlation  matrices \n(second  order  statistics)  but different  fourth  order  statistics.  Recent  experiments \nwith more advanced optimization techniques (Steepest  Ascent) show  better conver(cid:173)\ngence  properties  for  the KL- and FO-system.  Especially  the distance  between  the \nlearned filter  functions and the optimal orthogonal ones  becomes smaller. \nWe will describe some experiments which show that the MV-system is more suitable \nfor  tasks  requiring  unsupervised  classification.  We  use  two  training sets Tsetl  and \nTset2.  In  the  first  set  the  mean  scalar  product  between  class one  and  two  is  0.7, \nbetween class one and three 0.5  and between class two and three 0.3.  In the second \nset  the mean scalar products  between  all  classes  are 0.9,  i.e.  the  angle  between  all \ncluster centers is arccos(0.9) =  26 0 \u2022  In Table 4(a)-( c)  we show  the filter  response  of \nthe learned filter  functions  with  the KL-,  FO- and  MV-system  to typical examples \nof the input  patterns in  the  training set  Tsetl.  For  the KL-system we  see  that the \nsecond  filter  function gives  the largest magnitude response  for  both, patterns from \nclass one and two.  For the FO-system the feature  vectors are more binary.  Still the \nfirst  filter  function  has the  largest  magnitude response  for  patterns from  class one \nand two.  For  the MV -system we see that each filter  function has largest magnitude \nresponse for only one class of input patterns and thus the extracted feature vectors \nsupport  unsupervised  discrimination.  In Table 4( d)  (computed from  Tset2)  we  see \nthat this is  the case  even  then  the scalar  products  between  the  cluster  centers  are \nas  high  as  0.9.  The filter  functions  learned  by  the  MV -system  are  approximately \northogonal.  The system  learns  thus the  rotation of the  largest eigenvectors  which \nmaximizes  QVa.r(t).  Therefore  it  will  not  extract  uncorrelated  features  (see  Ta-\n\n\f142 \n\nOsterberg and Lenz \n\n02 \n\n015 \n\n0.1 \n\no \n\n02 \n\nI \n\n-' \n\n\u2022 \n\n........ '(\\\\ \n, , , , , , , , , , , \n\n, \", \n\n, \n\n'. '. \n\n, \n\" \n\n025 \n\n02 \n\n0.5 \n\no. \n\nOOS \n\n0 \n\n-0 os \n\n-0' \n\n-0 '5 \n\n-02 \n\n'-\n\n\\ \n\n(a) \n,- - ... \n, \n, \n, \n\" \n, \n, \nI , , , , , \n\n..... . \n\n\", \n\n),' \n\n\" ......... . \n,  . . .-\n,', \n./ \" \n, . \u2022 \n\n''II. .. \n\n\" \n\n'-\n\nI \n\n............... \n\n, \n,/ \n.\u2022..\u2022\u2022 1.,' \n,'., \n, \n,-.  I \n\" \n\nI \n\nI \n\n..\u2022\u2022.\u2022 \n\nISO \n\nlUG \n\n-o~ \n\n(b) \n\n031~--~--~--~--~--~--~ \n\nI \nI \nI \n\n,', \n\n, \n\n, \n\n' , \n' \n\n.. . ..... .. \n\n02 \n\n0.6 \n\nO' \n\n006 \n\n\\ , \n\n: \n, , \n...  , \nt \". \" \\ \n, , , , , , \n\n, \n\n: \n: \n\nI \n\n,~. \n.. \n\n, \n,,' \n\n, \n\n\\ \n\\ \n\n\" \n\n.................... . \n\n\\. \n\n(c) \n\n(d) \n\n~~~~~~~~~~~~_~-4I~m--~~ \n\nFigure  2:  (a)  Examples of normalized  reflectance  spectra of typical  reddish  (solid \ncurve),  greenish  (dotted  curve)  and  bluish  (dashed  curve)  Munsell  color  chips.  (b) \nThe three  largest  eigenvectors  belonging to the correlation matrix of the  1253  dif(cid:173)\nferent  reflectance spectra.  (c)  The learned filter functions  with the MV-system.  (d) \nThe  learned  non-negative  filter  functions  with  the  MV-system.  In  all  figures  the \nx-axes show  the wave  length  (nm) \n\nble 4(f\u00bb  but the variances  (e.g.  the diagonal elements of the  covariance matrix) of \nthe features  are  more or  less  equal.  In  Table 4( e)  we  see  that  the  KL-system  ex(cid:173)\ntracts uncorrelate  features  with  largely  different  variance.  This demonstrates  that \nthe KL-system  tries  to learn the pure eigenvectors. \n\nRecently,  we  have  applied  the  MV-system  to  real  world  data.  The  training  set \nconsists  of normalized  reflectance  spectra  of the  1253  different  color  chips  in  the \nMunsell  color  atlas.  Figure  2(a)  shows one  typical example of a  red,  a  green  and \na  blue  color  chip  and  Figure  2(b)  the  three  largest  eigenvectors  belonging  to  the \ncorrelation  matrix of the  training set.  We  see  that  the  first  eigenvector  (the solid \ncurve) has a more or less uniform response for all different colors.  On the other hand, \nthe  MV -system  (Figure 2 (c\u00bb  learns one bluish, one greenish  and one reddish  filter \nfunction.  Thus, the filter functions  divide the color space according to the primary \ncolors red, green and blue.  We notice that the learned filter functions are orthogonal \nand  tend  to span  the same space  as  the  eigenvectors  since  IIW.ol  - RoptUeigliF  = \n0.0199 (the  Frobenius norm) where  Ropt  maximizes QVa.r(t).  Figure 2(d) show  one \npreliminary attempt to include the condition of non-negative filter  functions in the \n\n\fUnsupervised Parallel Feature Extraction from First Principles \n\n143 \n\noptimization  process  (Steepest  Ascent).  We  see  that  the  learned  filter  functions \nare  non-negative  and  divide  the  color space  according  to the  primary colors.  One \npossible  real  word  application  is  optical  color  analysis  where  non-negative  filter \nfunctions  are  much  easier  to  realize  using  optical  components.  Smoother  filter \nfunctions  can  be  optained by  incorporating  additional constraints  into the quality \nfunction. \n\n5  Non-linear extensions \n\nThe  proposed  strategy  to  extract  feature  vectors  apply  to  nonlinear  filter  sys(cid:173)\ntems  as  well.  In  this  case  the  input output relation  OCt)  = W(t)P(t)  is  replaced \nby  OCt)  = I(W(t)P(t\u00bb  where  I  describes  the  desired  non-linearity.  The  corre(cid:173)\nsponding  learning  rule  can  be  derived  using  gradient  based  techniques  as  long  as \nthe non-linearity 1(\u00b7) is differentiable.  The exact form of 1(,) will usually be appli(cid:173)\ncation oriented.  Node  nonlinearities of sigmoid type  are one type of nonlinearities \nwhich  has  received  a  lot  of  attention  (see  for  example  Oja  &  Karhunen,  1993). \nTypical  applications include:  robust  Principal  Component  Analysis  PCA  (outlier \nprotection,  noise  suppression  and  symmetry  breaking), sinusoidal signal detection \nin colored  noise  and robust curve fitting. \n\nAcknowledgements \n\nThis work  was  done under TFR-contract TFR-93-00192.  The visit of M.  Osterberg \nat the  Dept. of Info.  Tech.,  Lappeenranta University of Technology was supported \nby  a  grant from  the  Nordic  Research  Network  in  Computer Vision.  The  Munsell \ncolor experiments were  performed  during  this visit. \n\nReferences \n\n(1970)  Is  Kurtosis  really  peakedness?  American  Statistics \n\nR.  B.  Darlington. \n24(2):19-20. \nJ.  E.  Dennis & Robert  B.  Schnabel.  (1983)  Numerical  Methods lor  Unconstrained \nOptimization  and  Nonlinear  Equations.  Prentice-Hall. \nN.  Intrator  &  L.N.  Cooper.  (1992)  Objective  Function  Formulation of the  BCM \nTheory of Visual Cortical Plasticity:  Statistical Connections,  Stability Conditions. \nNeural Networks 5:3-17. \nR.  Lenz  & M.  Osterberg.  (1992)  Computing the Karhunen-Loeve expansion with a \nparallel, unsupervised  filter system.  Neural  Computations 4(3):382-392. \nE.  Oja.  (1992) Principal Components, Minor Components, and Linear  Neural  Net(cid:173)\nworks.  Neural Networks 5:927-935. \nE.  Oja & J.  Karhunen.  (1993)  Nonlinear  PCA:  algorithms and Applications  Tech(cid:173)\nnical  Report  AlB,  Helsinki  University  01  Technology,  Laboratory  of Computer  and \nInformation  Sciences,  SF -02150  Espoo,  Finland. \n\nM. Osterberg.  (1993) Unsupervised Feature Extraction using Parallel Linear Filters. \nLinkoping  Studies  in  Science  and  Technology.  Thesis  No.  372. \n\n\f", "award": [], "sourceid": 721, "authors": [{"given_name": "Mats", "family_name": "\u00d6sterberg", "institution": null}, {"given_name": "Reiner", "family_name": "Lenz", "institution": null}]}