{"title": "The Bias-Variance Tradeoff and the Randomized GACV", "book": "Advances in Neural Information Processing Systems", "page_first": 620, "page_last": 626, "abstract": null, "full_text": "The Bias-Variance Tradeoff and the Randomized \n\nGACV \n\nGrace Wahba, Xiwu Lin and Fangyu Gao \n\nDept of Statistics \nUniv of Wisconsin \n\n1210 W Dayton Street \nMadison, WI 53706 \n\nwahba,xiwu,fgao@stat.wisc.edu \n\nDong Xiang \n\nSAS Institute, Inc. \nSAS Campus Drive \n\nCary, NC 27513 \n\nsasdxx@unx.sas.com \n\nRonald Klein, MD and Barbara Klein, MD \n\nDept of Ophthalmalogy \n610 North Walnut Street \n\nMadison, WI 53706 \n\nkleinr,kleinb@epi.ophth.wisc.edu \n\nAbstract \n\nWe  propose a new in-sample cross validation based method (randomized \nGACV) for choosing smoothing or bandwidth parameters that govern the \nbias-variance or fit-complexity tradeoff in  'soft' classification.  Soft clas(cid:173)\nsification refers to  a  learning procedure which  estimates the probability \nthat an example with a given attribute vector is  in  class  1 vs  class O.  The \ntarget  for  optimizing  the  the  tradeoff is  the  Kullback-Liebler distance \nbetween  the  estimated  probability  distribution  and  the  'true'  probabil(cid:173)\nity  distribution,  representing  knowledge of an  infinite  population.  The \nmethod uses a randomized estimate of the trace of a Hessian and mimics \ncross validation at the cost of a single relearning with perturbed outcome \ndata. \n\n1 \n\nINTRODUCTION \n\nWe propose and test a new in-sample cross-validation based method for optimizing the bias(cid:173)\nvariance tradeoff in 'soft classification' (Wahba et al1994), called ranG ACV (randomized \nGeneralized Approximate Cross Validation). Summarizing from Wahba et al(l994) we are \ngiven a  training  set consisting of n  examples,  where for  each  example  we  have a  vector \nt  E  T  of attribute values,  and an outcome y, which is  either 0 or 1.  Based on the training \ndata it is desired to estimate the probability p of the outcome 1 for any new examples in the \n\n\fThe Bias-Variance TradeofJand the Randomized GACV \n\n621 \n\nfuture.  In  'soft' classification the estimate p(t) of p(t) is of particular interest, and might be \nused  by a physician to  tell  patients how they might modify their risk p by changing (some \ncomponent of) t,  for example, cholesterol as  a risk factor for heart attack.  Penalized like(cid:173)\nlihood estimates are obtained for p by assuming that the  logit  f(t), t  E  T, which  satisfies \np(t)  =  ef(t) 1(1 + ef(t\u00bb)  is  in  some space 1{ of functions .  Technically 1{ is a reproducing \nkernel  Hilbert space, but you don't need to  know  what that is  to  read on.  Let the training \nset be {Yi, ti, i  =  1,\u00b7\u00b7\u00b7, n}. Letting Ii =  f(td, the negative log likelihood .c{Yi, ti, fd of \nthe observations, given f  is \n\nn \n\n.c{Yi, ti, fd  =  2::[-Ydi + b(li)], \n\ni=1 \n\n(1) \n\nwhere  b(f)  =  log(l  + ef ).  The  penalized  likelihood  estimate of the  function  f  is  the \nsolution to:  Find f  E  1{ to minimize h. (I): \n\nn \n\nh.(f) =  2::[-Ydi + b(ld) + J>.(I), \n\ni =1 \n\n(2) \n\nwhere 1>.(1)  is  a quadratic penalty functional depending on parameter(s) A =  (AI, ... , Aq) \nwhich govern the so called bias-variance tradeoff.  Equivalently the components of A con(cid:173)\ntrol the tradeoff between the complexity of f and the fit to the training data.  In this paper we \nsketch the derivation of the ranG ACV method for choosing A,  and present some prelim(cid:173)\ninary but favorable simulation results, demonstrating its efficacy.  This method is designed \nfor use with  penalized likelihood estimates, but it  is clear that it can be used with  a variety \nof other methods which contain bias-variance parameters to be chosen, and for which mini(cid:173)\nmizing the Kullback-Liebler (K L) distance is the target.  In the work of which this is a part, \nwe are concerned with  A having multiple components.  Thus, it  will  be highly convenient \nto  have an  in-sample method for  selecting A,  if one that  is  accurate and  computationally \nconvenient can be found. \n\nLet P>.  be  the  the estimate and  p  be the  'true'  but  unknown  probability  function  and  let \nPi  =  p(td,p>.i  =  p>.(ti).  For  in-sample tuning,  our criteria  for  a  good  choice of A is \nthe KL distance KL(p,p>.)  = ~ E~I[PilogP7. + (1- pdlogg~::?)]. We  may replace \nK L(p,p>.)  by the comparative K L distance (C K L), which differs from K L by a quantity \nwhich does not depend on A.  Letting hi = h (ti),  the C K L  is given by \nCKL(p,p>.)  ==  CKL(A)  =  ;;, 2:: [-pd>'i  + b(l>.i)). \n\n1  n \n\n(3) \n\ni=) \n\nC K L(A)  depends on the unknown p, and it  is  desired is  to  have a good estimate or proxy \nfor it, which can then be minimized with respect to A. \nIt is  known (Wong 1992) that no exact unbiased estimate of CK L(A) exists in this case, so \nthat only approximate methods are possible.  A number of authors have tackled this prob(cid:173)\nlem, including Utans and M90dy(1993), Liu(l993), Gu(1992). The iterative U BR method \nof Gu(l992) is  included in  GRKPACK  (Wang  1997),  which  implements general smooth(cid:173)\ning spline ANOVA penalized likelihood estimates with multiple smoothing parameters.  It \nhas  been  successfully  used  in  a  number of practical  problems,  see,  for  example,  Wahba \net al  (1994,1995).  The  present  work  represents  an  approach  in  the  spirit  of GRKPACK \nbut which employs several approximations, and may  be used  with  any data set,  no matter \nhow large, provided that an algorithm for solving the penalized likelihood equations, either \nexactly or approximately, can be implemented. \n\n\f622 \n\nG.  Wahba et al. \n\n2  THE GACV ESTIMATE \n\nIn the general penalized likelihood problem the minimizer 1>,(-)  of (2) has a representation \n\nM \n\n1>.(t)  = L  dv<Pv(t)  + L CiQ>.(ti, t) \n\nn \n\n(4) \n\nv=l \n\ni=l \n\nwhere the <Pv  span the  null space of 1>\"  Q>.(8, t)  is  a  reproducing kernel  (positive definite \nfunction)  for the penalized part of 7-1.,  and C  =  (Cl' ... ,Cn)'  satisfies M  linear conditions, \nso  that  there  are  (at most)  n  free  parameters  in  1>..  Typically  the  unpenalized  functions \n<Pv  are  low  degree polynomials.  Examples of Q(ti,')  include radial  basis  functions  and \nvarious kinds of splines; minor modifications include sigmoidal basis functions, tree basis \nfunctions and so on. See, for example Wahba( 1990, 1995), Girosi, Jones and Poggio( 1995). \nIf f>.C)  is  of the  form  (4)  then  1>,(f>.)  is  a  quadratic form  in  c.  Substituting  (4)  into  (2) \nresults  in  h  a  convex  functional  in  C  and d,  and  C  and  d  are obtained  numerically  via  a \nNewton Raphson iteration,  subject to  the conditions on c.  For large n, the second sum on \nthe right of (4)  may be replaced by  L~=1 Cik Q>. (tik , t), where the tik  are chosen via one \nof several principled methods. \n\nTo obtain the CACV we begin with the ordinary leaving-out-one cross validation function \nCV(.\\) for the CKL: \n\nCV  .\\)  -\n\n( \n\n_  1  \"\" \n\n- LJ-yd>.i  + b 1>.i)  , \n] \nn \n\n[-i] \n\n( \n\nn \n\ni=1 \n\n(5) \n\nwhere  fl- i ]  the  solution  to  the  variational  problem of (2)  with  the  ith data  point left out \nand  fti]  is  the  value  of fl- i] at  ti .  Although  f>.C)  is  computed by  solving  for  C  and d \nthe CACV  is  derived  in  terms of the  values  (it\"\", fn)'  of f  at  the ti.  Where  there  is \nno confusion between functions  f(-)  and vectors (it, ... ,fn)' of values of fat tl, ... ,tn, \nwe let  f  = (it, ... \" fn)'.  For any  f(-)  of the form  (4), J>. (f) also has a representation as \na  non-negative definite quadratic form  in  (it, ... , fn)'.  Letting L:>.  be twice the matrix of \nthis quadratic form  we can rewrite (2) as \n\nn \n\nh(f,Y) = L[-Ydi + b(/i)] + 2f'L:>.f. \n\n1 \n\n(6) \n\ni=1 \n\nLet W  =  W(f)  be the n  x  n  diagonal matrix  with (/ii  ==  Pi(l  - Pi)  in  the iith position. \nUsing the  fact  that (/ii  is  the second derivative of b(fi),  we  have that H  =  [W + L:>.] - 1 \nis  the inverse Hessian of the  variational problem (6).  In Xiang and  Wahba (1996), several \nTaylor  series  approximations,  along  with  a  generalization of the  leaving-out-one  lemma \n(see  Wahba  1990)  are  applied  to  (5)  to  obtain  an  approximate  cross  validation  function \nACV(.\\), which  is  a second order approximation to  CV(.\\) . Letting hii be the iith entry \nof H , the result is \n\nCV(.\\) ~ ACV('\\) =  .!. t[-Yd>.i + b(f>.i)] + .!.  t \nn  i=1 \n\nn  i= l \n\nhiiYi(Yi  - P>.i)  . \n\n[1  - hiwii] \n\n(7) \n\nThen the GACV  is  obtained from  the  ACV by  replacing hii  by  ~ L~1 hii  ==  ~tr(H) \nand replacing 1 - hiWii  by  ~tr[I - (Wl/2 HWl/2)], giving \n\nCACV('\\) = ;; t;;[-Yd>.i + b(1).i)  + -n-tr[I _  (Wl/2HWl /2)]  , \n\n] \n\ntr(H)  L~l Yi(Yi  - P>.i) \n\n1 ~ \n\n(8) \n\nwhere W  is  evaluated at 1>..  Numerical results based on an exact calculation of (8) appear \nin Xiang and Wahba (1996). The exact calculation is  limited to small n  however. \n\n\fThe Bias-Variance TradeofJand the Randomized GACV \n\n623 \n\n3  THE RANDOMIZED GACV ESTIMATE \n\nGiven any 'black box' which, given >.,  and a training set {Yi, ti} produces f>. (.)  as the min(cid:173)\nimizer of (2),  and thence f>.  = (fA 1 ,  \" ' ,  f>.n)',  we can produce randomized estimates of \ntrH and tr[! - W 1/ 2 HW 1/ 2 J without having any explicit calculations of these matrices. \nThis is  done by  running the  'black box' on perturbed data  {Vi  + <5i , td.  For the Yi  Gaus(cid:173)\nsian, randomized trace estimates of the Hessian of the variational problem (the  'influence \nmatrix') have been studied extensively and shown to  be essentially as good as exact calcu(cid:173)\nlations for  large n, see for example Girard(1998) .  Randomized trace estimates are  based \non the fact that if A  is any square matrix and <5  is  a zero mean random n-vector with  inde(cid:173)\npendent components with  variance (TJ,  then  E<5' A<5  = ~ tr A.  See Gong et al( 1998) and \nreferences cited there for experimental results with multiple regularization parameters. Re-\nturning to the 0-1  data case, it is easy to see that the minimizer fA(')  of 1;..  is continuous in \nY,  not withstanding the fact that in our training set the Yi  take on only values 0 or 1. Letting \nif =  UA1,\"', f>.n)'  be  the  minimizer of (6)  given  y  =  (Y1,\"', Yn)',  and  if+O  be the \nminimizer given data y+<5  =  (Y1 +<51, ... ,Yn +<5n)' (the ti remain fixed), Xiang and Wahba \n(1997) show, again using Taylor series expansions, that if+O  - ff ,.....,  [WUf) +  ~AJ-1<5. \nThis suggests that ~<5'Uf+O - ff) provides an estimate oftr[W(ff) + ~At1. However, \nif we take the solution ff to the nonlinear system for the original data Y as the initial value \nfor  a Newton-Raphson calculation of ff+O  things  become even simpler.  Applying a one \nstep Newton-Raphson iteration gives \n\nu\" \n\nu\" \n\n(9) \n\nSince  Pjf(ff,y  +  <5)  = \n[  82 h(fY  )J- 1 \n8?7if  A' Y \n[WUf) + EAt 1<5.  The result is  the following ranGACV function: \n\n-<5  +  PjfUf,Y)  =  -<5,  and  [:;~f(ff,Y +  <5)J - 1 \nf y+o,l  -\nfY \n- A \n-\n\nfY \nA +  8?7if  A' Y \n\n[ 8 2 h(fY  )J- 1 J: \n\n,we  ave  A \n\nu  so  t  at  A \n\nf y+o,l \n\nh \n\nh \n\nranGACV(>.)  =  .!. ~[- 'I '+bU  .)J+ \n\nn \n\nn  ~ Yz  At \n\nAt \n\n<5'  (f Y+O,l \nn \n\nA \n\nfY) \n- A \n\n) \n\",n \nwi=l Yi  Yi  - PAi \n\n( \n\n. \n\n[<5'<5  - <5'WUf)Uf+O,l  - ff)J \n(10) \nin  (10),  we  may  draw  R \n'+'  in \nto  obtain  an  R-replicated \n\nthe  variance  in \n\nTo  reduce \nindependent  replicate  vectors  <51,'\"  ,<5R ,  and  replace  the  term  after  the \n(1O)b \n\n1...  \",R  o:(fr Hr .1 -ff) \n\n2:7-1 y.(y.-P>..) \n\nterm  after  the \n\nthe \n\n'+' \n\n[O~Or-O~ W(fn(f~+Ar . l-ff)1 \n\ny  R  wr=l \n\nn \nranGACV(>.)  function. \n\n4  NUMERICAL RESULTS \n\nIn  this  section  we  present simulation  results  which  are  representative of more extensive \nsimulations to  appear elsewhere.  In each case,  K  < < n  was chosen by  a sequential clus(cid:173)\ntering algorithm. In that case, the ti were grouped into K  clusters and one member of each \ncluster selected at random. The model is fit.  Then the number of clusters is doubled and the \nmodel is fit again. This procedure continues until the fit does not change. In the randomized \ntrace estimates the random variates were Gaussian.  Penalty functionals were (multivariate \ngeneralizations of) the cubic spline penalty functional>. fa1 U\" (X))2, and smoothing spline \nANOVA models were fit. \n\n\f624 \n\nG.  Wahba  et at. \n\n4.1  EXPERIMENT 1.  SINGLE SMOOTHING PARAMETER \n\nIn  this  experiment t  E  [0,1], f(t)  =  2sin(10t), ti  =  (i  -\n.5)/500, i  =  1,\u00b7\u00b7\u00b7,500.  A \nrandom number generator produced  'observations' Yi  =  1 with  probability Pi  =  el , /(1 + \neli), to get the training set.  Q A is given in Wahba( 1990) for this cubic spline case, K  = 50. \nSince  the  true  P  is  known,  the  true  CKL  can  be  computed.  Fig. \nl(a)  gives  a  plot  of \nCK L(A)  and  10 replicates of ranGACV(A).  In each  replicate  R  was  taken as  1,  and  J \nwas generated anew as  a Gaussian random vector with  (115  =  .001.  Extensive simulations \nwith  different  (115  showed  that  the  results  were  insensitive  to  (115  from  1.0  to  10-6 \u2022  The \nminimizer of C K L  is  at  the filled-in  circle and  the  10  minimizers of the  10 replicates of \nranGACV are the open circles.  Anyone of these 10 provides a  rather good  estimate of \nthe A that goes  with  the  filled-in  circle.  Fig.  l(b) gives  the  same experiment, except that \nthis time R  =  5.  It can be seen that the minimizers ranGACV become even more reliable \nestimates of the minimizer of C K L, and the C K L  at all  of the ranG ACV estimates are \nactually quite close to its minimum value. \n\n4.2  EXPERIMENT 2.  ADDITIVE MODEL WITH A =  (Al' A2) \n\nHere  t  E  [0,1]  0  [0,1].  n  =  500  values  of ti  were  generated  randomly  according  to \na  uniform  distribution  on  the  unit  square  and  the  Yi  were  generated  according  to  Pi  = \neli j(l + el ,)  with  t  =  (Xl,X2)  and  f(t)  =  5 sin 27rXl  - 3sin27rX2.  An  additive model \nas  a  special case of the smoothing spline ANOVA  model  (see Wahba  et al,  1995), of the \nform  f(t)  =  /-l  + h(xd + h(X2)  with  cubic  spline  penalties  on  hand h  were  used. \n.001, R  =  5.  Figure  l(c)  gives  a  plot of CK L(Al' A2)  and Figure  l(d) \nK  =  50, (115  = \ngives a  plot of ranGACV(Al, A2).  The open circles  mark the  minimizer of ranGACV \nin  both  plots  and  the  filled  in  circle  marks  the  minimizer of C K L.  The  inefficiency,  as \nmeasured  by  CKL()..)/minACKL(A)  is  1.01.  Inefficiencies  near  1  are  typical  of our \nother similar simulations. \n\n4.3  EXPERIMENT 3.  COMPARISON OF ranGACV AND UBR \n\nThis  experiment  used  a  model  similar  to  the  model  fit  by  GRKPACK  for  the  risk  of \nprogression  of diabetic  retinopathy  given  t  = \n(Xl, X2,  X3)  =  (duration,  glycosylated \nhemoglobin,  body  mass  index)  in  Wahba  et al(l995)  as  'truth'.  A  training  set  of 669 \nexamples was generated according to that model,  which  had the structure  f(Xl, X2,  X3)  = \n/-l + fl (xd + h  (X2)  + h  (X3)  + fl,3 (Xl, X3).  This (synthetic) training set was fit  by GRK(cid:173)\nPACK  and  also  using  K  =  50  basis  functions  with  ranG ACV.  Here  there  are P  =  6 \nsmoothing parameters (there are 3 smoothing parameters in f13)  and the ranGACV func(cid:173)\ntion  was searched by a downhill simplex method to  find  its minimizer.  Since the  'truth'  is \nknown,  the  CKL for)\"  and for  the  GRKPACK  fit  using  the  iterative UBR method  were \ncomputed. This was repeated  100 times, and the  100 pairs of C K L  values appears in Fig(cid:173)\nure  l(e).  It can be seen that the U BR and ranGACV give similar C K L  values about 90% \nof the time, while the ranG ACV has lower C K L  for most of the remaining cases. \n\n4.4  DATA ANALYSIS: AN APPLICATION \n\nFigure 1(f) represents part of the results of a study of association at baseline of pigmentary \nabnormalities with various risk factors in 2585 women between the ages of 43 and 86 in the \nBeaver Dam Eye Study, R.  Klein et al( 1995). The attributes are:  Xl  =  age, X2  =body mass \nindex, X3  =  systolic blood pressure, X4  =  cholesterol.  X5  and X6  are indicator variables for \ntaking hormones, and history of drinking. The smoothing spline ANOVA model fitted  was \nf(t)  =  /-l+dlXl +d2X2 + h(X3)+ f4(X4)+ h4(X3, x4)+d5I(x5) +d6I(x6), where I is the \nindicator function.  Figure  l(e) represents a cross section of the fit  for  X5  =  no, X6  =  no, \n\n\fThe Bias- Variance Tradeoff and the Randomized GACV \n\n625 \n\nX2, X3  fixed  at  their medians and Xl  fixed  at  the  75th  percentile.  The dotted  lines are the \nBayesian confidence intervals, see Wahba et al( 1995). There is a suggestion of a borderline \ninverse association of cholesterol. The reason for this association is uncertain. More details \nwill appear elsewhere. \n\nPrincipled soft classification procedures can now  be implemented in much larger data sets \nthan previously possible, and the ranG ACV should be applicable in general learning. \n\nReferences \n\nGirard, D. (1998), 'Asymptotic comparison of (partial) cross-validation, GCV and random(cid:173)\nized GCV in nonparametric regression', Ann.  Statist.  126, 315-334. \nGirosi,  F.,  Jones,  M.  &  Poggio,  T.  (1995),  'Regularization  theory  and  neural  networks \narchitectures', Neural Computatioll 7,219-269. \nGong,  J.,  Wahba,  G.,  Johnson,  D.  &  Tribbia,  J.  (1998),  'Adaptive  tuning  of numerical \nweather prediction models:  simultaneous estimation of weighting, smoothing and physical \nparameters', MOllthly  Weather Review 125, 210-231. \nGu,  C.  (1992),  'Penalized  likelihood  regression:  a  Bayesian  analysis',  Statistica  Sinica \n2,255-264. \nKlein, R., Klein, B.  & Moss, S.  (1995),  'Age-related eye disease and survival. the Beaver \nDam Eye Study', Arch Ophthalmol113, 1995. \nLiu,  Y.  (1993),  Unbiased  estimate  of generalization error and  model  selection  in  neural \nnetwork, manuscript, Department of Physics, Institute of Brain and Neural Systems, Brown \nUniversity. \nUtans,  J.  &  Moody,  J.  (1993),  Selecting  neural  network  architectures  via  the  prediction \nrisk:  application to corporate bond rating prediction, in  'Proc. First Int'I Conf. on Artificial \nIntelligence Applications on Wall Street', IEEE Computer Society Press. \nWahba,  G.  (1990), Spline Models for Observational Data, SIAM.  CBMS-NSF Regional \nConference Series in Applied Mathematics, v.  59. \nWahba,  G.  (1995),  Generalization  and  regularization  in  nonlinear  learning  systems,  in \nM.  Arbib,  ed.,  'Handbook of Brain Theory and  Neural  Networks',  MIT Press,  pp.  426-\n430. \nWahba, G.,  Wang,  Y.,  Gu, c., Klein, R.  & Klein,  B.  (1994), Structured machine learning \nfor  'soft'  classification  with  smoothing  spline  ANOVA  and  stacked  tuning,  testing  and \nevaluation, in J.  Cowan, G.  Tesauro &  J. Alspector, eds,  'Advances in  Neural Information \nProcessing Systems 6', Morgan Kauffman, pp. 415-422. \nWahba, G., Wang, Y., Gu, C., Klein, R.  &  Klein, B. (1995), 'Smoothing spline AN OVA for \nexponential families,  with application to the Wisconsin Epidemiological Study of Diabetic \nRetinopathy' , Ann. Statist. 23,  1865-1895. \nWang, Y.  (1997), 'GRKPACK: Fitting smoothing spline analysis of variance models to data \nfrom exponential families',  Commun. Statist.  Sim.  Compo  26,765-782. \nWong,  W.  (1992), Estimation of the loss  of an estimate,  Technical  Report 356, Dept.  of \nStatistics, University of Chicago, Chicago, II. \nXiang, D.  &  Wahba, G.  (1996), 'A generalized approximate cross validation for smoothing \nsplines  with  non-Gaussian data', Statistica  Sinica  6, 675-692, preprint TR 930 available \nvia www. stat. wise. edu/-wahba - > TRLIST. \nXiang, D. &  Wahba, G.  (1997), Approximate smoothing spline methods for large data sets \nin the binary case, Technical Report 982, Department of Statistics, University of Wisconsin, \nMadison  WI.  To  appear in  the Proceedings of the  1997  ASA Joint Statistical  Meetings, \nBiometrics Section, pp 94-98 (1998). Also in TRLIST as above. \n\n\fG,  Wahba et aI, \n\nCKL \n\nCKL \nranGACV \n\n626 \n\n10 \n(0 \nc:i \n\n0 \n(0 \nc:i \n\n10 \n10 \nc:i \n\n0 \n~ \n0 \n\n.' . \n\n10 \n(0 \nc:i \n\no \n(0 \nc:i \n\n10 \n10 \nc:i \n\no \n10 \nc:i \n\n-8 \n\n-7 \n\n-6 \n-5 \nlog lambda \n\n(a) \n\n-4 \n\n-3 \n\n-8 \n\n-7 \n\n-6 \n-5 \nlog lambda \n\n(b) \n\n-4 \n\n-3 \n\n9.29 \n\nr,,6 \n:0' \n~,  O.  7  O.  9 \n\n-7 \n\nlog lambda1 \n\n-6 \n\n(c) \n\n-5 \n\no \n\n12! \no \n\nC\\I \n(0 \nc:i \n\nco \n10 \nc:i \n\n\"f \n\n... 0,28 \n\n.. \u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7\u00b7r ..... ranGACV \n\n\\~7 \n0\\ \n\n.' \n\n\"  ...... \n\n0:\\13 \n\n': \n\n:.\u00b7 .. O-:!4!7 \n: \n\n0 \n\n.  . \n.  . \n\n. \n. \n\n.:  0'F5  0'F8  0.[32 \n\nlog lambda1 \n\n-6 \n\n(d) \n\n-5 \n\n-4 \n\nO.  4 \n\n-4 \n\n.25  0'.2,4 \n\n-7 \n\n(0 \nc:i \n\n.~ =...,. \n.0  . \nca O \n.0 e a.. \n\nC\\I \nc:i \n\n(0 \n10 \nc:i  ~--------.-------.--------r--~ \n\n0,56 \n\n0,58 \n\n0,60 \nranGACV \n\n(e) \n\n0,62 \n\no \nc:i  ~ __ ~ ____ , -____ . -__ -. ____ . -__ ~ \n\n100 \n\n150 \n\n200 \n300 \nCholesterol (mg/dL) \n\n250 \n\n(f) \n\n350 \n\n400 \n\nFigure  1:  (a)  and (b):  Single smoothing parameter comparison of ranGACV and CK L. \n(c) and (d):  Two smoothing parameter comparison of ranGACV and CK L. (e):  Compar(cid:173)\nison of ranG ACV and U B R.  (f):  Probability estimate from Beaver Dam Study \n\n\f", "award": [], "sourceid": 1483, "authors": [{"given_name": "Grace", "family_name": "Wahba", "institution": null}, {"given_name": "Xiwu", "family_name": "Lin", "institution": null}, {"given_name": "Fangyu", "family_name": "Gao", "institution": null}, {"given_name": "Dong", "family_name": "Xiang", "institution": null}, {"given_name": "Ronald", "family_name": "Klein", "institution": null}, {"given_name": "Barbara", "family_name": "Klein", "institution": null}]}