{"title": "Statistics of Natural Images: Scaling in the Woods", "book": "Advances in Neural Information Processing Systems", "page_first": 551, "page_last": 558, "abstract": null, "full_text": "Statistics of Natural Images: \n\nScaling in the Woods \n\nDaniel L. Ruderman* and William Bialek \n\nNEe Research Institute \n\n4 Independence Way \nPrinceton, N.J. 08540 \n\nAbstract \n\nIn order to best understand a visual system one should attempt \nto characterize the natural images it processes. We gather images \nfrom the woods and find that these scenes possess an ensemble scale \ninvariance. Further, they are highly non-Gaussian, and this non(cid:173)\nGaussian character cannot be removed through local linear filter(cid:173)\ning. We find that including a simple \"gain control\" nonlinearity in \nthe filtering process makes the filter output quite Gaussian, mean(cid:173)\ning information is maximized at fixed channel variance. Finally, we \nuse the measured power spectrum to place an upper bound on the \ninformation conveyed about natural scenes by an array of receptors. \n\n1 \n\nIntroduction \n\nNatural stimuli are playing an increasingly important role in our understanding of \nsensory processing. This is because a sensory system's ability to perform a task is a \nstatistical quantity which depends on the signal and noise characteristics. Recently \nseveral approaches have explored visual processing as it relates to natural images \n(Atick & Redlich '90, Bialek et al '91, van Hateren '92, Laughlin '81, Srinivasan \net al '82) . However, a good characterization of natural scenes is sorely lacking. In \nthis paper we analyze images from the woods in an effort to close this gap. We \n\n\u2022 Current address: The Physiological Laboratory, Downing Street, Cambridge \n\nCB2 3EG, England. \n\n551 \n\n\f552 \n\nRuderman and Bialek \n\nfurther attempt to understand how a biological visual system should best encode \nthese images. \n\n2 The Images \n\nOur images consist of 256 x 256 pixels 1(x) which are calibrated against luminance \n(see Appendix). We define the image contrast logarithmically as \n\ncf;(x) = In(I(x)/10), \n\nwhere 10 is a reference intensity defined for each image. We choose this constant \nsuch that Ex cf;(x) = 0; that is, the average contrast for each image is zero. Our \nanalysis is of the contrast data cf;( x). \n\n3 Scaling \n\nRecent measurements (Field '87, Burton & Moorhead '87) suggest that ensembles \nof natural scenes are scale-invariant. This means that and any quantity defined on \na given scale has statistics which are invariant to any change in that scale. This \nseems sensible in light of the fact that the images are composed of objects at all \ndistances, and so no particular angular scale should stand out. (Note that this does \nnot imply that any particular image is fractal! Rather, the ensemble of scenes has \nstatistics which are invariant to scale.) \n\n3.1 Distribution of Contrasts \n\nWe can test this scaling hypothesis directly by seeing how the statistics of various \nquantities change with scale. We define the contrast averaged over a box of size \nN x N (pixels) to be \n\nN \n\ncf;N = ~2 L cf;( i, j). \n\ni,j=l \n\nWe now ask: \"How does the probability P( cf;N) change with N?\" \nIn the left graph of figure 1 we plot log(P( cf;N / cf;~MS)) for N = 1,2,4,8,16,32 along \nwith the parabola corresponding to a Gaussian of the same variance. By dividing \nout the RMS value we simply plot all the graphs on the same contrast scale. The \ngraphs all lie atop one another, which means the contrast scales-the distribution's \nshape is invariant to a change in angular scale. Note that the probability is far from \nGaussian, as the graphs have linear, and not parabolic, tails. Even after averaging \nnearly 1000 pixels (in the case of 32x32), it remains non-Gaussian. This breakdown \nof the central limit theorem implies that the pixels are correlated over very long \ndistances. This is analogous to the physics of a thermodynamic system at a critical \npoint. \n\n3.2 Distribution of Gradients \n\nAs another example of scaling, we consider the probability distribution of image \ngradients. We define the magnitude of the gradient by a discrete approximation \n\n\fStatistics of Natural Images: Scaling in the Woods \n\n553 \n\n., \n\n\u00b7 15 \n\n-2.5 \n\n\u00b735 \n\n., ' - - -............ -~--'---~-----'-----' \n., \n\n., \n\n-2 \n\nFigure 1: Left: Semi-log plot of P( \n\n~ \n\n! \n\n~ \n\n-1 \n\n- 1 , \n\n-, , \n\n-J , \n\n/' \n! \nI \n\n/ \n\nI \n,/' \nI \n\nI ; \nI \ni \ni \n:' \ni \n\n.-\",--\n\n/ \n\n\u2022 \n\nContr .... t \n\n, \n--,\\ \n, \n, \n\\ \n\\ \n\\, \n\\ \n\\ \n\n\\ \n\n\\ \n\\ \n\\ \n\\\\ \n\n~ \n\n~ \n\ni \n\n3 \n\n- 1 , \n\n- 2 \n\n-, , \n\n-J \n\n. J , \n-. \n\nc \n\n0 , \n\n1 5 \n\nGradlent \n\n\" \n\n\"\"\"\" \n\n\" .\"~, \n\n\"''\\'' \n\n\\'~\" \n\n:l \n\n2.5 \n\n(UrHtl of Me.n1 \n\n) S \n\nFigure 5: Left: Semi-log plot of histogram of 1/J, with Gaussian for comparison \n(dashed). Right: Semi-log plot of histogram of gradients of 1/J, with Rayleigh dis(cid:173)\ntribution shown for comparison (dashed). \n\nWe find that for a value N = 5 (ratio of the negative surround to the positive \ncenter), the histograms of 1/J are the closest to Gaussian (see the left of figure 5) . \nFurther, the histogram of gradients of 1/J is very nearly Rayleigh (see the right of \n\n\f556 \n\nRuderman and Bialek \n\nfigure 5). These are both signatures of a Gaussian distribution. Functionally, this \n\"variance normalization\" procedure is similar to contrast gain control found in the \nretina and LGN (Benardete et ai, '92). Could its role be in \"Gaussianizing\" the \nimage statistics? \n\n5 \n\nInformation in the Retina \n\nFrom the measured statistics we can place an upper bound on the amount of in(cid:173)\nformation an array of photo receptors conveys about natural images. We make the \nfollowing assumptions: \n\n\u2022 Images are Gaussian with the measured power spectrum. This places an \nupper bound on the entropy of natural scenes, and thus an upper bound \non the information represented. \n\n\u2022 The receptors sample images in a hexagonal array with diffraction-limited \n\noptics. There is no aliasing. \n\n\u2022 Noise is additive, Gaussian, white, and independent of the image. \n\nThe output of the nth receptor is thus given by \n\nYn = J d2x \u00a2(x) M(x - xn) + 'f/n, \n\nwhere Xn is the location of the receptor, M(x) is the point-spread function of the \noptics, and 'f/n is the noise. For diffraction-limited optics, \n\nIkl/kc, \nwhere kc is the cutoff frequency of 60 cycles/degree. \n\nM(k) ~ 1 -\n\nIn the limit of an infinite lattice, Fourier components are independent, and the total \ninformation is the sum of the information in each component: \n\n+= Ac fkCdkklog[1+A1 2 IM (k)1 2S(k)]. \n\n47J\" Jo \n\ncu \n\nHere I is the information per receptor, Ac is the area of the unit cell in the lattice, \nand u 2 is the variance of the noise. \nWe take S(k) = A/k 2- fJ , with A and 'f/ taking their measured values, and express \nthe noise level in terms of the signal-to-noise ratio in the receptor. In figure 6 we \nplot the information per receptor as a function of SN R along with the information \ncapacity (per receptor) of the photoreceptor lattice at that SN R, which is \n\n1 \n\nC = 2 log [1 + S N R] . \n\nThe information conveyed is less than 2 bits per receptor per image, even at SN R = \n1000. The redundancy of this representation is quite high, as seen by the gap \nbetween the curves; at least as much of the information capacity is being wasted as \nis being used . \n\n\fStatistics of Natural Images: Scaling in the Woods \n\n557 \n\nI (bits) \n\n5 \n\n4 \n\n0.5 \n\n1 \n\n1.5 \n\n2 \n\n2.5 \n\n3 LoglO[SNR) \n\nFigure 6: Information per receptor per image (in bits) as a function of 10g(SN R) \n(lower line). Information capacity per receptor ( upper line). \n\n6 Conclusions \n\nWe have shown that images from the forest have scale-invariant, highly non(cid:173)\nGaussian statistics. This is evidenced by the scaling of the non-Gaussian histograms \nand the power-law form of the power spectrum. Local linear filtering produces val(cid:173)\nues with quite exponential probability distributions. In order to \"Gaussianize,\" we \nmust use a nonlinear filter which acts as a gain control. This is analogous to contrast \ngain control, which is seen in the mammalian retina. Finally, an array of receptors \nwhich encodes these natural images only conveys at most a few bits per receptor \nper image of information, even at high SN R. At an image rate of 50 per second, \nthis places an information requirement of less than about 100 bits per second on a \nfoveal ganglion cell. \n\nAppendix \n\nSnapshots were gathered using a Sony Mavica MVC-5500 still video camera \nequipped with a 9.5-123.5mm zoom lens. The red, green, and blue signals were \ncombined according to the standard CIE formula Y = 0.59 G + 0.30 R + 0.11 B \nto produce a grayscale value at each pixel. The quantity Y was calibrated against \nincident luminance to produce the image intensity I(x). The images were cropped \nto the central 256 x 256 region. \n\nThe dataset consists of 45 images taken at a 15mm focal length (images subtend \n15 0 of visual angle) and 25 images at an 80mm focal length (3 0 of visual angle) . All \nimages were of distant objects to avoid problems of focus. Images were chosen by \nplacing the camera at a random point along a path and rotating the field of view \nuntil no nearby objects appeared in the frame. The camera was tilted by less than \n100 up or down in an effort to avoid sky and ground. The forested environment \n(woods in New Jersey in springtime) consisted mainly of trees, rocks, hillside, and \na stream. \n\n\f558 \n\nRuderman and Bialek \n\nAcknowledgements \n\nWe thank H. B. Barlow, B. Gianulis, A. J. Libchaber, M. Potters, R. R. de Ruyter \nvan Stevenink, and A. Schweitzer. Work was supported in part by a fellowship from \nthe Fannie and John Hertz Foundation (to D.L.R.). \n\nReferences \n\nJ .J. Atick and N. Redlich. Towards a theory of early visual processing Neural \nComputation, 2:308, 1990. \n\nE. A. Benardete, E. Kaplan, and B. W. Knight. Contrast gain control in the primate \nretina: P cells are not X-like, some M-cells are. Vis. Neuosci., 8:483-486, 1992. \n\nW. Bialek, D. L. Ruderman, and A. Zee. The optimal sampling of natural im(cid:173)\nages: a design principle for the visual system?, in Advances in Neural Information \nProcessing systems, 3, R. P. Lippman, J. E. Moody and D. S. Touretzky, eds., 1991. \n\nG. J. Burton and I. R. Moorhead. Color and spatial structure in natural scenes. \nApplied Optics, 26:157-170, 1987. \n\nD. J. Field. Relations between the statistics of natural images and the response \nproperties of cortical cells. I. Opt. Soc. Am. A, 4:2379, 1987. \n\nJ. H. van Hateren. Theoretical predictions of spatiotemporal receptive fields of fly \nLMCs, and experimental validation. I. Compo Physiol. A, 171:157-170, 1992. \n\nS. B. Laughlin. A simple coding procedure enhances a neuron's information capac(cid:173)\nity. Z. Naturforsh., 36c:910-912, 1981. \n\nM. V. Srinivasan, S. B. Laughlin, and A. Dubs. Predictive coding: a fresh view of \ninhibition in the retina. Proc. R. Soc. Lond. B, 216:427-459, 1982. \n\n\f", "award": [], "sourceid": 835, "authors": [{"given_name": "Daniel", "family_name": "Ruderman", "institution": null}, {"given_name": "William", "family_name": "Bialek", "institution": null}]}