{"title": "Kohonen Networks and Clustering: Comparative Performance in Color Clustering", "book": "Advances in Neural Information Processing Systems", "page_first": 984, "page_last": 990, "abstract": null, "full_text": "Kohonen Networks and Clustering: Comparative \n\nPerformance in Color Clustering \n\nWesley Snyder \n\nDepartment of Radiology \nBowman Gray School of \n\nMedicine \n\nWake Forest University \n\nWinston-Salem, NC 27103 \n\nDaniel Nissman, David Van den Bout, \n\nand Grift BUbro \n\nCenter for Communications and Signal Processing \n\nNorth Carolina State University \n\nRaleigh, NC 27695 \n\nAbstract \n\nThe problem of color clustering is defined and shown to be a problem of \nassigning a large number (hundreds of thousands) of 3-vectors to a \nsmall number (256) of clusters. Finding those clusters in such a way that \nthey best represent a full color image using only 256 distinct colors is a \nburdensome computational problem. In this paper, the problem is solved \nusing \"classical\" techniques -- k-means clustering, vector quantization \n(which turns out to be the same thing in this application), competitive \nlearning, and Kohonen self-organizing feature maps. Quality of the \nresult is judged subjectively by how much the pseudo-color result \nresembles the true color image, by RMS quantization error, and by run \ntime. The Kohonen map provides the best solution. \n\n1 INTRODUCTION \n\n\"Clusteringn , \"vector quantization\", and \"unsupervised learning\" are all words which \ndescn'be the same process: assigning a few exemplars to represent a large set of samples. \nPerfonning that process is the subject of a substantial body of literature. In this paper, we \nare concerned with the comparison of various clustering techniques to a particular, practi(cid:173)\ncal application: color clustering. \n\nThe color clustering problem is as follows: an image is recorded in full color -- that is, \nthree components, RED, GREEN, and BLUE, each of which has been measured to 8 bits \nof precision. Thus, each pixel is a 24 bit quantity. We must find a representation in which \n2563 possible colors are represented by only 8 bits per pixel. That is, for a problem with \n256000 variables (512 x 512) variables, assign each variable to one of only 256 classes. \n\nThe color clustering problem is currently of major economic interest since millions of dis(cid:173)\nplay systems are sold each year which can only store 8 bits per pixel, but on which users \nwould like to be able to display \"true\" color (or at least as near true color as possible). \n\nIn this study, we have approached the problem using the standard techniques from the lit(cid:173)\nerature (including k-means -- ISODATA clustering[1,3,61, LBG[4]), competitive learning \n(referred to as CL herein) [2] , and Kohonen feature maps [5 ,7 ,9]. The Kohonen feature map \n\n984 \n\n\fKohonen Networks and Clustering \n\n985 \n\n(referred to as KFM herein) was found to win \"hands down\", providing both the best quality image \n(subjectively) and objectively (based on quantization error), as well as the fastest nm times. \n\n2 BACKGROuND-METHODS TESTED \n\nIn almost all clustering algorithms, we begin with some (usually ad-hoc) determination of initial \ncluster centers. The number of such centers generally remains the same, although some algorithms \n(e.g. ISODATA[lO]) allow the number to evolve through the nmning of the algorithm. In this work. \nwe know that we want to find 256 distinct clusters. The basic idea behind most of these methods is \nto update the cluster closest to the current data point by moving it some small increment towards \nthat data point Mter the data has been presented to the algorithm sufficiently often, the clusters \nshould converge to the real cluster means. 'JYpically, one has to cycle through the training set sev(cid:173)\neral times (sometimes a large number of times) to get an acceptable solution. Each run though the \ntraining set is termed an epoch. \n\n2.1 K-MEANS \n\nThe well-known [6] k-means algorithm for clustering is as follows (see [10] for a tutorial explana(cid:173)\ntion). \n1. ;Begin with an arbitrary assignment of samples to clusters or begin with an arbitrary set of clus-\n\nter centers and assign samples to nearest centers. \n\n2. Compute the sample mean of each cluster. \n\n3. Reassign each sample to the cluster with the nearest mean. \n\n4. If the classification of all samples has not changed, stop; else go to step 2. \n\n2.2 LBG VECTOR QUANTIZATION \n\nIn this method, 256 colors are picked randomly from the scene. These are referred to as the \"code(cid:173)\nbook\". Each pixel in the image is then assigned to the \"nearest\" entry in the codebook. After \nassignment of all pixels, the mean of each bini is calculated. If the difference between the code(cid:173)\nbook entry and the mean of the corresponding bin is below threshold for all entries, the \"optimal\" \ncodebook has been located. In [4], the algorithm is shown to work for a large variety of distance \nfunctions; however, for applications (such as this one) where the Euclidean metric is most appropri(cid:173)\nate, the algorithm becomes identical to k-means. In [8], results similar to those we found are \nreported in the color clustering problem. \n\n2.3 KOHONEN MAPS AND COMPETITIVE LEARNING \n\nIn competitive learning algorithms, data examples are presented sequentially to the system. The \ncluster center most similar to the data example is determined, and that center is moved slightly \ntoward the example. \n\nI That is, all the pixels assigned to that entry in the codebook. \n\n\f986 \n\nSnyder, Nissman, Vcm den Bout, and Bilbro \n\nThe update rule for competitive learning can be described as follows: \n\n(EQ 1) \n\nwhere Wi is the weight vector (or mean) corresponding to cluster i and h is the learning parameter \n(typically on the order of 0.01). \n\nIn the case of Kohonen maps, however, the algorithm is slightly more complicated. All clusters are \nconnected to each other according to a topological map. When the closest cluster to a data point \n(the primary cluster) is updated, so are its immediate neighbors (the proximity clusters) in tenns of \nthe topological map. In feature space, it is possible, initially, for the neighbors of the primary clus(cid:173)\nter to not be its topological neighbors. By the nature of the update rule, the neighbors of the primary \ncluster in topological space will become its neighbors in feature space after some period of time. \nThis is very desirable for applications in which a minimum distance between related clusters is \ndesired (the Tmveliog Salesman Problem, for example). \n\nOften, it is the case that a single cluster is chosen much of the time, if not all of the time, because of \nthe order in which data is presented and the manner in which the clusters are initialized. In order to \nmake clustering work in a practical context, one needs to include a tenn in the distance calculation \nwhich reduces the probability of updating an often-used cluster. Such a term is called the con(cid:173)\nscience[2]. Its effect is to increase the effective distance of a cluster from a data point An alterna(cid:173)\ntive approach to the use of a conscience is to increment a counter for each cluster which has been \npassed over for updating and then subtract some multiple of this counter from the calculated dis(cid:173)\ntance. We call this the loneliness term, and used it because the implementation turned out to be \nmore convenient, and the perfonnance similar to that of conscience. \n\nFor KFM, the primary cluster is updated as indicated in Eqn. 1. The proximity clusters are updated \nin a similar fashion \n\n(EQ2) \n\nwhere Wj is the weight vector corresponding to the proximity cluster j, dij is the topological distance \nbetween clusters i andj, and F ('1\\. dij) is some decreasing function of the distance between i andj \nwith a maximum at '1\\. \n\n3 Application to Color Clustering \n\nMaking no assumptions concerning the input image, we chose an appropriate topology for the \nKFM algorithm which would easily lend itself to describing a uniform distribution of colors in \nRGB space. Such a distribution is a rectangular solid in the 3-D color space. We chose the dimen(cid:173)\nsions of this block to be 6x7x6 -- corresponding to 252 clusters mther than the 256 allowable -(cid:173)\nunder the assumption that the omission of those four clusters would not make a perceptible differ(cid:173)\nence. The clusters were initialized as a small block positioned at the center of RGB space with the \nlong axis in the green direction. This orientation was chosen because human eyes are most sensitive \nto green wavelengths and, hence, more resolution may be required along this axis. The exact initial \norientation does not matter in the final solution, but was chosen to aid in speed of convergence. \n\n\fKohonen Networks and Clustering \n\n987 \n\nIn an attempt to significantly speed up training, each data point was assigned to one of the eight \nsubcubes of RGB space. and then only a specified subset of clusters was searched for an appropri(cid:173)\nate candidate for updating. The clusters were subdivided, roughly, into eight subcubes as well. The \neffect of this is to decrease training time by approximately a factor of eight. Also, in the interest of \nprocessing time, only the six most immediate topological neighbors (those with a topological dis(cid:173)\ntance of one from the primary cluster) were updated. This same heuristic was applied for both CL \nand KFM experiments. \n\n4 RESULTS \n\nWe applied all the techniques discussed, in various implementations, to actual color images, includ(cid:173)\ningio particular, pictures of faces. Although also tested on larger images, all times given in this \nreport are against a baseline case of a 128x128 image: three bands of input (red, green, blue -- 8 bits \neach), and one band (8 bits) of output, plus a lookup table output indicating what 24 bit color each \nof the 8 bit pattern represented. Given sufficient training, all the techniques produced pseudo-color \nimages which were extremely lifelike. Comparing the images closely on a CRT, a trained observer \nwill note variations in the color rendering, particularly in sparse colors (e.g. blue eyes in a facial \nscene), and will also observe color contouring. However, these details are subtle, and are not easily \nreproducible in a conference proceedings. Map files and corresponding images were generated for \n5, 10, and 15 epochs using h = 0.05 and proximity h = 0.00625. Direct comparisons were made \nbetween Kohonen feature maps, competitive learning, and the results from k-means (and the LBG \nformulation of k-means). For the training runs using competitive learning, all clusters were initial(cid:173)\nized to random values within the unit sphere located in the center of RGB space. The conscience \nconcept was used here. \n\nIn this section all timing comparisons are done on a Microvax 2000, although we have also run \nmany of the same programs on a Decstation. The Decstation typically runs 10-15 times as fast as \nthe Microvax. In order to compare techniques fairly, all timing is reported for the same image. \n\n4.1 K-MEANS AND LBG EXPERIMENTS \n\nThe performance of k-means and LBG algorithms were strongly dependent on how long they were \nallowed to run. After approximately 90 minutes of execution of k-means, the results were as good \n(subjectively) as from Kohonen maps. In different experiments, k-means was started from the fol(cid:173)\nlowing initial configurations: \n\n1. 256 points randomly (uniformly) distributed over RGB space \n\n2. The 256 points on the main diagonal of color space (red=green=blue) \n\n3. A uniform (3D) grid spread over RGB \n\n4. Uniformly distributed over the surface of the color cube \n\n5. Randomly distributed near the origin \n\n\f988 \n\nSnyder, Nissman, Vcm den Bout, and Bilbro \n\nChoice 2 gave the best overall performance, where \"best\" is detennined by the time required to \nconverge to a point where the resulting image looked \"equally good\" subjectively. K-means \nreqUired 87 minutes to reach this standard quality, although it took 9 hours to completely converge \n(until no cluster center moved more than .5 units in one iteration). \n\n4.2 EXPERIMENTS ON KOHONEN AND COMPETITIVE LEARNING \n\nKFM gave an excellent rendering of color images. In particular, blue eyes were rendered extremely \nwell in images of faces. Depending on the value of the conscience parameter, the competitive learn(cid:173)\ning algorithm tended to rendered blue eyes as brown, since the dominant skin tones in facial images \nare shades of brown. \n\nSpeed comparisons. All of these runs were done on Microvaxen. \n\nAlgorithm \n\nKohonen \n\nCompLearn \n\nTotal time \n\nTime/epoch \n\n15:42 \n\n8:38 \n\n1:34 \n\n:52 \n\nConverting the image: \n1:34 for Kohonen \n4: 16 for Competitve Learning \n\nThe subjective judgments of picture quality were made using the 10 epoch case of Kohonen maps \nas a reference. To quantitatively compare the performance of Kohonen maps and competitive learn(cid:173)\ning, we computed the RMS color error: \n\n(EQ3) \n\nwhere Vi is the actual color 3-vector at pixel i, and Ci is the color represented by the mean of the \ncluster to which pixel i is currently assigned. Plotting E vs. epoch number for both Kohonen and \ncompetitive learning, we find the results in the figure below. \n\n\fKohonen Networks and Clustering \n\n989 \n\n~----~----~------.------p----~18+07 \n\n... \ne ... w \nc o \n~ \nN -... C as \n\n:::I a \n\nCompetitive Network \n\nKohonen Network \n\n58+06 \n\n28+06 \n\n~----------~------------~----~18+06 \no \n\n20 \n\n30 \n\nEpochs \n\n10 \n\n40 \n\n50 \n\nIt is clear from this figure that the KFM network converges more rapidly to a stable solution with \nmuch lower error than does the competitive network. Such figures can be deceiving in image pro(cid:173)\ncessingt howe vert since RMS error is a notoriously bad quality measure (small regions may have \nvery large errors in order to make the overall average error low). In this caset howevert the \nKohonen map preserves the accuracy of color rendering in small regions quite well. \n\nTo ~valuate the sensitivity to initial cluster center choicest both competitive learning and KFM \nwere applied with different choices of centers. We found that competitive learning often converged \nto undesirable renderingst whereas KFM always yielded a good solutiont even when the initial cen(cid:173)\nters were all at OtOtO. \n\n5 DISCUSSION \n\nThe quality of rendering attained by these algorithms is due to the nature of facial images. There is \na great deal of density in the flesh colored region and a comparatively smallert but nonetheless siz-\n\n\f990 \n\nSnyder, Nissman, Vcm den Bout, and Bilbro \n\nable, amount in the background colors. The competitive learning algorithm found these high den(cid:173)\nsity regions with no problem. Greater difficulty was had with the blue eyes, since there are few \nexamples of blue to be trained on and hence the algorithm was swai11ped by the high density \nregions. If one let the competitive learning algorithm run for a large nwnber of epochs, it eventually \nfound the blue cluster. The assignment of clusters to subdivisions of feature space guarantees that \nno region of the image was particularly emphasized, therefore allowing clusters that were solely \ninfluenced by less represented colors. However, this can also \"waste\" clusters in regions where \nthere are few examp'les. \n\nFurthermore, the topological structure of the Kohonen map allows one to make certain asswnptions \nto speed up the algorithm. \n\nDespite a minor penalty in computational speed per epoch, the Kohonen algorithm produces the \nimage with the least error in the least amount of time. With appropriate choice of parameters, the \nclustered image becomes indistinguishable from the original in less than ten epochs, for essentially \narbitrary initial conditions (as opposed to competitive learning). The other clustering techniques \nrequire significantly longer times. \n\n6 REFERENCES \n1. G. H. Ball and D. J. Hall, \"ISODATA, A Novel Method of Data Analysis and Pattern Classifica(cid:173)\n\ntion\" SRI Technical Report (NTIS AD699616), Stanford, CA, 1965 \n\n2. D. DeSieno, \"Adding a Conscience to Competitive Learning\", International Conference On \n\nNeural Networks, Vol. 1, pp. 117-124, 1988 \n\n3. K. Fukunaga, Introduction to Pattern Recognition, Academic Press, Orlando FL, 1972 \n4. Y. Linde, A. Bozo, and R. Gray, \"An Algorithm for Vector Quantizer Design\", IEEE Trans. \n\nCom. Vol. COM-28, No.1, pp. 84-95, Jan. 1980 \n\n5. T. Kohonen, \"Self-Organized Formation of Topologically Correct Feature Maps\", Biological \n\nCybernetics, 43:56-69,1982 \n\n6. J. Mac Queen \"Some Methods for Classification and Analysis of Multivariate Observations\", \n\nProc. 5th Berkeley Symposium, 1, pp. 281-297,1967 \n\n7. N. Nasrabadi and Y. Feng, \"Vector Quantization of Images Based upon Kohonen Self-organiz(cid:173)\ning Feature Maps\", IEEE International Conference on Neural Networks, Vol. 1, pp. 101-108, \n1986 \n\n8. H. Potlapalli, M. Jaisimha, H. Barad, A. Martinez, M. Lohrenz, J. Ryan, and J. Pollard. \"Classi(cid:173)\nfication Techniques for Digital Map Compression\" 21st Southeastern Symposiwn on System \nTheory,pp.268-272. 1989. Tal~,Fl,~h,1989 \n\n9. H. Ritter and K. Schulten, \"Kohonen Self -organizing Maps: Exploring their Computational \nCapabilities\" IEEE International Conference on Neural Networks, Vol. 1, pp. 109-116,1988 \n\n10.C. W. Therrien, Design, Estimation, and Classification, Wiley, NY, 1989 \n\n\fPart XV \n\nVLSI \n\n\f\f", "award": [], "sourceid": 352, "authors": [{"given_name": "Wesley", "family_name": "Snyder", "institution": null}, {"given_name": "Daniel", "family_name": "Nissman", "institution": null}, {"given_name": "David", "family_name": "Van den Bout", "institution": null}, {"given_name": "Griff", "family_name": "Bilbro", "institution": null}]}