{"title": "Hyperbolic Self-Organizing Maps for Semantic Navigation", "book": "Advances in Neural Information Processing Systems", "page_first": 1417, "page_last": 1424, "abstract": "", "full_text": "Hyperbolic Self-Organizing Maps for Semantic\n\nNavigation\n\nJ\u00a8org Ontrup\n\nNeuroinformatics Group\nFaculty of Technology\nBielefeld University\n\nD-33501 Bielefeld, Germany\n\njontrup@techfak.uni-bielefeld.de\n\nHelge Ritter\n\nNeuroinformatics Group\nFaculty of Technology\nBielefeld University\n\nD-33501 Bielefeld, Germany\nhelge@techfak.uni-bielefeld.de\n\nAbstract\n\nWe introduce a new type of Self-Organizing Map (SOM) to navigate\nin the Semantic Space of large text collections. We propose a \u201chyper-\nbolic SOM\u201d (HSOM) based on a regular tesselation of the hyperbolic\nplane, which is a non-euclidean space characterized by constant negative\ngaussian curvature. The exponentially increasing size of a neighborhood\naround a point in hyperbolic space provides more freedom to map the\ncomplex information space arising from language into spatial relations.\nWe describe experiments, showing that the HSOM can successfully be\napplied to text categorization tasks and yields results comparable to other\nstate-of-the-art methods.\n\n1 Introduction\n\nFor many tasks of exploraty data analysis the Self-Organizing Maps (SOM), as introduced\nby Kohonen more than a decade ago, have become a widely used tool [1, 2]. So far, the\noverwhelming majority of SOM approaches have taken it for granted to use a \ufb02at space\nas their data model and, motivated by its convenience for visualization, have favored the\n(suitably discretized) euclidean plane as their chief \u201ccanvas\u201d for the generated mappings.\n\nHowever, even if our thinking is deeply entrenched with euclidean space, an obvious limit-\ning factor is the rather restricted neighborhood that \u201c\ufb01ts\u201d around a point on a euclidean 2D\nsurface. Hyperbolic spaces in contrast offer an interesting loophole. They are characterized\nby uniform negative curvature, resulting in a geometry such that the size of a neighborhood\naround a point increases exponentially with its radius . This exponential scaling behavior\nallows to create novel displays of large hierarchical structures that are particular accessible\nto visual inspection [3, 4].\n\nConsequently, we suggest to use hyperbolic spaces also in conjunction with the SOM. The\nlattice structure of the resulting hyperbolic SOMs (HSOMs) is based on a tesselation of\nthe hyperbolic space (in 2D or 3D) and the lattice neighborhood re\ufb02ects the hyperbolic\ndistance metric that is responsible for the non-intuitive properties of hyperbolic spaces.\n\n\fAfter a brief introduction to the construction of hyperbolic spaces we describe several com-\nputer experiments that indicate that the HSOM offers new interesting perspectives in the\n\ufb01eld of text-mining.\n\n2 Hyperbolic Spaces\n\nHyperbolic and spherical spaces are the only non-euclidean geometries that are homoge-\nneous and have isotropic distance metrics [5, 6]. The geometry of H2 is a standard topic in\nRiemannian geometry (see, e.g. [7]), and the relationships for the area \nand the circum-\nference \u0001 of a circle of radius are given by\n\u0015\u0014\u0007\u0016\u0007\u0017\u0019\u0018\n\n\u0003\u0002\u0005\u0004\u0007\u0006\t\b\u000b\n\r\f\u000f\u000e\u0011\u0010\u0013\u0012\n\nThese formulae exhibit the highly remarkable property that both quantities grow exponen-\ntially with the radius . It is this property that was observed in [3, 4] to make hyperbolic\nspaces extremely useful for accommodating hierarchical structures.\nTo use this potential for the SOM, we must solve two problems: \u0012 \u001f\n\u0017 we must \ufb01nd suitable\n\u0012 \u001f!\u001f\ndiscretization lattices on H2 to which we can \u201cattach\u201d the SOM prototype vectors.\nafter having constructed the SOM, we must somehow project the (hyperbolic!) lattice into\n\u201c\ufb02at space\u201d in order to be able to inspect the generated maps.\n\n\u0006\t\b\u001a\n\u001b\f\u000f\u000e\u001c\u0012\n\n\u0015\u0017\u001e\u001d\n\n(1)\n\n2.1 Projections of Hyperbolic Spaces\n\nTo construct an isometric (i.e., distance preserving) embedding of the hyperbolic plane into\na \u201c\ufb02at\u201d space, we may use a Minkowski space [8]. In such a space, the squared distance \"\nbetween two points \u0012$#\n\n\u0018\u001a%&\u0018\u000b'&\u0017 and \u0012 #&(\n\u0002)\u0012$#+*,#\n\n\u0018\u000b%\n\n\u0018\u000b'\n\u0010.-\n\nis given by\n*/\u0012\n\n(2)\n\ni.e., it ceases to be positive de\ufb01nite. Still, this is a space with zero curvature and its some-\nwhat peculiar distance measure allows to construct an isometric embedding of the hyper-\nbolic plane H2, given by\n\n\b6\u001287\n\nM\n\nA\n\n\b\u000b\u000e;\u0012 2\n\n365\n\n#0\u0002\u0003\b\u000b\n\r\f\u000f\u000e1\u0012 2\n\n\u0002\u0003\b\u001a\n\u001b\f\u000f\u000e:\u0012$2\n\n\b\u001a\n\u001b\f:\u001287\n\nu\n\n1\n\nN\n\n\u001743\u00195\n\n\u0002)=\n\n\u0017\u0019\u00189'\n\n\u0017\u0019\u00189%\n\nswept out by rotating the curve '\n\nwhere \u0012$2\nappears as the surface <\n\n(3)\n\u0017\u001e\u0018\n\u0017 are polar coordinates on the H2. Under this embedding, the hyperbolic plane\n\u0010 about the ' -axis.\nFrom this embedding, we can construct two fur-\nther ones, the so-called Klein model and the\nPoincar\u00b4e model [5, 9] (we will use the latter to\nvisualize HSOMs below). Both achieve a pro-\njection of the in\ufb01nite H2 into the unit disk, how-\never, at the price of distorting distances. The\nKlein model is obtained by projecting the points\n\u0002>= along rays passing\nonto the plane '\nof <\nthrough the origin ?\n(see Fig. 1). Obviously,\nthis projects all points of <\ninto the \u201c\ufb02at\u201d unit\n= of B\ndisk #\n(e.g., ED\n).\nThe Poincar\u00b4e Model results if we add two fur-\nther steps: \ufb01rst a perpendicular projection of\nFigure 1: Construction steps underlying\nthe Klein Model onto the (\u201cnorthern\u201d) surface\nKlein and Poincar\u00b4e-models of the space H2\nof the unit sphere centered at the origin (e.g.,\nGHD\n), and then a stereographic projection of the \u201cnorthern\u201d hemisphere onto the unit\ncircle about the origin in the ground plane '\n). It turns out that the result-\ning projection of H2 has a number of pleasant properties, among them the preservation of\n\n\u0002JI\n\n\u0010A@\n\n\u0010 .\n\nB\n\nC\n\n(point K\n\nD\n\n1\n\nO\n\nS\n\n\u0001\n\u0002\n\u0016\n\u0017\n\u0010\n(\n(\n\u0017\n\"\n\u0010\n(\n\u0017\n\u0012\n%\n*\n%\n(\n\u0017\n\u0010\n'\n*\n'\n(\n\u0017\n\u0010\n\u0017\n\u0002\n\u0018\n7\n\u0010\n-\n#\n\u0010\n-\n%\n\u0010\n-\n%\nC\nF\nG\nF\n\u0001\n\fangles and the mapping of shortest paths onto circular arcs belonging to circles that inter-\nsect the unit disk at right angles. Distances in the original H2 are strongly distorted in its\nPoincar\u00b4e (and also in the Klein) image (cf. Eq. (5)), however, in a rather useful way: the\nmapping exhibits a strong \u201c\ufb01sh-eye\u201d-effect. The neighborhood of the H2 origin is mapped\nalmost faithfully (up to a linear shrinkage factor of 2), while more distant regions become\nincreasingly \u201csqueezed\u201d. Since asymptotically the radial distances and the circumference\ngrow both according to the same exponential law, the squeezing is \u201cconformal\u201d, i.e., (suf-\n\ufb01ciently small) shapes painted onto H2 are not deformed, only their size shrinks with in-\ncreasing distance from the origin. By translating the original H2, the \ufb01sh-eye-fovea can be\nmoved to any other part of H2, allowing to selectively zoom-in on interesting portions of a\nmap painted on H2 while still keeping a coarser view of its surrounding context.\n\n2.2 Tesselations of the Hyperbolic Plane\n\nTo complete the set-up for a hyperbolic SOM we still need an equivalent of a regular grid in\nthe hyperbolic plane. For the hyperbolic plane there exist an in\ufb01nite number of tesselations\nwith congruent polygons such that each grid point is surrounded by the same number\nof\n\u0002\u0002\u0001\nneighbors [9, 10]. Fig. 2 shows two example tesselations (for the minimal value of\nI ), using the Poincar\u00b4e model for their visualization. While these tesselations\nand for\nappear non-uniform, this is only due to the \ufb01sh-eye effect of the Poincar\u00b4e projection. In the\noriginal H2, each tesselation triangle has the same size.\n\n\u0002)=\n\nOne way to generate these tesselations algorithmically is by repeated application of a suit-\nable set of generators of their symmetry group to a (suitably sized, cf. below) \u201cstarting\ntriangle\u201d, for more details cf. [11].\n\nFigure 2: Regular triangle tesselations of the hyperbolic plane, projected into the unit disk using\n)\nthe Poincar\u00b4e mapping. The left tesselation shows the case where the minimal number (\n\u0003\u0005\u0004\u0007\u0006\nof equilateral triangles meet at each vertex, the right \ufb01gure was constructed with\n. In the\n\u0003\b\u0004\n\t\f\u000b\nPoincar\u00b4e projection, only sides passing through the origin appear straight, all other sides appear as\ncircular arcs, although in the original space all triangles are congruent.\n\n3 Hyperbolic SOM Algorithm\n\nWe have now all ingredients required for a \u201chyperbolic SOM\u201d. We organize the nodes of\na lattice as described above in \u201crings\u201d around an origin node. The numbers of nodes of\nsuch a lattice grows very rapidly (asymptotically exponentially) with the chosen lattice\n\u0002\u000f\u000e contains 1625\n-dimensional\nnodes. Each lattice node carries a prototype vector\nfeature space (if we wish to make any non-standard assumptions about the metric structure\nof this space, we would build this into the distance metric that is used for determining the\nbest-match node). The SOM is then formed in the usual way, e.g., in on-line mode by\n\n(its number of rings). For instance, a lattice with\n\nradius C\n\n\u0011\u0013\u0012\u0015\u0014\n\n\u0002\r\u0001\nfrom some K\n\n\n\n\n\n\u0018\nC\n\u0010\nB\nC\n\u0016\n\f#+*\n\n\u0012\f\u000b\n\n\u0012\f\u000b\n\n\u0014\u0002\u0001\n\nin a radial\n\naccording to the natural metric that is inherited by the hyperbolic lattice.\n\n(4)\n\u0017 . However, since we now work on a hyperbolic lattice, we\n\nto identify its\nposition in the Poincar\u00b4e model. The node distance is then given (using the Poincar\u00b4e model,\nsee e.g. [7]) as\n\n\u0018\u0004\u0003\u001a\u0017\n\u0018\u0005\u0003\u001a\u0017 and the (squared) node distance \"\n\nrepeatedly determining the winner node and adjusting all nodes \nlattice neighborhood\u0001\n\u0018\u0005\u0003\u001a\u0017 around according to the familiar rule\n\u0002\b\u0007\n\t\n\u0002\b\r\u000f\u000e\n\u0010:\u0012\u001a*\nwith\t\n\u0014\u0015\u0016\u0012\u0011\nhave to determine both the neighborhood\u0001\nThe simplest way to do this is to keep with each node a complex number\u0013\n\u0016 arctanh\u0014\u0016\u0015\nThe neighborhood\u0001\ndistance (which is chosen as a small multiple of the neighborhood radius\u0011 ) around .\n\n\u0017 can be de\ufb01ned as the subset of nodes within a certain graph\n\n4 Experiments\n\n*\u0018\u0017\n\n\u000b\u001a\u0019\n\n(5)\n\nSome introductory experiments where several examples illustrate the favorable properties\nof the HSOM as compared to the \u201cstandard\u201d euclidean SOM can be found in [11, 12]. A\nmajor example of the use of the SOM for text mining is the WEBSOM project [2].\n\n4.1 Text Categorization\n\nIn order to apply the HSOM to natural text categorization, i.e.\nthe assignment of natu-\nral language documents to a number of prede\ufb01ned categories, we follow the widely used\n\n(6)\n\nthe\n\ninverse document frequency weighting scheme:\n\n\u0017 denotes the number of times term\u0003\n\nThe HSOM can be utilized for text categorization in the following manner. In a \ufb01rst step,\n\u0012 according to (4). During the second\nthe training set is used to adapt the weight vectors\nstep, the training set is mapped onto the HSOM lattice. To this end, for each training\n\nvector-space-model of Information Retrieval (IR). For each document \" we construct a fea-\n\u0017 , where the components\u001c\u0012\u001d are determined by the frequency of which term\nture vector \u0010\n\u001d occurs in that document. Following standard practice [13] we choose a term frequency\n\u0017#\"%$'&\n\u0018! \n\u001d occurs in \")( ,\u0001\nwhere the term frequency\u0003\n\u001d , i.e. the\nthe document frequency of\u0003\nnumber of documents in the training set and \"\nnumber of documents\u0003\nexample \"\n\n\u001c\u001f\u001d\n\u0018! \n\u001d occurs in.\nits best match node\n\u0015+*\n\u0017 denotes the feature vector of document \"\u0012( , as described above. After all\nexamples have been presented to the net, each node is labelled with the union .\ntext is then classi\ufb01ed into the union.\n\u000b of categories which are associated with its winner\nnode selected in the HSOM.\nhttp://www.research.att.com// lewis/\n\nwhere \u0010\n\u0012 of all\ncategories that belonged to the documents that were mapped to this node. A new, unknown\n\nText Collection. We used the Reuters-215781 data set since it provides a well known\nbaseline which is also used by other authors to evaluate their approaches, c.f. [14, 15]. We\n\n1As compiled by David Lewis from the AT&T Research Lab in 1987. The data can be found at\n\nis determined such that\n\n\u0015-,\n\n\")(\n\n(7)\n\n\u0012\n\n\u0012\n\n\u0006\n\u0010\n\u0011\n\u0012\n\u0012\n\u0010\n\u0010\n\u0011\n\u0012\n\u0017\n\"\n\u0010\n\u0012\n\n\u0018\n\n\u0017\n\u0010\n\u0012\n\n\u0010\n\u0012\n\n\u0018\n\n\u0017\n\u0012\n\"\n\u0002\n\u0015\n\u0015\n\u0015\n\u0013\n\u0012\n*\n\u0013\n\u000b\n=\n\u0013\n\u0013\n\u0012\n\u0015\n\u0015\n\u0015\n\u0015\n\u001b\n\u001d\n\u0012\n\u0003\n\u0018\n\n\u001c\n\u0012\n\"\n\u0003\n\u001e\n\u0002\n\u0003\n\u001c\n\u0012\n\u0003\n\u001d\n\u0014\n\u0001\n\"\n\u001c\n\u0012\n\u0003\n\u001d\n\u0017\n\u001b\n\u0018\n\u001c\n\u0012\n\u0003\n\u001d\n\u001c\n\u0012\n\u0003\n\u001d\n\u0017\n\u0010\n\u0011\n(\n\u0015\n\u0015\n\u0015\n\u0010\n\u001c\n\u0012\n\"\n(\n\u0017\n*\n\u0010\n\u0011\n\u000b\n\u0015\n\u0015\n\u0015\n\u0015\n\u0015\n\u0010\n\u001c\n\u0012\n\"\n(\n\u0017\n*\n\u0010\n\u0011\n\u0012\n\u0015\n\u0015\n\n\u0018\n\u001c\n\u0012\n\fhave used the \u201cModApte\u201d split, leading to 9603 training and 3299 test documents. After\npreprocessing, our training set contained 5561 distinct terms.\nPerformance Evaluation. The classi\ufb01cation effectiveness is commonly measured in terms\n\n\u0001\u0003\u0002\n\n\u0018 where \u000f\u0010\n\nand recall C\n\u001d and \u000f\n\nof precision \n\u0001\n\u0002\u000b\u0004\n\u0004\u0007\u0006\f\b\t\r\u000e\u0004\n\u0001\n\u0002\ncorrectly not classi\ufb01ed to category \u0011\ncorresponding numbers of falsely classi\ufb01ed documents.\nFor each node and each category \u0011\nnumber of training documents belonging to class \u0011\nretrieving documents from a given category \u0011\n\u001d against a threshold \u0014\ninto the retrieval set. For nodes which contain a set of documents K\nretrieval set is ranked by \u0011\n\n[16], which can be estimated as \n\u001d are the numbers of documents correctly classi\ufb01ed, and\n\u001d , respectively. Analogous, \u0012\u0013\n\u001d are the\n\u001d and \u0012\n\u001d is determined. It describes the\n\u001d a con\ufb01dence value \u0001\n\u001d which were mapped to node . When\n\u001d , we compare for each node \n\u001d\u0013\u0015\n. Documents from nodes with \u0001\n\nits associated\nbecome then included\n\u0015\u0017 , the order of the\n\nIn this way the number of retrieved documents can be controlled and we obtain the\nprecision-recall-diagrams as shown in Fig. 3.\n\n\u0017 , where \u0010\n\n\u0015\u0017 .\n\n\u0001\u0003\u0002\u0005\u0004\n\u0004\u0007\u0006\t\b\n\n(8)\n\n-NN\n\n\u0017\u001e\u0018\n\n\u001d)(\n\n\u001d is\n\nis maximum. The assign-\n\nIn order to compare the HSOM\u2019s performance for text categorization, we also evaluated a\n\u0016 -nearest neighbor (\u0016 -NN) classi\ufb01er with our training set. Apart from boosting methods\n[16] only support vector machines [14] have shown better performances. The con\ufb01dence\n\nlevel of a \u0016 -NN classi\ufb01er to assign document \"\u0012(\nto class \u0011\n\u001d)(\n\u0001\u0013\u0017\n\u0019#\"%$'&\n\u0019\u001b\u001a\u001d\u001c\n\r\u001f\u001e! \nis the set of \u0016 documents \"\nfor which \u0011\nwhere\u0001\n( belongs to category \u0011\n\u001d and 0 otherwise. According to [14, 17] we\nis 1, if \"\nment factor\n\u0002+*\u0007I nearest neighbors.\nhave chosen the \u0016\nText Categorization Results. The results of three experiments are shown in Table 1. We\n\u0002-, neighbors (sum-\nhave compared a HSOM with C\nming up to 1306 nodes) to a spherical standard euclidean SOM as described in [11] with\napprox. 1300 nodes, and the \u0016 -NN classi\ufb01er. Our results indicate that the HSOM does not\nperform better than a \u0016 -NN classi\ufb01er, but to a certain extent also does not play signi\ufb01cantly\nworse either. It is noticable that for less dominant categories the HSOM yields superior\nresults to those of the standard SOM. This is due to the fact, that the nodes in H2 cover\na much broader space and therefore offer more freedom to map smaller portions of the\noriginal dataspace with less distortions as compared to euclidean space.\nAs the \u0016 -NN results suggest, other state-of-the-art techniques like support vector machines\nwill probably lead to better numerical categorization results than the HSOM. However,\nsince the main purpose of the HSOM is the visualization of relationships between texts\nand text categories, we believe that the observed categorization performance of the HSOM\ncompares suf\ufb01ciently well with the more specialized (non-visualization) techniques to war-\nrant its ef\ufb01cient use for creating insightful maps of large bodies of document data.\n\nrings and a tesselation with\n\n\u0002)\u0004\n\nTable 1: Precision-recall breakeven points for the ten most prominent categories.\n\nearn\nSOM\n90.0\nHSOM 90.2\n93.8\n\n. -NN\n\nacq mny-fx\n81.2\n81.6\n83.7\n\n61.7\n68.7\n69.3\n\ncrude\n70.3\n78.8\n84.7\n\ngrain\n69.4\n76.2\n81.9\n\ntrade\n48.8\n56.8\n61.9\n\ninterest wheat\n61.9\n69.3\n69.0\n\n57.1\n66.4\n71.0\n\nship\n54.8\n61.8\n77.5\n\ncorn\n50.3\n53.6\n67.9\n\n\u001d\n\u0002\n\u0002\n\u0004\n\u0018\nC\n\u001d\n\u0002\n\u0001\n\u0001\n\u0012\n\u0001\n\u0012\n\u0012\n\u0014\n\u0012\n$\n\n\u0012\n\u0010\n\u001c\n\u0012\n\"\n(\n\u0017\n\u0018\n\u0010\n\u0011\n\u0012\n\u001c\n\u0012\n\"\n(\n\u0017\n\u0014\nK\n\u0012\n\u001d\n\u0012\n\"\n(\n\u0017\n\u0002\n\u0018\n\u0019\n\u0011\n$\n\n\u0012\n\"\n(\n\u0018\n\"\n(\n\u0017\n\u0012\n\"\n(\n\u0017\n(\n$\n\n\u0012\n\"\n(\n\u0018\n\"\n(\n\u0017\n&\n\n\f1\n\n0.9\n\n0.8\n\n0.7\n\n0.6\n\n0.4\n\n0\n\nearn\nacq\nmoney\u2212fx\n\n0.2\n\n0.4\n\n0.6\n\n0.8\n\n1\n\n1\n\n0.9\n\n0.8\n\n0.7\n\n0.6\n\n0.4\n\n0\n\n0.2\n\n0.4\n\n0.6\n\n0.8\n\n1\n\n(a) . -NN\n\n(b) HSOM\n\n2: 0.69\n\nFigure 3: Precision-recall curves for the three most frequent categories earn, acq and money-fx.\n\n4.2 Text Mining & Semantic Navigation\n\nA major advantage of the HSOM is its remarkable capability to map high-dimensional\nsimilarity relationships to a low-dimensional space which can be more easily handled and\ninterpreted by the human observer. This feature and the particular \u201c\ufb01sh-eye\u201d capability mo-\ntivates our approach to visualize whole text collections with the HSOM. It can be regarded\nas an interface capturing the semantic structure of a text database and provides a way to\nguide the users attention.\nIn preliminary experiments we have labelled the nodes with\nglyphs corresponding to the categories of the documents mapped to that node. In Fig. 4\ntwo HSOM views of the Reuters data set are shown. Note, that the major amount of data\ngets mapped to the outermost region, where the nodes of the HSOM make use of the large\nspace offered by the hyperbolic geometry. During the unsupervised training process, the\ndocument\u2019s categories were not presented to the HSOM. Nevertheless, several document\nclusters can be clearly identi\ufb01ed. The two most prominent are the earn and acquisition\nregion of the map, re\ufb02ecting the large proportion of these categories in the Reuters-21578\ncollection. Note, that categories which are semantically similar are located beside each\nother, as can be seen in the corn, wheat, grain the interest, money-fx or the crude, ship area\nof the map. Additional to the category (glyph type) and the number of training documents\nper node (glyph size), the number of test documents mapped to each node is shown as the\nheight of the symbol above the ground plane. In this way the HSOM can be used as a\nnovelty detector in chronological document streams. For the Reuters-21578 dataset, a par-\nticular node strikes out. It corresponds to the small glyph tagged with the \u201cship\u201d label in\nFig. 4. Only a few documents from the training collection are mapped to that node as shown\n\nby it\u2019s relatively small glyph size. The large\u0013 -value on the other hand indicates that it con-\n\ntains a large number of test documents, and is therefore probably semantically connected\nto a signi\ufb01cant, novel event only contained in the test collection. The right image of Fig. 4\nshows the same map, but the focal view now moved into the direction of the conspicious\n\u201cship\u201d node, resulting in a magni\ufb01cation of the corresponding area. A closer inspection re-\nveals, that the vast majority (35 of 40) of the test documents describe an incident where an\nIranian oil rig was attacked in the gulf. Although no document of the training set describes\nthis incident (because the text collection is ordered by time and the attack took place \u201cafter\u201d\nthe split into train and test set), the HSOM generalizes well and maps the semantic content\nof these documents to the proper area of the map, located between the regions for crude\nand ship.\n\nThe next example illustrates that the HSOM can provide more information about an un-\nknown text than just it\u2019s category. For this experiment we have taken movie reviews from\nthe rec.art.movies.reviews newsgroup. Since all the reviews describe a certain movie, we\nretrieved their associated genres from the Internet Movie Database (http://www.imdb.com)\nto build a set of category labels for each document. The training set contained 8923 ran-\n\n\fship\n\nmoney\u2212fx\n\ninterest\n\ntrade\n\ncorn wheat\n\ngrain\n\nacq\n\ncrude\n\nearn\n\nFigure 4: The left \ufb01gure shows a central view of the Reuters data. We used a HSOM with\n\u0004\u0002\u0001\nrings and a tesselation with\nneighbors. Ten different glyphs were used to visualize the ten most\nfrequent categories. They were manually tagged to indicate the correspondence between category\nand symbol type. The glyph sizes and the\n-values (height above ground plane) re\ufb02ect the number of\ntraining and test documents mapped to the corresponding node, respectively.\n\n\u0004\u0004\u0003\n\ndomly selected reviews (without their genre information) from \ufb01lms released before 2000.\nWe then presented the system with \ufb01ve reviews from the \ufb01lm \u201cAtlantis\u201d, a Disney cartoon\nreleased in 2001. The HSOM correctly classi\ufb01ed all of the \ufb01ve texts as reviews for an an-\nimation movie. In Fig. 5 the projection of the \ufb01ve new documents onto the map with the\npreviously acquired text collection is shown. It can be seen that there exist several clusters\nrelated to the animation genre. By moving the fovea of the HSOM we can now \u201czoom\u201d\ninto that region which contains the \ufb01ve new texts. In the right of Fig. 5 it can be seen\nthat all of the \u201cAtlantis\u201d reviews where mapped to a node in immediate vicinity of docu-\nments describing other Disney animation movies. This example motivates the approach of\n\u201csemantic navigation\u201d to rapidly visualize the linkage between unknown documents and\npreviously acquired semantic concepts.\n\nTarzan\nMulan\n\nThe Iron Giant\n\nSouth Park\n\nChicken Run\n\nDinosaur\n\nAntz\nA Bug\u00b4s Life\n\nThe Prince\nof Egypt\n\nHercules\n\nAladin\n\nAtlantis\n\nTarzan\n\nBeauty and\nthe beast\n\nAnastasia\n\nPocahontas\n\nMulan\n\nand a tesselation with\n\nFigure 5: A HSOM with\nneighbors was used to map movie\nrewies from newsgroup channels. In both \ufb01gures, glyph size and\n-value indicate the number of\ntexts related to the animation genre mapped to the corresponding node. Nodes exceeding a certain\nthreshold were labelled with the title corresponding to the most frequently occuring movie mapped\nto that node. The underlined label in the right \ufb01gure indicates the position of the node to which \ufb01ve\nnew documents were mapped to.\n\n\u0004\u0007\u0006\n\n5 Conclusion\n\nEf\ufb01cient navigation in \u201cSematic Space\u201d requires to address two challenges: (i) how to cre-\nate a low dimensional display of semantic relationship of documents, and (ii) how to obtain\nthese relationships by automated text categorization. Our results show that the HSOM can\nprovide a good solution to both demands simultaneously and within a single framework.\n\n\n\u0003\n\u0005\n\n\u0003\n\u0004\n\u0006\n\u0005\n\fThe HSOM is able to exploit the peculiar geometric properties of hyperbolic space to suc-\ncessfully compress complex semantic relationships between text documents. Additionally,\nthe use of hyperbolic lattice topology for the arrangement of the HSOM nodes offers new\nand attractive features for interactive \u201csemantic navigation\u201d. Large document databases\ncan be inspected at a glance while the HSOM provides additional information which was\ncaptured during a previous training step, allowing e.g.\nto rapidly visualize relationships\nbetween new documents and previously acquired collections.\n\nFuture work will address more sophisticated visualization strategies based on the new ap-\nproach, as well as the exploration of other text representations which might take advantage\nof hyperbolic space properties.\n\nReferences\n[1] T. Kohonen. Self-Organizing Maps. Springer Series in Information Sciences. 3rd edition, 2001.\n[2] Teuvo Kohonen, Samuel Kaski, Krista Lagus, Jarkko Saloj\u00a8arvi, Vesa Paatero, and Antti Saarela.\nOrganization of a massive document collection. IEEE Transactions on Neural Networks, Spe-\ncial Issue on Neural Networks for Data Mining and Knowledge Discovery, 11(3):574\u2013585, May\n2000.\n\n[3] John Lamping and Ramana Rao. Laying out and visualizing large trees using a hyperbolic\n\nspace. In Proceedings of UIST\u201994, pages 13\u201314, 1994.\n\n[4] T. Munzer. Exploring large graphs in 3D hyperbolic space. IEEE Computer Graphics and\n\nApplications, 18(4):18\u201323, July/August 1998.\n\n[5] H. S. M. Coxeter. Non Euclidean Geometry. Univ. of Toronto Press, Toronto, 1957.\n[6] J.A. Thorpe. Elementary Topics in Differential Geometry. Springer-Verlag, New York, 1979.\n[7] Frank Morgan. Riemannian Geometry: A Beginner\u2019s Guide. Jones and Bartlett Publishers,\n\nBoston, London, 1993.\n\n[8] Charles W. Misner, J. A. Wheeler, and Kip S. Thorne. Gravitation. Freeman, 1973.\n[9] R. Fricke and F. Klein. Vorlesungen \u00a8uber die Theorie der automorphen Funktionen, volume 1.\n\nTeubner, Leipzig, 1897. Reprinted by Johnson Reprint, New York, 1965.\n\n[10] W. Magnus. Noneuclidean Tesselations and Their Groups. Academic Press, 1974.\n[11] Helge Ritter. Self-organizing maps in non-euclidian spaces. In E. Oja and S. Kaski, editors,\n\nKohonen Maps, pages 97\u2013108. Amer Elsevier, 1999.\n\n[12] J. Ontrup and H. Ritter. Text categorization and semantic browsing with self-organizing maps\n\non non-euclidean spaces. In Proc. of the PKDD-01, 2001.\n\n[13] G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information\n\nProcessing and Management, 24(5):513\u2013523, 1988.\n\n[14] T. Joachims. Text categorization with support vector machines: learning with many relevant\n\nfeatures. In Proc. of ECML-98, number 1398, pages 137\u2013142, Chemnitz, DE, 1998.\n\n[15] Huma Lodhi, John Shawe-Taylor, Nello Cristianini, and Chris Watkins. Text classi\ufb01cation using\nstring kernels. In Todd K. Leen, Thomas G. Dietterich, and Volker Tresp, editors, Advances in\nNeural Information Processing Systems 13, pages 563\u2013569. MIT Press, 2001.\n\n[16] F. Sebastiani, A. Sperduti, and N. Valdambrini. An improved boosting algorithm and its appli-\n\ncation to automated text categorization. In Proc. of CIKM-00, pages 78\u201385, 2000.\n\n[17] Y. Yang. An evaluation of statistical approaches to text categorization. Information Retrieval,\n\n1-2(1):69\u201390, 1999.\n\n\f", "award": [], "sourceid": 2029, "authors": [{"given_name": "Jorg", "family_name": "Ontrup", "institution": null}, {"given_name": "Helge", "family_name": "Ritter", "institution": null}]}