{"title": "Parameterising Feature Sensitive Cell Formation in Linsker Networks in the Auditory System", "book": "Advances in Neural Information Processing Systems", "page_first": 1007, "page_last": 1013, "abstract": "", "full_text": "Parameterising Feature Sensitive Cell \nFormation in Linsker Networks in the \n\nAuditory System \n\nLance C. Walton \n\nDavid L. Bisset \n\nUniversity of Kent at Canterbury \n\nUniversity of Kent at Canterbury \n\nCanterbury \n\nCanterbury \n\nKent \n\nEngland \n\nKent \n\nEngland \n\nAbstract \n\nThis paper examines and extends the work of Linsker (1986) on \nself organising feature detectors. Linsker concentrates on the vi(cid:173)\nsual processing system, but infers that the weak assumptions made \nwill allow the model to be used in the processing of other sensory \ninformation. This claim is examined here, with special attention \npaid to the auditory system, where there is much lower connec(cid:173)\ntivity and therefore more statistical variability. On-line training is \nutilised, to obtain an idea of training times. These are then com(cid:173)\npared to the time available to pre-natal mammals for the formation \nof feature sensitive cells. \n\n1 \n\nINTRODUCTION \n\nWithin the last thirty years, a great deal of research has been carried out in an \nattempt to understand the development of cells in the pathways between the sensory \napparatus and the cortex in mammals. For example, theories for the development of \nfeature detectors were forwarded by Nass and Cooper (1975), by Grossberg (1976) \nand more recently Obermayer et al (1990). \n\nHubel and Wiesel (1961) established the existence of several different types of fea(cid:173)\nture sensitive cell in the visual cortex of cats. Various subsequent experiments have \n\n1007 \n\n\f1008 Walton and Bisset \n\nshown that a considerable amount of development takes place before birth (i.e. \nwithout environmental input). This must either be dependent on a genetic pre(cid:173)\ndispostion for individual cells to develop in an appropriate way without external \ninfluence, or some low level rules sufficient to create the required cell morphologies \nin the presence of random action potentials. \n\nAlthough there is a great deal of a priori information concerning axon growth \nand synapse arborisation (governed by chemical means in the brain), it is difficult \nto conceive of a biological system that could use genetic information to directly \nmanipulate the spatial information about the pre-synaptic target with respect to \nthe axon with which the synapse is made. However, there is considerable random \nactivity in the sensory apparatus that could be used to effect synaptic development . \n\nVarious authors have constructed models that deal with different aspects of self(cid:173)\norganisation of this kind and some have pointed out the value of these types of \ncells in pattern classification problems (Grossberg 1976), but either the biological \nplausibility of these models is questionable, or the subject of pre-natal development \nis not addressed (i.e. without environmental input). \n\nIn this paper, the networks of Linsker (1986) will be examined. Although these \nnetworks have been analysed quite extensively by Linsker, and also by Mackay and \nMiller (1990), the biological aspects of parameter ranges and choices have only \nbeen touched upon. It is our aim in this paper, to add further detail in this area \nby examining the one-dimensional case which represents the auditory pathways. \n\n2 LINSKER NETWORKS \n\nThe network is based on a Multi Layer Perceptron, with feed forward connections in \nall layers, and la.teral connections (inhibition and excitation) in higher layers. The \nneural outputs are sums of the weighted inputs, and the weights develop according \nto a constrained Hebbian Rule. Each layer is lettered for reference starting from A \nand subsequent layers are lettered B,C,D etc. The superscript M will be used to \nrefer to an arbitrary layer, and L is used to refer to the previous layer. Each layer has \na set of parameters which are the same for all neurons in that layer. Connectivity \nis random but is based on a Gaussian density distribution (exp( -r2/rll\u00bb' where \nrM is the arbor radius for layer M. \n\nEach layer is a rectangular array of neurons (or vector of neurons for the one di(cid:173)\nmensional case). The layers are assumed to be large enough so that edge effects \nare not important or do not occur. Layers develop one at a time starting from the \nB layer. The A layer is an input layer, which is divided into boxes, within each of \nwhich activity is uniform . This is biologically realistic, since sensory neurons fan \nout to a number of cells (an average of lOin the cochlea) each of which only take \ninput from one sensory cell. Hence the input layer for the network acts like a layer \nof tonotopically organised neurons. \n\n\fParameterising Feature Sensitive Cell Formation in Linsker Networks in the Auditory System \n\n1009 \n\n3 NETWORK DEVELOPMENT \n\nThe output of a neuron in layer M is given by \n\nF~1rr = Ra + Rb. L CnjFp~:(nj) \n\nj \n\nWhere, \n\n7r indexes a pattern presentation, \n\nThe subscript n is used to index the M layer neurons, \n\nR a , Rb are layer parameters, \n\n(1) \n\nFp~:(nj) is the output of the L layer neuron which is pre-synaptic to the j'th input \nof the n'th M layer neuron. \n\nThe synaptic weights develop according to a constrained Hebbian learning rule, \n\n(.6cn i)7I' = ka + kb.(FnM7r - Ftt).(Fp~:(ni) - Fl) \n\n(2) \n\nWhere, \n(.6cn d7l' is the change in the i'th weight of neuron n, \nk a , kb' FtJ , Frf are layer parameters. \nSynaptic weights are constrained to lie within t.he ra.nge (nem - 1, n em ). (In this \nwork, nem = 0.5) \n\nLinsker (1986a) derives an Ensemble Averaged Development equation which shows \nhow development depends on the parameters, and how correlations develop between \nspatially proximate neurons in layers beyond the first. In so doing, the number of \nparameters is reduced from five per layer to two per layer, and therefore the equation \nis a very useful aid in understanding the self-organising nature of this model. The \ndevelopment equation is \n\n. \nCni = '\\1 \n\n}., +}' _ + ~) pren, .pre nJ \n\n~. QL (.) \n\\.::!'Cn \nNM \n< (Fl7r - FL).(F/ 7r -\n\nfL) \n\n(,),cnJ' \n\nWhere, \n\nQt == \n\nf6 \n\nN M is the number of synaptic connections to an M layer neuron, \n\nFL is the a.verage output activity in the L layer, \n\nK 1 = \n\nka + kb.(Ra - F~1) . (FL - Frf) \n\n2 \nNMkbRbfo \n\n(3) \n\n(4) \n\n(5) \n\nK'l = fL . (FL'l- Frf) \n\n(6) \nf& is a unit of activity used to normalise the two point correlation function Qb. In \nthis work fJ is chosen to set Qt = 1 \nAngle brackets denote an a.verage ta.ken over the ensemble of input patterns. \n\nfa \n\n~ \n\n\f1010 Walton and Bisset \n\n4 MORPHOLOGICAL REGIMES \n\nFrom equation 3, an expression can be found for the average weight value c in a \nlayer, and therefore certain properties of the system can be described. Although \nMackay and Miller (1990) have described the regimes with the aid of eigenvalues \nand eigenfunctions, there is a much simpler method which will provide the same \ninformation. \n\nFor an all-excitatory (AE) layer, the average weight value is equal to 71. em . Since \nall weights are equal to llem, the summation in equation 3 can be re-written \nn em \u00b7 L.Jj \n\npre(ni).pre(nj) = ll em \u00b7 M\u00b7q, were q = 2 N J 2 + 2 \nrc r B \n\n~ QL \n\n. c\u00b7 \n\nN \n\n-\n\nh \n\n-\n\nrB \n\n\u2022 \n\nA similar expression can be found for all-inhibitory (AI) layers, and therefore the \nJ(l - 1<2 plane can be sub-divided into three regions which will yield AE cells, AI \ncells, and mixed-mode cells (see figure 1). \n\nThe plane can be divided further for the mixed-mode cell type in the C layer. On(cid:173)\ncenter and off-center cells develop close to the AE and AI boundaries respectively. \nMackay and Miller have shown why these cells develop and have placed a theoretical \nlower bound on c which agrees with experimental data. However, in so doing the \neffect of the intercept on the /{2 axis was deemed small, due to a large number \nof synaptic connections. This approximation depends upon the large number of \nconnections between the Band C layers. \nIn the auditory case, the number of \nconnections is smaller, and it is possible that this assumption no longer holds. \n\nFrom equation 3, it can be seen that movement into the On-Centre region from \nthe AE region, causes the value of ~. QL (.) \n(.).cn )\u00b7 t.o decrease. This has the \neffect of moving the intercept of the constant c line from J{:? = q towards /{ 2 = O. \nJ{2 finally reaches 0 when c = 0, and then begins to move back towards ij as the AI \nregime is approached. \n\npre 7H .pre n} \n\nL.J} \n\nThis has two potentially important effects. Firstly, it means that the tolerance of \n/{ 2 varies with J( 1; for a particular value of /{l, there are upper and lower limits on \nthe value of K2 which will allow maturation of on-center cells. This range of values \n(i.e. the difference between the limits) varies in a linear way with K I, but the ratio \nof the range to a value of K2 which is within the range (i.e. the center value) is not \nlinear with /{l. Here, tolerance is defined as that ratio. Secondly, there is a region \nof negative /{2 where the nature of the cell morphology which will be produced is \nunknown. \n\nIt is therefore important that K'2 should be larger than this value in order to produce \nOn-Center or Off-Center cells reliably. Mackay and Miller use IK21 -+ 00 in their \nanalysis. Unfortunately, this would require the fundamental network parameter \nFl -+ 00 from equation 6, and therefore is an unsuitable choice. It is reasonable \nto assume that Fl is of the same order as FL, and hence an order for 1<2 can be \nestablished. For a concrete example, assume inputs are binary (giving Qfi = 0.25) \nand Fl = FL x 1.2, this will ensure K2 < 0 (equation 6) while adhering to the \nassumption made above. Equation 6 now gives the order for K2 = 0.2. \nTo find the value of ij, which will place a lower bound on IK2L a particular system \nshould be chosen. The auditory system is chosen here. \n\n\fParameterising Feature Sensitive Cell Formation in Linsker Networks in the Auditory System \n\n1011 \n\nKl \n\n~~~~~~-----------------------------------.K2 \n\no \n\nFigure 1: Graph of Morphological Regions for C Layer \n\nThere are approximately 3000 inner hair cells in the cochlea, each of which fans out \nto an average of 10 neurons (which sets our box size p = 10). These neurons take \ninput from only one hair cell. The anteroventral cochlea nucleus takes input from \nthis layer of cells, with a fan in N B ~ 50 (c.f. the value of N B = 1000 in Linsker \n(1986a)). The assumption is made that the three sections of the cochlea nucleus \neach contain approximately the same number of cells. With this smaller number of \nconnections, the correlation function for this layer is somewhat coarser, and does \nnot follow the theoretical curve for the continuum limit so well. \n\nIn addition, the on-center cells found in the posteroventral cochlea nucleus and \nthe dorsal nucleus have centres with a tuning curve response Q of about 2.5 which \ncorresponds to about 2000 B layer cells. If it is assumed that the surround of the cell \nis half the width of the core, then there is a total \"c ~ 3000 neurons. Simulations \nhere use Nc = 100 which is a realistic number of connections in the context of a \none-dimensional network . \n\nIn general, the arbor radius increases as layers become closer to the cortex. From \nLinsker, rc /\"B = 3. \"B is therefore equal to 1000. This yields the average number \nof connections to a given B cell from a. particular A box being approximately unity, \nwhich agrees well with the condition expressed by Linskel'. \nUsing the expression above, if can be calculated as approximately 1.5 x 10-3 . This \nvalue is certainly insignificant with respect to the value of f{2 = 0.2 quoted earlier, \nand therefore any effects due to the summation term in equation 3 can be ignored \nin the calculation of c for this system. This means that the original approximation \nstill holds even in this low connectivity case. \n\n\f1012 \n\nWalton and Bisset \n\n5 SIMULATION RESULTS \n\nA network was trained using the connectivity sta.ted above to give various values of \nC with f{2 = 0.2. To obtain an idea of the total number of presentations that were \nrequired to train the network , without any artifacts that might be produced as a \nresult of batch training, the original network equations were used. In all of these \nsimulations, R a , Ftt = 0 so that the value of f{ 1 could be easily controlled. \nThe findings were that the maximum value of kb was about 10-3 which required \n2.5 million pattern presentations to mature the network. With this value, on-center \ncells with an average weight value less than about 0.3 would not mature. However \nas the value of kb was decreased (keeping f{l constant), the value of c could be \nmade lower, at the expense of more pattern presentations. The figures obtained \nfor the maturation of feature sensitive cells are extremely biologically realistic in \nthe light of the number of pattern presentations available to an average mammal. \nFor example, the foetal cat has sufficient time for about 25 million presentations \n(assuming 10 presentations per second) . \n\n6 CONCLUSION \n\nWe have shown that the class of network developed by Linsker is extendable to \nthe auditory system where the number and density of synapses is considerably \nsmaller than in the visual case. It has also been shown that the time for layer \nmaturation by this method is sufficiently short even for mammals with a relatively \nshort gestation period. and therefore should also be sufficient in mammals with \nlonger foetal development times . We conclude that the model is therefore a good \nrepresentation of feature detector development in the pre-natal mammal. \n\nReferences \n\nGrossberg S. (1976) - On the Development of Feature Detectors in the Visual \nCortex with Applications to Learning and Reaction Diffusion Systems, Biological \nCybernetics 21, 145 - 159 \n\nGrossberg S. (1976) - Adaptive Pattern Classification and Universal Recoding : 1 \nParallel Development and Coding of Neural Feature Detectors, Biological Cybernet(cid:173)\nics 28, 121 - 134 \n\nH u bel D. H. and Wiesel T . N. (1961) - Receptive Fields, Binocular Interaction and \nFunctional Architechture in the Cat 's Visual Cortex, Journal of Physiology, 160, \n106 - 154 \n\nKalil R. E. (1989) - Synapse Formation In The Developing Brain, Scientific Amer(cid:173)\nican, December 1989, 38 - 45 \n\nKlinke R. (1986) - Physiology of Hearing. In Schmidt R . 'N . (ed.), Fundamentals \nof Sensory Physiology, 19!-J - 22:3 \n\nMacKay D. J. C. and Miller K. D. (1990) - Analysis of Linsker 's Simulations of \nHehbian Rules, Neural Computation, 2, 173 - 187 \n\nvon der Malsburg C. (1979) - Development of Ocularity Domains and Growth \n\n\fParameterising Feature Sensitive Cell Formation in Linsker Networks in the Auditory System \n\n1013 \n\nBehaviour of Axon Terminals, Biological Cybernetics, :]2, 49 - 62 \nLinsker R. (1986a) - From Basic Network Principles To Neural Architecture: Emer(cid:173)\ngence Of Spatial-Opponent Cells, Proceedings of the National Academy of Sciences \n(USA), 83, 7508 - 7512 \n\nLinsker R . (1986b) - From Ba.')ic Network Principles To Neural Architecture: \nEmergence of Orientation-Selective Cells, Proceedings of the National Academy of \nSciences (USA), 83, 8390 - 8394 \nLinsker R. (1986c) - From Basic Network Principles To Neural Architecture: Emer(cid:173)\ngence of Orientation-Columns, Proceedings of the National Academy of Sciences \n(USA), 83, 8779 - 8783 \n\nNass M. M. and Cooper L. N. (1975) - A Theory for the Development of Feature \nDetecting Cells in the Visua.l Cortex, Biological Cybernetics, 19, 1 - 18 \n\nObermayer K. Ritter H. and Schulten K. (1990) - Development and Spatial \nStructure of Cortical Feature Maps: A Model Study NIPS, 3, 11 - 17 \n\nSloman A. (1989) - On Designing a Visual System (Towards a Gibsonian Compu(cid:173)\ntational Model of Vision) Journal of Experimental and Theoretical Artificial Intel(cid:173)\nligence, 1, 289 - 337 \n\nTanaka S. (1990) - Intera.ction among Ocularity, Retinotopy and On-Center/Off \nCenter Pathways During Development NIPS, 3, 18 - 25 \n\n\f", "award": [], "sourceid": 610, "authors": [{"given_name": "Lance", "family_name": "Walton", "institution": null}, {"given_name": "David", "family_name": "Bisset", "institution": null}]}