{"title": "Familiarity Discrimination of Radar Pulses", "book": "Advances in Neural Information Processing Systems", "page_first": 875, "page_last": 881, "abstract": null, "full_text": "Familiarity Discrimination of Radar \n\nPulses \n\nEric Grangerl, Stephen Grossberg2 \n\nMark A. RUbin2 , William W. Streilein2 \n\n1 Department of Electrical and Computer Engineering \n\nEcole Poly technique de Montreal \nMontreal, Qc. H3C 3A 7 CAN ADA \n\n2Department of Cognitive and Neural Systems, Boston University \n\nBoston, MA 02215 USA \n\nAbstract \n\nThe ARTMAP-FD neural network performs both identification \n(placing test patterns in classes encountered during training) and \nfamiliarity discrimination (judging whether a test pattern belongs \nto any of the classes encountered during training). The perfor(cid:173)\nmance of ARTMAP-FD is tested on radar pulse data obtained in \nthe field, and compared to that of the nearest-neighbor-based NEN \nalgorithm and to a k > 1 extension of NEN. \n\n1 \n\nIntroduction \n\nThe recognition process involves both identification and familiarity discrimination. \nConsider, for example, a neural network designed to identify aircraft based on their \nradar reflections and trained on sample reflections from ten types of aircraft A . . . J. \nAfter training, the network should correctly classify radar reflections belonging to \nthe familiar classes A . .. J, but it should also abstain from making a meaningless \nguess when presented with a radar reflection from an object belonging to a different, \nunfamiliar class. Familiarity discrimination is also referred to as \"novelty detection,\" \na \"reject option,\" and \"recognition in partially exposed environments.\" \n\nARTMAP-FD, an extension of fuzzy ARTMAP that performs familiarity discrimi(cid:173)\nnation, has shown its effectiveness on datasets consisting of simulated radar range \nprofiles from aircraft targets [1, 2]. In the present paper we examine the perfor(cid:173)\nmance of ARTMAP-FD on radar pulse data obtained in the field , and compare it \n\n\f876 \n\nE. Granger, S. Grossberg, M. A. Rubin and W. W. Streilein \n\nto that of NEN, a nearest-neighbor-based familiarity discrimination algorithm, and \nto a k > 1 extension of NEN. \n2 Fuzzy ARTMAP \n\nFuzzy ARTMAP [3] is a self-organizing neural network for learning, recognition, \nand prediction. Each input a learns to predict an output class K. During training, \nthe network creates internal recognition categories, with the number of categories \ndetermined on-line by predictive success. Components of the vector a are scaled \nso that each ai E [0,1] (i = 1 ... M). Complement coding [4] doubles the number \nof components in the input vector, which becomes A = (a, a C ), where the ith \ncomponent of a C is ai = (I-ad. With fast learning, the weight vector w) records the \nlargest and smallest component values of input vectors placed in the /h category. \nThe 2M-dimensional vector Wj may be visualized as the hyperbox R j that just \nencloses all the vectors a that selected category j during training. \n\nActivation of the coding field F2 is determined by the Weber law choice function \nTj(A) =1 A 1\\ Wj 1 /(0:+ 1 Wj I), where (P 1\\ Q)i = min(Pi , Qj) and 1 P 1= \nL;~ 1 Pi I\u00b7 With winner-take-all coding, the F2 node J that receives the largest \nFl -+ F2 input Tj becomes active. Node J remains active if it satisfies the matching \ncriterion: 1 Al\\wj 1/ 1 A 1 = 1 Al\\wj 1 /M > p, where p E [0,1] is the dimensionless \nvigilance parameter. Otherwise, the network resets the active F2 node and searches \nuntil J satisfies the matching criterion. If node J then makes an incorrect class \nprediction, a match tracking signal raises vigilance just enough to induce a search, \nwhich continues until either some F2 node becomes active for the first time, in \nwhich case J learns the correct output class label k( J) = K; or a node J that has \npreviously learned to predict K becomes active. During testing, a pattern a that \nactivates node J is predicted to belong to the class K = k( J). \n3 ARTMAP-FD \n\nFamiliarity measure. During testing, an input pattern a is defined as familiar \nwhen a familiarity function \u00a2(A) is greater than a decision threshold T Once a \ncategory choice has been made by the winner-take-all rule, fuzzy ARTMAP ignores \nthe size of the input TJ. In contrast, ARTMAP-FD uses TJ to define familiarity, \ntaking \n\n\u00a2(A) = TJ(A) = 1 A 1\\ WJ 1 \n\nTjlAX \n\n1 WJ 1 \n\n' \n\n(1) \n\nwhere TjlAX =1 WJ 1 /(0:+ 1 WJ I)\u00b7 This maximal value of TJ is attained by each \ninput a that lies in the hyperbox RJ, since 1 A 1\\ W J 1 = 1 W J 1 for these points. \nAn input that chooses category J during testing is then assigned the maximum \nfamiliarity value 1 if and only if a lies within RJ. \n\nFamiliarity discrimination algorithm. ARTMAP-FD is identical to fuzzy \nARTMAP during training. During testing, \u00a2(A) is computed after fuzzy ARTMAP \nIf \u00a2(A) > I, \nhas yielded a winning node J and a predicted class K = k(J). \nARTMAP-FD predicts class K for the input a. If \u00a2(A) ::; I, a is regarded as \nbelonging to an unfamiliar class and the network makes no prediction. \n\nNote that fuzzy ARTMAP can also abstain from classification, when the baseline \n\nvigilance parameter 15 is greater than zero during testing. Typically 15 = \u00b0 during \n\ntraining, to maximize code compression. In radar range profile simulations such \n\n\fFamiliarity Discrimination of Radar Pulses \n\n877 \n\nas those described below, fuzzy ARTMAP can perform familiarity discrimination \nwhen p > 0 during both training and testing. However, accurate discrimination \nrequires that p be close to 1, which causes category proliferation during training. \nRange profile simulations have also set p = 0 during both training and testing, but \nwith the familiarity measure set equal to the fuzzy ARTMAP match function: \n\n(2) \n\nThis approach is essentially equivalent to taking p = 0 during training and p > 0 \nduring testing, with p =,. However, for a test set input a E RJ, the function \ndefined by (2) sets \u00a2(A) =1 w J 1 / M, which may be large or small although a is \nfamiliar. Thus this function does not provide as good familiarity discrimination as \nthe one defined by (1), which always sets \u00a2(A) = 1 when a E RJ. Except as noted, \nall the simulations below employ the function (1), with p = O. \n\nSequential evidence accumulation. ART-EMAP (Stage 3) [5] identifies a test \nset object's class after exposure to a sequence of input patterns, such as differing \nviews, all identified with that one object. Training is identical to that of fuzzy \nART MAP, with winner-take-all coding at F2 . ART-EMAP generally employs dis(cid:173)\ntributed F2 coding during testing. With winner-take-all coding during testing as \nwell as training, ART-EMAP predicts the object's class to be the one selected by the \nlargest number of inputs in the sequence. Extending this approach, ARTMAP-FD \naccumulates familiarity measures for each predicted class K as the test set sequence \nis presented. Once the winning class is determined, the object's familiarity is de(cid:173)\nfined as the average accumulated familiarity measure of the predicted class during \nthe test sequence. \n4 Familiarity discrimination simulations \n\nSince familiarity discrimination involves placing an input into one of two sets, fa(cid:173)\nmiliar and unfamiliar, the receiver operating characteristic (ROC) formalism can \nbe used to evaluate the effectiveness of ARTMAP-FD on this task. The hit rate \nH is the fraction of familiar targets the network correctly identifies as familiar and \nthe false alarm rate F is the fraction of unfamiliar targets the network incorrectly \nidentifies as familiar. An ROC curve is a plot of H vs. F, parameterized by the \nthreshold'Y (i.e., it is equivalent to the two curves Fh) and Hh)) . The area under \nthe ROC curve is the c-index, a measure of predictive accuracy that is independent \nof both the fraction of positive (familiar) cases in the test set and the positive-case \ndecision threshold 'Y. \n\nAn ARTMAP-FD network was trained on simulated radar range profiles from 18 \ntargets out of a 36-target set (Fig. \nla). Simulations tested sequential evidence \naccumulation performance for 1, 3, and 100 observations, corresponding to 0.05, \n0.15, and 5.0 sec. of observation (smooth curves, Fig. \nIb) . As in the case of \nidentification [6], a combination of multiwavelength range profiles and sequential \nevidence accumulation produces good familiarity discrimination, with the c-index \napproaching 1 as the number of sequential observations grows. \n\nFig. Ib also demonstrates the importance of the proper choice of familiarity mea(cid:173)\nsure. The jagged ROC curve was produced by a familiarity discrimination simula(cid:173)\ntion identical to that which resulted in the IOO-sequential-view smooth curve, but \nusing the match function (2) instead of \u00a2 as given by (1). \n\n\f878 \n\nE. Granger, S. Grossberg, M A. Rubin and W. W. Streilein \n\nIO , - - - - - ----r \n\nI \n' F \n~_~~~II \n\n\u00b7\"\"'\\\"MA \n'-\"-.. \n\no o \n\n0.2 \n\n0.4 \n\n0.6 \n\n08 \n\nF \n(b) \n\nT. \n\n'Y \n(c) \n\nFigure l:(a) 36 simulation targets with 6 wing positions and 6 wing lengths, and 100 \nscattering centers per target. Boxes indicate randomly selected familiar targets. (b) ROC \ncurves from ARTMAP-FD simulations, with multiwavelength range profiles having 40 \ncenter frequencies. Sequential evidence accumulation for 1, 3 and 100 views uses familiarity \nmeasure (1) (smooth curves); and for 100 views uses the match function (2) (jagged curve). \n(c) Training and test curves of miss rate M = (1- H) and false alarm rate F vs threshold \n1', for 36 targets and one view, Training curves intersect at the point where \"y = r p \n(predicted); and test curves intersect near the point where l' = ra (optimal). The training \ncurves are based on data from the first training epoch, the test curves on data from 3 \ntraining epochs. \n\n5 Familiarity threshold selection \nWhen a system is placed in operation, one particular decision threshold 'Y = r must \nbe chosen. In a given application, selection of r depends upon the relative cost \nof errors due to missed targets and false alarms. The optimal r corresponds to a \npoint on the parameterized ROC curve that is typically close to the upper left-hand \ncorner of the unit square, to maximize correct selection of familiar targets (H) while \nminimizing incorrect selection of unfamiliar tar gets (F) . \nValidation set method. To determine a predicted threshold r p , the training \ndata is partitioned into a training subset and a validation subset. The network \nis trained on the training subset, and an ROC curve (F(r) , H(r)) is calculated \nfor the validation subset. r p is then taken to be the point on the curve that \nmaximizes [H(r) - F(r)]. (For ease of computation the symmetry point on the \ncurve, where 1 - H('y) = F(r), can yield a good approximation.) For a familiarity \ndiscrimination task the validation set must include examples of classes not present \nin the training set. Once rp is determined , the training subset and validation subset \nshould be recombined and the network retrained on the complete training set. The \nretrained network and the predicted threshold r p are then employed for familiarity \ndiscrimination on the test set. \n\nOn-line threshold determination. During ARTMAP-FD training, category \nnodes compete for new patterns as they are presented. When a node J wins the \ncompetition, learning expands the category hyperbox RJ enough to enclose the \ntraining pattern a. The familiarity measure \u00a2 for each training set input then be(cid:173)\ncomes equal to 1. However, before this learning takes place, \u00a2 can be less than 1, \nand the degree to which this initial value of \u00a2 is less than 1 reflects the distance \nfrom the training pattern to RJ. An event of this type- a training pattern success(cid:173)\nfully coded by a category node-is taken to be representative of familiar test-set \npatterns. The corresponding initial values of \u00a2 are thus used to generate a training \n\n\fFamiliarity Discrimination of Radar Pulses \n\n879 \n\nhit rate curve, where H(\"() equals the fraction of training inputs with cp > ,. \nWhat about false alarms? By definition, all patterns presented during training are \nfamiliar. However, a reset event during training (Sec. 2) resembles the arrival of \nan unfamiliar pattern during testing. Recall that a reset occurs when a category \nnode that predicts class K wins the competition for a pattern that actually belongs \nto a different class k. The corresponding values of cp for these events can thus be \nused to generate a training false-alarm rate curve, where F(\"() equals the fraction \nof match-tracking inputs with initial cp > \"(. \nPredictive accuracy is improved by use of a reduced set of cp values in the training(cid:173)\nset ROC curve construction process. Namely, training patterns that fall inside \nRJ, where cp = I, are not used because these exemplars tend to distort the miss \nrate curve. In addition, the first incorrect response to a training input is the best \npredictor of the network's response to an unfamiliar testing input, since sequential \nsearch will not be available during testing. Finally, giving more weight to events \noccurring later in the training process improves accuracy. This can be accomplished \nby first computing training curves H(\"() and F(\"() and a preliminary predicted \nthreshold r p using the reduced training set; then recomputing the curves and r p \nfrom data presented only after the system has activated the final category node \nof the training process (Fig. Ic). The final predicted threshold r p averages these \nvalues. This calculation can still be made on-line, by taking the \"final\" node to be \nthe last one activated. \n\nTable I shows that applying on-line threshold determination to simulated radar \nrange profile data gives good predictions for the actual hit and false alarm rates, H A \nand FA. Furthermore, the HA and FA so obtained are close to optimal, particularly \nwhen the ROC curve has a c-index close to one. The method is effective even when \ntesting involves sequential evidence accumulation, despite the fact that the training \ncurves use only single views of each target. \n6 NEN \n\nNear-enough-neighbor (NEN) [7, 8] is a familiarity discrimination algorithm based \non the single nearest neighbor classifier. For each familiar class K, the familiarity \nthreshold t:l.K is the largest distance between any training pattern of class K and \nits nearest neighbor also of class K. During testing, a test pattern is declared \nunfamiliar if the distance to its nearest neighbor is greater than the threshold t:l.K \ncorresponding to the class K of that nearest neighbor. \nWe have extended NEN to k > I by retaining the above definition of the t:l.K's, \nwhile taking the comparison during testing to be between t:l.K and the distance \nbetween the test pattern and the closest of its k nearest neighbors which is of the \nclass K to which the test pattern is deemed to belong. \n7 Radar pulse data \n\nIdentifying the type of emitter from which a radar signal was transmitted is an \nimportant task for radar electronic support measures (ESM) systems. Familiarity \ndiscrimination is a key component of this task, particularly as the continual prolif(cid:173)\neration of new emitters outstrips the ability of emitter libraries to document every \nsort of emitter which may be encountered. \n\nThe data analyzed here, gathered by Defense Research Establishment Ottawa, con-\n\n\f880 \n\nE. Granger, S. Grossberg, M. A. Rubin and W W Streilein \n\nhit rate \nfalse alarm rate \naccuracy \n\n3x3 \n\noptimal \n\n0.86 \n0.14 \n1.00 \n\nactual \n0.81 \n0.11 \n0.95 \n\n6x6 \n\noptimal \n\n0.77 \n0.23 \n1.00 \n\nactual \n0.77 \n0.24 \n0.93 \n\n6x6* \n\noptimal \n\n0.98 \n0.02 \n1.00 \n\nactual \n0.99 \n0.06 \n1.00 \n\nla) , testing on all target classes. (In 3x3 case, 4 classes out of 9 to(cid:173)\n\nTable 1: Familiarity discrimination, using ARTMAP-FD with on-line threshold predic(cid:173)\ntion, of simulated radar range profile data. Training on half the target classes (boxed \n\"aircraft\" in Fig. \ntal used for training.) Accuracy equals the fraction of correctly-classified targets out of \nfamiliar targets selected by the network as familiar. The results for the 6x6' dataset in(cid:173)\nvolve sequential evidence accumulation, with 100 observations (5 sec.) per test target. \nRadar range profile simulations use 40 center frequencies evenly spaced between 18GHz \nand 22GHz, and wp x wl simulated targets, where wp =number of wing positions and \nwi =number of wing lengths. The number of range bins (2/3 m. per bin) is 60 , so each \npattern vector has (60 range bins) x (40 center frequencies) = 2400 components. Training \npatterns are at 21 evenly spaced aspects in a 10\u00b0 angular range and, for each viewing \nangle, at 15 downrange shifts evenly spaced within a single bin width. Testing patterns \nare at random aspects and downrange shifts within the angular range and half the total \nrange profile extent of (60 bins) x (2/3 m.) =40 m. \n\nmethod \n\nARTMAP-FD \n\nNEN \n\ncity-block metric \n\nEuclidean metric \n\nhit rate \nf. a. rate \naccuracy \n[memory \n\n[I \n\n0.95 \n0.02 \n1.00 \n21 \n\nk-l k-5 \n0.94 \n0.94 \n0.04 \n0.13 \n1.00 \n1.00 \n\nII \n\nk - 25 k-l k-5 \n0.93 \n0.93 \n0.02 \n0.05 \n1.00 \n1.00 \n\n0.94 \n0.14 \n0.99 \n\nk - 25 \n0.92 \n0.02 \n1.00 \n\n446 \n\nTable 2: Familiarity discrimination of radar pulse data set, using ARTMAP-FD and NEN \nwith different metrics and values of k. Figure given for memory is twice number of F2 \nnodes (due to complement coding) for ARTMAP-FD, number of training patterns for \nNEN. Training (single epoch) on first three quarters of data in classes 1-9, testing on other \nquarter of data in classes 1-9 and all data in classes 10-12. (Values given are averages \nover four cyclic permutations of the the 12 classes.) ARTMAP-FD familiarity threshold \ndetermined by validation-set method with retraining. \n\nsist of radar pulses from 12 ship borne navigation radars [9]. Fifty pulses were \ncollected from each radar, with the exception of radars #7 (100 pulses) and #8 \n(200 pulses). The pulses were preprocessed to yield 800 I5-component vectors. with \nthe components taking values between a and l. \n8 Results \n\nFrom Table 2, ARTMAP-FD is seen to perform effective familiarity discrimination \non the radar pulse data. NEN (k = 1) performs comparatively poorly. Extensions \nof NEN to k > 1 perform well. During fielded operation these would incur the \ncost of the additional computation required to find the k nearest neighbors of the \ncurrent test pattern , as well as the cost of higher memory requirements] relative to \nARTMAP-FD. The combination of low hit rate with low false alarm rate obtained \nby NEN on the simulated radar range profile datasets (Table 3) suggests that the \nalgorithm performs poorly here because it selects a familiarity threshold which is \n\n1 The memory requirements of kNN pattern classifiers can be reduced by editing \n\ntechniques[8], but how the use of these methods affects performance of kNN-based fa(cid:173)\nmiliarity discrimination methods is an open question. \n\n\fFamiliarity Discrimination of Radar Pulses \n\n881 \n\nmethod \n\ndataset \nhit rate \nfalse alarm rate \naccuracy \nI memory \n\nII ARTMAP -FD Ill-rk -----.-1 .,.....,_...-,--,-N_E ...... N....,..-...--.-_..---r-.----..-i \nk - 1 I k - 5 \n__ \nII 3x3 I 6x6 \n0.77 \n0.24 \n0.93 \n88 \n\nk - 5 \n3x3 \n0.11 \n0.00 \n1.00 \n1260 \n\n0.14 \n0.00 \n1.00 \n\n0.14 \n0.00 \n1.00 \n\n0.81 \n0.11 \n0.95 \n\nII 12 I \n\nk - 99 \n\n0.11 \n0.00 \n1.00 \n\n6x6 \n\n5670 \n\n__ \nII \n\nII \n\n0.11 \n0.00 \n1.00 \n\nTable 3: Familiarity discrimination of simulated radar range profiles using ARTMAP-FD \nand NEN with different values of k. Training and testing as in Table 1. ARTMAP-FD \nfamiliarity threshold determined by on-line method. City-block metric used with NEN; \nresults with Euclidean metric were slighlty poorer. \n\ntoo high. ARTMAP-FD on-line threshold selection, on the other hand, yields a \nvalue for the familiarity threshold which balances the desiderata of high hit rate \nand low false alarm rate. \n\nThis research was supported in part by grants from the Office of Naval Research, ONR \nNOOOI4-95-1-0657 (S . G.) and ONR NOOOI4-96-1-0659 (M. A. R ., W. W. S.) , and by a grant \nfrom the Defense Advanced Research Projects Agency and the Office of Naval Research, \nONR NOOOI4-95-1-0409 (S. G. , M. A. R. , W W. S.). E. G. was supported in part by \nthe Defense Research Establishment Ottawa and the Natural Sciences and Engineering \nResearch Council of Canada. \nReferences \n[1] Carpenter, G. A., Rubin, M. A. , & Streilein, W . W ., ARTMAP-FD: Familiarity \ndiscrimination applied to radar target recognition, in ICNN'97: Proceedings of the \nIEEE International Conference on Neural N etworks, Houston, June 1997; \n\n[2] Carpenter, G. A., Rubin, M. A., & Streilein, W. W ., Threshold Determination for \nARTMAP-FD Familiarity Discrimination, in C . H. Dagli et al., eds., Intelligent En(cid:173)\ngineering Systems Through Artificial Neural Networks, 1, 23-28, ASME, New York, \n1997. \n\n[3] Carpenter, G. A., Grossberg, S. , Markuzon, N., Reynolds, J . H., & Rosen, D. E ., \nFuzzy ARTMAP: A neural network architecture for incremental supervised learning \nof analog multidimensional maps, IEEE Transactions on N eural Networks, 3, 698-713, \n1992. \n\n[4] Carpenter, G. A., Grossberg, S., & Rosen . D. B. , Fuzzy ART: Fast stable learning and \ncategorization of analog patterns by an adaptive resonance system, Neural Networks, \n4,759-771, 1991. \n\n[5] Carpenter, G. A., & Ross, W . D. , ART-EMAP : A neural network architecture for \nobject recognition by evidence accumulation , IEEE Transactions on Neural Networks, \n6, 805-818, 1995. \n\n[6] Rubin, M. A., Application of fuzzy ARTMAP and ART-EMAP to automatic target \n\nrecognition using radar range profiles, Neural Networks , 8, 1109-1116, 1995. \n\n[7] Dasarathy, E. V.,.Is your nearest neighbor near enough a neighbor?, in Lainious, D. G. \nand Tzannes, N. S., eds. Applications and Research in Informations Systems and \nSciences, 1, 114-117, Hemisphere Publishing Corp. , Washington, 1977. \n\n[8] Dasarathy, B. V., ed., Nearest Neighbor(NN) Norm: NN Pattern Classification Tech(cid:173)\n\nniques, IEEE Computer Society Press, Los Alamitos, CA, 1991. \n\n[9] Granger, E. , Savaria, Y, Lavoie, P., & Cantin, M.-A ., A comparison of self-organizing \nneural networks for fast clustering of radar pulses, Signal Processing , 64, 249-269, \n1998. \n\n\f", "award": [], "sourceid": 1548, "authors": [{"given_name": "Eric", "family_name": "Granger", "institution": null}, {"given_name": "Stephen", "family_name": "Grossberg", "institution": null}, {"given_name": "Mark", "family_name": "Rubin", "institution": null}, {"given_name": "William", "family_name": "Streilein", "institution": null}]}