{"title": "Multimodular Architecture for Remote Sensing Operations.", "book": "Advances in Neural Information Processing Systems", "page_first": 675, "page_last": 682, "abstract": "", "full_text": "Multimodular Architecture for Remote Sensing \n\nOperations. \n\nSylvie Thiria(1,2) \n\nFouad Badran(1,2) \n\nCarlos Mejia(l) \n\nMichel Crepon(3) \n\n(1) Laboratoire de Recherche en Informatique \n\nUniversite de Paris Sud, B 490 - 91405 ORSAY Cedex France \n\n(2) CEDRIC, Conservatoire National des Arts et Metiers \n\n292 rue Saint Martin - 75003 PARIS \n\n(3) Laboratoire d'Oceanographie et de Climatologie (LODYC) \n\nT14 Universite de PARIS 6 - 75005 PARIS (FRANCE) \n\nAbstract \n\nThis paper deals with an application of Neural Networks to satellite \nremote sensing observations. Because of the complexity of the \napplication and the large amount of data, the problem cannot be solved \nby using a single method. The solution we propose is to build multi(cid:173)\nmodules NN architectures where several NN cooperate together. Such \nsystem suffer from generic problem for whom we propose solutions. \nThey allow to reach accurate performances for multi-valued function \napproximations and probability estimations. The results are compared \nwith six other methods which have been used for this problem. We \nshow that the methodology we have developed is general and can be \nused for a large variety of applications. \n\n675 \n\n\f676 \n\nThiria, Mejia, Badran, and Crepon \n\n1 INTRODUCTION \n\nNeural Networks have been used for many years to solve hard real world applications \nwhich involve large amounts of data. Most of the time, these problems cannot be solved \nwith a unique technique and involve successive processing of the input data. \nSophisticated NN architectures have thus been designed to provide good performances e.g. \n[Lecun et al. 90]. However this approach is limited for many reasons: the design of \nthese architectures requires a lot of a priori knowledge about the task and is complicated. \nSuch NN are difficult to train because of their large size and are dedicated to a specific \nproblem. Moreover if the task is slightly modified, these NN have to be entirely \nredesigned and retrained. It is our feeling that complex problems cannot be solved \nefficiently with a single NN whatever sophisticated it is. A more fruitful approach is to \nuse modular architectures where several simple NN modules cooperate together. This \nmethodology is far more general and allows to easily build very sophisticated architectures \nwhich are able to handle the different processing steps which are necessary for example in \nspeech or signal processing. These architectures can be easily modified to incorporate \nsome additional knowledge about the problem or some changes in its specifications. \n\nWe have used these ideas to build a multi-module NN for a satellite remote sensing \napplication. This is a hard problem which cannot be solved by a single NN. The \ndifferent modules of our architecture are thus dedicated to specific tasks and allow to \nperform successive processing of the data. This approach allows to take into account in \nsuccessive steps different informations about the problem. Furthermore, errors which \nmay occur at the output of some modules may be corrected by others which allows to \nreach very good performances. Making these different modules cooperate raises several \nproblems which appear to be generic for these architectures. It is thus interesting to study \ndifferent solutions for their design, training, and the efficient information exchanges \nbetween modules. In the present paper, we first briefly describe the geophysical problem \nand its difficulties, we then present the different modules of our architecture and their \ncooperation, we compare our results to those of several other methods and discuss the \nadvantages of our method. \n\n2 THE GEOPHYSICAL PROBLEM \n\nScatterometers are active microwave radars which accurately measure the power of \ntransmitted and backscatter signal radiations in order to compute the normalized radar cross \nsection (ao) of the ocean surface. The ao depends on the wind speed, the incidence angle 9 \n(which is the angle between the radar beam and the vertical at the illuminated cell) and the \nazimuth angle (which is the horizontal angle X between the wind and the antenna of the \nradar). The empirically based relationship between ao and the local wind vector can be \nestablished which leads to the determination of a geophysical model function. \n\nThe model developed by A. Long gives a more precise form to this functional. It has \nbeen shown that for an angle of incidence 9, the general expression for ao can be \nsatisfactorily represented by a Fourrier series: \n\n\fMultimodular Architecture for Remote Sensing Options \n\n677 \n\nwith U = A.v\"! \n\nLong's model specifies that A and 'Y only depend on the angle of incidence 9, and that bi \nand b2 are a function of both the wind speed v and the angle of incidence 9 (Figure 1). \n\n(1) \n\nFigure 1 : Definition of the different geophysical scales. \n\nFor now, the different parameters bl, b2 A and y used in this model are determined \nexperimentally. \n\nConversely it becomes possible to compute the wind direction by using several antenna \nwith different orientations with respect to the satellite track. The geophysical model \nfunction (1) can then be inverted using the three measurements of 0'0 given by the three \nantennas, it computes wind vector (direction and speed). Evidence shows that for a given \ntrajectory within the swath (Figure 1) i.e. (91,92,93) fixed, 9i being the incidence angle of \nthe beam linked to antenna i, the functional F is of the fonn presented in Fig.2 . \n\nIn the absence of noise, the determination of the wind direction would be unique in most \ncases. Noise-free ambiguities arise due to the bi-hannonic nature of the model function \nwith respect to X. The functional F presents singular points. At constant wind speed F \nyields a Lissajous curve; in the singular points the direction is ambiguous with respect \nto the triplet measurements (0'1,0'2,0'3) as it is seen in Fig. 2. At these points F yields \ntwo directions differing by 160\u00b0. In practice, since the backscatter signal is noisy the \nnumber and the frequency of ambiguities is increased. \n\n\f678 \n\nThiria, Mejia, Badran, and Crepon \n\n270\" \n\n45 0 \n\n135 0 \n\n(a) \n\n10\" \n\n1700 \n\n(b) \n\nFigure 2 : (a) Representation of the Functional F for a given trajectory (b) Graphics \n\nobtained for a section of (a) at constant wind speed. \n\nThe problem is therefore how to set up an accurate (exact) wind map using the observed \nmeasurements (0'1,0'2,0'3) . \n\n3 THE METHOD \n\nWe propose to use multi-layered quasi-linear networks (MLP) to carry out this inversion \nphase. Indeed these nets are able of approximate complex non-linear functional relations; \nit becomes possible by using a set of measurements to determine F and to realize the \ninversion. \n\nThe determination of the wind's speed and direction lead to two problems of different \ncomplexity, each of them is solved using a dedicated multi-modular system. The two \nmodules are then linked together to build a two level architecture. To take into account \nthe strong dependence of the measurements with respect to the trajectory, each module (or \nlevel) consists of n distinct but similar systems, a specific system being dedicated to each \nsatellite trajectory (n being the number of trajectories in a swath (Figure 1)). \n\nThe first level will allow the determination of the wind speed at every point of the swath. \nThe results obtained will then be supplied to the second level as supplementary data \nwhich allow to compute the wind direction. Thus, we propose a two-level architecture \nwhich constitutes an automatic method for the computation of wind maps (Figure 3). \nThe computation is performed sequentially between the different levels, each one \nsupplying the next with the parameters needed. \n\nOwing to the space variability of the wind, the measurements at a point are closely related \nto those performed \nin the neighbourhood. Taking into account this context must \ntherefore bring important supplementary information to dealiase the ambiguities. At a \npoint, the input data for a given system are therefore the measurements observed at that \npoint and at it's eight closest neighbours. \n\nAll the networks used by the different systems are MLP trained with the back-propagation \nalgorithm. The successive modifications were performed using a second order stochastic \ngradient: which is the approximation of the Levenberg-Marquardt rule. \n\n\fMultimodular Architecture for Remote Sensing Options \n\n679 \n\nuvtl3 : \n\nAmbiguUies correction \n\nuvel2 : \n\nWind Direction \ncompulillion \n\n- - -\n\nuvtll: \nWind Speed \ncompulillion \n\n-\n0 0 \n-\n\n0 0 \n\n(a) \n\n~= \n\n- -\nSi= \n- -\n\nLuwtr Spud Wi .... \nNtlWorl \n~= \n\n(b) \n\nFigure 3 : The three systems SI, S2 and S3 for a given trajectory. \n\nOne system is dedicated to a proper trajectory. As a result the networks used on the same \nlevel of the global architecture are of the same type; only the learning set numerical \nvalues change from one system to another. Each network learning set will therefore \nconsist of the data mesured on its trajectory. We present here the results for the central \ntrajectory, perfonnances for the others are similar. \n\n3.1 THE NETWORK DECODING : FIRST LEVEL \n\nA system (S 1) in the first level allows to compute the wind speed (in ms- 1) along a \ntrajectory. Because the function Fl to be learned (signal ~ wind speed) is highly non-\nlinear, each system is made of three networks (see Figure 3) : Rl allows to decide the \nrange of the wind speed (4 ~ v < 12 or 12 ~ v < 20); according to the Rl output an \naccurate value is computed using R2 for the first range and R3 for the other. The first \nlevel is built from 10 of these systems (one for each trajectory). \n\nEach network (Rl, R2, R3) consists of four fully connected layers. For a given point, we \nhave introduced the knowledge of the radar measurements at the neighbouring points. The \nsame experiments were performed without introducing this notion of vicinity, the \nlearning and test performances were reduced by 17%, which proves the advantages of this \napproach. The input layer of each network consists of 27 automata: these 9x3 automata \ncorrespond to the 0'0 values relative to each antenna for the point to be considered and its \neight neighbours. \n\nRl output layer has two cells: one for 4 ~ v < 12 and the other for 12 ~ v < 20; so its \n4 layers are respectively built of 27, 25, 25, 2 automata. \n\nR2 and R3 compute the exact wind speed. The output layer is represented by a unique \noutput automaton and codes this wind speed v at the point considered between [-1, + I] . \nThe four layers of each network are respectively formed of27, 25, 25,1 automata. \n\n\f680 \n\nThiria, Mejia, Badran, and Crepon \n\n3.2 DECODING THE DIRECTION : SECOND LEVEL \n\nNow the function F2 (signal ~ wind direction) has to be learned. This level is located \nafter the first one, so the wind speed has already been computed at all points. For each \ntrajectory a system S2 allows to compute the wind direction, it is made of an MLP and a \nDecision Direction Process (we call it D). As for FI we used for each point a contextual \ninformation. Thus, the input layer of the MLP consists of 30 automata : the first 9x3 \ncorrespond to the ao values for each antenna, the last three represent three times the first \nlevel computed wind speed. However, because the original function has major ambiguities \nit is more convenient to compute, for a given input, several output values with their \nprobabilities. For this reason we have discretized the desired output. It has been coded in \ndegrees and 36 possible classes have been considered, each representing a 10\u00b0 interval \n(between 0\u00b0 and 360\u00b0). So, the MLP is four layered with respectively 30, 25, 25, 36 \nautomata. It can be shown, according to the coding of the desired output, that the network \napproximates Bayes discriminant function or Bayes probability distribution related to the \ndiscretized transfer function F 2 [White, 89]. The interpretation of the MLP outputs using \nthe D process allows to compute with accuracy the required function F 2. The network \noutputs represents the 36 classes corresponding to the 36 10\u00b0 intervals. For a given input, \na computed output is a ~36 vector whose components can be interpreted to predict the \nwind direction in degrees. Each component, which is a Bayes discrim inant function \napproximation, can be used as a coefficient of likelihood for each class. The Decision \nDirection Process D (see Fig. 3) computes real directions using this information. It \nperforms the interpolation of the peaks' curve. D gives for each peak ist wind direction \nwith its coefficients of likelihood. \n\no 30 60 90 120 150 180 210 240 270 300 330 3600 \n\nFigure 4 : network's output. The points in the x -axis correspond to the 36 outputs. Each \nrepresents an interval of 10\u00b0 between 0 and 360\u00b0. The Y-axis points give the automata \ncomputed output The point indicated by a d corresponds to the desired output angle, ~ is \nthe most likely solution proposed by D and p is the second one. \n\nThe computed wind speed and the most likely wind direction computed by the first two \nlevels allow to build a complete map which still includes errors in the directions. As we \nhave seen in section 2, the physical problem has intrinsic ambiguities, they appear in the \nresults (table 2). The removal of these errors is done by a third level of NN. \n\n\fMultimodular Architecture for Remote Sensing Options \n\n681 \n\n3.3 CORRECTING THE REMAINING ERRORS : THIRD LEVEL \n\nThis problem has been dealt with in [Badran & al 91] and is not discussed here. The \nmethod is related to image processing using MLP as optimal filter. The use of different \nfilters taking into account the 5x5 vicinities of the point considered permits to detect the \nerroneous directions and to choose among the alternative proposed solutions. This method \nenables to correct up to 99.5% of the errors. \n\n4 RESULTS \n\nAs actual data does not exist yet, we have tested the method on values computed from real \nmeteorological models. The swaths of the scatterometer ERS 1 were simulated by flying \na satellite on wind fields given by the ECMWF forecasting model. The sea roughness \nvalues (0'1,0'2,0'3) given by the three antennas were computed by inverting the Long \nmodel. Noise was then added to the simulated measurements in order to reproduce the \nerrors made by the scatterometer. (A gaussian noise of zero average and of standard \ndeviation 9.5% for both lateral antennas and 8.7% for the central antenna was added at \neach measurement).Twenty two maps obtained for the southern Atlantic Ocean were used \nto establish the learning sets. The 22 maps were selected randomly during the 30 days of \nSeptember 1985 and nine remaining maps were used for the tests. \n\n4.1 DECODING THE SPEED : FIRST LEVEL \n\nIn the results presented in Table 1, a predicted measurement is considered correct if it \ndiffers from the desired output by 1 m/s. It has to be noticed that the oceanographer's \nspecification is 2 m/s; the prescnt results illustrate the precision of the method. \n\na e \nPerformances \nAccuracy 1 ml s \n\nT bl 1 : per ormances on t e wm \n\nfi \n\nh \n. d spee d \nperformances \n\nlearninf? \n\ntest \n\n99.3% \n98,4 % \n\nbias \n\n0.045m/s \n0.038m/s \n\n4.2 DECODING THE DIRECTION : SECOND LEVEL \n\nIt is found that good performances are obtained after the interpretation of the best two \npeaks only. When it is compared to usual methods which propose up to six possible \ndirections, this method appears to be very powerful. Table 2 shows the performances \nusing one or two peaks. The function F and its singularities have been recovered with a \ngood accuracy, the noise added during the simulations in order to reproduce the noise made \nby the measuring devices has been removed. \n\nT bl 2 \n\na e \n\n: pe ormances on t e wm \n\nr\u00a3 \n\nh \n\nPerformances \nPrecision 20\u00b0 \n\nlearnim~ \n\ntest \n\n. dd' \n\nI \n\nuectlOn usmg th \none peak \n68.0 % \n72.0 % \n\ne com~ ete ~stem \ntwo peaks \n99.1 % \n99.2 % \n\n\f682 \n\nThiria, Mejia, Badran, and Crepon \n\n5 VALIDATION OF THE RESULTS \n\nIn order to prove the power of the NN approach, table 3 compare our results with six \nclassical methods [Chi & Li 88]. \n\nTable 3 shows that the NN results are very good compared to other techniques, moreover \nall the classical methods are based on the assumption that a precise analytical function \n\u00abv ,X) ~ 0') exists, the NN method is more general and does not depend on such an \nassumption. Moreover the decoding of a point with NN requires approximately 23 ms on \na SUN4 working station. This time is to be compared with the 0.25 second necessary for \nthe decoding by present methods. \n\nTable 3 : performances simulation results Erms (in m/s) for different fixed wind speed \nN.N \n0.49 \n0.53 \n1.18 \n\nSpeed WLSL \n0.92 \n0.89 \n3.71 \n\nAWLS \n0.69 \n0.89 \n3.52 \n\nL1 \n0.63 \n0.98 \n4.06 \n\nLWSS \n1.02 \n0.87 \n3.49 \n\nLow \nMiddle \nHight \n\nML \n0.66 \n0.85 \n3.44 \n\nLS \n0.67 \n1.10 \n4.11 \n\nWLS \n0.74 \n1.31 \n5.52 \n\nThe wind vector error e is defined as follows: e = V1 - V2 where V1 is the true \nwind vector and V2 is the estimated wind vector, Erms = E( II ell). \n\n6 CONCLUSION \n\nPerformances reached when processing satellite remote sensing observations have proved \nthat multi-modular architectures where simple NN modules cooperate can cope with real \nworld applications. The methodology we have developed is general and can be used for a \nlarge variety of applications, it provides solutions to generic problems arising when \ndealing with NN cooperation. \n\nReferences \n\nBadran F, Thiria S, Crepon M (1991) : Wind ambiguity removal by the use of neural \nnetwork techniques, J.G.R Journal of Geophysical Research vol 96 n \u00b0C 11 p 20521-\n20529, November 15. \n\nChong-Yung C, Fuk K Li (1969) : A Comparative Study of Several Wind Estimation \nAlgorithms for Spacebomes scatterometers. IEEE transactions on geoscience and remote \nsensing, vol 26, No 2. \n\nLe Cun Y., Boser B., & aI., (1990) : Handwritten Digit Recognition with a Back(cid:173)\nPropagation Network- in D.Touretzky (ed.) Advances in Neural Information Processing \nSystems 2 , 396-404, Morgan Kaufmann \n\nWhite H. (1989) : Learning in Artificial Neural Networks: A Statistical Perspective. \nNeural Computation, 1,425-464. \n\n\f", "award": [], "sourceid": 579, "authors": [{"given_name": "Sylvie", "family_name": "Thiria", "institution": null}, {"given_name": "Carlos", "family_name": "Mejia", "institution": null}, {"given_name": "Fouad", "family_name": "Badran", "institution": null}, {"given_name": "Michel", "family_name": "Cr\u00e9pon", "institution": null}]}