{"title": "Distribution Learning of a Random Spatial Field with a Location-Unaware Mobile Sensor", "book": "Advances in Neural Information Processing Systems", "page_first": 12479, "page_last": 12487, "abstract": "Measurement of spatial fields is of interest in environment monitoring. Recently mobile sensing has been proposed for spatial field reconstruction, which requires a smaller number of sensors when compared to the traditional paradigm of sensing with static sensors. A challenge in mobile sensing is to overcome the location uncertainty of its sensors. While GPS or other localization methods can reduce this uncertainty, we address a more fundamental question: can a location-unaware mobile sensor, recording samples on a directed non-uniform random walk, learn the statistical distribution (as a function of space) of an underlying random process (spatial field)? The answer is in the affirmative for Lipschitz continuous fields, where the accuracy of our distribution-learning method increases with the number of observed field samples (sampling rate). To validate our distribution-learning method, we have created a dataset with 43 experimental trials by measuring sound-level along a fixed path using a location-unaware mobile sound-level meter.", "full_text": "Distribution Learning of a Random Spatial Field with\n\na Location-Unaware Mobile Sensor\n\nMeera Pai and Animesh Kumar\n\nElectrical Engineering\n\nIndian Institute of Technology Bombay\n\nMumbai 400076 India\n\nmeeravpai,animesh@ee.iitb.ac.in\n\nAbstract\n\nMeasurement of spatial \ufb01elds is of interest in environment monitoring. Recently\nmobile sensing has been proposed for spatial \ufb01eld reconstruction, which requires a\nsmaller number of sensors when compared to the traditional paradigm of sensing\nwith static sensors. A challenge in mobile sensing is to overcome the location\nuncertainty of its sensors. While GPS or other localization methods can reduce\nthis uncertainty, we address a more fundamental question: can a location-unaware\nmobile sensor, recording samples on a directed non-uniform random walk, learn\nthe statistical distribution (as a function of space) of an underlying random process\n(spatial \ufb01eld)? The answer is in the af\ufb01rmative for Lipschitz continuous \ufb01elds,\nwhere the accuracy of our distribution-learning method increases with the number\nof observed \ufb01eld samples (sampling rate). To validate our distribution-learning\nmethod, we have created a dataset with 43 experimental trials by measuring sound-\nlevel along a \ufb01xed path using a location-unaware mobile sound-level meter.\n\n1\n\nIntroduction\n\nLearning the statistical distribution of physical \ufb01elds from observed values is a fundamental task in\napplications like environmental monitoring and pollution control. Consider a spatio-temporal process\nX(s, t) along a path, such as in a residential neighborhood or a city boulevard, where s denotes the\nlocation and t is the time. It is of interest to the learn the statistical distribution of X(s, t) at any point\ns along the path for environment monitoring. Motivated by this application, the distribution-learning\nof a Lipschitz continuous spatial \ufb01eld at all locations from spatial samples of its realizations is studied.\nIn classical environment monitoring done by agencies such as the EPA (http://epa.gov), the sensing\nlocations are assumed to be known. This is especially true when there is a dedicated \ufb01xed sensing\nlocation with associated equipment. Recently, mobile-sensing has been proposed as a way to increase\nthe spatial sampling density and reduce the cost of hardware Unnikrishnan and Vetterli [2013]. A\nkey challenge in mobile-sensing is to ascertain the exact location of sampling and it is of interest\nto work with location-unaware sensing methods Kumar [2017]. While it is possible to use GPS or\nwireless localization methods to estimate the location, it has energy and hardware overhead Che\net al. [2009], Hu and Evans [2004]. We have a more fundamental question: can recently discovered\nlocation-unaware sensing methods be used to learn the statistical distribution of X(s, t) as a function\nof s? The answer is yes, and analytical and experimental results along this theme will be presented.\nLet X(s, t) be a spatial \ufb01eld where s \u2208 P denotes the location and t \u2208 R denotes time. The path P\nis known, and it can be an open path or a loop. The set P represents the \ufb01nite path over which the\ndistribution of X(s, t) has to be learned. It is assumed that |X(s, t)| \u2264 b everywhere for a \ufb01nite b > 0\nand the \ufb01eld is Lipschitz continuous; that is, |X(s, t) \u2212 X(s(cid:48), t)| \u2264 \u03b1|s \u2212 s(cid:48)| for some \u03b1 > 0. The\nunknown sampling locations are modeled using an unknown renewal process (directed non-uniform\n\n33rd Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada.\n\n\frandom walk) as in the related literature Kumar [2017]. The sampling locations are S1, S2, . . . , SM\nalong the path P, where M is obtained from the stopping condition SM \u2264 1, SM +1 > 1. A renewal\nprocess implies that \u03b81 := S1, \u03b82 := S2 \u2212 S1, . . . are independent and identically distributed. In\nour setup, the distribution of \u03b8 is not known. This model is useful when there is jitter in mobile\nsensor\u2019s speed or when the sensing time-intervals are programmed to be on a renewal process. The\nmobile-sensing experiment for distribution-learning is designed around N independent trials. It\nis assumed that N mobile-sensing experiments, with statistically independent sampling locations\nbetween the experiments, are conducted on the path P. Using these location-unaware samples, it is\nof interest to learn the statistical distribution of X(s, t) for any point s \u2208 P.\nOur main results are as follows:\n\n1. Using the classical Glivenko-Cantelli estimate, a distribution-learning method for\nX(s, t), s \u2208 P is presented, where the maximum pointwise error between the cumulative\n\ndistribution function (CDF) of X(s, t) and its estimate decreases as O(cid:0)1/(n\u03b52)(cid:1) + O(\u03b5).\n\nHere \u03b5 > 0 is a parameter of choice and n is the average number of samples. This result\nholds in the limit when N \u2192 \u221e.\n\n2. We have conducted mobile sensing experiments with a sound-level meter. The implications\nof our distribution-learning method on this custom dataset will be explored, and comparisons\nbetween distribution-learning with a \ufb01xed and a mobile sound-level meter will be presented.\n\nTo apply our distribution-learning method we have measured sound-level along a closed path in\nmultiple experiments. Using a portable sound-level meter, which is location-unaware, a dataset with\nN = 43 trials has been created for the application of the proposed distribution-learning method.\n\n(a)\n\n(b)\n\nFigure 1: (a) The \ufb01gure shows a mobile sensor moving along a \ufb01xed 1-D path. The \ufb01eld samples are\nobtained at unknown locations S1, S2, . . . , SM . (b) N trials are carried on N different days. Spatial\n\ufb01eld values are recorded at unknown locations Si,1, Si,2, . . . , Si,Mi on trial i. The number of samples\nrecorded during the ith trial is denoted by Mi.\n\nState of the art: Classical sampling and distributed sampling have been addressed with \ufb01xed sam-\npling locations, where the location of sensor is known A. J. Jerri [1977], Marco et al., Kumar\net al. [2011, 2010]. A systematic analysis of spatial sensing with mobile sensor has been studied\nin Unnikrishnan and Vetterli [2013]. Sensing of temporally \ufb01xed parametric spatial \ufb01elds with\nlocation-unaware mobile sensor was \ufb01rst addressed in Kumar [2016]. With location-unaware sam-\npling, interpolation methods for polynomial shapes has been reported in Pacholska et al. [2017]. With\nlocation-unawareness, an algorithm for spatial mapping is presented in Elhami et al. [2018]. Mobile\nsampling is also studied for crowdsensing application; Morselli et al. [2018] compared environmental\nmonitoring using a \ufb01xed grid of sensors and sensors attached to vehicles. Use of vehicular sensor\nnetworks for environmental monitoring has been studied in Atakan [2014], Wang and Chen [2017].\n\n2 Sensing model and spatial \ufb01eld properties\n\nIn this section, modeling assumptions made on the spatial \ufb01eld and the location-unaware mobile\nsensor are presented. First, spatial \ufb01eld properties are discussed.\n\n2\n\n\fLet P be a bounded-length path and s \u2208 P be a point on it. Let t be time. The spatial \ufb01eld of interest\nis X(s, t), s \u2208 P, t \u2208 R. The distribution of X(s, t) as a function of s \u2208 P has to be \u2018learned\u2019\nand it will be termed as the distribution-learning problem in this work. The \ufb01eld X(s, t) may be\nnon-stationary as a function of s \u2208 P which makes the distribution-learning problem non-trivial. It is\nassumed that |X(s, t)| \u2264 b everywhere and it is Lipschitz continuous in s, i.e.,\n(cid:48) \u2208 P and all t.\n\n|X(s, t) \u2212 X(s\n(cid:48)\n\n, t)| \u2264 \u03b1|s \u2212 s\n\n(cid:48)| for all s, s\n\nThe boundedness of spatial derivative indicates that nearby points have similar \ufb01eld values.\nWithout loss of generality, the one-dimensional path will be considered as P = [0, 1]. A location-\nunaware mobile sensor samples the \ufb01eld X(s, t) from s = 0 to s = 1 at points generated by an\nunknown renewal process. The sampling points are S1, S2, . . . , SM , while the inter-sample distances\nare \u03b81 := S1, \u03b82 := S2 \u2212 S1, . . ., \u03b8M := SM \u2212 SM\u22121. The variables \u03b81, \u03b82, . . . are independent and\nidentically distributed positive random variables. For analysis purposes, it will be assumed that\n\n0 < \u03b8 \u2264 \u03bb\nn\n\nand E(\u03b8) =\n\n1\nn\n\n,\n\n(1)\n\nwhere \u03bb > 1 is \ufb01nite and represents maximum speed of the sensor while the average sampling rate is\nn/meter.\nSince \u03b81, \u03b82, . . . are assumed to be random variables, the number of sample points realized in [0, 1]\nwill be random. Let the random variable M be the number of readings taken in each mobile sensing\ntrial in the interval P = [0, 1]. The variable M is given by the following stopping rule Durrett [2010]:\n\u03b81 + \u03b82 + . . . + \u03b8M \u2264 1 and \u03b81 + \u03b82 + . . . + \u03b8M +1 > 1. As shown in Kumar [2017], the conditional\naverage of \u03b8 conditioned on M = n is approximately 1\n\nn. Speci\ufb01cally, it is known that\n\nE[M ] \u2264 n + \u03bb \u2212 1.\n\n(2)\nThe distribution of \u03b81 and the values of s1, . . . , sM are not required for our distribution-learning\nalgorithm, which makes it a universal learning algorithm under the above assumptions. This is one of\nthe simplest location-unaware mobile sensor model that can be used along a path.\nThe entire mobile-sensing experiment is designed around N independent trials. It is assumed that the\n\ufb01eld samples\n\n(cid:126)f1 := [X(S1,1, t1,1), X(S1,2, t1,2), . . . , X(S1,M1 , t1,M1)]T ;\n. . . ,\n(cid:126)fN := [X(SN,1, tN,1), X(SN,2, tN,2), . . . , X(SN,MN , tN,MN )]T\n\nare available. It is assumed that the observed values in different trials are statistically independent.\nUsing these N different trials, it is of interest to learn the distribution of X(s, t) for any point s \u2208 P.\nThe values of sampling locations Si,j are not known. All of these sampling locations are generated\nby N independent instances of the same renewal process with inter-sample spacing distribution of\n\u03b8. Thus, the vectors (cid:126)f1, . . . , (cid:126)fN are statistically independent. (Individually, each vector (cid:126)fi will be\ndependent; for example, S1,2 = S1,1 + \u03b81,2 depends on S1,1.)\n\n3 Spatial \ufb01eld\u2019s distribution-learning algorithm\n\nThis section will summarize our distribution-learning method and the analysis results. The values\nsummarized by (cid:126)f1, . . . , (cid:126)fN are available. The i-th trial results in the dataset (cid:126)fi with Mi number of\nsamples. Since the sample locations are unknown, error in learning the \ufb01eld distribution at any given\nlocation depends on the error in the estimation of \ufb01eld values for the given location from samples\nobtained by the mobile sensor. For any s \u2208 [0, 1], the task is to learn the distribution of X(s, t). For\nnotational purposes, in a given trial, let M be the number of recorded samples. Let Mi be the number\nof samples recorded during trial i and let Si,j denote the location of jth sample for trial i. From\ntrial i, let \u02c6Xi(s) be the estimate of \ufb01eld value at the point s (corresponding to the time of the trial i).\nDesigning a good estimate for \u02c6Xi(s) is a challenge in the location-unaware sensing setup. For the\ndistribution-learning problem, we de\ufb01ne an estimate for X(s) from the i-th trial as\n\n\u02c6Xi(s) := X(si,(cid:98)(Mi\u22121)s(cid:99)+1, ti,(cid:98)(Mi\u22121)s(cid:99)+1).\n\n(3)\n\n3\n\n\fNote that the dependence on t has been dropped in the left-hand side. This simpli\ufb01ed notation will be\nused, since the main error in distribution-learning will be due to the error in location estimate s. The\ndistribution is assumed to be calculated over all time. Let\n\nFX(s)(x) = P(X(s) \u2264 x)\n\ndenote the cumulative distribution function (CDF) of \ufb01eld values at the location s, and let F \u02c6X(s)(x) =\nP( \u02c6X(s) \u2264 x) be the CDF of its estimate. Let 1(x \u2208 A) be the indicator of set A. The CDF of \u02c6X(s)\ncan be obtained as the following classical Glivenko-Cantelli limit:\n\nF \u02c6X(s)(x) = lim\nN\u2192\u221e\n\n1\nN\n\n\u02c6Xi(s) \u2264 x\n\n1\n\n(cid:16)\n\nN(cid:88)\n\ni=1\n\n(cid:17)\n\n(4)\n\nOur \ufb01rst result establishes the error between F \u02c6X(s)(x) to FX(s)(x) under the previously mentioned\nlocation-unaware sensing setup. Let fX(s)(x) be the probability density function of X(s). Then,\nTheorem 1. Let \u03b81, \u03b82, . . . , \u03b8M be inter-sample intervals generated by an unknown renewal process\nsuch that E[\u03b81] = 1\nn . Let M be the random number of samples recorded during a\ntrial. Then for every x \u2208 R, s \u2208 [0, 1] and for any \u03b5 > 0,\n\nn and 0 < \u03b8 \u2264 \u03bb\n\n|FX(s)(x) \u2212 F \u02c6X(s)(x)| \u2264 \u03b5. max(cid:0)fX(s)(x)(cid:1) +\n\n\u03b12\n\n\u03b52 ((n + \u03bb \u2212 1)s(1 \u2212 s) + C)\n\n\u03bb2\nn2 .\n\n(5)\n\nProof. This result establishes the closeness of CDFs of X(s) and \u02c6X(s) for any s \u2208 [0, 1]. Using\nclassical result from [Grimmett and Stirzaker [2001], pg. 311], the following result is noted:\n\n(cid:12)(cid:12)(cid:12) > \u03b5\nF \u02c6X(s)(x) \u2264 FX(s)(x + \u03b5) + P(cid:16)(cid:12)(cid:12)(cid:12) \u02c6X(s) \u2212 X(s)\n(cid:17)\n\n(6)\nWhen fX(s)(x) exists for every x, |FX(s)(x + \u03b5) \u2212 FX(s)(x)| = P(x < X(s) \u2264 x + \u03b5) \u2264\n\n\u03b5. max(cid:0)fX(s)(x)(cid:1).1 Therefore,\n\n(cid:12)(cid:12)(cid:12) > \u03b5\n|F \u02c6X(s)(x) \u2212 FX(s)(x)| \u2264 |FX(s)(x + \u03b5) \u2212 FX(s)(x)| + P(cid:16)(cid:12)(cid:12)(cid:12) \u02c6X(s) \u2212 X(s)\n(cid:17)\n(cid:12)(cid:12)(cid:12) > \u03b5\n\u2264 \u03b5. max(cid:0)fX(s)(x)(cid:1) + P(cid:16)(cid:12)(cid:12)(cid:12) \u02c6X(s) \u2212 X(s)\n(cid:17)\n(cid:12)(cid:12)(cid:12) \u2264 \u03b1(cid:12)(cid:12)S(cid:98)(M\u22121)s(cid:99)+1 \u2212 s(cid:12)(cid:12) ,\n(cid:12)(cid:12)(cid:12)X(s) \u2212 \u02c6X(s)\n\nSince the \ufb01eld is assumed to be Lipschitz continuous, so\n\n(7)\n\n(8)\n\n(9)\n\n.\n\n.\n\nwhere \u03b1 is the Lipschitz constant. Let\n\nl(M, s) = (cid:98)(M \u2212 1)s(cid:99) + 1.\n\nTherefore, the mean-squared error (MSE) in the estimation of spatial \ufb01eld values at location s is\ngiven by\n\n.\n\nE\n\n(cid:20)(cid:12)(cid:12)(cid:12)X(s) \u2212 \u02c6X(s)\n(cid:12)(cid:12)(cid:12)2(cid:21)\n\u2264 \u03b12E(cid:104)(cid:12)(cid:12)Sl(M,s) \u2212 s(cid:12)(cid:12)2(cid:105)\nE(cid:104)(cid:12)(cid:12)Sl(M,s) \u2212 s\n(cid:12)(cid:12)2(cid:105) \u2264 (E[M ]s(1 \u2212 s) + C)\n(cid:12)(cid:12)(cid:12)2(cid:21)\n(cid:20)(cid:12)(cid:12)(cid:12)X(s) \u2212 \u02c6X(s)\n\n\u2264 \u03b12((n + \u03bb \u2212 1)s(1 \u2212 s) + C)\n\n\u03bb2\nn2 .\n\nE\n\n(10)\n\n(11)\n\n(12)\n\n\u03bb2\nn2 .\n\nFrom (23) in Appendix A (given in the supplementary document),\n\nFrom (2), (10), and (11) it follows that\n\n1In case fX(s)(x) does not exist for every x, since FX(s)(x) is a continuous function for every \u03b5 > 0 there\n\nexists a \u03b4(\u03b5) > 0 such that |FX(s)(x + \u03b5) \u2212 FX(s)(x)| \u2264 \u03b4(\u03b5). As \u03b5 tends to zero \u03b4(\u03b5) tends to zero.\n\n4\n\n\f(13)\n\nBy the Chebyshev\u2019s inequality and \u02c6X(s) = X(Sl(M,s)),\n\nP(cid:0)(cid:12)(cid:12)X(s) \u2212 X(Sl(M,s))\n\n(cid:12)(cid:12) > \u03b5(cid:1) \u2264 1\n\nE\n\n(cid:12)(cid:12)(cid:12)2(cid:21)\n(cid:20)(cid:12)(cid:12)(cid:12)X(s) \u2212 \u02c6X(s)\n\n\u03b52\n\u2264 \u03b12\n\nThus from (8) and (14),\n\n\u03b52 ((n + \u03bb \u2212 1)s(1 \u2212 s) + C)\n\n|FX(s)(x) \u2212 F \u02c6X(s)(x)| \u2264 \u03b5. max(cid:0)fX(s)(x)(cid:1) +\n\n\u03bb2\nn2 .\n\u03b12\n\u03b52 ((n + \u03bb \u2212 1)s(1 \u2212 s) + C)\n(15)\nThe second term in the upper bound is of the order O( 1\nn\u03b52 ) while the \ufb01rst term is of the order O(\u03b5).\nTherefore, as the sampling rate n tends to in\ufb01nity, FX(s)(x) converges to F \u02c6X(s)(x). This upper\nbound depends on s and has a maximum at s = 1/2.\n\n\u03bb2\nn2 .\n\n(14)\n\nSimilar to the above result, our next theorem obtains a uniform bound on the error between the CDFs\nof X(s) and \u02c6X(s).\nTheorem 2. Let \u03b81, \u03b82, . . . , \u03b8M be inter-sample intervals generated by an unknown renewal process\nsuch that E[\u03b81] = 1\nn . Let M be the random number of samples recorded during a\ntrial. Then for every x \u2208 R, s \u2208 [0, 1] and for any \u03b5 > 0,\n\nn and 0 < \u03b8 \u2264 \u03bb\n\nsup\ns\u2208[0,1]\n\nProof. From (9),\n\nFor any \u03b5 > 0,\n0 \u2264 lim\nn\u2192\u221e\n\nP\n\nLet \u03b5\n\n\u03b1 = \u03b7. From (44) in Appendix B (given in the supplementary document),\n\nE[M ]\nwhere \u03b2 tends to 1 as n tends to in\ufb01nity. Therefore from (2) and (18),\n\nsup\n\nP\n\ns\n\n16\n\u03b72\n\n\u03bb2\nn2\n\n(cid:33)\n\n\u03b5\n\u03b1\n\n(16)\n\n(17)\n\n.\n\n(18)\n\n(19)\n\nP\n\ns\u2208[0,1]\n\n\u03b12\n\n32\n\u03b2\n\n(cid:32)\n\n(cid:32)\n\nsup\ns\u2208[0,1]\n\nsup\ns\u2208[0,1]\n\nsup\ns\u2208[0,1]\n\n\u2264 lim\nn\u2192\u221e\n\n(cid:12)(cid:12)(cid:12)F \u02c6X(s)(x) \u2212 FX(s)(x)\n\n(cid:12)(cid:12)(cid:12) \u2264 \u03b5. max(cid:0)fX(s)(x)(cid:1) +\n\u03b52 (n + \u03bb \u2212 1)\n(cid:12)(cid:12)Sl(M,s) \u2212 s\n(cid:12)(cid:12) .\n(cid:32)\n(cid:12)(cid:12)Sl(M,s) \u2212 s\n(cid:12)(cid:12) >\n\n(cid:12)(cid:12)(cid:12) \u2264 \u03b1 sup\n(cid:12)(cid:12)(cid:12)X(s) \u2212 \u02c6X(s)\n(cid:33)\n(cid:12)(cid:12)(cid:12) \u02c6X(s) \u2212 X(s)\n(cid:12)(cid:12)(cid:12) > \u03b5\n(cid:18)\n(cid:19)\n(cid:12)(cid:12)Sl(M,s) \u2212 s\n(cid:12)(cid:12) > \u03b7\n(cid:33)\n(cid:12)(cid:12)(cid:12) > \u03b5\n(cid:12)(cid:12)(cid:12) \u02c6X(s) \u2212 X(s)\n\u03b52 (n + \u03bb \u2212 1)\n(cid:33)\n(cid:32)\n(cid:12)(cid:12)(cid:12) > \u03b5\n(cid:12)(cid:12)(cid:12) \u02c6X(s) \u2212 X(s)\n(cid:12)(cid:12)(cid:12) \u2264 P(cid:16)(cid:12)(cid:12)(cid:12)X(s) \u2212 \u02c6X(s))\n(cid:12)(cid:12)(cid:12) > \u03b5\n(cid:17)\n+ \u03b5. max(cid:0)fX(s)(x)(cid:1)\n(cid:32)\n(cid:33)\n(cid:12)(cid:12)(cid:12)X(s) \u2212 \u02c6X(s)\n(cid:12)(cid:12)(cid:12) > \u03b5\n(cid:12)(cid:12)(cid:12)F \u02c6X(s)(x) \u2212 FX(s)(x)\n(cid:12)(cid:12)(cid:12) \u2264 \u03b5. max(cid:0)fX(s)(x)(cid:1) +\n\n\u2264 32\n\u03b2\n\n\u2264 2\n\u03b2\n\nsup\ns\u2208[0,1]\n\nsup\ns\u2208[0,1]\n\nsup\ns\u2208[0,1]\n\n\u03bb2\nn2 ;\n\nlim\nn\u2192\u221e\n\n\u2264 P\n\n\u03bb2\nn2\n\n\u03b12\n\nP\n\n+ \u03b5. max(cid:0)fX(s)(x)(cid:1) .\n\nThe upper bound in (19) is of O( 1\nn ). This proves that for any \u03b5 > 0,\nP\n\n= 0.\n\nFrom (8),(cid:12)(cid:12)(cid:12)F \u02c6X(s)(x) \u2212 FX(s)(x)\n\nThe upper bound on the right hand side in (19) is independent of s so,\n32\n\u03b2\n\n(20)\nThis implies that as the sampling rate n tends to in\ufb01nity, FX(s)(x) converges uniformly over s \u2208 [0, 1]\nto F \u02c6X(s)(x).\n\n\u03b52 (n + \u03bb \u2212 1)\n\nsup\ns\u2208[0,1]\n\n\u03bb2\nn2\n\n\u03b12\n\nIn the above result, \u03b5 is a parameter and the upper bound can be minimized over it. The result is left\nin terms of \u03b5 for future improvements, if any. Simulation results are presented next to validate the\nabove two theorems.\n\n5\n\n\f4 Simulations for distribution-learning using location-unaware samples\n\nTo apply and con\ufb01rm our distribution-learning method, we consider a synthetic spatiotemporally\nvarying sound-level along a path for simulations. The main goal of these simulations is to verify\nthe accuracy of our distribution-learning method with an increase in the number of samples. The\nsound-level at location s \u2208 [0, 1] and time t in the simulated signal is X(s, t) where,\n\n(cid:12)(cid:12)(cid:12)(cid:12)(cid:12)1000 +\n\n10(cid:88)\n\nr=1\n\n(cid:12)(cid:12)(cid:12)(cid:12)(cid:12) .\n\nX(s, t) =\n\nAr(t) cos(2\u03c0fr(t)s)\n\nIt is a 10 frequency signal, where the frequencies at each sampling time-instant are generated\nuniformly at random in the audible frequency range 20 Hz to 20 kHz, and where the amplitudes Ar(t)\nare generated uniformly in the range [\u2212180, 180]; this interval was selected to ensure that the sound-\nlevel lies in the usual range of 30-70 dB. Thus, in each trial among a total of N, at every sampling\ninstant an independent realization of the sound-level signal is used. Let ti,j be the sampling instants\nwhere i = 1, 2, . . . , N and j = 1, 2, . . . , mi. Here, mi is the number of samples collected in trial i.\nThe sound-levels for different values of ti,j are independent in our simulations. If j = l(mi, s), then\nX(s, ti,j) is the true sound-level which is estimated by our algorithm as \u02c6X(s, ti,j) := X(si,j, ti,j)\n(see (3)). Note that si,j values are not known in location-unaware sensing; and, these sampling\nlocations are approximated as j\u22121\nThe sampling locations are obtained by randomly generated locations si,1, si,2, . . . si,mi on trial\ni. These locations are generated by adding independent inter-sample intervals \u03b8 with a Rayleigh\n\u03c0 . The mean of \u03b8 is 1/n. The sound-levels in the simulation are\ndistribution having a parameter 1\nn\nalso recorded at s in each trial. These values model the recording of sound-level by a \ufb01xed sensor at\nthe point s. The empirical CDF of sound-level and their estimates at location s are given by\n\nfor j = 1, 2, . . . , mi.\n\n(cid:113) 2\n\nmi\n\n(cid:26)\n\nN(cid:88)\n\ni=1\n\n\u02c6FX(s)(x) =\n\n1\nN\n\n(cid:27)\n\nX (s, ti,j) \u2264 x\n\n1\n\nand \u02c6F \u02c6X(s)(x) =\n\n1\nN\n\n\u02c6X(s, ti,j) \u2264 x\n\n1\n\n.\n\n(21)\n\nwhere 1(x \u2208 A) denotes the indicator of set A. Recall that j = l(mi, s) for the i-th trial as discussed\nabove. Comparisons of CDFs for various values of n and N are shown in Figure 2, where n indicates\nthe sampling rate and s = 1/2. As the sampling rate n increases, the number of samples recorded\nduring each trial increases and the error between the estimated CDF of samples obtained by mobile\nsensor and the actual CDF of samples obtained by the \ufb01xed sensor at location s = 1/2 reduces. When\nthere is a large number of trials, the error in the estimation of empirical CDFs reduces further. Thus,\nthe simulation results validate our distribution-learning method with location-unaware samples.\n\n(cid:26)\n\nN(cid:88)\n\ni=1\n\n(cid:27)\n\n5 Experiments for sound-level estimation along a path\n\nSound-level is measured along the path shown in the map in Figure 3 using a sound-level meter. It is\ncarried along the path from the starting point 1 along the path back to point 1. Sound level meter by\nBAFX products (Model no: BAFX3608) is used for this purpose. Speci\ufb01cations of the sound-level\nmeter are given in Table1. It is not equipped with GPS or any other localization tool.\n\nTable 1: Speci\ufb01cations of Sound Level Meter\n\nRange: 30-130dB Sampling Rate: 1 per sec Memory: 4700 readings Accuracy: \u00b1 1.5 dB\n\nDatasets: We have created two different datasets by measuring sound-level along the path shown in\nFigure 3. For the \ufb01rst data set denoted by Dataset1, the path is traversed with a sound-level meter. It\nbegins recording data at the starting point and continues collecting data along the entire path. This\nacts as a location-unaware mobile sound-level meter. A static sound-level meter is used to measure\nsound-level at speci\ufb01c locations marked in the map in Figure 3 with numbers one to nine, during\neach trial. This acts as a \ufb01xed sensor as the \ufb01eld is measured at known locations. We have performed\n43 trials along the same path in Figure 3. For the second dataset denoted by Dataset2, the path\n\n6\n\n\f(a)\n\n(c)\n\n(b)\n\n(d)\n\nFigure 2: Empirical CDF of simulated sound-level at the location s = 0.5 where n is the sampling\nrate and N is the number of trials: (a) n = 100 and N = 100; (b) n = 100 and N = 500; (c)\nn = 1000 and N = 100; and, (d) n = 1000 and N = 500.\n\nFigure 3: Path along which sound-level is recorded. The locations marked in the map with numbers\nare used for measurement using a \ufb01xed sensor\n\nis traversed using the sound-level meter while cycling, where sampling rate in space is lower as\ncompared to walking. We have performed 43 trials in this case as well along the same path in Figure 3.\nSince the sound-level meter records samples at the rate of 1 sample per second, the spatial sampling\nrate for Dataset2 is smaller than the spatial sampling rate for Dataset1. We have also emulated a \ufb01xed\nstation at location 9 in Figure 3 using a static sensor for 10 minutes.\nFor experimentation, the path in Figure 3 was chosen as there is a large variation in the sound-level\nalong the path. The residential area is expected to be quiet compared to the region near the state\nhighway and residential market. The box plot for Dataset1 is illustrated in Figure 4. A box plot\ndisplays information about the range, median, and quartiles of the data. From Figure 4, the dynamic\nrange of sound-level along the path is observed. The average sound-level variation is 20 dB (ratio of\n100) while the dynamic range exceeds 30 dB (ratio of 1000). The main aim is to apply the distribution-\nlearning method on experimental data, and compare the agreement of learned distributions between a\nmobile sensor and a \ufb01xed sensor. The empirical distribution of sound-level obtained from the mobile\nsound-level meter de\ufb01ned in (21) and the empirical distribution of sound-level obtained from the \ufb01xed\nsensor de\ufb01ned in (21) that measures sound-level at locations marked with numbers 1-9 in Figure 3 is\ncompared. Figure 5a shows the comparison of empirical CDFs of experimental data from Dataset1\nat location 5 in Figure 3. The error in the empirical distributions computed using samples from the\n\n7\n\n455055606500.20.40.60.81SoundLevelindBFixedSensorMobileSensor40506000.20.40.60.81SoundLevelindBFixedSensorMobileSensor455055606500.20.40.60.81SoundLevelindBFixedSensorMobileSensor40506000.20.40.60.81SoundLevelindBFixedSensorMobileSensor\fFigure 4: Box plot for samples obtained from the mobile sound level device in Dataset1 along the\npath in Figure 3 of length 1015 meter is illustrated.\n\n\ufb01xed sensor and the mobile sensor in Dataset1 is small as shown in Figure 5a. This shows that the\nsound-level distribution at any location on a path can be learned using location-unaware samples.\nTo check the distribution-learning method at two different sampling rates of the mobile sound-level\nmeter, the empirical CDF of sound-level de\ufb01ned by (21) (at location 9 in Figure 3) using a \ufb01xed\nsensor and empirical CDF of sound-level obtained by mobile sensors de\ufb01ned by (21) are compared.\nThis comparison is done at two different sampling rates, obtained from Dataset1 and Dataset2. The\nCDFs are plotted in Figure 5. From Figure 5(b) and (c) the accuracy in learning the distribution is\nbetter for Dataset1 (higher spatial sampling rate) as compared to Dataset2 (lower spatial sampling\nrate). The accuracy of the distribution-learning method increases with spatial sampling rate. The\ndecrease in maximum pointwise error in learned CDF with n is also shown in Theorems 1 and 2.\n\n(a)\n\n(b)\n\n(c)\n\nFigure 5: (a) Comparison of empirical CDF of sound-level at location 5 in Figure 3, obtained by\nthe \ufb01xed sensor and by experimentation at location 9 in Figure 3, for two different sampling rates of\nmobile sensor: (b) \ufb01xed sensor versus mobile sensor for Dataset1 (Higher spatial sampling rate) (c)\n\ufb01xed sensor versus mobile sensor for Dataset2 (Lower spatial sampling rate)\n\n6 Conclusions\n\nIn this work, we proposed a data-driven method for learning the statistical distribution of a Lipschitz\ncontinuous spatial \ufb01eld along a path. The samples used were obtained at unknown-locations generated\nby an unknown renewal process. The accuracy of the proposed distribution-learning method increases\nwith the spatial sampling rate of the mobile sensor. Simulation and experimental results support\nthis claim. A method to learn the variation of distribution with time needs be developed if the \ufb01eld\nis temporally varying in nature. The \ufb01eld was assumed to be one dimensional and a single mobile\nsensor was used to sample. Use of multiple location-unaware mobile sensors for sampling 2-D \ufb01elds\ncan be studied in the future.\n\n8\n\n506070809000.20.40.60.81AcousticNoiseLevelindBFixedSensorMobileSensor60708000.20.40.60.81AcousticNoiseLevelindBFixedSensorMobileSensor506070809000.20.40.60.81AcousticNoiseLevelindBFixedSensorMobileSensor\fReferences\nJ. Unnikrishnan and M. Vetterli. Sampling and reconstruction of spatial \ufb01elds using mobile sensors.\n\nIEEE Transactions on Signal Processing, 61(9):2328\u20132340, May 2013.\n\nA. Kumar. On bandlimited \ufb01eld estimation from samples recorded by a location-unaware mobile\n\nsensor. IEEE Transactions on Information Theory, 63(4):2188\u20132200, April 2017.\n\nX. Che, I. Wells, P. Kear, G. Dickers, X. Gong, and M. Rhodes. A static multi-hop underwater wireless\nsensor network using RF electromagnetic communications. In 2009 29th IEEE International\nConference on Distributed Computing Systems Workshops, pages 460\u2013463, June 2009.\n\nLingxuan Hu and David Evans. Localization for mobile sensor networks. In Proceedings of the 10th\nannual international conference on Mobile computing and networking, pages 45\u201357. ACM, 2004.\n\nA. J. Jerri. The Shannon Sampling Theorem \u2013 its Various Extensions and Applications: a Tutorial\n\nPreview. Proceedings of the IEEE, 65:1565\u20131594, Nov. 1977.\n\nD. Marco, E. J. Duarte-Melo, M. Liu, and D. L. Neuhoff. On the many-to-one transport capacity of a\ndense wireless sensor network and the compressibility of its data. In IPSN, Proc. of the 2nd Intl.\nWkshp., Palo Alto, CA, USA, LNCS edited by L. J. Guibas and F. Zhao, Springer, NY, 2003, pages\n1\u201316.\n\nA. Kumar, P. Ishwar, and K. Ramchandran. High-resolution distributed sampling of bandlimited\n\ufb01elds with low-precision sensors. IEEE Trans. on Information Theory, 57(1):476\u2013492, Jan. 2011.\n\nA. Kumar, P. Ishwar, and K. Ramchandran. Dithered A/D conversion of smooth non-bandlimited\n\nsignals. IEEE Transactions on Signal Processing, 58(5):2654\u20132666, May 2010.\n\nA. Kumar. Bandlimited \ufb01eld estimation from samples recorded by a location-unaware mobile sensor.\n\nIn 2016 IEEE International Symposium on Information Theory, pages 1257\u20131261, July 2016.\n\nM. Pacholska, B. B. Haro, A. Schole\ufb01eld, and M. Vetterli. Sampling at unknown locations, with\nan application in surface retrieval. In 2017 International Conference on Sampling Theory and\nApplications (SampTA), pages 364\u2013368, July 2017.\n\nG. Elhami, M. Pacholska, B. B. Haro, M. Vetterli, and A. Schole\ufb01eld. Sampling at unknown locations:\nUniqueness and reconstruction under constraints. IEEE Transactions on Signal Processing, 66\n(22):5862\u20135874, Nov 2018.\n\nF. Morselli, F. Zabini, and A. Conti. Environmental monitoring via vehicular crowdsensing. In 2018\nIEEE 29th Annual International Symposium on Personal, Indoor and Mobile Radio Communica-\ntions (PIMRC), pages 1382\u20131387, Sep. 2018.\n\nB. Atakan. On exploiting sampling jitter in vehicular sensor networks.\n\nVehicular Technology, 63(1):403\u2013407, Jan 2014.\n\nIEEE Transactions on\n\nY. Wang and G. Chen. Ef\ufb01cient data gathering and estimation for metropolitan air quality monitoring\nby using vehicular sensor networks. IEEE Transactions on Vehicular Technology, 66(8):7234\u20137248,\nAug 2017.\n\nRick Durrett. Probability: theory and examples. Cambridge University Press, 2010.\n\nGeoffrey Grimmett and David Stirzaker. Probability and random processes. Oxford university press,\n\n2001.\n\n9\n\n\f", "award": [], "sourceid": 6770, "authors": [{"given_name": "Meera", "family_name": "Pai", "institution": "Indian Institute of Technology Bombay"}, {"given_name": "Animesh", "family_name": "Kumar", "institution": "Indian Institute of Technology Bombay"}]}