Access Point Significance Measures in WLAN-based ... - IEEE Xplore

Access Point Significance Measures in WLAN-based Location Elina Laitinen, Elena Simona Lohan, Jukka Talvitie and Shweta Shrestha

position estimation is done either via fingerprinting or via path loss models [1], [5], [6]. It is generally believed that some RSSIs are strongly relevant, while others are weakly relevant for the positioning purpose [1], [4]. However, there has been very little work so far regarding how to measure the ‘significance’ of an Access Point (AP) and according to which rules to keep or not an AP in the positioning estimation [1], [5], [7]. So far, the main criterion to study the relevance of an AP has been the average RSSI received from that particular AP. We address this issue by dividing the problem into two stages: the significance of APs at the training phase and the significance of an AP at the estimation phase and we study several additional possibilities to select an AP both at the training and estimation phase. In the training phase, the motivation for the AP significance measures is to decrease the amount of data to be transferred to the mobile. In the estimation phase, the motivation is to find out if the positioning accuracy can be increased by dropping out some parts of the data. We address fingerprinting estimation method and we compare all the techniques based on indoor measurement data, collected with a Windows tablet in several buildings in Tampere, Finland (university, shopping centers and supermarket).

Abstract—This paper focuses on the WLAN-based indoor location by taking into account the contribution of each hearable Access Point (AP) in the location estimation. Typically, in many indoor scenarios of interest for the future location services, such as malls, shopping centers, airports or other transit hubs, the amount of hearable APs is huge and it is important to find out whether some of these APs are redundant for the purpose of location accuracy and may be dropped. Moreover, many APs nowadays are multi-antenna APs or support multiple MAC addresses coming from exactly the same location, thus it is likely that they may bring little or no benefit if keeping all in the positioning stage. The purpose of our paper is to address various significance measures in WLAN-based location and to compare them from the point of view of the accuracy of the location solution. The access point significance is studied both at the training stage and at the estimation stage. Our models are based on real measurement data. Index Terms— Access Point (AP) selection, fingerprinting, path-loss models, Received Signal Strength Indicator (RSSI), Wireless Local Area Network (WLAN)-based location.

I. INTRODUCTION

W

LAN-BASED location is becoming more and more popular in indoor areas, where the traditional Global Navigation Satellite Systems (GNSS) often fail to offer a position estimate [1],[2]. A wide area of location-based services is envisioned for the future, once the barrier of indoor location is crossed [3]. The underlying multiple access schemes for WLANs are both Direct Sequence-Code Division Multiple Access (DS-CDMA) and Orthogonal Frequency Division Multiplexing (OFDM) techniques and the underlying modulations range from Binary Phase Shift Keying (BPSK) to higher order Quadrature Amplitude Modulation (QAM). Thus the Time-Of-Arrival (TOA) or Round-Trip-Time (RTT)-based estimations for WLAN location are still not widespread, due to the many different underlying physical layer features of WLANs on the market. Alternatively, the Received Signal Strength Indicator (RSSI) of the signal can be used for the location purpose, either by matching the measured RSSIs with some RSSIs collected in preamble in a database (fingerprinting method) or by using some path-loss models derived from the measured RSSI (path-loss estimation). Both estimation methods rely on an initial training phase, when the RSSI are continuously collected in various environments, followed by an estimation phase, where the actual mobile

II. POSITIONING PRINCIPLES Several different mobile positioning techniques based only on RSS or RSSI have been proposed over the years. One possible positioning method is path loss approach, where position of the MS is estimated based on some signal propagation model. In this paper, we will focus only on fingerprinting, since we want to study the maximum achievable performance. Different positioning procedures can be in general divided into two different phases: so-called training phase and estimation phase. In the training phase, models and databases are built based on collected data samples. In the estimation phase, the unknown position of a mobile station (MS) is estimated based on the information saved in the training phase. In fingerprinting method, the position of the MS is estimated based on a database with some location-sensitive parameters, such as the RSSI measurements. The main idea with this method is to create a database in the training phase using pre-measured samples (i.e., the fingerprints) with known locations, and then use only this database in the estimation phase. In this paper, when comparing currently measured RSSI levels of the MS to the RSSI levels of the fingerprints, the difference is calculated as a Euclidean distance. If no averaging over nearest neighbor (NN) points is used, the

All authors are with Tampere University of Technology, Tampere, Finland. Emails: {elina.laitinen, elena-simona.lohan, jukka.talvitie, shweta.shrestha }@tut.fi).

978-1-4673-1439-8/12/$31.00 ©2012 IEEE

24

fingerprint with smallest Euclidean distance is selected, and the location of this point is returned as MS location. When the NN averaging is used, N neigh fingerprints with smallest

several MAC-addresses due to multiple SSID MAC support) are recognized simply based on the AP location: if there are several APs located with maximum one meter distance of each other, only one (with maximum average RSSI) is kept. Naturally, there may be also situations where two separate APs are located next to each others. In this case, if the position estimates for these APs determine them to be closely located, only one will be kept. Since the distance estimation is the only measure to determine closely located APs in our studies, this kind of situations are not taken into account. The number of MIMO APs in total is dependent on the measurement scenario. - The reasoning behind this selection criterion is to try to remove redundant information offered by similar (closely located) APs. 3. Average RSSI - Most commonly used selection method, where APs are sorted in descending order based on their average RSSI. Only a certain part of APs is selected: e.g., 75%, 50% or 25% out of all APs. 4. Entropy - APs are sorted in descending order based on so-called entropy of RSSIs per AP. The entropy is calculated and derived by the Authors by analogy with classical entropy definition [8]. Only a certain part of APs is selected: e.g., 75%, 50%, or 25% out of all APs. 5. Variance - APs are sorted in descending order based on the variance of RSSIs per AP. Only a certain part of APs is selected: e.g., 75%, 50%, or 25% out of all APs. 6. Maximum RSSI - APs are sorted in descending order based on their maximum RSSI value. Only a certain part of APs is selected: e.g., 75%, 50% or 25% out of all APs. 7. Number-of-points - The target here is to emphasize more those APs that have higher coverage area by sorting APs in descending order based on the number of fingerprints where an AP is heard. Only a certain part of APs is selected: e.g., 75%, 50% or 25% out of all APs. 8. Random - This criterion is included in the results as another reference, in order to see the effect of selecting the APs randomly among all APs. Only a certain part of APs is kept: e.g., 75%, 50% or 25% out of all APs. The results for random criterion are calculated over 1000 random iterations. The results for these different selection criteria will be presented in Section V. Criterion 1 (No selection) and Criterion 8 (Random selection) are kept in the results as a reference. In Criterion 2 (MIMO selection) the target is to reduce possible redundant information offered by similar (closely located) APs. Criteria 3 (Average RSSI), 4 (Entropy), 5 (Variance) and 6 (Maximum RSSI) are closely related to the RSSI values of an AP in the fingerprints, with small differences from one selection criterion to another. Criterion 7 (Number-of-points) is based only on the number of fingerprints where an AP is measured, i.e., an AP that has

Euclidean distances are selected, and the position of the MS is calculated as an average over corresponding locations. In this paper, we use so-called synthetic grids, where the grid resolution is fixed. The grid points have some pre-defined size (e.g., 5 m x 5 m) and all samples measured in this area are fixed to the same grid point. When a new sample occurs to a grid point that already has a sample, all hearable APs are examined. If a new AP has been detected in the incoming sample, the AP is saved to the grid point data. If an AP is detected both in the old and incoming sample, the old RSSI value is replaced with geometric mean over the old and new RSSI values. The architecture of the positioning system used in this paper is mobile-based. This means that the MS makes the necessary measurements (here, RSSIs of the heard APs) and calculates the position estimation based not only on the measurements but also on training phase data. The training phase data are saved and continuously maintained and updated on a database (i.e., server) and transferred to the MS when requested. There are two different stages where AP significance is important. The addressed problems are: - At the training phase (in order to decrease the amount of transferred data): which APs’ training data to be transferred to MS? - At the positioning phase (in order to increase the positioning accuracy): which APs to use in position estimation? The restrictions for database size and especially for the amount of data to be transferred to the MS lead to the problem of selecting only parts of all available data. Since the number of APs can be huge, it is really important to know whether it is enough to transfer only some parts of the training data to the MS without decreasing the positioning performance. Another important question is to choose the APs to use in track estimation phase, especially if no selection is used in the training phase. There can be tens of APs measured per each scan result and as it will be illustrated later on, choosing all APs the MS “hears” may not be the optimal solution. We will show that restricting the use of APs also in the estimation phase may increase the positioning performance. We remark that if an AP is limited in the training phase (i.e., the fingerprint data for that AP are not transferred to the MS), that AP cannot be used in the estimation phase either. III. AP SELECTION IN WLAN TRAINING PHASE In this paper, several different AP selection criteria in the training phase were studied. The criteria are: 1. No selection - Here, all heard APs are kept. This criterion is included in the results as a reference, in order to see the effect of limiting the number of APs. 2. MIMO selection - When MIMO selection is chosen, unknown AP locations are first estimated as an average over the positions of the fingerprints (synthetic grids) with the highest RSSI. The MIMO APs (or other APs with 25

A. Measurement-based observations In [5], it is stated that if an AP is “heard” in more measurement locations (i.e., fingerprints), the average RSSI from that AP is higher and that AP is decided to be more significant. This finding was examined also in our studies, but the results are somewhat different. Fig. 2 presents the dependence between the number of fingerprints where an AP is “heard” and average RSSI from that AP in Univ. building 1.1 -case. As it can be seen in Fig. 2, strong RSSIs do not necessarily correspond to the “goodness” (measured as a number of fingerprints where that AP is heard) of an AP, like it is usually assumed. We remark that we got similar results also with the other measurement scenarios, but they could not be presented here due to the limited number of pages.

larger coverage area is decided to be more significant. IV. AP SELECTION IN WLAN ESTIMATION PHASE The results in this section and in the Section IV are based on real measurement data that has been collected with a Windows tablet, running a software tool that converted the map location into Cartesian coordinates. The positions on the map have been manually added during the data collection phase. The data is collected in several different indoor environments, like University, shopping centers and supermarket. Datasets are shortly described in TABLE I. In all measurement scenarios shown here, we have used fixed grid size of 5 m x 5 m. User tracks in the same environments have been collected similarly to training data, but as a separate and independent measurement. One example with measurement points with fixed grid size and with one user track with true positions is shown in Fig. 1.

Dataset Univ. building 1.1

Univ. building 1.2

Univ. building 2

Supermarket

Shopping mall 1

Shopping mall 2

TABLE I DATASET DESCRIPTIONS Description University-building 1, 1st floor Length of user track : 15 measurements Total number of APs: 185 University building 1, 2nd floor Length of user track : 30 measurements Total number of APs: 116 University building 2, 2nd floor Length of user track : 31 measurements Total number of APs: 236 Supermarket, one-floor-only Length of user track : 283 measurements Total number of APs: 27 Shopping center 1, 1st floor Length of user track : 78 measurements Total number of APs: 41 Shopping center 2, 3rd floor Length of user track : 117 measurements Total number of APs: 91

Figure 2. Dependence between the number of points and average RSSI. Univ. building 1.1 –scenario.

B. AP selection criteria in the estimation phase In the estimation phase, the MS can “hear” tens of APs in each scan. APs can be selected in the estimation phase based either on information gathered from the current user measurement or on some information gathered from the fingerprint dataset. In this paper, two different procedures were examined: sorting in descending order based on the RSSI of current user measurement or sorting after some particular weight (here, entropy of RSSIs over the fingerprints) that is calculated in the server and saved in training data. The number of APs used for calculating the position estimate is for now on denoted as AP-limit. C. Measurement results Fig. 3 presents the results for Root Mean Square Error (RMSE) versus AP-limit, for three different measurement scenarios. Both selection criteria are included here: sorting the APs based on the RSSI of current user measurement and sorting the APs after the entropy of RSSIs that is calculated on the server over the fingerprint data. All data is transferred to the MS, i.e., no AP selection is used in the training phase. APlimit = all means that no limit is used and every AP is taken into account in the user positioning. When examining Fig. 3, it can be noticed that for all tree measurement scenarios, the sorting based on the RSSI of the current user measurement is offering better results than

Figure 1. Example of measurement positions with 5 m x 5 m fixed grids and real positions of one user track, indoor mall in Tampere, Finland. 26

entropy-based sorting in terms of positioning accuracy. Indeed, it can be concluded that when using the RSSI-based sorting, limiting the used APs in the estimation phase can improve the positioning accuracy by a few meters, as it is, e.g., in the measurement scenarios Univ. building 2 and Shopping mall 2.

(Fig. 6), the positioning performance is decreased for all selection criteria. However, mean RSSI-based and especially maximum RSSI-based methods seem to be more sensitive to “too high” removal percentage. When it comes to the joint limitation in training and in estimation phase, it can be stated from Figs. 4 to 6 also that limiting in the estimation phase may increase the positioning performance for all criteria also when the APs are limited already in the training phase. This holds when removing 25% or even 50% of all the APs in the training phase. If as much as 75% of the APs are removed, most of the APs the user “hears” are already removed and the number of used APs in the user measurement is much lower. Thus, limiting the user measurements in the estimation phase, e.g., to 10 doesn’t affect to the results anymore, as it can be seen in Fig. 6. We remark that similar results are achieved also with other measurement scenarios, but the results for all of those could not be presented here due to the limited number of pages. Indeed, it can be concluded that MIMO removal with limiting the number of used APs also in the estimation phase to 5, is for the Univ. building 1.1 -scenario giving the best results in terms of positioning accuracy. Especially when the APs are selected randomly, it may happen that all APs the user “hears” in some measurement were actually removed in the training phase, and thus, no position estimation can be done for that particular measurement. This situation does not occur with any of the selection criteria in Univ. building 1.1 -case, when 25% or even 50% of the APs are removed. However, when 75% of APs are removed, in less than 0.5% of the cases (out of 1000 iterations and 15 user measurements in every iteration) for the random selection, the user measurement contained only those APs that were removed in the estimation phase. The same happened once (out of 15 user measurement) also with both average RSSI –based and maximum RSSI –based selections. TABLE II presents the results for different datasets and for all selection criteria, when the AP-limit = 7 and only RSSI-based selectivity is used in the estimation phase. In these results, approximately same removal percentage is used with the other methods than with MIMO removal, in order to have MIMO removal more comparable with the other selection criteria. The removal percentage varies from one dataset to another due to different AP structure and is shown in TABLE II with brackets. It can be clearly seen in TABLE II that MIMO removal offers in three cases the best positioning performance, when compared to the other selection criteria or to the case with no AP selectivity in the training phase at all. Indeed, in the other three cases, MIMO removal is on the same level with other methods. Since the AP locations are not known but only estimated based on the RSSI measurements and since the limit for MIMO AP recognition is quite strict (1 meter), it is possible that some MIMO APs (or other APs with several MAC-addresses) are left in the data. This may affect to the results in some cases.

Figure 3. Limiting the number of used APs in the estimation phase only.

V. JOINT LIMITATION IN TRAINING-ESTIMATION PHASE In this section, the APs are limited both in the training phase and in the estimation phase. All criteria for limiting in the training phase described in Section III are examined. In the estimation phase only RSSI-based selection criteria described in Section IV is taken into account. We remark that taking every AP into account in the estimation phase (i.e., no AP-limit is used) refers to limiting in the training phase only. RMSE based on the distance error is used in the comparison to quantify the difference between the true MS location and the estimated one. Figs. 4 to 6 show the results for all criteria when removing 25%, 50% and 75%, respectively, of APs in the training phase. ‘No selection’ and ‘MIMO selection’ remain the same for all figures. For the MIMO criterion, the percentage of removal was all the time 60.5 %. The results are presented as RMSE versus AP-limit. The dataset is in each case University building 1.1-set. When examining the figures, it is clearly visible that some APs can truly be removed in the training phase without any effect to the positioning performance. In Fig. 4, where 25% of the APs are removed, the positioning performance is almost exactly the same, no matter the selection criterion. Even with random removal, the same performance level is achieved within approximately 1m. When the percentage of removed APs is increased to 50% (Fig. 5) and 75% (Fig. 6), the differences between the methods start to be more visible. In Fig. 5, most of the methods are still quite on the same level, but random removal as well as entropy based removal start to deteriorate. When the removal percentage is increased to 75%

27

Figure 4. 25% of APs removed in the training phase. All selection criteria included.



TABLE II RMSE FOR DIFFERENT DATASETS, WITH ALL SELECTION CRITERIA IN THE TRAINING PHASE. AP-LIMIT = 7.

Dataset Univ. building 1.1 (60.5%) Univ. building 1.2 (62,1%) Univ. building 2 (39,4%)

No selection 6.1085

MIMO

Entropy

Variance

Maximum RSSI 5.5177

Numberof-points 6.1085

Random

6.1085

Average RSSI 13.5015

5.6628

7.4568

19.8778

10.0514

19.5186

19.5716

19.1915

19.1602

19.7903

10.6056

9.9060

10.6378

9.8979

10.4976

10.4803

11.1740

9.7505

12.2797

28

7.2702

Supermarket (22,2%) Shopping mall 1 (22,0%) Shopping mall 2 (17,6%)

15.7954

15.7406

15.8770

16.0197

15.6823

15.7055

16.5260

16.3920

25.7291

19.5709

20.1709

26.5166

26.5468

26.5468

26.9224

27.5973

11.3585

10.5444

11.8177

11.9872

12.5128

12.3553

12.5800

11.4139

[5]

VI. CONCLUSION [6]

In this paper, we have shown, via extensive measurement campaign, that it is possible to remove APs in the training phase up to certain limits. We examined several different removal criteria and noticed that MIMO removal seemed to offer the best results among the other studied methods. MIMO APs or other APs with several MAC-addresses due to multiple SSID MAC support provide redundant information, since based on our examinations the measured powers for the MAC-addresses coming from the same location are almost identical. The other studied criteria do not take into account the similarity between APs, but the removals are based on other measures like the maximum RSSI or number-of-points where the AP is heard. These values are naturally equivalent for all MAC-addresses coming from the same location, and thus, MIMO removal is the only criterion that reduces most of the repetitive APs. Of course, there may be still some APs with one MAC-address only left in the data that are also unnecessary in the positioning point of view. However, this kind of APs may be further limited in the estimation phase. Indeed, we have examined two different selection criteria in the estimation phase and have shown that the selection based on the RSSI of the current user measurement is giving better results in terms of positioning accuracy. We have also shown that limiting the number of APs in the estimation phase, when RSSI-based selection criterion is used, may increase the positioning accuracy also when the APs are limited already in the training phase.

[7]

[8]

ACKNOWLEDGMENTS The research leading to these results has received funding from the Nokia Corporation, the Academy of Finland and Tampere Doctoral Programme in Information Science and Engineering (TISE), which are gratefully acknowledged. The Authors would like to express their thanks to Jari Syrjärinne, Lauri Wirola, Mikko Blomqvist and Tommi Laine for their invaluable comments regarding our results. REFERENCES [1]

[2]

[3] [4]

S.-H. Fang, and T.-N. Lin, “Accurate Indoor Location Estimation by Incorporating the Importance of Access Points in Wireless Local Area Networks”, 2010 IEEE Global Telecommunications Conference (GLOBECOM 2010), pp.1-5, Dec 2010. T. S. Rappaport, J. H. Reed, and D. Woerner, “Position location using wireless communications on highways of the future,” IEEE Communications Magazine, vol. 34, no. 10, pp. 33–41, 1996. S. Tekinay, “Wireless geolocation systems and services,” IEEE Communications Magazine, vol. 36, no. 4, pp. 28–28, 1998. A. Kushki, K. Plataniotis, and A. Venetsanopoulos, “Sensor selection for mitigation of RSS-based attacks in wireless local area network positioning,” IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 2065–2068, 2008.

29

M. Youssef, A. Agrawala, and A. U. Shankar, “WLAN location determination via clustering and probability distributions,” in Pervasive Computing and Communications, pp. 143–150, 2003. S. Mazuelas, A. Bahillo, R. M. Lorenzo, P. Fernandez, F. A. Lago, E. Garcia, J. Blas, and E. J. Abril, “Robust Indoor Positioning Provided by Real-Time RSSI Values in Unmodied WLAN Networks”, IEEE Journal of Selected topics in signal processing, vol. 3(5), Oct 2009. S. L. Lau, Y. Xu, and K. David, “Novel Indoor Localisation using an Unsupervised Wi-Fi Signal Clustering Method”, Future Network and Mobile Summit 2011, Warsaw, Poland, Jun 2011. R. W. Yeung, A First Course in Information Theory. Kluwer Academic/Plenum Publishers, 2002.