Device-Free Passive Identity Identification via WiFi Signals - MDPI

0 downloads 0 Views 3MB Size Report
Nov 2, 2017 - utilizing human's gait based on Channel State Information (CSI) of WiFi signals. ...... New York, NY, USA, 17–19 October 2016; pp. 1–8. 23.
sensors Article

Device-Free Passive Identity Identification via WiFi Signals Jiguang Lv

ID

, Wu Yang * and Dapeng Man

Information Security Research Center, Harbin Engineering University, Harbin 150001, China; [email protected] (J.L.); [email protected] (D.M.) * Correspondence: [email protected]; Tel.: +86-451-8258-9638 Received: 26 September 2017; Accepted: 27 October 2017; Published: 2 November 2017

Abstract: Device-free passive identity identification attracts much attention in recent years, and it is a representative application in sensorless sensing. It can be used in many applications such as intrusion detection and smart building. Previous studies show the sensing potential of WiFi signals in a device-free passive manner. It is confirmed that human’s gait is unique from each other similar to fingerprint and iris. However, the identification accuracy of existing approaches is not satisfactory in practice. In this paper, we present Wii, a device-free WiFi-based Identity Identification approach utilizing human’s gait based on Channel State Information (CSI) of WiFi signals. Principle Component Analysis (PCA) and low pass filter are applied to remove the noises in the signals. We then extract several entities’ gait features from both time and frequency domain, and select the most effective features according to information gain. Based on these features, Wii realizes stranger recognition through Gaussian Mixture Model (GMM) and identity identification through a Support Vector Machine (SVM) with Radial Basis Function (RBF) kernel. It is implemented using commercial WiFi devices and evaluated on a dataset with more than 1500 gait instances collected from eight subjects walking in a room. The results indicate that Wii can effectively recognize strangers and can achieves high identification accuracy with low computational cost. As a result, Wii has the potential to work in typical home security systems. Keywords: WiFi; channel state information; human gait; human identification

1. Introduction Identity identification has been researched for many years and there have been several methods for human identification such as fingerprint-based [1], face recognition-based [2] and iris-based [3] methods. It is widely used in security systems as these biological characteristics are unique among people and can provide high identification accuracy. The characteristics can be utilized in the security systems to make sure whether someone has access to a certain room. However, most identity identification methods either need wearable devices or participating actively in the identification process, which hinders their applicability in some specific scenarios such as intrusion detection. WLAN-based device-free passive sensing attracts much attention in recent years because it does not require the user to be equipped with any sensing devices or behave actively in the sensing process [4]. WiFi has been widely deployed in our daily life that it can provide Internet access for a large number of devices. Numerous researchers find that WiFi signal can be utilized in device-free sensing as WiFi devices can not only be used for communication, but also act as generalized sensors. The movement of human has an impact on signal transmission, and different people’s gait may cause different multipath to signal propagation. It is confirmed that human’s gait is different from each other and it can be seen as his unique characteristic. This motivates our research on WLAN-based device-free identity identification. Compared to traditional security systems, which are based on video

Sensors 2017, 17, 2520; doi:10.3390/s17112520

www.mdpi.com/journal/sensors

Sensors 2017, 17, 2520

2 of 17

cameras or infrared equipment, WLAN-based approaches do not have the disadvantages of privacy leakage [5] and high cost. There have been several pioneering studies on WLAN-based device-free identity identification [6–8]. These approaches leverage Channel State Information (CSI) in physical layer of wireless networks to model human’s gait. However, there are still many challenges in WiFi-based human identification. One of the biggest challenges is how to extract effective gait patterns using CSI dynamics. The received CSI waveform contains the signal reflections of the whole body. As a result, it is difficult to divide the CSI time series into certain segments according to walking steps. In addition, CSI feature extraction is also a challenging task for human identification. In other words, we should find proper features to characterize human’s gait. The common features such as signal variance, median value, and mean value are not effective because they can be easily influenced by environmental changes. Thus, there is a huge gap between the extracted feature and the exact gait character. To deal with these challenges, in this paper, we present Wii, a device-free WiFi-based Identity Identification approach. It is composed of four main modules, which are pre-processing, step segmentation, feature extraction and classification. Classification contains two sub-modules, stranger recognition and identity identification. First, PCA and low pass filter are utilized to eliminate the useless signal components and high frequency noise. Then, walking steps are automatically segmented by Continuous Wavelet Transform (CWT) and wavelet variance. Thirdly, several time and frequency domain features are extracted before a few of them are selected according to information gain to represent human’s gait. Finally, a Gaussian Mixture Model (GMM) is utilized to recognize strangers and a Support Vector Machine (SVM) with Radial Basis Function (RBF) kernel acts as the classifier to identify people. We implement a prototype system in a 5 m × 4 m meeting room with eight volunteers for evaluation and compare the performance with WiWho and FreeSense. The results show that the stranger recognition rate can achieve over 91% and the identity identification accuracy can achieve 90.9% to 98.7% when the size of the candidate user set changes from 8 to 2, which is feasible in common indoor scenarios. In summary, the main contributions of the paper are as follows.



• • •

We propose Wii, a novel device-free WiFi-based identity identification approach, which is capable of recognizing strangers who are outside the training set and has a higher human identification accuracy compared to existing methods. We use PCA and low pass filter to remove the useless components in the WiFi signals and reduce the computational complexity at the same time. We combine CWT and wavelet variance to segment CSI waveforms into single walking steps. It is based on the observation that the wavelet coefficients show obvious periodicity. We extract and select gait features from both time and frequency domain according to information gain that can better represents human’s walking patterns and reduce computational cost. As a result, the identification accuracy becomes higher.

The rest of this paper is organized as follows. Section 2 presents the related work in human identification and device-free sensing. The basic background knowledge of CSI of WiFi network is introduced in Section 3. In Section 4, we present the detail design of Wii, and experimental evaluations are presented in Section 5. Then we discuss the potentials and limitations of Wii in Section 6. Finally, we conclude our work in Section 7. 2. Related Work A broad range of identity identification approaches have been proposed these years that can satisfy different scenarios. Most identity identification approaches leverage biometric characteristics such as fingerprint, iris and even gait. Iris is widely used in user authentication as it is unique among different people

Sensors 2017, 17, 2520

3 of 17

and very stable across different time. An iris recognition system is proposed in [9] including two modules, localizing iris and iris pattern recognition. A digital camera is used to capture images, from which iris is extracted. The iris is then reconstructed into rectangle format and iris pattern is recognized. Park et al. propose an iris recognition approach based on score level fusion, in which two Gabor wavelet filters and SVM are used [3]. Daugman’s Rubber Sheet Model is proposed to extract features from iris including freckles, coronas and stripes [10]. Fingerprint is also widely used to identify a person, as it has high uniqueness. In fingerprint-based user authentication, some special points called minutiae points are extracted from fingerprint images. The ridge thinning algorithm is put forward to extract minutiae features from the fingerprint [11]. Although the recognition accuracy of the above techniques is relative high, they all require specific infrastructure, which limits their applicability. Recently, many researchers pay their attention on passive recognition schemes using wireless sensing, especially WLAN-based frameworks. There are many applications in WLAN-based wireless sensing as it does not require the user to carry any devices and it can even only use WiFi infrastructure to achieve high accuracy sensing such as human detection, activity recognition and identity recognition. Human detection and crowd counting are the fundamental processes of indoor localization and identity recognition. Researchers first implement human detection systems utilizing received signal strength (RSS), as it is easy to obtain from many kinds of wireless devices. Human presence can affect the propagation of wireless signal and hence received signal strength varies as people move [12–15]. However, RSS is a coarse-grained measurement that it has weak stability due to severe multipath effect in indoor scenario. More recently, researchers find channel state information has better properties in physical layer. As illustrated in [16], CSI is a more fine-grained measurement of wireless signal compared with RSS as it is a subcarrier level measurement and contains both amplitude and phase, and it can be extracted from commodity devices with slight firmware modification. As a result, CSI-based passive human detection approaches attract much attention. Most CSI-based methods treat CSI as extended RSS, such as [17]; the variance of CSI amplitude is extracted as the feature to detect human motion. PADS extracts features from both amplitude and phase of CSI to make human detection system more sensitive and it can detect human of different moving speeds [18]. A proper feature extracted only from amplitude of CSI can also be sensitive enough to detect human even if the moving speed is very slow [19]. Wireless signals can be used to provide a rough estimation of the number of people in the monitoring area. Different numbers of people usually generate different signal measurements due to multipath effect in indoor scenarios [20–23]. Device-free passive sensing can also be used in activity recognition. E-eyes can recognize a few daily activities such as sleeping and cooking using CSI histograms as features [24]. CARM is a model-based activity recognition system [25]. It models the relationship between CSI and human activities. WiDraw can track the activity of a person’s hand according to the Angle of Arrival (AoA) of WiFi signal [26]. Smokey can detect if a person I s smoking using WiFi signal even under NLOS environment [27]. WiFinger is presented to recognize the minute variation of signal when the finger moves [28]. It is confirmed that gait is a unique feature of a person, as a result of which it can be used to identify a subject [29]. WiFi-ID extracts representative features of people’s walking style and shows for the first time that WiFi signal can be used to identify a person [8]. WiWho is a framework using WiFi signal to identify a person’s step and walking gait [6]. It shows that identity can be identified based on step and walk analysis. WifiU profiles human movement in which spectrograms are generated from CSI [30]. FreeSense combines PCA, Discrete Wavelet Transform (DWT) and Dynamic Time Warping (DTW) techniques to implement human identification [7]. Different from the approaches above, we for the first time calculate the periodicity from the CSI of WiFi signal using Continuous Wavelet Transform (CWT), and extract time and frequency domain features from a single step as well as a few steps. As a result, the identification accuracy of Wii can be higher than the above approaches. Furthermore, Wii can recognize strangers from a group of people.

Sensors 2017, 17, 2520

4 of 17

3. Preliminary Wii utilizes CSI in the physical layer of WiFi signal. Consequently, in this section, we give a brief introduction of CSI. In a common indoor scenario, the wireless signal propagates through multiple paths to the receiver. There may exist LOS path and several reflection paths, and the received signal is the superposition of signals from different paths. In OFDM system, the wireless channel in the time domain can be descripted by a Channel Impulse Response (CIR) to distinguish different paths. Under the assumption of time-invariant, CIR can be expressed as: N

h(τ ) =

∑ αi e− jθi δ(τ − τi ) + n(τ ),

(1)

i =1

where αi , θi , and τi denote the amplitude, phase and time delay of the signal from ith path, respectively; N is the total number of paths; n(τ ) is complex Gaussian white noise; and δ(τ ) is the Dirac delta function. However, precise CIR cannot be extracted from ordinary commodity infrastructures which is inapplicable in home environment. To overcome this limitation, in the frequency domain, Channel Frequency Response (CFR) can model the transmitting channel, which is composed of amplitude-frequency response and phase-frequency response. Given infinite bandwidth, CIR is equivalent to CFR, and CFR can be derived by taking the Fast Fourier Transform (FFT) of CIR: H = FFT (h(τ )).

(2)

The Linux kernel driver of Intel 5300 NIC can be slightly modified to make it convenient to obtain CFRs for N = 30 subcarriers in the format of CSI: H = [ H ( f 1 ), H ( f 2 ), . . . , H ( f N )].

(3)

The amplitude and phase of a subcarrier can be described by CSI: H ( f k ) = k H ( f k )ke j sin(∠ H ) ,

(4)

where H ( f k ) is the CSI of the subcarrier of which the central frequency is f k , and ∠ H denotes its phase. Therefore a group of CSIs H ( f k ), (k = 1, . . . , K), reveals K sampled CFRs at the granularity of subcarrier level. 4. System Design In this section, the overview of Wii is presented followed by the design details of each module. 4.1. System Overview The framework of Wii is provided in Figure 1; it has four main components: pre-processing, step segmentation, feature extraction and classification. In the classification procedure, it contains stranger recognition and identity identification before post-processing. Noise removing is conducted in preprocessing as there are more than one kind of noise included in the wireless signal. Step segmentation process can divide continuous CSI data into discrete segments according to walking steps in order to extract per-step features. In feature extraction module, several time and frequency domain features are extracted and the features that generate the largest information gain are selected. Finally, stranger is recognized and identity is identified in classification module. The system works in a typical indoor environment with a pair of commodity WiFi devices deployed. A wireless router that supports IEEE 802.11n protocol acts as the TX and a laptop equipped with some certain model of wireless network interface card (such as Atheros 9390 and Intel 5300) acts

Sensors 2017, 17, 2520

5 of 17

as the RX as well as the server. They keep transmitting data to collect CSIs when a person walks in the monitoring area. The collected CSIs are stored in the laptop, and will be processed in the Sensors 2017, 17, x 5 of 18 subsequent procedures.

CSI

Pre-processing

Step Segmentation

Feature Extraction

Stranger Recognition Post-processing Identity Identification Classification

Laptop

Wireless router

Figure framework. Figure1. 1. System System framework.

Pre-Processing 4.2.4.2. Pre-Processing During ourexperiments, experiments,we we find find that that the the number number of same as as thethe During our of collected collectedCSIs CSIsis isnot notthethe same number of transmitted ICMP packets we set. Thus, we first conduct the linear interpolation in the number of transmitted ICMP packets we set. Thus, we first conduct the linear interpolation in CSIs to calibrate the sampling frequency. As a result, collected have aCSIs unified theraw rawcollected collected CSIs to calibrate the sampling frequency. As the a result, theCSIs collected have sampling frequency that we can then extract frequency domain features after interpolation. a unified sampling frequency that we can then extract frequency domain features after interpolation. According to the theory of Orthogonal Frequency Division Multiplexing (OFDM), the subcarriers of According to the theory of Orthogonal Frequency Division Multiplexing (OFDM), the subcarriers of the channel are independent and have different frequency which causes frequency selective fading the channel are independent and have different frequency which causes frequency selective fading [31]. [31]. However, actually, the CSI data of adjacent subcarriers have some relationships. For each However, actually, the CSI data of adjacent subcarriers have some relationships. For each ICMP packet, ICMP packet, a CSI matrix of 3 ´ 30 can be extracted. Consequently, we use PCA to obtain the a CSI matrix of 3 × 30 can be extracted. Consequently, we use PCA to obtain the independent data. independent data. PCA can automatically combine the correlated CSI streams to extract more PCA can automatically combine theCSI correlated CSI streamsinto to extract more representative components. representative components. The matrix is reshaped a 1 ´ 90 vector. It consists of a n ´ 90 Thematrix CSI matrix is reshaped into a 1 × 90 vector. It consists of a n × 90 matrix for n collected CSIofvectors for n collected CSI vectors in a time interval. We calculate the principle components the in amatrix time interval. the principle components ofthe thefirst matrix and find that, in is most cases, and find We that,calculate in most cases, the contribution rate of principle component higher thethan contribution than 85%, thus we use data. the first principle 85%, thusrate we of usethe thefirst firstprinciple principlecomponent component is ashigher the representation of walking component as thethere representation walking data.reserved in the first principle component that have However, are severalof kinds of noises negative effect on are theseveral identification The oneinthat most significant However, there kinds ofaccuracy. noises reserved the has firstthe principle componentnegative that have influence is high frequency noise due to other movements. Thethe signal reflected by negative effect on the identification accuracy. The one that has mostcomponents significant that negative influence torso, arms andnoise legs are in identifying identity. The frequency of these components is lower is high frequency dueuseful to other movements. The signal components that reflected by torso, arms than 10 Hz according to the intuition and our observation. Consequently, we utilize a low pass filter and legs are useful in identifying identity. The frequency of these components is lower than 10 Hz with thetocutoff frequency 10 Hz to eliminate the high frequency noise in the walking data.the After according the intuition andofour observation. Consequently, we utilize a low pass filter with cutoff pre-processing, most noise in the raw CSIs has been removed. frequency of 10 Hz to eliminate the high frequency noise in the walking data. After pre-processing,

most noise in the raw CSIs has been removed. 4.3. Step Segmentation

4.3. StepOur Segmentation identity identification system is based on human’s gait information, so it is necessary to find the start and end point of each step from the collected data to divide the data sequence into Our identity identification system is based on human’s gait information, so it is necessary to find segments according to their walking steps. the start and end point of each step from the collected data to divide the data sequence into segments Generally, a person’s walking speed is constant in a short time and therefore, there is an according to their walking steps. obvious periodicity during the walking period. However, it is challenging using wireless signal to Generally, a person’s walking speed is constant in a short time and therefore, there is an obvious determine the periodicity of walking. Unlike the device-based methods such as inertial periodicity during the walking period. However, it is challenging using to determine sensor-based systems, in which the sensor is attached to human body andwireless it is easysignal to determine the theperiodicity. periodicityWe of walking. Unlike the device-based methods such as inertial sensor-based systems, cannot observe an obvious periodicity directly from the CSI waveform. As a result, we need to dig deep down into different frequency components using time−frequency analysis techniques to find the periodicity pattern. Fortunately, the combination of Continuous Wavelet

Sensors 2017, 17, 2520

6 of 17

in which the sensor is attached to human body and it is easy to determine the periodicity. We cannot observe obvious periodicity directly from the CSI waveform. As a result, we need to dig deep Sensorsan 2017, 17, x 6 ofdown 18 into different frequency components using time-frequency analysis techniques to find the periodicity Transform (CWT) and wavelet variance can effectively discover periodicity in a and multi-scale manner pattern. Fortunately, the combination of Continuous Wavelet Transform (CWT) wavelet variance the waveform. canfrom effectively discover periodicity in a multi-scale manner from the waveform. First, calculatethe thewavelet waveletcoefficients coefficients of of the waveform First, wewe calculate the first first PCA PCAcomponent componentofofthe theCSI CSI waveform after low-pass filtering (cpl) in multiple scales using CWT. According to our experimental results of of after low-pass filtering (cpl) in multiple scales using CWT. According to our experimental results trying different waveletfunctions, functions,we wechoose choose DB6 DB6 (Daubechies) (Daubechies) wavelet AsAs shown in Figure 2, 2, trying different wavelet wavelet[32]. [32]. shown in Figure we can see clearly the periodicity of the wavelet coefficients exists under some scales. However, the we can see clearly the periodicity of the wavelet coefficients exists under some scales. However, exact periodicity cannot be confirmed intuitively from the wavelet coefficients. The wavelet the exact periodicity cannot be confirmed intuitively from the wavelet coefficients. The wavelet variance reflects the distribution of the power of the wavelet coefficients in different scales, and it variance reflects the distribution of the power of the wavelet coefficients in different scales, and it can can be used to estimate the main periodicity of a time series [33]. Thus, we then calculate the be used to estimate the main periodicity of a time series [33]. Thus, we then calculate the wavelet wavelet variance as Equation (5). variance as Equation (5). Z +∞+¥ 22 a ) = ò W Wf((a, a,b ) db (5) (5) var(var( a) = b ) db,, f -¥ −∞

2 W (a,b 2) is the power of the wavelet coefficient of scale a at time b . where where W f ( a,f b) is the power of the wavelet coefficient of scale a at time b.

Figure 2. Wavelet coefficient of multiple scales of channel state information. Figure 2. Wavelet coefficient of multiple scales of channel state information.

Second, as indicated in Figure 3, we can see that the maximum wavelet variance locates at the Second, as indicated in Figure 3, we can see that the maximum wavelet variance locates at the scale of about 190, which means the wavelet coefficients have the most obvious periodicity at this scale. scale of about 190, which means the wavelet coefficients have the most obvious periodicity at this Based on Based the observation of our experiments, this periodicity is mainly caused by the movement scale. on the observation of our experiments, this periodicity is mainly caused by the of torso, as the torso affects thetorso signal transmission significantly. movement of torso, as the affects the signalmost transmission most significantly. It is noteworthy that the location of the maximum wavelet variance varies among different people and even different time of the same person as the walking speed is not the same among different people and not constant during walking. Fortunately, Wii is a window-based approach, and it can calculate the periodicity dynamically when people walk at the definition of a certain time window. As a result, Wii has the ability to adapt to different walking speeds of a person. Third, we can use this scale to extract the corresponding cpl that is relative to walking. When collecting CSI data, we also use a digital camera to take video records of walking. Then, we compare the waveforms of cpl of this scale (cpl s ) with video records and find that the waveform between two zero crossing points can represent one step as shown in Figure 4. However, some noises in the waveform exists; that is, the time interval between two adjacent zero crossing points is too short or too long. Thus, we set the lower bound tlb as 0.5 s and upper bound tub as 1.2 s intuitively as the time interval of most steps are between these two bounds.

Figure 3. Wavelet variance.

Figure 2. Wavelet coefficient of multiple scales of channel state information.

Second, as indicated in Figure 3, we can see that the maximum wavelet variance locates at the scale of about 190, which means the wavelet coefficients have the most obvious periodicity at this Sensors 2017,Based 17, 2520on the observation of our experiments, this periodicity is mainly caused by 7the of 17 scale. movement of torso, as the torso affects the signal transmission most significantly.

Sensors 2017, 17, x

7 of 18

It is noteworthy that the location of the maximum wavelet variance varies among different people and even different time of the same person as the walking speed is not the same among different people and not constant during walking. Fortunately, Wii is a window-based approach, and it can calculate the periodicity dynamically when people walk at the definition of a certain time window. As a result, Wii has the ability to adapt to different walking speeds of a person. Third, we can use this scale to extract the corresponding cpl that is relative to walking. When collecting CSI data, we also use a digital camera to take video records of walking. Then, we compare the waveforms of cpl of this scale ( cpl s ) with video records and find that the waveform between two zero crossing points can represent one step as shown in Figure 4. However, some noises in the waveform exists; that is, the time interval between two adjacent zero crossing points is too short or too long. Thus, we set the lower bound tlb3.3.as 0.5 s and upper bound tub as 1.2 s intuitively as the Figure Wavelet variance. Figure Wavelet variance. time interval of most steps are between these two bounds. Finally, thes cpl can be divided into step segments according to locations the locations of these points Finally, the cpl cans be divided into step segments according to the of these points with

the time tlb and tub with interval the time between interval between tlb . and tub .

Figure Wavelet coefficientwaveform waveformof of steps. steps. Figure 4. 4. Wavelet coefficient

4.4. Feature Extraction

4.4. Feature Extraction

During our experiments, we have extracted many features such as the maximum, minimum,

During experiments, we features have extracted many features suchthat as the minimum, mean andour some other statistic of the waveform, and find bothmaximum, time and frequency mean and some other statistic features of the waveform, and find that both time and frequency domain domain features can be used to represent the characteristics of human walking. In addition, features can be used to represent the characteristics of human walking. In addition, per-step features per-step features and walking features can be both extracted in identity identification. However, and these walking features can be extracted identitywindow. identification. However, these should features should beboth calculated in in a certain Consequently, we use features two types of be calculated a certain Consequently, two types windows, step windows, in step windowwindow. and walking window. Awe stepuse window is theoftime length of onewindow step andand a walking window severalisstep The of candidate features in this paper are walking window. Acontains step window thewindows. time length one step and a calculated walking window contains several step windows. candidate features this areoflisted listed in Table 1 withThe Equations (6)−(16). In thecalculated equations,in n is thepaper number cpl s in in Table a step,1mwith is s s s scpl in a step, m is the s of steps Equations (6)–(16). In the equations, n is the number of number the number of steps in a walking window, cpli is ith cpl in the step, and cpl j is the cpl of jth in a walking window, cplis is ith cpl s ins the step, and cpl s j is the cpl s of jth step. In Equation (16), step. In Equation (16), we divide cpl values in a step into 10 segments and p is the ratio of the we divide cpl s values in a step into 10 segments and pk is the ratio of the number ofk cpl s of kth segment of kth to the total numberHowever, of cpl s intoo themany walking window. number of cpl s of to the total number cpl ssegment in the walking window. features willHowever, lead to a too high many features will lead to a high computational complexity and even overfitting that the computational complexity and even overfitting that the identification accuracy decreases. To select the accuracy To select the most effective features, we calculate normalized mostidentification effective features, wedecreases. calculate the normalized information gain of each feature the to quantify their information gain of each feature to quantify their impact on classification as shown in Figure 5, and impact on classification as shown in Figure 5, and select the Top 5 features [34]. A higher information select the Top 5 features [34]. A higher information gain value means a higher capability of gain value means a higher capability of representing the characteristic. As a result, we choose the representing the characteristic. As a result, we choose the maximum (max), minimum (min) and standard deviation (std) of cpl s as the time-domain features in step windows, average step time

Sensors 2017, 17, 2520

8 of 17

Sensors 2017, 17, x

8 of 18

(ast) as the time-domain walking windows, and(std) entropy assthe frequency-domain feature in maximum (max), minimumfeature (min) in and standard deviation of cpl as the time-domain features in step windows. step windows, average step time (ast) as the time-domain feature in walking windows, and entropy as the frequency-domain feature in step windows. Table 1. Candidate features ID

Feature Name Description Table 1. Candidate features.

1

ID 2 1 2 3

3

max = max(cpl s ) (6) min = Description min(cpl s ) (7)

Maximum (max)

Feature Name Minimum (min)

n max = (cpl s ) (6) 1 max mean = å cpl s i s (8) min = (cpl ) (7) n min i =1

Maximum (max) Mean

Minimum (min)

4

n median = median( cpl s ) (9) mean = n1 ∑ cpl s i (8) i =1 1 n s std = median cplmedian mean (= )2s ) (10) å (cpl (9) i nsi =1

Median

Mean

4 5

StandardMedian deviation (std)

5 6

Standard deviation Skewness (sk) (std)

6 7

n

n

i =1

i =1 i

2 1 s sk = ∑ (s cpl - mean )4 ) (11) i − mean ån(cpl

Skewness (sk) 7

Kurtosis (ku)

ku =

(12)

n

4 s 4 cpl std (n∑- (1) i − mean ) (12) ku = mi=1 (n−1)std4 1 ast = å mlen(cpl s j ) (13) Average step time (ast) ast m = jm1=1 ∑ len(cpl s j ) (13) Average step time (ast) j =1 m 1 (len(cpl s j ) - ast )2 (14) vst = å m Variance of step time (vst) vst m = jm1=1 ∑ (len(cpl s j ) − ast)2 (14) Variance of step time (vst)

Kurtosis (ku)

8

8 9

9 10

n 1 n1∑ s( cpl s − mean )2 (10) skstd = = ån(cpl - mean )2 (11) i n i =1 i=1i

j =1

10

n

Energy

energy = å (ncplis )2 s (15) energy = (cpli )2 (15) i =1 ∑

Entropy

entropy = -å p10k log pk (16) entropy =k − pk log pk (16) =1 ∑

Energy

10

11

11

Entropy

i =1

k =1

0.16

Normalized Information Gain

0.14 0.12 0.1 0.08 0.06 0.04 0.02 0

1

2

3

4

5

6

7

8

9

10

11

Features

Figure 5.5.Normalized of the thecandidate candidatefeatures. features. Figure Normalizedinformation information gain gain of

In the experiments, weset setthe thewalking walking window window length step is is In the experiments, we length from from22toto10, 10,while whilethe thewindow window step 1 walking step, and the evaluation results will be illustrated in Section 5. 1 walking step, and the evaluation results will be illustrated in Section 5. Training and Classification 4.5.4.5 Training and Classification a person’s walkingpath pathhas hasaagreat great impact impact on on the because of of As As a person’s walking the features featuresofofwireless wirelesssignal signal because multipath effect, usually the identity identification system is placed near the entrance of the room or multipath effect, usually the identity identification system is placed near the entrance of the room

or a corridor that people in the area walk in a relative fixed path. Thus, scenarios of training and classification are the same, which ensures the effectiveness of training samples.

Sensors 2017, 17, 2520

9 of 17

After extracting the per-step and walking features of human walking, we integrate the features of several steps to build people’s gait profiles. Wii first utilizes Gaussian Mixture Model (GMM) to distinguish between authenticated people and strangers and then uses the multi-class SVM with Radial Basis Function (RBF) kernel as the classifier to identify people based on the extracted features. The inputs of GMM are the vectors of selected feature values of strangers and authenticated people, and the outputs of GMM are probabilities whether the steps belong to a stranger or an authenticated person. The inputs of SVM are the vectors of selected feature values of the authenticated people, and the outputs of SVM are probabilities that each step’s owner. It is difficult for a classifier to recognize a subject outside the training set. As a result, we collect several samples of strangers and construct a stranger profile. The samples are constructed by the five features with the highest information gain. The authenticated people can be seen as a Gaussian model while strangers can be seen as another Gaussian model. Consequently, we use GMM to recognize strangers to authenticate people in the monitoring area. The strangers used in the training data are only some samples of the assumed strangers. However, when a genuine stranger comes, his similarity is higher to the samples of strangers than the authenticated people. As a result, a genuine stranger can also be recognized. After the authenticated people are recognized, the identity will be identified. In the identification phase, Wii extracts the same features as the training phase and uses the SVM to identify people. We use the LIBSVM toolbox proposed by Chih-Jen Lin [35] which is widely used in machine learning. In the end of classification, we have a post-processing procedure, which can further improve the identification accuracy. In the post-processing, we assume that the successive steps in a walking window belong to the same person. It is unlikely to happen in real scenarios that Person A suddenly disappears and Person B shows up to walk for a single step or two steps instead of Person A, and Person A appears again to continue walking. For example, if the identification result sequence in a walking window is AABAA, we can replace B in the middle with A in the identification result in this situation. The only cost of post-processing is the delay in presenting the identification result, but the recognition can be more accurate. 5. Evaluation 5.1. Experiment Setup To evaluate the performance of Wii, we conduct real experiments using holdout cross validation in a typical meeting room with the size of 5 m × 4 m as shown in Figure 6. A part of data is randomly selected as the training data, while the other part is used as test data. This procedure is repeated five times in each evaluation. Each result is the mean value of the five validations. The meeting room is occupied with a meeting table and chairs. We use a TP-Link 802.11n wireless router with a single antenna as the transmitter and a Lenovo laptop equipped with a three-antenna Intel WiFi Link 5300 (iwl 5300) NIC running Ubuntu 10.04 OS as the receiver. Specifically, the antennas of the laptop are modified using three 6 dbi gain antennas, and the firmware of the NIC is modified to extract CSIs from data packets using the CSI tools. The transmitter and receiver are placed about 0.75 m above the floor and 3 m away from each other. During data collection period, the laptop is configured to continuously send ICMP packets to the wireless router and it will receive corresponding response packets. To collect CSI data to obtain precise information of people’s walking activity, the sampling rate should be as high as possible. However, the processing ability of the wireless router is limited, we adjust the sampling rate to be 200 Hz in consequence. The NIC records a CSI sample with CFR of 3 × 30 subcarriers from each respond packet. We recruit eight healthy volunteers in our experiment and the basic information of the volunteers is shown in Table 2. The volunteers are asked to walk naturally on the path that crosses the LOS of the transceivers without the constraints of walking speed or style. The data collection takes five days and

Sensors 2017, 17, 2520

10 of 17

Sensors 2017, 17, x

10 of 18

collectionwalks takes five days anyone and a volunteer walks without anyone else inthe thenoise. meeting room toextraction, reduce a volunteer without else in the meeting room to reduce After step step extraction, we get about 150 steps for each volunteer. wethe getnoise. aboutAfter 150 steps for each volunteer.

Tx Walking Path Rx

Figure 6. 6. Experimental Experimental scenario. Figure scenario. Table2.2.Basic Basicinformation information of Table of volunteers volunteers.

Volunteers Volunteers 1 1 2 2 33 44 55 66 77 8 8

Gender Gender male male male male male male male male male male female female male male female female

Height Height(cm) (cm) 174 174 175 175 175 175 170 170 171 171 164 164 183 183 161 161

Weight Weight(kg) (kg) 63 63 70 70 66 66 62 62 60 60 50 50 74 74 50 50

Age Age 29

29 26 26 27 27 25 25 25 24 24 24 24

24

5.2.5.2. Performance Evaluation Performance Evaluation 5.2.1. Stranger Recognition 5.2.1. Stranger Recognition Stranger recognition identificationininsecurity securitysystem, system, a crucial Stranger recognitionisisthe thefirst firststep step of of human human identification so so it isit aiscrucial procedure Wii. Beforeidentifying identifyingthe the person’s person’s identity, whether thethe person procedure of of Wii. Before identity,Wii Wiihas hastotorecognize recognize whether person entering monitoringarea areaisisan anauthenticated authenticated person only five walking steps, entering thethe monitoring personor oraastranger. stranger.After After only five walking steps, Wii can determine whether the person is a stranger. When an authenticated person is recognized, Wii can determine whether the person is a stranger. When an authenticated person is recognized, identity identification procedure is activated. The accuracy of stranger recognition thethe identity identification procedure is activated. The accuracy of stranger recognition under under different different number of strangers in the training set is shown in Figure 7. The strangers are randomly number of strangers in the training set is shown in Figure 7. The strangers are randomly selected selected from the volunteers and modeled using the same features as the authenticated people. The from the volunteers and modeled using the same features as the authenticated people. The selection selection is performed several times and the final result is the average accuracy of different is performed several times and the final result is the average accuracy of different combination. combination. As can be seen, the overall trend of the accuracy decreases as the number of strangers As can be seen, the overall trend of the accuracy decreases as the number of strangers in the training in the training set becomes larger. This is because, as the number of strangers in the training set set becomes larger. This is because, as the number of strangers in the training set increases, the features increases, the features of strangers become more and more generally representative that the features of strangers become more and more generally representative that the features of authenticated people of authenticated people may be submerged in those of strangers. Consequently, the number of may be submerged in those strangers. the number of strangers in the training set strangers in the training set of should not be Consequently, too large. should not be too large.

Sensors 2017, 17,xx Sensors 17, Sensors 2017,2017, 17, 2520

Accuracy Accuracy (%) (%)

1111 ofof 1817 11 of 18

Figure 7. 7. Performance ofof Figure 7.Performance Performance ofstranger strangerrecognition. recognition. Figure stranger recognition.

5.2.2. Evaluation ofStep Step Segmentation 5.2.2. Evaluation of Segmentation 5.2.2. Evaluation of Step Segmentation Step segmentation critical module of gait gait analysis analysis in the theTosystem. system. To validate its its segmentation isis aa critical module in StepStep segmentation is a critical module of gait of analysis in the system. validateTo its validate effectiveness, effectiveness, we we compare compare Wii Wii with with WiWho WiWho fairly fairly using using the the same same features features and and classifier classifier as as that that of of effectiveness, we compare Wii with WiWho fairly using the same features and classifier as that of WiWho as well as WiWhoas aswell wellas asthe theexperimental experimentalsetup. setup.Concretely, Concretely,the thefeatures featuresinclude includethe thefeature featureID IDof of1,1,2,2,5,5,8,8, WiWho the experimental setup. Concretely, the features include the feature ID of 1, 2, 5, 8, and 11 in Table 1 and 11 11 in in Table Table 11 and and the the classifier classifier isis decision decision tree. tree. There There are are eight eight volunteers volunteers in in the the training training set. set. and and the classifier is decision tree. There are eight volunteers in the training set. The size of the training Thesize sizeof ofthe thetraining trainingset setvaries variesfrom from10 10to to50 50steps. steps.The Thesize sizeof ofwalking walkingwindow windowisisfive fivesteps. steps.The The The set varies from 10 to 50 steps. The size of walking window is five steps. The identification accuracy identification accuracy accuracy of of the the two two approaches approaches isis shown shown in in Figure Figure 8.8. ItIt can can be be seen seen that that the the identification of the two approaches is shown Figure 8. rises It canas seen the identification accuracy of both identification accuracy of both bothin approaches rises asbe the sizethat of training training set grows. grows. The The accuracy accuracy of identification accuracy of approaches the size of set of approaches rises as the of training set that grows. The accuracy of Wii grows faster and keeps higher Wii grows grows faster andsize keeps higher than than that of of WiWho WiWho as the the size size of training training set becomes becomes larger. Wii faster and keeps higher as of set larger. ItIt thanindicates that of WiWho thesegmentation size of training set becomes larger. It more indicates that the segmentation indicates thatthe theas step segmentation module of Wii Wii can can extract more effective effective stepstep information and that step module of extract step information and module of Wii can extract more effective step information and has a better performance. has a better performance. has a better performance.

Figure Step segmentation. Figure 8.8.8. Step segmentation. Figure Step segmentation.

5.2.3. Identity Identification with Different Numbersofof ofFeatures Features 5.2.3. Identity Identification with Different Numbers Features 5.2.3. Identity Identification with Different Numbers The number of features features has impact on both thecomputation computationcomplexity complexity and and the identification number of has impact both the computation complexity TheThe number of features has impact onon both the andthe theidentification identification accuracy.We Weselect selectthe thefirst firstnnfeatures featuresthat thathave have the thehighest highestinformation informationgain, gain,and andeight eightvolunteers volunteers accuracy. accuracy. We select the first n features that have the highest information gain, and eight volunteers are are included included in in this this evaluation. evaluation. The The evaluation evaluation result result isis depicted depicted in in Figure Figure 9.9. The The identification identification are included in this evaluation. The evaluation result is depicted in Figure 9. The identification accuracy accuracy first first rises rises with with the the number number of of features. features. However, However, the the accuracy accuracy isis not not changing changing accuracy first rises with the number of features. However, the accuracy is not changing monotonically with the monotonically with with the the number number of of features. features. ItIt starts starts to to decrease decrease when when the the number number of of features features monotonically number of features. It starts to decrease when the number of features becomes larger than 6 because becomeslarger largerthan than66because becausean anexcess excessnumber numberof offeatures featuresleads leadsto tooverfitting overfittingto tothe thesystem. system.As Asaa becomes

an excess number of features leads to overfitting to the system. As a result, the five features with the highest information gain give a good trade-off between computation complexity and accuracy.

Sensors 2017, 17, x

12 of 18

result, the five features with the highest information gain give a good trade-off between 12 of 17 computation complexity and accuracy.

Accuracy (%)

Sensors 2017, 17, 2520

Figure 9. 9.Impact features. Figure Impactofofdifferent differentnumbers numbers of of features.

5.2.4. Identity Identification with Different 5.2.4. Identity Identification with DifferentSizes SizesofofTraining TrainingSets Sets In this section, present performance Wii when thesize sizeofoftraining trainingset setvaries. varies.The Thesize size of In this section, wewe present thethe performance ofof Wii when the of training set from varies10from to 50 for one and subject, eight are subjects are in included in theset. training set varies to 5010 steps forsteps one subject, eightand subjects included the training training set. The size of the walking window is five steps. In addition, to study the advancement of TheSensors size of the walking window is five steps. In addition, to study the advancement of Wii, we also 2017, 17, x 13 of 18 Wii, we also compare the performance with WiWho and FreeSense. The design scenario of WiWho compare the performance with WiWho and FreeSense. The design scenario of WiWho and FreeSense and FreeSense as work Wii that all oforthem home or small are almost the sameare as almost Wii thatthe all same of them in home smallwork officeinenvironment. As aoffice result, environment. As a result, we implement WiWho and FreeSense using the same experimental setup we implement WiWho and FreeSense using the same experimental setup as Wii. According to Figure 11, as Wii. According to Figure 10, the accuracy of Wii keeps increasing from 75.8% to 91.4% as the size the accuracy of Wii keeps increasing from 75.8% to 91.4% as the size of training set grows. As indicated of training set grows. As indicated in the figure, the accuracy of Wii increases faster when the size in the figure, the accuracy of Wii increases faster when the size of training set grows from 10 to 30, of training set grows from 10 to 30, while slows down as it changes to 40 and 50. This means the while slows down as it changes to 40 and 50. This means the performance of Wii becomes stable when performance of Wii becomes stable when the size is 40 or larger. The paired t-test is used to analyze the size is 40 or larger. paired t-test Wii is used themethods. differences theresults results Wii the differences of theThe results between and to theanalyze other two Theof test arebetween shown in andTable the other two methods. The test results are shown in Table 3, where P1 is the possibility of t-test 3, where P1 is the possibility of t-test between Wii and WiWho, while P2 is the possibility of between and WiWho, while P2 isAs the possibility of reject t-test the between Wii and FreeSense. As can t-test Wii between Wii and FreeSense. can be seen, all null hypothesis, which means thatbe seen,the allresults reject the null hypothesis, which means that the results are independent from sample selection. are independent from sample selection. In addition, we use the Kappa’s index to In addition, the Kappa’s index to[36]. evaluate the classification agreement [36]. The Kappa’s index evaluate we the use classification agreement The Kappa’s index of the three methods under different of the three methods different size11. of training is shown in Figure Wii hasto a higher Kappa’s size of training setunder is shown in Figure Wii has aset higher Kappa’s index 10. compared the other two methods, which means has a higher agreement. In one Wiiagreement. performs better index compared to the otherthat twoWii methods, which means that Wii hasword, a higher In onethan word, WiWho and FreeSense. It is because the features selected theoretically contain more information, Wii performs better than WiWho and FreeSense. It is because the features selected theoretically contain Figure 10. Impact theclassifier size of training set. set. and they are more to more the classifier and a smaller training more information, andsuitable they are suitable toofneed the and need a smaller training set. Table 3. P values of the paired t-test of different sizes of training set. 0.95 Size of Training Set (Step) 0.9 P1 0.85 P2

10 0.007 0.007

20 0.007 0.010

30 0.005 0.016

40 0.005 0.014

50 0.005 0.018

0.8 0.75 0.7 0.65 Wii WiWho FreeSense

0.6 0.55

10

20

30

40

50

Size of Training Set (step)

Figure 10.11.Kappa’s sizes of oftraining trainingset. set. Figure Kappa’sindex indexunder under different different sizes

5.2.5. Identity Identification with Different Group Sizes We then evaluate Wii with different group sizes. The size of the group increases from two to eight persons, and the size of training set is 40. We fix the size of training set to 40 steps because the performance rises to a stable level when the size of training set is larger than 30 according to Figure 10. For the cases that the size of the group is 2−7, we randomly select 2−7 persons from the eight volunteers, and Wii is trained before classification.

Sensors 2017, 17, 2520

13 of 17

Table 3. P values of the paired t-test of different sizes of training set.

Sensors 2017, 17, x

Size of Training Set (Step)

10

20

30

40

50

P1 P2

0.007 0.007

0.007 0.010

0.005 0.016

0.005 0.014

0.005 0.018

13 of 18

Figure size of oftraining trainingset. set. Figure11. 10.Impact Impact of of the the size

5.2.5. Identity Identification with0.95Different Group Sizes We then evaluate Wii with 0.9different group sizes. The size of the group increases from two to eight persons, and the size of training set is 40. We fix the size of training set to 40 steps because the 0.85 performance rises to a stable level when the size of training set is larger than 30 according to Figure 11. 0.8 For the cases that the size of the group is 2–7, we randomly select 2–7 persons from the eight volunteers, 0.75 and Wii is trained before classification. As shown in Figure 12, the0.7accuracy of Wii decreases from 98.7% to 90.9% as the group size gets larger. It is because there are similar features among different people, and when the group size 0.65 gets larger, the similar features add more confusion into the system, which makes it more difficult to Wii WiWho 0.6 FreeSense identify a person. The same trend also happens in WiWho and FreeSense, but the accuracy decreases quickly when the group size gets0.55larger than 4.20 Consequently, the 10 30 40 50 comparison result indicates that Wii Sensors 2017, 17, x 14 of 18 Size of Training Set (step) performs more stably when the group size gets larger. Figure 11. Kappa’s index under different sizes of training set.

5.2.5. Identity Identification with Different Group Sizes We then evaluate Wii with different group sizes. The size of the group increases from two to eight persons, and the size of training set is 40. We fix the size of training set to 40 steps because the performance rises to a stable level when the size of training set is larger than 30 according to Figure 10. For the cases that the size of the group is 2−7, we randomly select 2−7 persons from the eight volunteers, and Wii is trained before classification. As shown in Figure 12, the accuracy of Wii decreases from 98.7% to 90.9% as the group size gets larger. It is because there are similar features among different people, and when the group size gets larger, the similar features add more confusion into the system, which makes it more difficult to identify a person. The same trend also happens in WiWho and FreeSense, but the accuracy decreases quickly when the group size gets larger than 4. Consequently, the comparison result indicates that Wii performs more stably when the group size gets larger. Figure 12.12. Performance sizes. Figure Performancewith withdifferent different group group sizes.

5.2.6. Identification Accuracy of Different People The identification result of the eight volunteers in the above experiment is shown in the confusion matrix in Figure 13. It shows the details of the identification of different people. It is obvious that different people have different identification accuracy and it varies from 83% to 95%. According to Table 2, the result that Volunteer 6 and 8 have the highest accuracy as these two volunteers are the only female volunteers in the test group, which indicates that the gait pattern of female has less similarity with male. We can also find that, if two people have similar height and

Sensors 2017, 17, 2520

14 of 17

Figure 12. Performance with different group sizes.

5.2.6. Identification Accuracy of Different People

5.2.6. Identification Accuracy of Different People

The identification result of the eight volunteers in the above experiment is shown in the confusion The identification result of the eight volunteers in the above experiment is shown in the matrix in Figure 13. in It shows the Itdetails thedetails identification of different of people. It ispeople. obvious that confusion matrix Figure 13. showsofthe of the identification different It is different people have different andaccuracy it varies and fromit 83% tofrom 95%.83% According obvious that different people identification have different accuracy identification varies to 95%. to Table 2, the result that Volunteer 6 and 8 have the highest accuracy as these two volunteers are only According to Table 2, the result that Volunteer 6 and 8 have the highest accuracy as thesethe two female volunteers in the test group, which indicates that the gait pattern of female has less similarity volunteers are the only female volunteers in the test group, which indicates that the gait pattern of with male.has Weless cansimilarity also find with that, male. if twoWe people havefind similar weight, they areheight moreand likely female can also that,height if two and people have similar to have difficulties identified. Especially, the identification accuracy of Volunteer weight, they arebeing morecorrectly likely to have difficulties being correctly identified. Especially, the 7 identification accuracy of Volunteer 7 is the male than volunteers, as he has volunteers. a higher is the highest among the male volunteers, as highest he has aamong higherthe height the other male heightwords, than the otherand male volunteers. other words, height have a of significant In other height weight have aInsignificant impact onand the weight transmission wireless impact signal.

Actual person

on the transmission of wireless signal. Classified as

1

2

3

4

5

6

7

8

1

93

0

4

1

1

0

0

1

2

0

89

3

2

5

0

1

0

3

3

2

85

4

4

1

0

1

4

3

3

2

83

6

2

0

1

5

1

2

2

4

90

1

0

0

6

0

0

0

0

2

95

0

3

7

3

1

2

0

0

0

94

0

8

0

0

0

2

0

3

0

95

Figure 13. The confusion matrix of human identification with eight people. Figure 13. The confusion matrix of human identification with eight people.

5.2.7. Impact of Walking Window Size

5.2.7. Impact of Walking Window Size

There is also another important factor in evaluation of Wii that can affect the identification There is also another important factor in evaluation of Wii that can affect the identification accuracy. As FreeSense is not a window-based approach, it is excluded in the evaluation of different accuracy. As FreeSense is not a window-based approach, it is excluded in the evaluation of different walking window sizes. We further compare the performance of Wii with WiWho under different walking window sizes. We further compare the performance of Wii with WiWho under different walking window sizes when the size of the training set is 40 and the group size is 8. walking window sizes when the size of the training set is 40 and the group size is 8. As As cancan be be seen in in Figure 14, windowfrom from2 2toto10. 10.The The smallest seen Figure 14,we wechange changethe thesize size of of walking walking window smallest window sizesize is 2 is is because it is the size larger than athan single the window size decreases window 2 is because it issmallest the smallest size larger a step. singleIfstep. If the window size to 1, the walking features decline to per-step features, which makes no sense. We think it is still practical for the system that the identification result comes out after the user walks for 10 steps if we can get a higher accuracy. From the evaluation result we can see that, when the size of the walking window is smaller than five steps, the identification accuracy of Wii keeps growing as the window size gets larger, which indicates that our walking features play a crucial role in improving the performance of identity identification. However, the accuracy becomes stable when the window size gets larger than five steps, which means that it is appropriate to set the size of walking window to 5. The accuracy of 90.9% can satisfy common smart home applications. Obviously, Wii can be better adopted in a larger group size.

steps if we can get a higher accuracy. From the evaluation result we can see that, when the size of the walking window is smaller than five steps, the identification accuracy of Wii keeps growing as the window size gets larger, which indicates that our walking features play a crucial role in improving the performance of identity identification. However, the accuracy becomes stable when the window size gets larger than five steps, which means that it is appropriate to set the size of Sensors 2017, 17,window 2520 15 of 17 walking to 5. The accuracy of 90.9% can satisfy common smart home applications. Obviously, Wii can be better adopted in a larger group size. 100 Wii WiWho

95

Accuracy (%)

90 85 80 75 70 65 60

2

3

4

5

6

7

8

9

10

Walking Window Size

Figure Impactofofwalking walking window window size. Figure 14.14.Impact size.

6. Discussion 6. Discussion We did several evaluations in this work and showed the feasibility of identity identification We did several evaluations in this work and showed the feasibility of identity identification using using WiFi signals. However, there still exist some limitations in Wii. In this section, we will discuss WiFi signals. However, there still exist some limitations in Wii. In this section, we will discuss the the limitations and potentials of Wii, which give the direction of our future work. limitations and potentials of Wii, which give the direction of our future work. Although Wii can achieve a relatively high identification accuracy, it can also be affected by Although Wii can achieve a relatively high identification accuracy, it can also be affected by many factors. many factors. First, we have collected walking data in several locations and different paths. At first, we First, wetohave collected data in several locations and different paths. At first, we planned planned identify peoplewalking even if he changed his path. However, the extracted features change a lot to identify people even if he his path. However, the extracted features change lot training when the when the person walks in changed another path. We cannot successfully identify a person whenathe person in another We cannot successfully identify a we person thethe training data and datawalks and test data arepath. collected at different paths. Specifically, also when find that identification test accuracy data are collected at different paths.walks Specifically, we line-of-sight also find that accuracy is higher when the person across the of the theidentification transceivers than other is paths. As the a result, we walks constraint thethe walking path in of ourthe experiments. higher when person across line-of-sight transceivers than other paths. As a result, In addition, people’s walking indeed has a great impact on the identification result. Thus, we constraint the walking path in ourgait experiments. during data collection, the volunteers are asked keepimpact their walking gait and walkresult. naturally. In addition, people’s walking gait indeed has atogreat on the identification Thus, WLAN-based human identification has some similarities with radar techniques. Thus, during data collection, the volunteers are asked to keep their walking gait and walkdifferent naturally. poses may human cause different fingerprints for the same person, theThus, accuracy. WLAN-based identification has some similarities withwhich radarinfluences techniques. different poses Furthermore, WiFi signals can be easily affected by environmental changes. Thus, may cause different fingerprints for the same person, which influences the accuracy. Wii can only be used when there is only one person walking in the monitoring area without anyone else. Furthermore, WiFi signals can be easily affected by environmental changes. Thus, Wii can only be Despite these limitations, wireless signal-based identity identification technique has much used when there is only one person walking in the monitoring area without anyone else. potential. Besides intrusion detection, it can also be used as a critical function in behavior analysis Despite these limitations, wireless signal-based identity identification technique has much systems as well as to provide personalized services in smart spaces. potential. Besides intrusion detection, it can also be used as a critical function in behavior analysis In our future work, we plan to explore new features that can represent people’s gait patterns systems well as toinprovide services in smart spaces. moreas accurately order topersonalized make the identity identification system have the ability to be adaptively In our future work, we plan to explore new features that can represent people’s gait patterns more used even when the user walks in different paths.

accurately in order to make the identity identification system have the ability to be adaptively used even when the user walks in different paths. 7. Conclusions In this paper, we propose an effective identity identification approach Wii only based on WiFi signals which brings much convenience in smart home applications. It utilizes existing WLAN infrastructures and is based on fine-grained CSI from physical layer of wireless network. Wii divides time series of walking into segments by steps and extracts both time and frequency domain features from per-step and walking segment perspective. The extracted feature can be properly used in representing human’s gait patterns. It can effectively recognize strangers. To evaluate the performance of Wii, we implement a set of experiments from several perspectives. The results show that Wii achieves an average identification accuracy of 98.7% when there are two people, and 90.9% when there

Sensors 2017, 17, 2520

16 of 17

are eight people, which is effective and can satisfy typical home security. The limitations and potentials of WiFi signal-based identity identification systems are also discussed in this work. Acknowledgments: This research is supported by the National Natural Science Foundation of China (Grant Nos. 61300206, 61472098 and 6177010612). Author Contributions: Jiguang Lv and Wu Yang conceived and designed the study; Jiguang Lv and Dapeng Man conceived and designed the experiments; Jiguang Lv performed the experiments; Dapeng Man analyzed the data; Jiguang Lv wrote the paper; and Wu Yang and Dapeng Man reviewed and edited the manuscript. All authors read and approved the manuscript. Conflicts of Interest: The authors declare no conflict of interest.

References 1. 2. 3. 4.

5.

6.

7.

8.

9.

10. 11. 12.

13.

14. 15. 16. 17.

Karu, K.; Jain, A.K. Fingerprint Classification. Pattern Recognit. 1996, 29, 389–404. [CrossRef] Chellappa, R.; Wilson, C.L.; Sirohey, S. Human and Machine Recognition of Faces: A Survey. Proc. IEEE 1995, 83, 705–741. [CrossRef] Park, H.-A.; Park, K.R. Iris Recognition Based on Score Level Fusion by Using SVM. Pattern Recognit. Lett. 2007, 28, 2019–2028. [CrossRef] Niu, J.; Wang, B.; Cheng, L.; Rodrigues, J.J.P.C. WicLoc: An Indoor Localization System based on WiFi Fingerprints and Crowdsourcing. In Proceedings of the IEEE International Conference on Communications (ICC), London, UK, 8–12 June 2015; pp. 3008–3013. Youssef, M.; Mah, M.; Agrawala, A. Challenges: Device-free passive localization for wireless environments. In Proceedings of the 13th Annual ACM International Conference on Mobile Computing and Networking, Montreal, QC, Canada, 9–14 September 2007; pp. 222–229. Zeng, Y.; Pathak, P.H.; Mohapatra, P. WiWho: WiFi-based Person Identification in Smart Spaces. In Proceedings of the 15th International Conference on Information Processing in Sensor Networks, Vienna, Austria, 11–14 April 2016; pp. 1–12. Xin, T.; Guo, B.; Wang, Z.; Li, M.; Yu, Z.; Zhou, X. FreeSense: Indoor Human Identification with Wi-Fi Signals. In Proceedings of the IEEE Global Communications Conference (GLOBECOM), Washington, DC, USA, 4–8 December 2016; pp. 1–7. Zhang, J.; Wei, B.; Hu, W.; Kanhere, S.S. WiFi-ID: Human Identification Using WiFi Signal. In Proceedings of the International Conference on Distributed Computing in Sensor Systems (DCOSS), Washington, DC, USA, 26–28 May 2016; pp. 75–82. Liam, L.W.; Chekima, A.; Fan, L.C.; Dargham, J.A. Iris Recognition Using Self-Organizing Neural Network. In Proceedings of the Student Conference on Research and Development, Shah Alam, Malaysia, 17 July 2002; pp. 169–172. Bhattacharyya, D.; Das, P.; Bandyopadhyay, S.K.; Kim, T. Iris Texture Analysis and Feature Extraction for Biometric Pattern Recognition. Int. J. Database Theory Appl. 2008, 1, 53–60. Wang, Y.; Hu, J.; Han, F. Enhanced Gradient-Based Algorithm for the Estimation of Fingerprint Orientation Fields. Appl. Math. Comput. 2007, 185, 823–833. [CrossRef] Yang, J.; Ge, Y.; Xiong, H.; Chen, Y.; Liu, H. Performing Joint Learning for Passive Intrusion Detection in Pervasive Wireless Environments. In Proceedings of the 2010 Proceedings IEEE INFOCOM, San Diego, CA, USA, 14–19 March 2010; pp. 1–9. Kosba, A.E.; Saeed, A.; Youssef, M. RASID: A Robust WLAN Device-Free Passive Motion Detection System. In Proceedings of the IEEE International Conference on Pervasive Computing and Communications, Lugano, Switzerland, 19–23 March 2012; pp. 180–189. Zhang, D.; Liu, Y.; Guo, X.; Ni, L.M. RASS: A Real-Time, Accurate, and Scalable System for Tracking Transceiver-Free Objects. IEEE Trans. Parallel Distrib. Syst. 2013, 24, 996–1008. [CrossRef] Depatla, S.; Muralidharan, A.; Mostofi, Y. Occupancy Estimation Using Only WiFi Power Measurements. IEEE J. Sel. Areas Commun. 2015, 33, 1381–1393. [CrossRef] Halperin, D.; Hu, W.; Sheth, A.; Wetherall, D. Tool Release: Gathering 802.11 n Traces with Channel State Information. ACM SIGCOMM Comput. Commun. Rev. 2011, 41, 53. [CrossRef] Liu, W.; Gao, X.; Wang, L.; Wang, D. Bfp: Behavior-Free Passive Motion Detection Using PHY Information. Wirel. Pers. Commun. 2015, 83, 1035–1055. [CrossRef]

Sensors 2017, 17, 2520

18.

19.

20.

21. 22.

23.

24.

25.

26.

27.

28.

29. 30.

31. 32. 33. 34. 35. 36.

17 of 17

Qian, K.; Wu, C.; Yang, Z.; Liu, Y.; Zhou, Z. PADS: Passive Detection of Moving Targets with Dynamic Speed Using PHY Layer Information. In Proceedings of the 20th IEEE International Conference on Parallel and Distributed Systems (ICPADS), Hsinchu, Taiwan, 16–19 December 2014; pp. 1–8. Lv, J.; Yang, W.; Gong, L.; Man, D.; Du, X. Robust WLAN-Based Indoor Fine-Grained Intrusion Detection. In Proceedings of the IEEE Global Communications Conference (GLOBECOM), Washington, DC, USA, 4–8 December 2016; pp. 1–6. Domenico, S.D.; Sanctis, M.D.; Cianca, E.; Bianchi, G. A Trained-once Crowd Counting Method Using Differential WiFi Channel State Information. In Proceedings of the 3rd International on Workshop on Physical Analytics, Singapore, 26 June 2016; pp. 37–42. Yuan, Y.; Zhao, J.; Qiu, C.; Xi, W. Estimating Crowd Density in an RF-Based Dynamic Environment. IEEE Sens. J. 2013, 13, 3837–3845. [CrossRef] Domenico, S.D.; Pecoraro, G.; Cianca, E.; Sanctis, M.D. Trained-Once Device-Free Crowd Counting and Occupancy Estimation Using WiFi: A Doppler Spectrum Based Approach. In Proceedings of the 12th IEEE International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob), New York, NY, USA, 17–19 October 2016; pp. 1–8. Xi, W.; Zhao, J.; Li, X.Y.; Zhao, K.; Tang, S.; Liu, X.; Jiang, Z. Electronic Frog Eye: Counting Crowd Using WiFi. In Proceedings of the IEEE Conference on Computer Communications, Toronto, ON, Canada, 27 April–2 May 2014; pp. 361–369. Wang, Y.; Liu, J.; Chen, Y.; Gruteser, M.; Yang, J.; Liu, H. E-eyes: Device-Free Location-Oriented Activity Identification Using Fine-Grained WiFi Signatures. In Proceedings of the 20th Annual International Conference on Mobile Computing and Networking, Maui, HI, USA, 7–11 September 2014; pp. 617–628. Wang, W.; Liu, A.X.; Shahzad, M.; Ling, K.; Lu, S. Understanding and Modeling of WiFi Signal Based Human Activity Recognition. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking, Paris, France, 7–11 September 2015; pp. 65–76. Sun, L.; Sen, S.; Koutsonikolas, D.; Kim, K.-H. WiDraw: Enabling Hands-free Drawing in the Air on Commodity WiFi Devices. In Proceedings of the 21st Annual International Conference on Mobile Computing and Networking, Paris, France, 7–11 September 2015; pp. 77–89. Zheng, X.; Wang, J.; Shangguan, L.; Zhou, Z.; Liu, Y. Smokey: Ubiquitous Smoking Detection with Commercial WiFi Infrastructures. In Proceedings of the 35th Annual IEEE International Conference on Computer Communications, San Francisco, CA, USA, 10–14 April 2016; pp. 1–9. Tan, S.; Yang, J. WiFinger: Leveraging commodity WiFi for fine-grained finger gesture recognition. In Proceedings of the 17th ACM International Symposium on Mobile Ad Hoc Networking and Computing, Paderborn, Germany, 5–8 July 2016; pp. 201–210. Unar, J.A.; Seng, W.C.; Abbasi, A. A review of Biometric Technology Along with Trends and Prospects. Pattern Recognit. 2014, 47, 2673–2688. [CrossRef] Wang, W.; Liu, A.X.; Shahzad, M. Gait Recognition Using WiFi Signals. In Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing, Heidelberg, Germany, 12–16 September 2016; pp. 363–373. Yang, Z.; Zhou, Z.; Liu, Y. From RSSI to CSI: Indoor Localization via Channel Response. ACM Comput. Surv. 2013, 46, 1–32. [CrossRef] Daubechies, I.; Bates, B.J. Ten Lectures on Wavelets. J. Acoust. Soc. Am. 1993, 93, 1671. [CrossRef] Carey, C.C.; Hanson, P.C.; Lathrop, R.C.; St. Amand, A.L. Using Wavelet Analyses to Examine Variability in Phytoplankton Seasonal Succession and Annual Periodicity. J. Plankton Res. 2016, 38, 27–40. [CrossRef] Witten, I.H.; Frank, E.; Hall, M.A. Data Mining: Practical Machine Learning Tools and Techniques; Morgan Kaufmann Publishers Inc.: San Francisco, CA, USA, 2011; p. 664. Chang, C.-C.; Lin, C.-J. LIBSVM: A Library for Support Vector Machines. ACM Trans. Intell. Syst. Technol. 2011, 2, 1–27. [CrossRef] Cohen, J. A Coefficient of Agreement for Nominal Scales. Educ. Psychol. Meas. 1960, 20, 37–46. [CrossRef] © 2017 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).