sequential nonparametric density estimation - CiteSeerX

7 downloads 0 Views 508KB Size Report
SEQUENTIAL NONPARAMETRIC DENSITY ESTIMATION. H.I. Davies &Edward J. Wegman. Department of Statistics. University of North Cal'oZina at Chapel, ...




/

SEQUENTIAL NONPARAMETRIC DENSITY ESTIMATION H.I. Davies

&Edward J.

Wegman

Department of Statistics University of North Cal'oZina at Chapel, HiZZ Institute of Statistics Mimeo Series No. 884 August, 1973





SEQUENTIAL NONPARAMETRIC DENSITY ESTIMATION by ' 1 H.• I • DaV1es

and Edward J. Wegman 1.

Introduction:

In this paper, we shall discuss a sequential approach to

probability density estimation.

For the most part we shall confine our attent-

ion to estimators of the form (1.1)

e

first introduced by Rosenblatt (1956) and discussed in greater detail by Parzen (1962) •

Here, of course,

are L Ld. random variables chosen

X ,X , ... ,X n l 2

according to some density,

f.

In this paper, the function

K, the so-called

kernel, is assumed to be a bounded density on the real line satisfying lim luIK(u)

(1. 2)

=0

.

U-+±CIO

Moreover, the sequence,

h , n

is assumed to be a sequence of positive real

numbers satisfying (1.3)

1

lim h

n-+oo n

= 0,

h

lim nh

n-+oo

n

= CIO

and

, n+1 1 I 1m~= . n-+oo n

The work of this author was supported by a C.S.I.R.O. postgraduate studentship •

1



2

We shall principally focus our attention on a naive stopping rule defined by



the following procedure: Choose successive random samples of size M and form the differences (1.4) A

where fnM(x) ru~

and

A

f(n_1)M(x)

are the density estimators based on sample sizes

and (n-1)M respectively.

N(e,M)

(1.5)

=

The stopping rule is

First n such that IVn(x) [

00

/
0

if no such n exists.

In section 2, we investigate the asymptotic structure of Vn(x). we investigate properties of the stopping variable,

N(e,M).

In section 3,

Finally section·A·

is a concluding section. 2.

ASymptotic Structure of Vn(x): Theorem 2.1: i.

If K and hn

satisfy (1.2) and (1.3) respectively then

/Vn(x) I-+-o in probability for every x€C(f), the continuity points of f,

and

ii.

s~pIVn(x)/

-+- 0 in probability if f

If, in addition, for some a iii.

sup IUlm{K(cu)

>

is uniformly continuous.

0 K(u)}2

is locally Lipschitz of order a

IUI~a

at c=l for some a>O,



3 00

f {K(cu)

iv.

- K(u)}2du

is locally Lipschitz of order a at c=l,



_00

and finally

I

00

1

_1__

hl - 8 h n n n+l

v. n=l

8

L


0,

An

~

max Ig(x-y)-g(x) 1

lyl~o

J

hI Ig(x-y) IK(t) n Iyl>o n

J

Ig(x)

Iyl>o

K(~] ~y

J

K(if) lyl~o n

n-l

K(~]dY

+

n

+

n-l

IK[~ K(~) ~. n

n... l

n

In the first and third terms of the R.H.S., we make the transformation

Z

= y/hn ,

so that An

~

K(Z)K[hh~

J

max Ig(x-y}-g(x) I Iyl~o Izl~o/h

n-l

n

J

.lg(X;y)

I.

Iyl>o Ig(x)1

0

h K(t) n

J

Izl~o/h

n

n.

Z)dZ +

K[~)dY

+

n-l

n K(z) K(h h ZJdZ . n-l



5

4It

Since

o

K is bounded, the first term can be made arbitrarily small by choosing

arbitrarily small.

The second term is bounded by -

}



sup

Iz I~o/hn

00

If

IZK(Z) K(h hnzJ Ig(y)ldy n-l _00

which along with the third term can be made arbitrarily small for choice of n sufficiently large. Proof of (B): Letting z = ylh , n

=

h

I J {K(~J-K(Z)JK(Z)dZI 00

-00

~

h.

00

J IK(hn: l

zJ -K(z) IK(Z)dZ .

-00

Clearly this last integrand is bounded by [2 sup K(y)] • K(z) and hence y

appealing to the Lebesgue Dominated Convergence theorem completes the

o

result. Theorem 2.3:

Let

K and h satisfy (1.2) and (1.3) respectively and also n

let lim n n-+OO then

lim

n-+oo

Proof:

n21~lnM var(Vn(x))

{h (n-I)M hnM - l} 1- v, =

= vf(x)

f

oo

v
0,

(2.3)

(n-l)M P[

AnI (x) - EAnl (x) ~

kl J ~ O.

~

e:[(n-l)M] 2 .

[var AnI (x)]

A sufficient condition (Liapounov's condition) for 2.3 is that for some 6 2 6 EIAn1 (X) - E[Anl(x)] 1 + ~

as

0

n

~

> 0,

co

(nM)6/2 0 2+6 [AnI (x)]

where

cr

(a+b)3

2

S

[AnI (x)] = var[A

nl

(x)].

We let

6 = 1.

Then using the inequality

4(a 3+b 3), we obtain

so that EIAn! (x) - E[AnI (x)] I

3

EIA

2 o [AnI (x)] =

so that

O(nh

1

),

nM

m(x)1

3

1

= O( 2 2 n h nM

I

3

---.r-j;--:3=----- • (nM) 2 0 [AnI (x)]

S

(nM}~;0 3 [An! (x)] By Lemmas 2.4 and 2.6,

8E IAnI (x)

)

and

3 ni (xj 1

8EIA

---.,---::----- = 0 (-i (nM) 'to3[AnI (x)]

which completes the result.

1

)~ nM

)

n 2 (nh

o



11

3.

The Stopping Variable N(€,M):

In this section, we shall generally

suppress in the notation the explicit dependence of N(€,M) on Hence we write N(€,M) simply as



N.

Noting that

[Nsn]

€ and M.

= [Vn(x)

S

E],

it is clear that the probabilistic structure of N is closely related to that of Vn(x).

Inasmuch as the structure of Bn(x)

depends on f(x), we

will be, in general, unable to give the exact asymptotic structure of N. In this section, we demonstrate the finiteness of the moments of N, the closure of N, and the divergence of N as E + O. Lemma 3.1: (3.1)

For arbitrary t > 0 and given € > 0, -nMh HEt S (x)t P[V (x) > €] s e n , E e n n

and P[Vn(x)

(3.2)

-nMh MEt -S (x)t sen E e n

< -e]

where x-X· _(n 1)M 2L hnM t K(?) l. h J=l nM j=l n-l· (n-1)M hM

Sn(x) = ) Proof:

Define T(x) to be the indicator of [5 (x) t(S (x)-nMhnME) n arbitrary t > 0, T(x) s e n , so that

Noting that

= [Sn(x)

[Vn(x) > E]

K(h

> ru~hnM€]'

x-X. J)" • (n-1)M

Then for

nMhnM €] completes the proof of (3.1). Equation (3.2) follows by similar arguments. 0 Now,

>

n

P[N

>

nM]

= P[ n flvk(x) I

>

EI]

k=2

s p[IV (x) n

I

> e] •



12

By Lemma 3.1, for arbitrary 3.1, for arbitrary t

P[N > m4] ~ e



-nMh t 5 (x)t -5 (x)t nM [E e n + E en]

5 (x)t -5 (x)t We next examine E e n and E e n .

(n-l)M nM I A*k(x) + I B* (x) k=l n k=(n-l)M+l nk

An*k(X) = K(X-XkJ _..E- hnM hnM n-l h(n-I)M

* (X-Xk ) Bnk(X) = K h

Let us decompose

where

5n (x) =

and

> 0,

K(

X-X k ) h(n_I)MJ

,

k=1,2, ••. ,(n-l)M

k = (n-l)M+l, ... ,ru\1.

ilM

* * Notice that Anl(x), ... ,An,(n_l)M(x)

* are i.i.d., that Bn,(n_I)M+l(x)"",

* Bn,nM(x) are i.i.d. and that aZZ of these random variables are mutually independent.

Thus 5 (x)

E e n

r

A* (X)] (n-l)Mr B* (X)]M = LE e nl LE e n,~1

Buf if L = sup K(x), x 00

E[.Bn.nM(X)]

=J

K(x-u) hnM f(u)du



_00

< _ eL .

Lemma 3.2:

Let an

(n-l) a ~ log n n n-+oo lim

y

*

= E[IAnl(x)le

for some

IA * (x) I nl ]

y;:: 0,

If an satisfies

then



E [e

~s

n

(X)]

13

M 1iL

~ n Yeo'



For n. sufficiently large, since an ~ 0, a n > log(l+an ). Combining this with the inequality on an' we have for n sufficiently large,

Proof:

(n-l)M log(l+a n )

~

Mylog n.

Exponentiating both sides, ( 1+an ) (n-l)M ~ nMy •

Now

S (x)

Ee n

~

[E

A* nl(x)](n-l)M ML

e

e

This, together with the observation that

* (x) AnI

e

~

* e,Anl(X)! IA * 1 (x) I , ~ 1 + lAnl(X) Ie n

*

so that

A*1 (x) Ee n ~ 1 + an completes the proof.

o

Under the hypotheses of Lemmas 3.1 and 3.2 we have for arbitrary t

>

0,

(3.3) -nMhnMEt . My Notice in general nh ~~, so that e ~ o. Since n ~~, however, nM we will usually want to choose hn in such a way that for any 0 > 0 and for n sufficiently large (3.4)

e

-nMh E nM < _ n -0.

We note here that the usual choice hn guarantee (3.4).

= Bn'-Ct ,

0 < Ct < 1

is sufficient to



~4

e



Theorem 3.3: Under the hypotheses of Lenunas 3.1 and 3.2 and assuming (3.4 l, we have ENr < 00 for every r ~ O.

r

CIO

Proof:

Elf =

nrP[N = nM]

n=O

r (n+l)rp[N n=l 00

s

~ ru~] •

Using (3.3) with t = 1, Elf

r

00

S

2JfL (n+l)r+MY e n=l

-(n+l)~fu

(n+l)M



Reindexing

Now for

0

* = %min IV.(x)1 so that for €*, Iv. (x) I > €* for all j s nO' This J lSjSn J o is a contradition to N < nO for all e. Hence N ~ nO for € sufficiently let



small.

That is to say

N

is greater than any finite number for

small and for we[lvn(x)l~ 0 as n ~

~

peN

~

00

as e

~

0]

= 1.

00,

Ivn(x)I

>

0 for every n].

0



Thus

sufficiently



17 ~

411

We are now able to state a convergence theorem based on Theorem 3.6. A

Theorem 3.7:

N ~ = as e

Suppose

A

as n

~

= with probability one, then

Proof:

f (x) ~ f(x) n 0 with probability one.

0 with probability one and

~

~

fN(x)

f(x) as e

~

Let A be the set of probability one for which N ~

the set of probability one for which f (x) n fN(x) ~ f(x) and P(AnB) = 1.

+

f(x).

oo.

Let

B be

Clearly on AnB,

A

Sufficient conditions for tions for

fn(x)

~

N+

00

appear in Theorem 3.6.

Sufficient condi-

o

f(x) appear in Theorem 2.1.

We close this section by noting that a slightly revised stopping rule

NY

given by 1st n {

00

such that



butlVn (x) I > 0

if no such n exists

obviates the need to consider the class of this section holds for

Ivn (x)l