SEQUENTIAL NONPARAMETRIC DENSITY ESTIMATION. H.I. Davies &Edward J. Wegman. Department of Statistics. University of North Cal'oZina at Chapel, ...
•
e·
/
SEQUENTIAL NONPARAMETRIC DENSITY ESTIMATION H.I. Davies
&Edward J.
Wegman
Department of Statistics University of North Cal'oZina at Chapel, HiZZ Institute of Statistics Mimeo Series No. 884 August, 1973
•
•
SEQUENTIAL NONPARAMETRIC DENSITY ESTIMATION by ' 1 H.• I • DaV1es
and Edward J. Wegman 1.
Introduction:
In this paper, we shall discuss a sequential approach to
probability density estimation.
For the most part we shall confine our attent-
ion to estimators of the form (1.1)
e
first introduced by Rosenblatt (1956) and discussed in greater detail by Parzen (1962) •
Here, of course,
are L Ld. random variables chosen
X ,X , ... ,X n l 2
according to some density,
f.
In this paper, the function
K, the so-called
kernel, is assumed to be a bounded density on the real line satisfying lim luIK(u)
(1. 2)
=0
.
U-+±CIO
Moreover, the sequence,
h , n
is assumed to be a sequence of positive real
numbers satisfying (1.3)
1
lim h
n-+oo n
= 0,
h
lim nh
n-+oo
n
= CIO
and
, n+1 1 I 1m~= . n-+oo n
The work of this author was supported by a C.S.I.R.O. postgraduate studentship •
1
•
2
We shall principally focus our attention on a naive stopping rule defined by
•
the following procedure: Choose successive random samples of size M and form the differences (1.4) A
where fnM(x) ru~
and
A
f(n_1)M(x)
are the density estimators based on sample sizes
and (n-1)M respectively.
N(e,M)
(1.5)
=
The stopping rule is
First n such that IVn(x) [
00
/
0
if no such n exists.
In section 2, we investigate the asymptotic structure of Vn(x). we investigate properties of the stopping variable,
N(e,M).
In section 3,
Finally section·A·
is a concluding section. 2.
ASymptotic Structure of Vn(x): Theorem 2.1: i.
If K and hn
satisfy (1.2) and (1.3) respectively then
/Vn(x) I-+-o in probability for every x€C(f), the continuity points of f,
and
ii.
s~pIVn(x)/
-+- 0 in probability if f
If, in addition, for some a iii.
sup IUlm{K(cu)
>
is uniformly continuous.
0 K(u)}2
is locally Lipschitz of order a
IUI~a
at c=l for some a>O,
•
3 00
f {K(cu)
iv.
- K(u)}2du
is locally Lipschitz of order a at c=l,
•
_00
and finally
I
00
1
_1__
hl - 8 h n n n+l
v. n=l
8
L
0,
An
~
max Ig(x-y)-g(x) 1
lyl~o
J
hI Ig(x-y) IK(t) n Iyl>o n
J
Ig(x)
Iyl>o
K(~] ~y
J
K(if) lyl~o n
n-l
K(~]dY
+
n
+
n-l
IK[~ K(~) ~. n
n... l
n
In the first and third terms of the R.H.S., we make the transformation
Z
= y/hn ,
so that An
~
K(Z)K[hh~
J
max Ig(x-y}-g(x) I Iyl~o Izl~o/h
n-l
n
J
.lg(X;y)
I.
Iyl>o Ig(x)1
0
h K(t) n
J
Izl~o/h
n
n.
Z)dZ +
K[~)dY
+
n-l
n K(z) K(h h ZJdZ . n-l
•
5
4It
Since
o
K is bounded, the first term can be made arbitrarily small by choosing
arbitrarily small.
The second term is bounded by -
}
•
sup
Iz I~o/hn
00
If
IZK(Z) K(h hnzJ Ig(y)ldy n-l _00
which along with the third term can be made arbitrarily small for choice of n sufficiently large. Proof of (B): Letting z = ylh , n
=
h
I J {K(~J-K(Z)JK(Z)dZI 00
-00
~
h.
00
J IK(hn: l
zJ -K(z) IK(Z)dZ .
-00
Clearly this last integrand is bounded by [2 sup K(y)] • K(z) and hence y
appealing to the Lebesgue Dominated Convergence theorem completes the
o
result. Theorem 2.3:
Let
K and h satisfy (1.2) and (1.3) respectively and also n
let lim n n-+OO then
lim
n-+oo
Proof:
n21~lnM var(Vn(x))
{h (n-I)M hnM - l} 1- v, =
= vf(x)
f
oo
v
0,
(2.3)
(n-l)M P[
AnI (x) - EAnl (x) ~
kl J ~ O.
~
e:[(n-l)M] 2 .
[var AnI (x)]
A sufficient condition (Liapounov's condition) for 2.3 is that for some 6 2 6 EIAn1 (X) - E[Anl(x)] 1 + ~
as
0
n
~
> 0,
co
(nM)6/2 0 2+6 [AnI (x)]
where
cr
(a+b)3
2
S
[AnI (x)] = var[A
nl
(x)].
We let
6 = 1.
Then using the inequality
4(a 3+b 3), we obtain
so that EIAn! (x) - E[AnI (x)] I
3
EIA
2 o [AnI (x)] =
so that
O(nh
1
),
nM
m(x)1
3
1
= O( 2 2 n h nM
I
3
---.r-j;--:3=----- • (nM) 2 0 [AnI (x)]
S
(nM}~;0 3 [An! (x)] By Lemmas 2.4 and 2.6,
8E IAnI (x)
)
and
3 ni (xj 1
8EIA
---.,---::----- = 0 (-i (nM) 'to3[AnI (x)]
which completes the result.
1
)~ nM
)
n 2 (nh
o
•
11
3.
The Stopping Variable N(€,M):
In this section, we shall generally
suppress in the notation the explicit dependence of N(€,M) on Hence we write N(€,M) simply as
•
N.
Noting that
[Nsn]
€ and M.
= [Vn(x)
S
E],
it is clear that the probabilistic structure of N is closely related to that of Vn(x).
Inasmuch as the structure of Bn(x)
depends on f(x), we
will be, in general, unable to give the exact asymptotic structure of N. In this section, we demonstrate the finiteness of the moments of N, the closure of N, and the divergence of N as E + O. Lemma 3.1: (3.1)
For arbitrary t > 0 and given € > 0, -nMh HEt S (x)t P[V (x) > €] s e n , E e n n
and P[Vn(x)
(3.2)
-nMh MEt -S (x)t sen E e n
< -e]
where x-X· _(n 1)M 2L hnM t K(?) l. h J=l nM j=l n-l· (n-1)M hM
Sn(x) = ) Proof:
Define T(x) to be the indicator of [5 (x) t(S (x)-nMhnME) n arbitrary t > 0, T(x) s e n , so that
Noting that
= [Sn(x)
[Vn(x) > E]
K(h
> ru~hnM€]'
x-X. J)" • (n-1)M
Then for
nMhnM €] completes the proof of (3.1). Equation (3.2) follows by similar arguments. 0 Now,
>
n
P[N
>
nM]
= P[ n flvk(x) I
>
EI]
k=2
s p[IV (x) n
I
> e] •
•
12
By Lemma 3.1, for arbitrary 3.1, for arbitrary t
P[N > m4] ~ e
•
-nMh t 5 (x)t -5 (x)t nM [E e n + E en]
5 (x)t -5 (x)t We next examine E e n and E e n .
(n-l)M nM I A*k(x) + I B* (x) k=l n k=(n-l)M+l nk
An*k(X) = K(X-XkJ _..E- hnM hnM n-l h(n-I)M
* (X-Xk ) Bnk(X) = K h
Let us decompose
where
5n (x) =
and
> 0,
K(
X-X k ) h(n_I)MJ
,
k=1,2, ••. ,(n-l)M
k = (n-l)M+l, ... ,ru\1.
ilM
* * Notice that Anl(x), ... ,An,(n_l)M(x)
* are i.i.d., that Bn,(n_I)M+l(x)"",
* Bn,nM(x) are i.i.d. and that aZZ of these random variables are mutually independent.
Thus 5 (x)
E e n
r
A* (X)] (n-l)Mr B* (X)]M = LE e nl LE e n,~1
Buf if L = sup K(x), x 00
E[.Bn.nM(X)]
=J
K(x-u) hnM f(u)du
•
_00
< _ eL .
Lemma 3.2:
Let an
(n-l) a ~ log n n n-+oo lim
y
*
= E[IAnl(x)le
for some
IA * (x) I nl ]
y;:: 0,
If an satisfies
then
•
E [e
~s
n
(X)]
13
M 1iL
~ n Yeo'
•
For n. sufficiently large, since an ~ 0, a n > log(l+an ). Combining this with the inequality on an' we have for n sufficiently large,
Proof:
(n-l)M log(l+a n )
~
Mylog n.
Exponentiating both sides, ( 1+an ) (n-l)M ~ nMy •
Now
S (x)
Ee n
~
[E
A* nl(x)](n-l)M ML
e
e
This, together with the observation that
* (x) AnI
e
~
* e,Anl(X)! IA * 1 (x) I , ~ 1 + lAnl(X) Ie n
*
so that
A*1 (x) Ee n ~ 1 + an completes the proof.
o
Under the hypotheses of Lemmas 3.1 and 3.2 we have for arbitrary t
>
0,
(3.3) -nMhnMEt . My Notice in general nh ~~, so that e ~ o. Since n ~~, however, nM we will usually want to choose hn in such a way that for any 0 > 0 and for n sufficiently large (3.4)
e
-nMh E nM < _ n -0.
We note here that the usual choice hn guarantee (3.4).
= Bn'-Ct ,
0 < Ct < 1
is sufficient to
•
~4
e
•
Theorem 3.3: Under the hypotheses of Lenunas 3.1 and 3.2 and assuming (3.4 l, we have ENr < 00 for every r ~ O.
r
CIO
Proof:
Elf =
nrP[N = nM]
n=O
r (n+l)rp[N n=l 00
s
~ ru~] •
Using (3.3) with t = 1, Elf
r
00
S
2JfL (n+l)r+MY e n=l
-(n+l)~fu
(n+l)M
€
Reindexing
Now for
0
* = %min IV.(x)1 so that for €*, Iv. (x) I > €* for all j s nO' This J lSjSn J o is a contradition to N < nO for all e. Hence N ~ nO for € sufficiently let
€
small.
That is to say
N
is greater than any finite number for
small and for we[lvn(x)l~ 0 as n ~
~
peN
~
00
as e
~
0]
= 1.
00,
Ivn(x)I
>
0 for every n].
0
€
Thus
sufficiently
•
17 ~
411
We are now able to state a convergence theorem based on Theorem 3.6. A
Theorem 3.7:
N ~ = as e
Suppose
A
as n
~
= with probability one, then
Proof:
f (x) ~ f(x) n 0 with probability one.
0 with probability one and
~
~
fN(x)
f(x) as e
~
Let A be the set of probability one for which N ~
the set of probability one for which f (x) n fN(x) ~ f(x) and P(AnB) = 1.
+
f(x).
oo.
Let
B be
Clearly on AnB,
A
Sufficient conditions for tions for
fn(x)
~
N+
00
appear in Theorem 3.6.
Sufficient condi-
o
f(x) appear in Theorem 2.1.
We close this section by noting that a slightly revised stopping rule
NY
given by 1st n {
00
such that
€
butlVn (x) I > 0
if no such n exists
obviates the need to consider the class of this section holds for
Ivn (x)l