Rate of convergence for computing expectations of ... - Semantic Scholar

1 downloads 0 Views 799KB Size Report
Rate of convergence for computing expectations of stopping functionals of an -mixing process. Mohamed Ben ALAYA. Gilles PAG S y. Abstract. The shif t method ...
Rate of convergence for computing expectations of stopping functionals of an -mixing process Mohamed Ben ALAYA

Gilles PAGÈSy

Abstract

The shift method consists in computing the expectation of an integrable functional F dened on the probability space ((Rd )N ; B(Rd ) N ;  N ) ( is a probability measure on Rd ) using the Birkho's Pointwise Ergodic Theorem : n1 kn=01 F  k ! E (F ) a:s: as n ! +1, where  denotes the canonical shift operator. When F lies in L2 (FT ;  N ) for some integrable enough stopping time T , several weak (CLT) or strong (Gàl-Koksma

P

Theorem or LIL) converging rates hold. The method successfully competes with Monte Carlo. The aim of this paper is to extend these results to more general probability distributions P on ((Rd )N ; B(Rd ) N ), namely when the canonical process (Xn )n2N is X  P stationary, -mixing and fullls Ibragimov's assumption 2+ (n) < +1 for n0

some  > 0 . One application is the computation of the expectation of functionals of an -mixing Markov Chain under its stationary distribution P . It may provide both a better accuracy and save the random number generator compared to the usual Monte Carlo or to the shift method on independent innovations.

1991 MSC classication : 60F05, 60F15, 60F17. Key words : mixing process, Monte Carlo method, Rate of convergence.

1 Introduction and mathematical framework The origin of the problem is motivated by the computation of the expectation of a functional F dened on the canonical space ((R d )N ; B(R d ) N ;  N ) using the Birkho's Pointwise Ergodic Theorem. Several contributions (Bouleau [4], [5] and Ben Alaya [1],[2]) have established some rates of convergence for a wide class of functionals, namely FTX -measurable for some integrable enough stopping time T and square integrable. As a matter-of-fact, both strong (Gàl & Koksma Theorem, Law of the Iterated Logarithm) and weak (Central Limit Theorem) convergence rates hold in the Birkho's Theorem. To be more specic, one considers the canonical space ((R d )N ; B(R d ) N ) endowed with the product measure  N , the canonical projections Xk ; k  0, dened for every ! := (!k )k 2 0

LAGA URA 742, Institut Galilée Université Paris 13. Avenue J.B. Clément 93430 Villetaneuse & CERMICS, Ecole Nationale des Ponts et Chaussées. La courtine 93160 Noisy-le-Grand. Mail: [email protected]. y Univ. Paris 12, Fac. Sciences, 61 av. Général de Gaulle, F-94010 Créteil Cedex & Labo. Probabilités, URA 224, Univ. P.& M. Curie, Tour 56, F-75252 Cedex 05. Mail: [email protected]. 

1

(R d )N by Xk (!) := !k and the (left) shift operator on (R d )N (X ; X ;   ) := (X ; X ;   ). It is widely known (see e:g: [13]) that the dynamical system ((Rd )N ; B(R d )N ;  N ; ) being ergodic, the Birkho Pointwise Ergodic Theorem implies that 0

8F 2 L

1

((R d )N ;  N );

1

1

2

Z n 1X k

N n k F   ! E (F ) = Fd : 1

 N -a:s:

=0

Similarly, one can dene the right shift (this time on (R d )Z) by setting Xn   = Xn . Then identifying L ((R d )N ;  N ) to a subspace of L ((R d )Z;  Z) 1

8F 2 L

1

1

1

((R d )N ;  N );

 N -a:s:

Z n 1X  k F  ( ) ! E (F ) = Fd N : nk 1

=0

The Shift on Independent Innovations Method(s) (SIIM) simply is/are the data-processing of these convergence results. The expectation E (F ) is then computed by averaging some dependent paths while the usual Monte Carlo method (MCM) requires some independent paths. The main theoretical results concerning the -shift method are summed up below (see [2]). Let T be a FnX -stopping time(1) and F 2 L (R N ; B(R ) N ) be an FTX -measurable functional(2) where FnX := (X ;    ; Xn) denotes the natural ltration of the canonical process (Xn)n2N . Then if T 2 L  for some > 0, 2

2+

0

  (F ):=Var(F ) + 2 2

+1 X

k=1

Cov(F; F  k ) is absolutely convergent, which in turn implies

that :  The Gàl-Koksma Theorem holds : n X 1 8 " > 0; n F   k k 1

E (F )



= o n 21 (log(n)) 32

"

+



 N -a:s:

=0

 The CLT theorem holds, that is, whenever (F ) 6= 0, n 1p X k F    (F ) n k 1

E (F )

 L

! N (0; 1);

=0

where N (0; 1) denotes the standard normal distribution and L! the convergence in distribution.  Moreover, if the stopping time T has nite polynomial moments (this assumption is slightly relaxable), the LIL holds n 1 X

F  k

E (F )

lim sup k p 2n log log n n!1 =0

1

n 1 X



=  (F )

and

F  k

lim inf k p n!1 =0

 E (F )

2n log log n

A N -valued random variable is a FnX -stopping time if fT  ng 2 FnX for every n 2 N . A 2 F1X = A \ fT  ng 2 FnX g for every n 2 N T

2 F X := f

2

=  (F ):

Similar results are obtained in the case of (see [5]). The computational performances of the shift method SIIM lie in the use of a storage box that partially avoids to uselessly re-simulate all the innovations Xi 's when passing from a path to another while this is necessary in the usual MCM. Hence, for the same number of iterations, we observed on true simulations that the SIIM runs faster than the classical MCM (see [2]). The time savings are on the expenses of the data storage (dynamical or not) which is typical for the antagonism between time complexity and storage complexity. On the other hand, the SIIM also calls the random number generator less often than the MCM does. This may be crucial for large scale simulations. However, when  (F ) > Var(F ) the required number of iterations is higher. Unfortunately no satisfactory estimate of V ar(FF ) is known to us and it is likely that, for most naturally encountered functionals F , this ratio is greater than 1. The balance between these two eects depends on the choice of F . The aim of the paper is to extend these results to more general stationary probability distributions P : whenever the dynamical ((R d )N ; B(R d ) N ; P; ) is ergodic, the Birkho's Theorem directly applied on the shifted paths of a P-integrable functional F yields 2

2

(

)

n 1X k n k F   ! E (F ): 1

P-a:s:

(1)

=0

Of course, the plain ergodicity cannot provide a rate of convergence in the Birkho Pointwise Ergodic Theorem without any further assumption (see [13]). That is why we will assume from now on that the canonical process (Xn)n2N on ((R d )N ; B(R d ) N ) shares a strong mixing assumption, namely the Ibragimov -mixing assumption, under the probability P. This notion turns out to be the natural extension of the former of i.i.d. random variable setting in terms of Limit Theorems for our stopping functionals. The -mixing Markovian setting is a natural domain of application for the techniques. In fact, let (Xn)n2N be an homogeneous Markov chain on R d with transition P (x; dy) and a starting distribution  . A commonly encountered problem of Numerical Probability is to compute (an approximate of) E  (f (X ;  ; X` )) where  denotes the invariant probability measure  assumed to be unique  of the transition P . When the chain is positively recurrent (resp. stable), the natural method is to apply the Law of Large Numbers along the available paths of the chain that is, for every x 2 R d , for every f : (R d )` ! R bounded Borel (resp. continuous) function 0

0

8 x 2 Rd ;

n 1X n k f (Xk ;  ; Xk

1

1

`

+

) n!!1 E  (f (X ;  ; X` )) +

1

0

1

Px -a:s:

(2)

=0

The rate of convergence in (2) is ruled by several classical theorems like the Central Limit Theorem or the Law of the Iterated Logarithm under some standard assumptions (see e:g: [9]). When f is no longer a function of nitely many Xn's but is a functional F dened on the whole canonical space ((R d ) N ; B(R d ) N ) of the chain, the computation of E  (F ) either by simulation or from statistical data cannot be carried out as easily. The rst natural idea is to implement the usual Monte Carlo Method (MCM). However this approach turns out to be costly in terms of C.P.U. time. Starting from experimental facts on the Shift on Independent 3

Innovation Method (SIIM), one can try using equation (1) under P and shift on the chain itself. We will call this method the Shift Process Method (SPM). The Markov assumption on (Xn)n2N will be dropped in the theoretical part of the paper. The theoretical results will then be applied to the -mixing case. Then the three methods (MCM, SIIM, SPM) will be compared numerically. They will be applied to two -mixing Markovian models Xn = h(Xn; Yn) where the underlying innovations Yn are independent. The paper is organized as follows. Section 2 is devoted to some background on the main tools used in the rest of the paper : the denition of an -mixing process is recalled along with the Ibragimov Central Limit Theorem for -mixing sequences satisfying the Ibragimov assumption (subsection 2.1). The Gál-Koksma Theorem in the L -stationary setting is recalled at subsection 2.2. This will be our basic result when dealing with the a:s: rate of convergence (except for the Law of the Iterated Logarithm investigated in section 5). Section 3 deals with the a:s: rate of convergence of the shift method for stopping functionals. This result essentially relies on the niteness of a pseudo-variance, denoted  (F ). In section 4, a Central Limit Theorem is established under the same hypothesis. In section 5, after recalling Philipp and Stout's Theorem, a Law of the Iterated Logarithm is established, only for a subclass of stopping functionals having nite polynomial moments. Section 6 is dedicated to the Markov setting. Some standard -mixing criteria for (stationary) Markov chains are recalled (subsection 6.1) and the simulation framework is presented (subsection 6.2). Some numerical simulations are processed in section 7 on three mixing Markov processes satisfying the Ibragimov assumption. A simple Metropolis like algorithm (subsection 7.1) is considered in two dierent settings so that the invariant distribution  is alternately explicitly known (subsubsection 7.1.1) and not explicitly known (subsubsection 7.1.2). The third example, a Vector Quantization algorithm, will illustrate some possible false convergence phenomenon (subsection 7.2) when  is not explicitly known. Throughout the text Lp( ; A; P) will denote 1the set of A-measurable real-valued func+1

2

2

Z

tionals F whose norm jjF jjp :=  will be the canonical shift on (R d )N .

Lp

jF jpdP



p

is nite. From now on the shift operator

2 Some background

2.1 -mixing sequences and Central Limit Theorem

We are going to recall some results on -mixing processes (see e:g: Doukhan [7]). Let be a sequence ( (n); n 2 N ) of real numbers, satisfying limn!1 (n) = 0, and let (Xn)n2N be a R d -valued process dened on a probability space ( ; A; P). (Xn )n2N is -mixing if for every k; n 2 N , n  1

8A 2 F k ; 8B 2 Fk1 n; 0

+

jP(A \ B )

P( A ) P( B )

j  (n):

Intuitively if (n) is small then B and A are essentially independent, hence for an -mixing process the future is asymptotically independent from the present and the past. One nds in the literature various notions of mixing that quantify the dependence between the past 4

and the future. Just for comparison, in the '-mixing for example we measure the quantity P(B )j. The notion of -mixing is therefore weaker. It is in fact the weakest when compared to all usual notions of strong mixing (see [7]). However, this assumption on the sequence (Xn)n2N turns out to be quite adequate. Furthermore, we will say that an mixing process (Xn)n2N satises the Ibragimov assumption if

jP(B=A)

X

n0

2+ (n) < +1 for some  > 0: 

(3)

Historically, the Central Limit Theorem for -mixing processes is due to Ibragimov (1962) (see [7] or Hall & Heyde [12]). It essentially holds under the above assumption (3). Theorem 1 1Suppose that (Xn)n2N is a centered real valued strictly stationary(3 ) -mixing X  process with (n) 2+ < +1 and E jX j  < +1 for some  > 0. Then the sequence 0

n=0

2+

 := V ar(X ) + 2 2

0

1 X k=1

Cov(X ; Xk ) 0

is absolutely convergent. Furthermore, if  > 0, then X0 + X1 +p   + Xn 1 L! N (0; 1) as n ! +1;

 n

where N (0; 1) denotes the standard normal distribution and L! is for the convergence in law.



Except for the fundamental underlying Central Limit Theorem for the martingale increments, this result mainly relies on the covariance inequality below (see [7] p.9). Proposition 1 Let (Xn)n2N be a strictly stationary -mixing process. Then: 8 r; p; q  1 with r + p + q = 1, 8 F 2 Lp(F k ) and 8 G 2 Lq (Fk1 n) 1

1

1

0

+

jCov(F; G)j  8 1 (n) jjF jjp jjGjjq : r

Application to cylindrical functions: Let us go back to the framework described in the

introduction i:e: the canonical projections (Xn)n2N are -mixing on the canonical dynamical space ((R d )N ; B(R d ) N ; P) with a rate := ( (n))n2N . Then, the real valued measurable functions F on ((R d )N ; B(R d ) N ; P) that actually depend on nitely many components behave like the sequence (Xn)n2N itself in the following sense: if F only depends on the rst N components then XnF;N := F  n; n 2 N , is an N -mixing process with rate  N N (n) = 1 (n N ) ifif nn  >N : (

)

A Rd -valued process (Xn )n2N is strictly stationary if for every k 2 N , (Xn+k )n2N and (Xn )n2N have the same distribution that is, if P denotes the distribution of (Xn )n2N on the canonical space ((Rd )N; B(Rd ) N ), P   = P with the notations of section 1. 3

5

Note that

1 X n=0

1 ()

2+ (n) < + 

1 X n=0



N2+ (n) < +1:

So, one straightforwardly derives the following Proposition 2 Let1 (Xn)n2N be a strictly stationary R d -valued -mixing process. If there is X   > 0 satisfying 2+ (n) < +1, then for every cylindrical function F 2 L  (P) with 2+

n=0

E (F )=0,

(a) the sequence  2 (F ) := V ar(F ) + 2

1 X k=1

Cov(F  k ; F ) is absolutely convergent,

n X 1 k L! N (0; 1) as n ! +1: p (b) Furthermore, if  (F ) > 0, then: F    (F ) n k 1

=0

2.2 Rate of almost sure convergence

As a rst step we recall the Gál and Koksma Theorem, established in their article  Sur l'ordre de grandeur des fonctions sommables([11]). We will restrict to the L -stationary process setting (see [1] for a probabilistic proof in a full general setting). Theorem 2 Let ( ; A; P) be a probability space and let (Xn)n2N be a L -stationary sequence of random variables such that E jX + X +    + Xn j = O(n). Then 2

2

1

2

2



X + X +    + Xn = o n 21 (log(n)) 32

8">0

1

2

"

+



P-a:s:



Coming back to the canonical dynamical system ((R d )N ; B(R d ) N ; P; ), we derive from the previous theorem a strong ergodic result. i:e: a speed of a:s: convergence in Birkho's pointwise ergodic Theorem. Proposition 3 Let F12 L ((Rd )N ; B(R d ) N ); P) such that E (F )=0. If X  (F ):= V ar(F ) + 2 Cov(F  k ; F ) converges, then 2

2

k=1

n  1X k = o n 12 (log(n)) 23 F   nk 1

8 " > 0;

"

+



P-a:s::

=0

Proof: Using the convergence of the series  (F ) and the fact that  preserves the measure 2

P,

we rst prove that (see e:g: [1]) E

n 1 X k=0

F  k

!2

= n (F ) 2 2

n X k=1

kCov(F  k ; F ) 2n 6

1 X k=n+1

Cov(F  k ; F ):

The niteness of  (F ) along with the Kronecker lemma yield 2

Z n X 1 F  k j dP =  (F ): lim j n!1 n Rd N k 1

2

(

)

2

=0

The Gál and Koksma's Theorem completes the proof. 

By its very construction, a functional F that can be simulated on a computer naturally appears as a stopping functional with respect to its (a:s: nite) stopping rule T . So from now on, we will focus on such FT -measurable functionals.

3 An

a:s:

rate of convergence for stopping functionals

3.1 A class of FT -measurable functionals with nite  (F ) 2

Set Fmn := (Xm;   ;Xn) and Fm1 := (Xk ; k  m). T will denote a F n-stopping time and F a FT -measurable functional. Finally [x] will denote the integral part of x. Theorem 3 below provides a bound for the covariance Cov(F  k ; F ) from which the 1 X absolute convergence of the series  (F ) = V ar(F ) + 2 Cov(F  k ; F ) follows. It is the k key result of this work. Theorem 3 Let 1(Xn)n2N be a R d -valued stationary -mixing process. Assume there is some X   > 0 such that 2+ (n) < +1. If T is a stopping time and T 2 Lp((R d )N ; P) for some 0

2

=1

p>

 

2+ 1+

n=0

then, for every F 2 L2+ ((R d )N ; FT ; P) with E (F ) = 0, we have 1+

(E (T p )) 2+ : [k=2]) + jjF jj 2+  jjF jj  1+ 1+ [k=2]p 2+ Proof: To establish inequality (4), rst notice that Cov (F

 k ; F )  16 jjF jj2

2+ (k 2+

Cov (F





2+







(4)



 k ; F )  Cov(F  k ; F  1fT  k= g) + Cov(F  k ; F  1fT> k= g) : (5) Now F  k is Fk1-measurable and F  1fT  k= g is F k= -measurable. By applying Proposition [

[

2]

1 with r = 1 +  and p = q = 2 + , we obtain 2

Cov (F



2] [ 0

[

2]



 k ; F  1fT  k= g)  8 2+ (k [k=2]) F  1fT  k=  8 2+ (k [k=2]) jjF jj  : [



2]





2]

[



g 2+ jjF jj2+

2]

2 2+



(6)

For the second term on the right of inequality (5), the standard H older inequality with +  rst provides p = 2 +  and q = 12 +  Cov (F





 k ; F  1fT> k= g)  F  k  1fT> k= [

[

2]

7

2]



2+ g 1+ 

jjF jj  : 2+

(7)

It is straightforward that E



jF j 1+2+  k  1fT> k=  

[

2]

g









 Cov(jF j 1+2+  k ; 1fT> k= g) + E jF j 1+2+  

[

 

2]



P( T

> [k=2])

At this stage, we observe that F  k is Fk1-measurable and fT > [k=2]g belongs to F k= . Still applying Proposition 1 but this time with r = 1 + 1 ; p = 1 +  and q = +1 yields [ 0

E



jF j

2+ 1+

 k  1

fT>[k=2]g





8 1+ (k 

[k=2])

jF j

E

  1+1 

2+



+ E jF j

2+  1+ P(T

> [k=2]);

that is F

 k  1fT> k= [

2]g 2+ 1+



 8 1+ (k [k=2]) 



E

  1+1 

jF j

2+

2+ 



+ E jF j 1+

P( T

2]

  1+ 2+

> [k=2])

:

Plugging this bound in inequality (7) and using inequality (x + y)  x + y ; 0 < < 1; x; y  0 leads to

Cov (F



 k ; F  1fT> k= g)  8 2+1+ 2+ (k [k=2]) jjF jj  + jjF jj 1+2+ jjF jj  (P(T  [k=2])) 1+2+  8 2+ (k [k=2]) jjF jj  + jjF jj 2+1+ jjF jj  (P(T  [k=2])) 1+2+ : (8) [

 

2]





2 2+



2 2+



 

 

 

2+

 

2+

(T ) . Hence, collecting inequalities (5), (6) As T 2 Lp((R d )N ; B(R d ) N ; P), P(T > [k=2])  E[k= 2]p and (8) nally yield: p

Cov (F

 k ; F )  16 jjF jj2

1+

( E (T p )) 2+ 2+   jjF jj  1+ ;  (k [k=2]) + jjF jj 2+ 1+ [k=2]p 2+ 

2+

2+

which completes the proof. 

Remarks and improvements: (a) A careful reading of the above proof (namely equation (8)) shows that the assumption T 2 Lp for some p >  can be slightly improved into 2+ 1+

1 X k=1

P( T

1+

> k) 2+ < +1:

(b) As  < 2, the moment assumption on T is always fullled as soon as that: T 2 L ((Rd )N ; B(R d ) N ; P). (c) If the functional F is bounded, then we can simply assume that T is integrable. Indeed, P1 if T is integrable then k P(T > [k=2]) < +1 and the proof can be modied  in fact simplied!  in this setting (which formally corresponds to  = +1). (d) Our assumptions on the process (Xn)n2N and the functional F are satisfactory in the following sense: 2

2+ 1+

=1

8

1 X

2+ (n) < 1 and F 2 L  do not dier from those of the n original Ibragimov Central Limit Theorem which studies functions only depending on one variable (i:e: F (x ;    ; xn;   ) := f (x )).  When (n)=0, n  1, we nd again the results of [2] obtained in the independent setting. (e) The FT -measurability of the functional F for some stopping time T is crucial. In fact, we cannot obtain this result as a consequence of some results on functionals that can be approximated by a sequence (Fk )k2N of F k -measurable cylindrical functions so as P 1 jjF F jj < +1. By such a simple approach (setting F := F:1 k k fT kg ), we get k the result under the more stringent assumptions:

 Both conditions



2+

=0

0

0

0

2

=1

F 2L

2+



and T has a moment of order p >



4(2+ )



.

3.2 An a:s rate of convergence

As it has been emphasized in paragraph 2.2 on the a:s: convergence rate, the condition  (F ) < +1 is the basic assumption to apply Gál and Koksma Theorem (Theorem 2). Therefore, we derive from the previous theorem the following a:s: convergence result. 2

Theorem 4 Under the assumptions of Theorem 3, one has: n  1 3 1X k 2 (log(n)) 2 F   = o n nk 1

8 " > 0;

"

+



P-a:s::

=0



4 A Central Limit Theorem for stopping functionals

Theorem 5 Let 1(Xn)n2N be a R d -valued stationary -mixing process. Assume there is some X   > 0 such that 2+ (n) < +1. If T is a stopping time and T 2 Lp((R d )N ; P) for some n p>  then, for every F 2 L  ((R d )N ; FT ; P) with E (F ) = 0, we have: =0

2+ 1+

2+

n X  (F ) > 0 =) (F1)pn F  k L! N (0; 1) as n ! +1; n 1

2

(9)

=0

where N (0; 1) denotes the standard normal distribution.

To establish the Central Limit Theorem we compute the limits of  (F:1fT `g) and  (F:1fT>`g) when ` tends to +1. Indeed, if for every ` 2 N we set: 2

2

` :=  (F:1fT `g) = V ar(F:1fT `g) + 2 2

2

9

1 X k=1

Cov(F:1fT `g  k ; F:1fT `g)

(10)

and

` :=  (F:1fT>`g ) = V ar(F:1fT>`g ) + 2 2

2

then we have the following results. Lemma 1 Under the assumptions of Theorem 5, lim  `!1 `

1 X k=1

Cov(F:1fT>`g  k ; F:1fT>`g);

=  and `lim  = 0: (11) !1 `  Proof : Following Theorem 3, if we replace the function F by F:1fT `g E F:1fT `g (still FT -measurable and centered), inequality (4) yields an upper bound for Cov (F:1fT `g   k ; F:1fT `g ) , namely  Cov (F:1fT `g   k ; F:1fT `g )  16 F:1fT `g E  (F:1fT `g ) 2+  (k [k=2]) 2

2

2

2

2+

+ F:1fT `g

E (F:1fT `g ) 2+ F:1fT `g 1+

Now for every functional G 2 Lp; p > 1; jjG



8 ` 2 N ; Cov(F:1fT `g  k ; F:1fT `g) 

E (F:1fT `g )

1+

2+

(E (T p )) 2+ : 1+  [k=2]p 2+

jj  jj jj

E (G) p 2 G p, therefore,  72 F:1fT `g 2 2+ (k [k=2])



2+

1+ ( E (T p )) 2+ +4 F:1fT `g 1+ 2+ F:1fT `g 2+ 1+ ;  [k=2]p 2+

which in turn implies that, for every ` 2 N Cov (F:1fT `g



 k ; F:1fT `g)  72 jjF jj  2+ (k [k=2]) 2 2+





1+

( E (T p )) 2+ +4 jjF jj 1+ 2+ jjF jj  1+ :  [k=2]p 2+ 2+

Hence equation (10) implies that ` is dened by an absolutely convergent sequence, uniformly, with respect to `. As each term of the series converges towards Cov(F  k ; F ), one nally has `lim  = . !1 `  If we note that F:1fT>`g E F:1fT>`g is FT -measurable and centered we obtain in the same way: 2

2

2

Cov (F:1fT>`g



 k ; F:1fT>`g)  72 jjF jj  2+ (k [k=2]) 2 2+





1+

(E (T p )) 2+ : +4 jjF jj 1+ 2+ jjF jj  1+  [k=2]p 2+ 2+

Hence ` is also dened by an absolutely convergent sequence uniformly with respect to `. Since each term of the series converges towards 0, `lim  = 0. !1 ` 2

2



10

Let us prove now the Central Limit Theorem.

Proof: Let F 2 L ((R d )N ; FT ; P). For every ` 2 N we write 2

F = F:1fT `g

 E (F:1fT `g ) +

F:1fT>`g

E (F:1fT>`g )



:

Then n n n X X X p1n F  k = p1n F:1fT `g E (F:1fT `g )  k + p1n F:1fT>`g E (F:1fT>`g )  k : k k k F:1fT `g E (F:1fT `g ) is a cylindrical function only depending on the rst ` variables. According to proposition 2, the rst term on the right of the equality converges in distribution towards N (0; ` ), for every ` 2 N , with 1

1

1

=0

=0

=0

2

` =  (F:1fT `g) = V ar(F:1fT `g) + 2 2

2

1 X k=1

Cov(F:1fT `g  k ; F:1fT `g):

From Lemma 1, one derives the convergence in distribution of N (0; ` ) towards N (0;  ). Consequently, it amounts to prove that, for every " > 0, ! n X  1 k lim lim sup P p `!1 n!1 n k F:1fT>`g E (F:1fT>`g )    " = 0: Then Bienaymé-Tchebichev's inequality yields ! n X  1 F:1fT>`g E (F:1fT>`g )  k  " P p n k 2

2

1

=0

1

=0

 n"1

Z

2

n 1 X (Rd )N

k=0

F:1fT>`g



E (F:1fT>`g )



2 k dP:

2 n 1 X  k F: 1 E ( F: 1 )  dP towards `2 fT>`g fT>`g (Rd )N k=0 ! n 1 2 X   1 ` k lim sup P F:1fT>`g E (F:1fT>`g )  " n "2 : n!1

The convergence of n1

Z



p

Lemma 1 completes the proof.



k=0



yields:





Remark : This CLT is satisfactory since it holds under the same Ibragimov assumption

that rules the standard CLT for mixing processes. However some recent work by Doukhan, Massart and Rio [8] shows that the (functional) CLT holds for a stationary mixing processes (Xn)n2N whenever Z (t)Q (t)dt < +1 (12) 1

1

2

0

where t 7 ! (t) denotes the canonical inverse of the monotonic function t 7 ! ([t]) and Q denotes the quantile function of X . 1

0

11

5 The law of the Iterated Logarithm The a:s: estimates for the convergence rate obtained in paragraph 3.2 using the Gál and Koksma Theorem (see Theorem 4) are obviously weaker than those of the standard Law of the Iterated Logarithm (LIL) property. The usefulness of these results is to provide an estimate close to the iterated logarithm but under weak and natural assumptions in simulation. However it is possible to prove the true LIL under more stringent assumptions on the functional F and the stopping time T . Several results are available in the literature on the asymptotic behavior of the partial sums n 1 X

n 1 X

=0

=0

Xk of a weakly dependent (Xk )k2N process or on the partial sums F (Xk ; Xk ;   ) k k of a functional F depending on a weakly dependent process (see [3],[17]). Thus, W. Philipp and W. Stout in [17] provide several invariance principles for the partial sums of weakly dependent random variable sequences. Among them some are related to the sum of the functional of a delayed -mixing process. +1

Philipp and Stout's Theorem: For the sake of simplicity we state Philipp and Stout's

Theorem in the -mixing stationary case still using the same notations. We go back to the canonical dynamic system ((R d )N ; B(R d ) N ; P; ). Theorem 6 (W. Philipp and W. Stout [17]) Let F be a centered function 2 L  ((R d )N ; P) for some 0 <   2, and (Fk )k2N an approximating sequence of F k -measurable functions. We assume that: (i) There is some constant C satisfying: 8 n 2 N jjF Fnjj   C 7 : (13) n (ii) ! 2+

0

2+

E

n 1 X k=0

F

 k

2

= n + O (n

1

2+

30 ) 

as n ! 1:

(14)

(iii) (Xn)n2N is a R d -valued stationary -mixing sequence with: 

(n) = o n

2 

168(1+  )

Then the Law of the Iterated Logarithm holds, that is: n 1 X

=1 lim sup pk 2n log log n n!1 =0



F

 k

:

(15)

P-a:s:

n 1 X

and

F  k

lim inf pk = 1: n!1 2n log log n =0

The proof of this theorem is available in chapter 8 of [17]. We will apply now this theorem to FT -measurable functionals. 12

Application to stopping functionals of an -mixing process: We study now some classes of functions depending on a stopping time. Hence we consider a (F n)n2N -stopping time T , and a FT -measurable functional F . Theorem 7 Let (Xn)n2N be a R d -valued stationary -mixing sequence and  2 (0; 2]. As2  sume that (n) = o n . If T is a stopping time and T 2 Lp  ((R d )N ; P) for some p() > 2(2 + )(1 + )(2 + 7) (4 ) then, for every F 2 L  ((Rd )N ; FT ; P) with E (F ) = 0 0

168(1+ )

( )

2+2

2

and  2 (F ) > 0, the Law of the Iterated Logarithm is satised, that is: n 1 X

n 1 X

F  k

lim sup pk =  (F ) 2n log log n n!1

P-a:s:

lim inf pk =  (F ): n!1 2n log log n

and

=0

F  k

=0

Remark: Note that '():= 2(2 + )(1 + )(2 + 7) is a decreasing function on (0; 2] so '()  '(2)=66. For any practical implementation, such a requirement amounts to assuming 2

that the stopping time T has moments of every order.

Proof: W.l.g., one may assume that  (F ) = 1 and (n) is a non increasing sequence. We 2

will now show that the assumptions of Theorem 6 are fullled. According to the proof of Proposition 3 one has: n 1 X

E

Let p 2

k=0

F  k

=n 2



 ; p(  ) 

2+ 1+

!2

n X k=1

kCov(F  k ; F ) 2n

1 X k=n+1

Cov(F  k ; F ):

(16)

. So, one has, following Theorem 3:

8F 2 L  ((R d )N ; FT ; P) 2+

Cov (F



 k ; F )