0, T - e p u b s . s i a m . o r g

1 downloads 0 Views 2MB Size Report
shown that the value function is continuous and is the unique viscosity solution of the ... Then, in 4, assuming Lipschitz continuity of the coefficients, we prove.
()

SIAM J. CONTROL AND OPTIMIZATION Vol. 33, No. 3, pp. 937-959, May 1995

1995 Society for Industrial and Applied Mathematics 013

SINGULAR OPTIMAL STOCHASTIC CONTROLS II: DYNAMIC PROGRAMMING* ULRICH G. HAUSSMANN AND WULIN SUO$ Abstract. The dynamic programming principle for a multidimensional singular stochastic control problem is established in this paper. When assuming Lipschitz continuity on the data, it is shown that the value function is continuous and is the unique viscosity solution of the corresponding Hamilton-Jacobi-Bellman equation.

Key words, singular controls, control rules, value function, dynamic programming principle, Hamilton-Jacobi-Bellman equation, viscosity solution AMS subject classifications. 49J30, 49A55, 60G44, 93E20

1. Introduction. In [8] we applied a direct method to study the existence of optimal controls for the stochastic control problem in which the state is governed by the stochastic differential equation

xt

x+

b(O, xe, ue)dO +

,

a(O, xe, ue)dBe +

g(O)dve

(, t, P), where b(.,., .), a(.,., .), g(.) are given deterministic functions, (Bt, t 0) is a d-dimensional Brownian motion (in fact, B. need not be d-dimensional), x is the initial state at time s, and u U, [0, T] k, with v nondecreasing componentwise, stand for the controls v [0, T] The expected cost has the form

on some filtered probability space

J(a)

Ep

f (t, xt, ut)dt +

,

c(t) dvt ,T)

are given. We assume that where f(.,-, .)" [0, T] x d x V c(.)" [0, T] the cost of applying the singular control is positive, i.e., ci(.) > 0, 1,..., k. For this type of problem, the reader may consult the paper by Haussmann and Suo [8] and the list of references therein. This paper is a continuation of Haussmann and Suo [8]. As is well known for the classical stochastic control problem, the dynamic programming principle is satisfied and, if the value function has appropriate regularity, it satisfies a second-order nonlinear partial differential equation (the Hamilton-Jacobi-Bellman equation) (cf. Fleming and Rishel [4] and Lions [10], among others). This is still the case for singular stochastic control where the Hamilton-Jacobi-Bellman equation is a second-order variational inequality (see Fleming and Soner [5] and the list of references in Haussmann and Suo [8]). In this paper, in 3 we adopt a probabilistic approach used in Haussmann [6], Haussmann and Lepeltier [7], and E1 garoui, Nguyen, and JeanblancPicqu [3] to establish the dynamic programming principle under very mild conditions of the data. Then, in 4, assuming Lipschitz continuity of the coefficients, we prove Received by the editors June 15, 1993; accepted for publication (in revised form) January 12, 1993. This work was supported by Natural Sciences and Engineering Research Council of Canada grant 88051. Department of Mathematics, University of British Columbia, Vancouver, British Columbia, Canada V6T 1Z2. Faculty of Management, University of Toronto, Toronto, Ontario, Canada MbS 1V4.

937

938

ULRICH G. HAUSSMANN AND WULIN SUO

that the value function is continuous. In 5 the Hamilton-Jacobi-Bellman equation is derived heuristically, and the value function is shown to be the unique viscosity solution of this equation. For the reader’s convenience, the main results of Haussmann and Suo [8] are recalled in 2, along with the formulation of the problem. We list some notation that will be used throughout this paper: denote the d-dimensional Euclidean space and the real line, respec1id, tively. 1i+ {x E ,x _> 0} and/_ is defined similarly. For x (xi), y

xiyi T > 0 is the fixed horizon, and E

[0, T]

x/Rd.

:Dd[0, T] denotes the collection of d-valued functions defined on [0, T] which

-

are left continuous and have right linits.

Ak[0, T] denotes the collection of functions a" [0 T] tik+ such that a k. 1 (a i) e Tk[0, T] and a is nondecreasing with hi(0) 0, Sz x k is the space of x k matrices with the x k-dimensional Euclidean norm. If Y is a metric space, B(Y) denotes the corresponding Borel a-field, and f B(Y) means that f is a B(Y)-measurable real-valued function. We denote by B//I(Y) (g//+(Y), respectively) the space of probabilities (nonnegative Radon measures, respectively) on Y with the weak convergence topology. U, called the control set, is a compact metric space. It is well known that is a bounded U -. B//I(U) is also a compact metrizable space. If measurable function, we can extend to g//1 (U) by letting

(#)

_=

] (u)#(du).

Define b/-

{#" [0, T]

g//1 (U)is Borel

measurable}.

If X is a random variable on a probability space (f, $- P), the expectations of X will be denoted by E P (X). M is the family of continuous square integrable martingales on some given probability space (ft, 9c, P) with a filtration

C stands for a constant, but not necessarily the same one from line to line. 2. Formulation of the problem. We consider the following optimal control problem in which we allow both classical control and singular control to act at the same time, i.e., the dynamics are in the form,

(2.1)

x

xt

for (t,x)E,

+

b(O, xo,#o)dO +

a(O, xo,#o)dBo +

g(O)dvo a.s.

s_ 0, 1 s

for any bounded Ktk-valued Borel-measurable function h(.). These properties will be used repeatedly in the rest of this section. Assume that T is an $’t-stopping time, 0 _< T _r. Proof. The proof is comparable to that of Stroock and Varadhan [11,

, r.

Thm.

1.2.10.]

COROLLARY 3.6. If P Tis,x and T is an .Tt-stopping time, then there is a P-null set N r such that for w N, (M C, t Pr o 0-1 is a martingale for t

e The proof is obvious from Lemmas 3.4 and 3.5. The next two results are important for the rest of the paper. The first one states that a control rule remains a control rule for the problem starting at a later time from the point reached at that time. The second one says that if we take a control rule and at some later time switch to another control rule, then this concatenated object is still a control rule. PROPOSITION 3.7 (closure under conditioning). If P Tis,x and T is a stopping

Proof.

time, s r3, & N. Certainly, tAr .’,’Tzt, ’) is a martingale, where

&,= (, 0, 0). Therefore definition that

P (w" Let N

U. Nm;

then

forw

N

(2tAro dp, .’t, P.) is a martingale.

x=Xr,, #

P(N)

It is obvious frmn the

v=0, 0_ W(s, z). On the other hand, fore, by (b),

(3.10)

W(s,x)

EPw(7, x)

EPJ(T,P)

EPJ(T,P)

J(r,P) and, there-

Eer(r, W) Ep

f(O, xo,,o)dO +

c(0). dye + EPW(T, x)

E

(0, xo, o)dO +

c(O). do + J(r, P)

,

J(,P), and thus (c) is proved if we take the infimum over P

on the right-hand side

(HS). (d) If F(t, W) is a P-martingale, then w(,x) E’r(, W)= ’r(T, W)= ’r(T, 0)= ’r (, P), because from our assumptions, W(T, .) 0. So P is optimal. is optimM, then by (3.10), Proposition 3.7, and If we assume that P

,

Corollary 3.6,

w(s, z) ECru(t, W) Eer W(s, x). Therefore (F(t, W),t,P) is a submartingale with constant mean value, (.1)

so it is

indeed a martingale. 4. Continuity of the value function. In the rest of the paper we add the following assumptions: c(.) is Lipschitz continuous, f(.,., .) is bounded, and g (gY) is a constant d k-matrix; f(., .), b(.., .), a(.,., .) satisfy the following conditions:

.

(t, x, ]b(t,x,u) -b(s,y,u) (t,x, ) -(, )]

(4.1)

,

c(t- +

x- )

uniformly for 0 We will prove that under these conditions the value function W(., .) is uniformly continuous on E. In fact, there exists a constant C 0 such that

w(t,x)

W(s, )

c (t-

s + x- )

o

s,t

T, x,

Note that the constancy of g is only required in the proof of Theorem 4.2.

.

THEOREM 4.1. The value function W(s,x) is uniformly Lipschitz continuous in the state variable x, i.e., there exists a constant C > 0 such that

(4.e)

w(,x’) w(,x)

Cx’

o

t

T, x,

x’ e

948

ULRICH G. HAUSSMANN AND WULIN SUO

the following, we use the same notation C to denote the constants, which may change from time to time. For any 0 _< s E P EP

f(O, xo, #o)dO + W(T,

f(O, xo,#o)dO +

c(O). dvo + W(T,x)

This contradicts the dynamic programming principle (el. (3.9)). Next, if (5.14) holds at (to, x0) for some i, then we can take h such that

(to, xo + gh)

(to, xo) < -c(to)h

> 0 small enough

,

where gi denotes the ith column of the d x k matrix g. Therefore, by

W(to, x + gihi)

(5.12)

we have

W(to, xo) < -d(to)h

and, therefore,

W(to, Xo) > W(to, xo + gh) + c(to) h, where h (0,..., hi,..., 0). This is a contradiction of Theorem 5.1, and thus we have shown that W is a subsolution of (5.9) Now we show that W is also a supersolution of (5.9). If 5 e C1,2(E) such that W- has a local minimum point at (to, x0) int(E), then there exists a neighborhood

01(to,xo) (5.16)

of

(to,xo) satisfying

w(t,x)

W(to, xo) >_ (t,x)- (t0,z0), (t,)

O(to, xo).

958

ULRICH G. HAUSSMANN AND WULIN SUO

If (5.11) fails, then inf

uEU

at

(to,Xo) for

(/ + f) > 0,

+c >0

(g*V)

1,... ,k. From assumption (4.1) and the fact that E O2(t0, xo) of (to, xo) such that for some > 0,

C1’2(E),

we

can find a neighborhood

inf

uEU

(/ + f) > ,

02(to, xo) for

k. Let 1, for O(to, Xo), we have small h

on

01(to,xo)AO2(to, xo); then for (t,x)

O(to, xo)

g, h

+c >

(g*V)

0,

(t, x + gh)

(t, x) > -c(t) h.

W(t, x + gh)

W(t, x) > -c(t) h

Therefore, by (5.16),

At. Hence for P 7’o,o

or x

(5.17)

Xto)

P(xto+

1

by Theorem 5.1. Define r

then from

(5.17)

we see that

(t, xt)

inf{t >_ to, (t, xt) O(to, xo)}; 1 for P

P(T > to)

O(to, xo),

xt

7to,xo, and it can be seen that

At, to < t
EP(T--to).

(+f)(O, xo,#o)dO+

Applying Ito’s formula and noting that the state process x. is continuous a.s. when to < t < r, we have

EP(T, XT) dP(to,Xo) q- E P

(O, xo,#o)dO +

Vx(O, xo) gdvo

which may be rewritten as

EP[(T, Xr) (to, xo)] >_ E P

-f(O, xo, #o)dO

c(O). dvo + eE P(r to). By (5.16) and the fact that P(r > to) > 0,

EP[w(T, x) W(to, xo)] > E P

we have

-f(O, xo, #o)dO

c(O) dvo

(P)

959

SINGULAR OPTIMAL STOCHASTIC CONTROLS II or

f(O, xe, #e)dO +

ft

c(O) dvo + W(, x)

},

which contradicts the dynamic programming principle. The proof of this theorem is therefore complete. Let us define the function space

C(E)

{W(., .)" W e C(E; 1i) with W bounded and IW(t, x) W(t,Y)I _ 0}.

By Theorem 4.4 we know that the value function W E C(E). The proof of the next theorem is a modification of the methods used in Fleming and Soner [5]. For details see Suo [12]. THEOREM 5.5. There exists a unique viscosity solution in C(E) to the dynamic programming equation (5.9) with the boundary condition W(T,x) O, x 1R d, which can be identified as the value function. REFERENCES

[1] P. BILLINGSLEY, Convergence of Probability Measures, John Wiley, New York, 1968. [2] M. G. CRANDALL, H. ISHII, AND P.-L. LIONS, A user’s guide to viscosity solutions, Bull. Amer. Math. Soc. (N.S.), 27 (1992), pp. 1-67. [3] N. EL KAROUI, HUU NGUYEN, AND M. JEANBLANC-PICQUI, Compactification methods in the control of degenerate diffusions: Existence of an optimal control, Stochastics Stochastics Rep., 20 (1987), pp. 169-219. [4] W. H. FLEMING AND R. W. RISHEL, Deterministic and Stochastic Optimal Control, SpringerVerlag, New York, 1975. AND H. M. SONER, Controlled Markov Processes and Viscosity Solutions, Springer-Verlag, New York, 1993. [6] U. G. HAUSSMANN, Existence of optimal Markovian controls for degenerate diffusions, Lecture Notes in Control and Inform. Sci., 78 (1986), pp. 171-186. [7] U. G. HAUSSMANN AND J. P. LEPELTIER, On the existence of optimal control, SIAM J. Control

[5] W. H. FLEMING

Optim., 28 (1990), pp. 851-902. [8] U. G. HAUSSMANN AND W. Swo, Singular optimal stochastic controls I: Existence, SIAM J. Control Optim., 33 (1995), pp. 916-936. [9] N. KRYLOV, Controlled Diffusion Processes, Springer-Verlag, New York, 1980. [10] P. L. LIONS, Optimal control of diffusion processes and Hamilton-Jacobi-Bellman equations.

Part 1: The dynamic programming principle and applications, and Part 2: Viscosity solutions and uniqueness, Comm. Partial Differential Equations, 8 (1983), pp. 1101-1174 and pp. 1229-1276. AND S. R. S. VARADHAN, Multidimensional Diffusion Processes, SpringerVerlag, New York, 1979. W. Suo, The Existence of Singular Optimal Controls for Stochastic Differential Equations, Ph.D. thesis, Univ. of British Columbia, Vancouver, British Columbia, 1994.

[11] D. W. STROOCK

[12]