Some Finance Problems Solved with Nonsmooth ... - Springer Link

0 downloads 0 Views 121KB Size Report
The use of utility functions and optimization is a simple, systematic .... (i) If f is continuously differentiable on a neighborhood of x, then. ¶f (x) ={rf (x)}. ... we have: (H1) f is continuous and there exists K>0 such that jf (x, u) – f (y, u)j#Kjx – yj, for all x ... utility, which is a function of the cash value of the portfolio, then the prob-.
JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS: Vol. 119, No. 1, pp. 1–18, October 2003 (g2003)

Some Finance Problems Solved with Nonsmooth Optimization Techniques R. B. VINTER1 AND H. ZHENG2 Communicated by D. Q. Mayne

Abstract. The purpose of this paper is to draw the attention of the nonsmooth analysis and mathematical finance communities to the scope for applications of nonsmooth optimization to finance by studying in detail two illustrative examples. The first concerns the maximization of a terminal utility function in an investment problem with transaction costs. The second concerns the calculation of the duration of a bond for general term structures of interest rates. The emphasis is on methodology. Key Words. Nonsmooth optimization, utility maximization, transaction costs, bond duration, general term structure changes.

1. Introduction Optimization has become an indispensable tool in modern finance theory since Markovitz’s seminal work on quadratic programming and mean-variance portfolio theory (Ref. 1). As financial markets become increasingly competitive and volatile, portfolio managers face ever demanding tasks of improving performance while reducing exposure to risks. A range of financial instruments (bonds, shares and derivatives) are available to achieve these objectives. The use of utility functions and optimization is a simple, systematic means of combining these instruments to design investment portfolios. Such well-established optimization techniques as mathematical programming (linear, quadratic, stochastic, etc.) and control theory are now routinely applied in financial engineering and risk management (Ref. 2). Since the 1970s, a new branch of optimization theory, nonsmooth optimization, has been the subject of intensive study. (For expository accounts 1

Professor, Department of Electrical and Electronic Engineering, Imperial College, London, UK. 2 Lecturer, Department of Mathematics, Imperial College, London, UK.

1 0022-3239=03=1000-0001=0 g 2003 Plenum Publishing Corporation

2

JOTA: VOL. 119, NO. 1, OCTOBER 2003

see Ref. 3 or 4.) This provides analytical tools for studying optimization problems involving functions that are not differentiable in the usual sense. Nonsmooth constraint and objective functions arise in various areas of applied optimization. Examples are encountered in optimization problems relating to nonlinear electrical circuits, that incorporate devices such as ideal diodes and saturating amplifiers with nonsmooth characteristics. Other examples are to be found also in mathematical economics, when account is taken of the banding of interest rates or of the difference between borrowing and lending rates, both features that give rise to a nonsmooth characteristic relating investment and returns. However, nonsmooth analysis has an important role in optimization, even in situations when the original data functions are differentiable. This is because it is often convenient to study optimization problems by examining the properties of auxiliary functions that are nonsmooth. These include exact penalty functions that are used to replace constraints by nonsmooth cost terms, as well as value functions that describe the dependence of the minimum cost on the problem data and typically nonsmooth for constrained problems. It has been applied in fields as varied as production-inventory control in operations research (Ref. 5), resource economics (Ref. 6) and structure design (Ref. 7). Nonetheless, the application of nonsmooth optimization to finance has till now been largely unexplored. The purpose of this paper is to draw the attention of the nonsmooth analysis and mathematical finance communities to the scope for applications of this nature, by studying in detail two illustrative examples. The first concerns the maximization of a terminal utility function in an investment problem with transaction costs. This is an extension of a problem studied by Sethi and Thompson (Ref. 8). The second concerns the calculation of the duration of a bond for general term structures of interest rates. The solution to the first problem is new. The duration formulas have previously been derived by Zheng et al. (Ref. 9), but the simple nonsmooth derivation of this paper appears for the first time. The emphasis is on methodology. Indeed, we could have posed and solved a more complicated version of the investment problem to allow for short selling and additional constraints, but have chosen not to do so, in order not to obscure with other details the role of nonsmooth analysis in the calculations. Sections 3 and 4 covering the two examples are preceded by Section 2 providing a summary of relevant nonsmooth techniques.

2. Nonsmooth Optimization Techniques Nonsmooth optimization provides analytical tools for studying minimizers to optimization problems involving nondifferentiable functions and

JOTA: VOL. 119, NO. 1, OCTOBER 2003

3

constraints sets which fail to have smooth boundaries. The key constructs involved are generalizations of traditional notions of gradient of a function and normal vector to a set, to allow the function to be nondifferentiable and the set to have nonsmooth boundary. These are underpinned by an extensive set of calculus rules for calculating (or at least estimating) extended-sense gradients of composite functions, as well as general principles governing minimizers to nonsmooth optimization problems (generalized Lagrange multiplier rules, sensitivity relationships, etc). Recent expository accounts of nonsmooth optimization are Ref. 4, 10, 11. An influential early text was Ref. 3. As far as the applications of this paper are concerned, it is necessary only to define an extended-sense gradient of a nondifferentiable function and to invoke some of its elementary properties. There are a number of possible approaches. We follow one of the earliest due to Clarke, for its simplicity and because it provides the required precision to permit a complete analysis of our examples. For other examples, the use of different nonsmooth constructs may give fuller information about minimizers. Recall that, given a subset A  Rk and a function f : Rk fi R, the function f is said to be Lipschitz continuous on A if there exists a constant c > 0 such that j f ( y) – f (z)j# cjy – zj,

for all y, z ˛A:

In many applications of nonsmooth optimization, the nonsmooth functions involved are Lipschitz continuous functions. Indeed, the most commonly encountered nonsmooth functions in engineering applications are perhaps piecewise linear functions, that are special cases of Lipschitz continuous functions. They therefore provide a suitable framework for extending the notion of derivative. The starting point is the observation that, given a Lipschitz continuous function on an open set, then the function is (Fre´chet) differentiable on a subset of the open set having full Lebesgue measure. This is the Rademacher theorem. When attempting to define gradients of a Lipschitz continuous function at a specified point then, it makes sense to look at limits of gradients at neighboring points where the function is differentiable. This is the idea behind the following definition. Definition 2.1. Generalized Gradient. Given an open set A  Rk, a point x¯ ˛A and a Lipschitz continuous function f : A fi R, the generalized gradient of f at x¯ is the set ¶f (x¯) = co{x ˛Rk : 9 a sequence xi fi x¯ such that f is differentiable at each xi and rf (xi )fi x}:

4

JOTA: VOL. 119, NO. 1, OCTOBER 2003

Here, co denotes the operation of taking the convex hull. To construct the generalized gradient then, we generate a subset of Rk by considering all possible limits points of gradients evaluated along sequences of points converging to x¯ and at which the function is differentiable. The generalized gradient is the convex hull of this set. A key feature of the generalized gradient is that it is possibly set-valued. There are alternative ways of defining this directional derivative (the original definition was in terms of the polar of the generalized directional derivative), but the limits of gradients definition above is usually the most convenient for calculations. Example 2.1.

Take f: R fi R to be

f (x) = jxj: Then, ¶f (0) = [–1, +1]. We gather together some properties of the generalized gradient that link it with optimality properties and that are useful in computations. Properties 2.1. Take a set A  Rk, Lipschitz continuous function f, g : A fiR and a point x ˛interior A. (i)

If f is continuously differentiable on a neighborhood of x, then ¶f (x) = {rf (x)}:

(ii) The inclusion 0 ˛¶f (x) is a necessary condition for x to minimize f over A. If A is a convex set and f is a convex function, then this condition is also sufficient. (iii) Sum Rule. ¶( f + g)(x) ¶f(x) + ¶g(x) and, if g is continuously differentiable on a neighborhood of x, then ¶( f + g)(x) = ¶f (x) + {rg(x)}: (iv) Positive Homogeneity. (v) Max Rule.

¶(a f )(x) = a ¶f (x).

¶ max{ f , g}(x) {lh + (1 – l)x : l ˛[0, 1], x ˛¶f (x) and h˛¶g(x)}: The first example of a mathematical finance problem that we shall study in this paper will have the form of a nonsmooth optimal control problem. We conclude this section by briefly summarizing relevant optimality conditions.

JOTA: VOL. 119, NO. 1, OCTOBER 2003

5

Consider the optimal control problem: minimize g(x(T)), over Lebesgue measurable functions u : [0, T ]fi Rm and absolutely continuous functions x : [0, T ]fi Rn satisfying x_ (t) = f (x(t), u(t)), u(t)˛U,

a:e: t˛[0, T ],

a:e: t ˛[0, T ],

x(0) = x0 , the data for which comprise functions g : Rn fi R and f : Rn · Rm fi Rn, a subset U  Rm and a n-vector x0. The domain of the optimal control problem comprises control=state trajectory pairs (u, x), in which the control u is a Lebesgue measurable function satisfying u(t)˛U a.e. and the state trajectory x is a solution to the differential equation x_ = f (x, u) corresponding to the choice of the control u and the initial condition x(0) = x0. In the usual circumstances, when there is a unique such state trajectory, the choice variable is effectively the control u. We state a simple form of the nonsmooth maximum principle, relating to nonsmooth optimal control problems with free right endpoints. Proofs of the necessary condition are to be found for example in Ref. 3, Chapter 5 or Ref. 4, Chapter 6. A proof of sufficiency, under the stronger hypotheses, appears for example in Ref. 12. It is a generalization to problems with nondifferentiable data of the classic maximum principle of Pontryagin, a simple proof of which appears for example in Ref. 13. Define the Hamiltonian function H(x, p, u) = p  f (x, u): The expression ¶xH(x, p, u), used below, is to be interpreted as the generalized gradient of H( . , p, u) for fixed p, u (the partial generalized gradient). The notation B refers to the closed unit Euclidean ball in Rk, B = {x ˛Rn : jxj# 1}: Theorem 2.1. Take a control=state trajectory pair (u¯ , x¯) satisfying the constraints of the optimal control problem. Assume that, for some e > 0, we have: (H1)

f is continuous and there exists K > 0 such that j f (x, u) – f ( y, u)j# Kjx – yj,

(H2) (H3)

for all x, y ˛x¯(t) + eB, u ˛U;

g is Lipschitz continuous on x¯(T ) + eB; U is a Borel measurable set.

6

JOTA: VOL. 119, NO. 1, OCTOBER 2003

Then, the existence of an arc (absolutely continuous function) p : [0, T ]fi Rn satisfying (a) adjoint inclusion, – p_ (t) = ¶x H(x¯(t), p(t), u¯(t)), for t˛[0, T ], (b) transversality condition, – p(T ) = rg(x¯(T )), (c) maximization of the Hamiltonian condition, H(x¯(t), p(t), u¯(t)) $ H(x¯(t), p(t), u),

for all u ˛U,

is a necessary condition for (u¯, x¯) to be a minimizer. If the following additional hypotheses are satisfied: (d) f is an affine function, g is a convex function and U is a convex subset, then existence of an arc p satisfying conditions (a)–(c) is also a sufficient condition for (u¯, x¯) to be a minimizer. 3. Utility Maximization with Transaction Costs A central problem in finance is to determine how an investment should be allocated over time between different types of assets. Suppose that an investor wants to maximize the utility of the terminal wealth subject to various constraints. To simplify the discussion, we assume that there are only two types of investments: cash bond and stock index. In the case that the terminal wealth is the sum of the cash bond and the book value of stock index, then the problem has been studied by Elton and Gruber (Ref. 14) and by Sethi and Thompson (Ref. 8). The optimal strategy is to apply bang-bang control: buy or sell the security at the maximum permissible rate, or do nothing, according to rules which are governed by the current level of investments and the horizon time. The optimal policy depends on the interest rate, the security return rate, and the broker’s commission rate, but not on the initial number of shares. If the objective is to maximize the terminal utility, which is a function of the cash value of the portfolio, then the problem becomes more complicated. The investor cannot simply maximize the book value of the terminal portfolio, because account must be taken of the transaction costs involved in converting it into cash value. Instead, it is necessary to maximize the cash value of the terminal portfolio. Such a problem cannot be solved by means of the standard maximum principle of optimal

JOTA: VOL. 119, NO. 1, OCTOBER 2003

7

control theory, applied by Sethi and Thompson to a simpler problem, since the objective function is no longer differentiable. On the other hand, the dynamic programming approach, followed by Elton and Gruber, which involves consideration of all possible outcomes, is too complex to yield useful insights here. We solve this problem by means of nonsmooth control theory. Formulation of the utility maximization problem with transaction costs requires the following positive numbers to be specified: T, the time horizon (years), r, the interest rate, a, 0< a < 1, the broker’s commission rate, b, 0 < b< 1, the broker’s commission rate at termination, m, the security return rate, x0, the initial cash investment in the bond ($), y0, the initial number of shares purchased, M, the upper bound on the rate of buying and selling shares (years)– 1, s0, the initial price of the security. Here, it is assumed that m > r. The price of the security at time t is given by the formula s(t) =e mts0 : The amount of cash x(t) and the number of shares y(t), which we shall interpret as state variable components, evolve according to the following differential equations: x_ = rx – (1 + a)s(t)u + (1 – a)s(t)v, y_ = u – v, with initial conditions x(0) = x0 , y(0) = y0 : Here, u(t), v(t) are the rates of buying and selling the security, respectively, at time t. We shall interpret them as control variable components. They are constrained to satisfy 0 # u # M, 0# v# M: The investment problem is to choose the controls u(t), v(t), 0 # t# T, to maximize the terminal wealth in cash terms, namely, w(T ) = x(T ) + y(T )s(T ) – bjy(T )js(T ): This is a nonsmooth optimization problem, owing to the presence of the nondifferentiable term + bjy(T )js(T ) in the cost. Notice that we allow y(T ) to be negative (short selling). The above formulation is a deterministic analogue of that used by Davis et al (Ref. 15) in a somewhat different setting. If b = 0, the problem reduces

8

JOTA: VOL. 119, NO. 1, OCTOBER 2003

to the optimal cash balance problem studied in Sethi and Thompson (Ref. 8). In this case, the nonsmooth term disappears, and the problem can be studied with the help of the standard maximum principle. The solution that we provide is expressed by the following functions of the scalar variable h, – 1# h # + 1:   1 1 +a 1 t (h) := T – loge , m–r 1 – bh   1 1–a t2 (h) := T – loge , m–r 1 – bh and 8 0, > < B t (h) = t1 (h), > : T, 8 0, > < S t (h) = t2 (h), > : T,

if t1 (h) < 0, if 0 # t1 (h) # T, if T # t1 (h), if t2 (h)< 0, if 0# t2 (h)# T, if T # t1 (h):

Notice that t1(h) # t2(h) and tB(h), tS(h) are the projections onto the interval [0, T ] of t1(h), t2(h) respectively. Accordingly, 0# tB (h) # tS (h)# T: Define also h(h) = T – tS (h) – tB (h):

(1)

Theorem 3.1. The utility maximization problem with transaction costs has a solution u¯(t), v¯ (t), 0 # t # T, unique in an almost everywhere sense, given by the formula (u¯(t), v¯ (t)) = (M, 0),

if 0 # t (1 + a)s(t):

(3)

Here, s(t) =S(T ) exp(– m(T – t)): Consider the benefits of buying an additional share at time t. The amount of cash needed to pay for the share at time t is simply the right side of (3). On the other hand, as we discuss presently, the right side of (3) can be interpreted as the discounted cash value of the share at time t. It follows that one should buy shares at the maximum rate, since a net profit per share, namely the difference of the two sides, can thereby be achieved. This course of action is consistent with the claimed optimal strategy when 0 # t # tB (h¯ ). In justification of the above interpretation, we observe that the book value of the extra share is s(T ). Consider either of the cases y(T )> 0 or

10

JOTA: VOL. 119, NO. 1, OCTOBER 2003

y(T ) < 0. According to the theorem statement, the corresponding values of h are +1 or –1, respectively. It follows that (1 – bh¯ )s(T ) is its cash value, when we take account of the final commission and the possibility that h takes values +1 or –1, according to whether y(T )> 0 or y(T )< 0, respectively. We deduce that the left side of (3) is indeed the discounted value at time t in these cases. If finally y(T ) = 0, we can think of h¯ as a scaling factor which takes account of the enhanced cash value of the extra share, discounted at time t, that can be achieved by combining it with the other shares and adopting an optimal buying=selling strategy. Of course, the appropriate value of h¯ here is not obvious, and must be determined by an analysis, such as that carried out in this paper. Remark 3.2. Incorporation of a Utility Function. The strategy (u¯ (.), v¯ (.)) above remains optimal when the cost is replaced by g(x, y) = U(x(T ) + y(T )s(T ) – bjy(T )js(T )), where U is any monotone increasing utility function. Of course, the optimal cost changes. This is because maximizing a function g is equivalent to maxˆ  g with any monotone increasing function U(.). imizing its composition U Remark 3.3. Essential Nonsmoothness of This Problem. The optimal investment problem can be solved by means of the traditional (smooth) maximum principle if a priori information is available assuring that y¯(T )„ 0:

(4)

This is because the smoothness requirements on the data regarding the terminal cost function g(x, y) are merely that g is continuously differentiable on a neighborhood of the optimal endpoint x¯(T ), y¯(T ), and this condition is satisfied provided (4) is true. With regard to the relevance of nonsmooth analysis in this application, it is then natural to ask whether the case y¯(T ) = 0, in which nonsmoothness arises, is an anomalous one, requiring a special choice of problem data, or whether it is in some sense generic. Consider again the data (2). The optimal number of shares at termination y¯(T ) depends on the size of the initial share holding y0 in relation to the maximum permissible rate of share buying=selling. For the above data, we find that y¯(T ) = 0 for all values of y0 and M satisfying 0# y0=M # 4:1242: We have illustrated by example that the case y¯(T ) = 0 is not exceptional. This is not surprising. The nonsmooth feature of this problem is the presence of an absolute value function with a corner, i.e. discontinuous derivative at the

JOTA: VOL. 119, NO. 1, OCTOBER 2003

11

origin. It is well known that minimization involving such functions typically results in the minimizer being located at the point of discontinuity of the derivative, as here. This is the phenomenon of exact penalization. Proof of Theorem 3.1. We formulate the optimal investment problem as the following optimal control problem: minimize subject to

g(x(T), y(T)), x_ (t), y_ (t) = f (t, x(t), y(t)), (x(0), y(0)) = (x0 , y0 ), (u(t), v(t)) ˛W,

in which g(x, y) = x + ys(T ) – bjyjs(T ), f (t, x, y) = (rx – (1 + a)s(t)u + (1 – a)s(t)v, u – v), W = {(u, v) : 0 # u # M, 0 # v # M}: The dynamic equations are linear in control and state, the control constraint set W is closed and convex, the initial state is fixed and the terminal cost is a convex function of the state variables (x, y). In these circumstances, existence of a minimizer over the class of measurable controls is assured. See Ref. 13. It is also known that the hypotheses are valid under which the nonsmooth maximum principle applies and is a necessary and sufficient condition for optimality. The maximum principle applied to the investment problem asserts the existence of costate arcs ( p(t), q(t)), 0 # t# T, associated with the x and y state variables respectively, such that – p_ (t) = rp(t),

0 # t # T,

(5a)

– q_ (t) = 0,

0 # t # T,

(5b)

– ( p(T ), q(T )) ˛¶g(x¯(T ), y¯(T )),

(6)

H(t, x¯(T ), y¯(T ), u¯ (T ), v¯ (T )) = max H(t, x¯(T ), y¯(T ), u, v): u, v

Here, H(t, x, y, u, v) = p(rx – (1 + a)s(t)u + (1 – a)s(t)v) + q(u – v): Condition (6) can be equivalently expressed as – p(T ) = – 1,

– q(T ) = s(T )(– 1 + hb),

(7)

12

JOTA: VOL. 119, NO. 1, OCTOBER 2003

and  h=

+ 1,

if y¯(T )> 0,

– 1, if y¯(T )< 0: We deduce from (5) that p(t) = er(T– t) : On the other hand and from (7),  M, if – (1 + a) + m(t) > 0, u= 0, if – (1 + a) + m(t) < 0,  M, if – (1 – a) – m(t)> 0, v= 0, if – (1 – a) – m(t)< 0,

(8)

(9)

where m(t) = q(t)=p(t)s(t) = (1 – hb)e(m – r)(T– t) : The switching conditions in (8) and (9) can be expressed equivalently as – (1 +a) + m(t)” t< tB (h), (1 – a) + m(t)” t> tS (h): Notice that y(T ) = y0 +

ðT (u¯ – v¯ )dt 0

= y0 + M(tB (h) – (T – tS (h))) = y0 – Mh(h), where h(.) is the function of the theorem statement. Since h is a monotone increasing function, one of the following three cases must occur. (C1) y0 > Mh(+ 1). In this case, the pair (u¯ , v¯ ) satisfies the maximum principle if and only if it is given by (8), (9) with h¯ = + 1. (C2) Mh(– 1) # y0 # Mh(+ 1). In this case, the pair (u¯, v¯ ) satisfies the maximum principle if and only if it is given by (8), (9), for some h¯ ˛[ – 1, + 1] satisfying y0 = h(h¯ ). (C3) y0 < Mh(– 1). In this case, the pair (u¯ , v¯ ) satisfies the maximum principle if and only if it is given by (8), (9) with h¯ = – 1. We see that, in all cases, the pair (u¯, v¯ ) given by (8), (9), where h¯ is chosen in the manner indicated in the theorem statement, satisfies the conditions of the

JOTA: VOL. 119, NO. 1, OCTOBER 2003

13

maximum principle. Since the maximum principle is necessary and sufficient for optimality, and since the optimization problem has a solution, it follows that (u¯ , v¯ ) is an optimal strategy. It remains to establish uniqueness. Here, we have merely to observe that the control (u¯, v¯ ) satisfying the maximum principle is completely specified (to within a null set) by h¯ . But since h(.) is a strictly monotone increasing function, there is only one h¯ ˛[– 1, + 1] satisfying

or or

y0 < h(– 1) y0 = h(h¯ ),

and h¯ = – 1,

y0 > h(+ 1) and h¯ = + 1:

It follows that the optimal strategy is unique.

%

4. Calculation of Bond Duration Suppose that the instantaneous forward interest rate at time t in a bond market is f (t). Consider a bond with positive income streams c(ti) at the positive times ti, i = 1, 2,. . ., N, respectively, and terminating at time tN. The price of the bond is given by the formula  ð ti  N (10) V ( f ()) =  c(ti ) exp – f (s)ds : i=1

0

Inclusion of the argument f (.) serves to emphasize the dependence of the price on the interest rate. The Macaulay duration DM of the bond (in years) describes the proportional decrease in price of the bond, due to a uniform (or parallel) increase in interest rate over the lifetime of the bond, DM = – [1=V ( f ())](d=dr)V ( f () + r)jr=0 : Its relevance to bond analysis is that it furnishes an approximation to the effects on the price of a uniform change of interest rates r, namely, V ( f () + r)  V ( f ())(1 – DM  r): A simple calculation permits us to deduce from (10) the following explicit formula for the Macaulay duration: N

D M =  w i ti , i=1

(11)

14

JOTA: VOL. 119, NO. 1, OCTOBER 2003

in which the wi are the collection of positive numbers summing to one, given by Ðt c(ti ) exp{ 0 i f (s) ds} wi = N : (12) Ðt j=1 c(tj ) exp{ 0 j f (s)ds} In the case that the bond is a zero coupon bond, with face value c0 and time to maturity D, we see from (11) that the Macaulay duration is precisely D. A shortcoming of the Macaulay duration, as regards price sensitivity calculations for bonds, is that it is based on the assumption of parallel changes to term structures, i.e. that the size and direction of rate changes for short term and long term bonds are the same. These conditions are in general not satisfied in the bond market. Consequently, durations based on these assumptions can be misleading and cause mismatch in asset liability management. There has been extensive research on formulating more robust interest rate sensitivity parameters, such as key rate duration (Ref. 16), partial duration (Ref. 17) and, most recently, approximate duration (Ref. 9). We focus on the approximate duration concept. The idea is to approximate the price V( f (.)) of the bond by the price V0( f (.)) of a zero-coupon bond  ðD  V0 ( f ()) = c0 exp – f (s)ds , (13) 0

with suitably chosen parameters c0 (the face value) and D (the time to maturity) and then to define the duration of the bond to be D, i.e., the duration of the matched zero-coupon bond. To follow this idea through, we need to specify the nature of the approximation. To get away from the parallel rate change assumption underlying the Macaulay duration, we consider all possible profiles describing the interest rate change variation over the lifetime of the bond, with a uniform upper bound, namely the functions G = {g() : [0, O)fi R : g() is measurable and jg(t)j# 1 for almost all t}: Of course, these include g(.) ”1, corresponding to a uniform rate change. For the given forward rate change f (.) and an arbitrary rate change variation g(.), define DV ( f (); g()) := (d=dr)V ( f () + rg())jr=0 : Likewise define DV0( f (.), g(.)). As the notation suggests, DV and DV0 are, in fact, directional derivatives, but it is not required here to specify the underlying function spaces and make these identifications precise.

JOTA: VOL. 119, NO. 1, OCTOBER 2003

15

Matching the price sensitivities of the bond and the approximating zerocoupon bond for all interest change profiles would require us to find c0 and D such that V ( f ()) = V0 ( f ()) and DV ( f (); g()) = DV0 ( f (); g()), for all g(.)˛G. This is not possible, unless the original bond is itself a zerocoupon bond because, otherwise, there are too many independent relations to be satisfied by choice of only two parameter c0 and D. We can however choose c0 and D to minimize the worst-case deviation of the price sensitivities, over all possible rate change profile ranges. In this manner, we define the approximate duration. Definition 4.1. Dapprox is an approximate duration for the bond if, for some c0 (c0, Dapprox) minimizes max jDV ( f (); g()) – DV0 ( f (); g())j

g() ˛G

over elements (c0, D) satisfying V( f (.)) = V0( f (.)). Here V(.) and V0(.) are defined according to (10) and (13). A simple characterization of approximate duration is provided by the following theorem. Theorem 4.1. Characterization of Approximate Duration. Let w1, w2,. . ., wN be defined according to (12). Suppose that there exists an integer i0 ˛{1, 2,. . . , N} satisfying  wi <  wi

i < i0

and

i $ i0

 wi >  wi :

i # i0

(14)

i > i0

Then, the approximate duration Dapprox is unique and is Dapprox = ti0 : Otherwise, there exists an integer i0 ˛{1, 2,. . ., N – 1} satisfying  wi =  wi :

i # i0

(15)

i > i0

In this case, the approximate duration is not unique and the set of approximate durations comprise the numbers Dapprox satisfying ti0 # Dapprox # ti0 +1 : Remark 4.1. Zheng et al (Ref. 9) provides a detailed discussion of the approximate duration and also numerical studies, based on historical interest rate profiles, demonstrating the advantages of using the approximate duration in predicting price sensitivities to interest rate changes, as

16

JOTA: VOL. 119, NO. 1, OCTOBER 2003

compared with other duration concepts. We mention merely that, while the concept of approximate duration was devised to match a bond to a zerocoupon bond, approximate duration has a role also in providing a simplified description of a fixed-income portfolio involving multiple cash flows, where perhaps these cash flows span over a long period, and which cannot satisfactorily be matched to a single zero-coupon bond. For such products, the cash flows can be split into several groups according to their timing, such as short term, medium term, and long term, and the approximate duration can be used to approximate each group by a single zero-coupon bond. Remark 4.2. The theorem tells us that the approximate duration is the median time of the cash flows of the underlying coupon bond. This result was first obtained by Zheng et al (Ref. 9) by means of discrete approximation and linear programming. We provide a simple alternative proof based on analyzing the set of generalized gradients of the function H(.), defined by (1) at Dapprox, to illustrate the application of nonsmooth analysis in this context. The problem of characterizing the approximate duration is essentially nonsmooth, to the extent that H(.) is not differentiable at Dapprox, outside the case (15), which is rather exceptional. Proof of Theorem 4.1.  1, c [a, b] (s) = 0,  1, sign() = – 1,

Denote by if s ˛[a, b], if s ˇ[a, b], if s> 0, if s# 0,

the indicator function of the interval [a, b] and the sign function respectively. Take any nonnegative numbers c0, D such that V( f ) = V0( f ). A simple calculation yields the formula  ð O N  wi c [0,ti ] (s) – c [0, D] (s) g(s)ds, DV ( f ;g) – DV0 ( f ;g) = V ( f ) 0

i=1

for g(.)˛G. Since the maximum of the expression on the right side over g(.) ˛G is achieved at N  g() = sign  wi c [0,ti ] (u) – c [0, D] (u) , i=1

we have max jDV ( f (); g()) – DV0 ( f (); g())j = V ( f )H(D),

g() ˛G

(16)

JOTA: VOL. 119, NO. 1, OCTOBER 2003

17

where H(D) :=

 ðO  N     wi c  [0, ti ] (s) – c [0, D] (s)ds:  0

i=1

Now, for any D $ 0, we can always find c0 $ 0 to satisfy V( f ) = V0( f ). Furthermore, the right side of (16) depends only on D and not on c0. Since V( f ) is positive and does not depend on c0 or D, it follows that D* is an approximate duration if and only if D* maximizes H(D) over D ˛[1, O). Further calculations give   ðD  ðO       wi c ds +  wi c ds (s) – 1 (s) [0, ti ] [0, ti ] i  i  0 D  ð D ðO = 1 –  wi c [0, ti ] (s) ds +  wi c [0, ti ] (s)ds

H(D) =

i

0

= D –  wi i

ðD 0

D

c [0, ti ] (s)ds + wi i

ðT D

i

c [0, ti ] (s)ds

= D –  wi min(D, ti ) + wi max(ti – D, 0) i

i

=  wi [D – min(D, ti ) + max(ti – D, 0)] i

=  wi jD – ti j: i

Since the ti are positive, we see from the last expression that D = 0 is not a minimizer. It follows that Dapprox = 0 cannot be an approximate duration. Since H(.) is continuous, bounded below and limD fi O H(D) = O, H(.) achieves a minimum over [0, O). But H(.) is a convex, Lipschitz continuous function and D = 0 is not a minimizer. It follows from the results of Section 2 that D* is a minimizer of H(.) if and only if D*> 0 and 0 ˛¶H(D*):

(17)

Now, by the nonsmooth calculus rules of Section 2, ¶H(D*) =  wi –  wi (– 1) + wi0 [– 1, 1], i < i0

 ¶H(D*) =

i > i0

  wi –  wi ,

i # i0

i > i0

if D* = i0 for some i0 , if ti0 < D < ti0 +1 for some i0 :

If there exists i0 such that (14) is true, we deduce from (17) that D* = i0. If, on the other hand, there exists i0 such that (15) is true (in this case, we must have i0 < N), then condition (17) is satisfied precisely by those D* satisfying ti0 # D # ti0+1 . This is what the theorem asserts. %

18

JOTA: VOL. 119, NO. 1, OCTOBER 2003

References 1. MARKOWITZ, H. M., Portfolio Selection, Journal of Finance, Vol. 7, pp. 77–91, 1952. 2. ZENIOS, S. A., Financial Optimization, Cambridge University Press, Cambridge, UK, 1993. 3. CLARKE, F. H., Optimization and Nonsmooth Analysis, Wiley, New York, NY, 1983. 4. VINTER, R. B., Optimal Control, Birkhauser, Boston, Massachusetts, 2000. 5. HARTL, R. F., and SETHI, S. P., Optimal Control Problems with Differential Inclusions: Sufficiency Conditions and an Application to a Production-Inventory Model, Optimal Control Applications and Methods, Vol. 5, pp. 289–307, 1984. 6. CLARK, C. W., CLARKE, F. H., and MONRO, G. R., The Optimal Exploitation of Renewable Resource Stocks, Econometrica, Vol. 47, pp. 25–47, 1979. 7. COX, S. J., and OVERTON, M., On the Buckling Design of Columns against Buckling, SIAM Journal on Mathematical Analysis, Vol. 23, pp. 287–325, 1992. 8. SETHI, S. P., and THOMPSON, G. L., Optimal Control Theory: Applications to Management Science and Economics, Martinus Nijhoff, Boston, Massachusetts, 1981. 9. ZHENG, H., THOMAS, L. T., and ALLEN, D. E., The Duration Derby: A Comparison of Duration-Based Strategies in Asset Liability Management, Journal of Bond Trading and Management, Vol. 1, pp. 371–380, 2003. 10. CLARKE, F. H., LEDYAEV, Y. S., STERN, R. J., and WOLENSKI, P. R., Nonsmooth Analysis and Control Theory, Graduate Texts in Mathematics, Springer Verlag, New York, NY, Vol. 178, 1998. 11. ROCKAFELLAR, R. T., and WETS, R. J. B., Variational Analysis, Grundlehren der Mathematischen Wissensshaften, Springer Verlag, Berlin, Germany, Vol. 317, 1998. 12. BENSOUSSAN, A., Perturbational Methods in Optimal Control, Wiley=GauthierVillars, Paris, France, 1991. 13. FLEMING, W. H., and RISHEL, R. W., Deterministic and Stochastic Optimal Control, Springer Verlag, New York, NY, 1975. 14. ELTON, E. J., and GRUBER, M. J., Finance as a Dynamic Process, Prentice-Hall, Englewood Cliffs, New Jersey, 1975. 15. DAVIS, M. H., PANAS, V. G., and ZARIPHOPOULOU, T., European Option Pricing with Transaction Costs, SIAM Journal on Control and Optimization, Vol. 31, pp. 470–493, 1993. 16. HO, T., Key Rate Durations: Measures of Interest Rate Risks, Journal of Fixed Income, Vol. 2, pp. 29–44, 1992. 17. COOPER, I. A., Asset Values, Interest Rate Changes, and Duration, Journal of Financial and Quantitative Analysis, Vol. 14, pp. 343–349, 1977.