{ } A Statistical Minimax Approach to Optimizing Linear ... - Springer Link

8 downloads 397 Views 703KB Size Report
The problem of quadratic programming given by ... Abstract—A statistical minimax method for optimizing linear models with parameters, given up to the accu.
ISSN 10642307, Journal of Computer and Systems Sciences International, 2010, Vol. 49, No. 5, pp. 710–718. © Pleiades Publishing, Ltd., 2010. Original Russian Text © E.Yu. Ignashchenko, A.R. Pankov, K.V. Semenikhin, 2010, published in Izvestiya Akademii Nauk. Teoriya i Sistemy Upravleniya, 2010, No. 5, pp. 32–40.

CONTROL IN STOCHASTIC SYSTEMS AND UNDER UNCERTAINTY CONDITIONS

A Statistical Minimax Approach to Optimizing Linear Models under A Priori Uncertainty Conditions E. Yu. Ignashchenko, A. R. Pankov, and K. V. Semenikhin Moscow Institute of Aviation (Technical University), Volokolamskoe sh. 4, Moscow, 125993 Russia Received March 29, 2010

Abstract—A statistical minimax method for optimizing linear models with parameters, given up to the accu racy of belonging to some uncertainty sets, is proposed. Statistical methods for constructing uncertainty sets as confidence regions with a given reliability level are presented. A numerical method for finding a minimax strategy is proposed for arbitrary uncertainty sets that meet convexity and compactness conditions. A number of examples are considered that admit the analytical solution to optimization problem. Results of numerical simulation are given. DOI: 10.1134/S1064230710050059

0. INTRODUCTION Recent years have seen extensive research activity in solving topical optimization problems of estimation and control strategies in linear stochastic systems under incomplete information on the system parame ters and features of the existing disturbances [1–14]. A minimax approach was used to obtain many construc tive results in this area. The approach implies optimiz ing strategies for the worst combination of uncertain parameters of the model that belong to some sets of their admissible values (uncertainty sets) given a priori. In this work, we study a conditional minimization problem for the variance of the linear stochastic func tional l(x) = ξ, x

x o ∈ arg min D [ ξ, x ] ,

(0.1)

x∈X

where x ∈ ⺢ is the vector of strategies to be opti mized, ξ ∈ ⺢ p is the vector of random parameters and disturbances of the model with the mathematical expectation M[ξ] = μ and covariance matrix cov[ξ, ξ] = V. The set of admissible strategies X is given by the system of linear equalities and inequalities of the form p

{

}

(0.2) X = x ∈ R p : Ax = a, Bx ≤ b , l × p l with the matrix parameters A ∈ ⺢ , a ∈ ⺢ , B ∈ ⺢k × p, and b ∈ ⺢k supposed to be given. One can easily see that optimization problem (1), (2) can be repre sented as

x o ∈ arg min J ( x,V ) , x∈X

where (0.3) J ( x,V ) = Vx, x . The problem of quadratic programming given by relations (1)–(3) is of great theoretical and applied

value. In particular, statistical identification problems of parameters of linear systems [1–3, 5, 7, 13], discrete linear stochastic filtration [1, 3–5, 9, 14], risk man agement [8, 10–12] and other various problems related to information processing and decision making under stochastic uncertainty can be reduced to solving the stated problem. Quadratic programming theory is sufficiently well elaborated when all model parameters are assumed to be known exactly. In this case, one can easily find the solution by, for instance, secondorder conical pro gramming algorithms (SOCP) [6, 11, 15]. However, in practice, unknown values of parameters are generally replaced by their estimates constructed using statisti cal data. In this case, the solution to optimization problem significantly depends on the accuracy of the estimates. For instance, the assumption that the cova riance matrix V involved in relation (3) for the optimi zation criterion is known exactly is not realistic [1–4, 11]. To take this into account in order to make the solution less sensitive to uncertain initial data, it is rea sonable to modify the original optimization system as (0.4) xˆ ∈ arg min max J ( x,V ) , x ∈ X V ∈ᐂ

where ᐂ is the set of nonnegative definite symmetri cal matrices, which we call an uncertainty set for the covariance matrix V . We can apply the results of duality theory [1, 2, 8, 12, 13] or SOCP algorithms for minimax problems [6, 9–11] to solve problem (4) for different types of uncertainty sets ᐂ . Yet, the solution to the stated prob lem will be incomplete unless we develop an efficient way to construct the very uncertainty set ᐂ . One of the ways is to use a confidence region of acceptable reli ability constructed by processing statistical informa tion on the vector ξ as the uncertainty set ᐂ . The idea to combine the minimax approach and statistical esti

710

A STATISTICAL MINIMAX APPROACH TO OPTIMIZING

mation methods was proposed in [10] to optimize the portfolio of financial instruments, with the results of the SOCP theory used to construct the solution. In this work, we propose to combine the statistical approach to forming uncertainty sets and dual meth ods of minimax optimization studied in [8, 12, 13]. This approach yields a simple and efficient numerical method of finding minimax strategies for an arbitrary confidence set and helps solving minimax optimiza tion problems analytically for some particular types of sets. In this work, we use the following designations: M [ξ] and cov[ξ, ξ] are the mathematical expectation D

and covariance matrix of the random vector ξ, ξn L denotes the convergence with respect to the distribu tion of the sequence {ξn} of random elements to the ⎡x⎤ element ξ with the distribution L; col[x, y] = ⎢ ⎥ ; ||·|| ⎣y⎦ and ⋅ , ⋅ are the Euclidean norm and scalar product in ⺢p; I is the unity matrix of the respective dimensions, A*, A+, and tr[A] are the transposition, pseudoinver sion and trace of the matrix A,

A = max Ax ,

A

x ≤1

2

= tr[ AA*],

A



= max aij i, j

are the spectral, Frobenius, and uniform (Chebyshev) p×p norms of the matrix A, respectively, ⺢ + is the set of all symmetrical nonnegative definite matrices of the p×p dimension p × p; V  W means that V – W ∈ ⺢ + ; V  0 means that the matrix V is symmetrical and pos itive definite arg min f ( x ) and arg max f ( x ) x∈X

x∈X

are the sets of points of the global maximum and min imum of the function f (x) on the set X, respectively. 1. SOLVING THE PROBLEM OF MINIMAX QUADRATIC PROGRAMMING We consider the general method for solving the minimax problem of quadratic programming of the form

xˆ ∈ arg min su p J ( x,V ) ,

(1.1)

x∈X V ∈ ᐂ

based on the duality theory. In what follows, we assume that the set of admissible strategies X is given p×p in (2) and the set ᐂ is a convex compact in ⺢ + . Instead of (1.1), we consider its equivalent problem xˆ ∈ arg min

sup

x ∈ X 0 V ∈ᐂ, λ ≥ 0

L ( x,V , λ ) ,

(1.2)

where the Lagrange function takes the form

L ( x,V , λ) = Vx, x + 2 Bx − b, λ ,

(1.3)

711

and X0 = {x ∈ ⺢p : Ax = a} is the set of admissible strat egies given only by the system of linear equations. For the sake of convenience, we state the problem dual to (1.2) (1.4) (Vˆ, λˆ ) ∈ arg max L (V , λ ) , V ∈ᐂ, λ ∈ Λ

L (V , λ ) = inf L ( x,V , λ ) ,

(1.5)

x ∈ X0

where Λ is the set of admissible values of Lagrange coefficients and L (V , λ ) is the dual functional. To construct a minimax strategy, we need to solve the following subproblems (1) find analytically the optimal solution x (V , λ ) ∈ arg min L ( x,V , λ ) x ∈ X0

for arbitrary V ∈ ᐂ and λ ∈ Λ , (2) find the optimal values Vˆ and λˆ of the dual vari ables (Vˆ, λˆ ) ∈ arg max L (V , λ ), V ∈ᐂ, λ ∈ Λ

where L (V , λ ) is given in (1.5), (3) calculate the value of the minimax strategy

xˆ = x (Vˆ, λˆ ) and the respective ensured value of the criterion Jˆ = min sup L ( x,V , λ ) = J (xˆ,Vˆ) = L(xˆ,Vˆ, λˆ ). x ∈ X 0 V ∈ᐂ, λ ≥ 0

The following proposition justifies the approach. T h e o r e m 1. Suppose (1) the set of strategies X satisfies the Slater condi tion ∃x : Ax = a, Bx < b; (2) the set Λ has the form k

Λ = {λ ∈ ⺢ : 0 ≤ λ ≤ λ

max

},

where

λ

max

λ imax =

{

= λ1 ,..., λ k max

2

max

}* ,

max J ( x,V ) ,

( b − Bx ) i V ∈ᐂ

i = 1,..., k , and ( b − Bx ) i is the ith component of the vector b – B x . Then, (1) minimax strategy xˆ (1.1) and the solution (Vˆ, λˆ ) to dual problem (1.4) exist, (2) dual functional (1.5) can be represented analyt ically + L (V, λ) = ⎡⎣V − V (QVQ) V ⎤⎦ x0, x0 (1.6) + + − B (QVQ) B*λ, λ + 2 B ⎡⎣x0 − (QVQ) Vx0⎤⎦ − b, λ ,

JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL

Vol. 49

No. 5

2010

712

IGNASHCHENKO et al.

if ⎡I − ( QVQ) + V ⎤ QB*λ = 0 , and L (V , λ ) = −∞ other ⎣ ⎦ wise (here, Q = I − A + A and x0 = A +a), (3) if Vˆ  0 , the minimax strategy can be repre sented analytically

ˆ ) + Vˆ⎤ x0 − (QVQ ˆ ) + B*λˆ xˆ = x (Vˆ, λˆ ) = ⎡I − (QVQ ⎣ ⎦ (1.7) + − − 1 1 = Vˆ A* AVˆ A* a − AVˆ −1B*λˆ + Vˆ −1B*λˆ ;

(

)(

)

(4) if the set ᐂ includes only positive definite matrices, the minimax solution is unique. The proof of the theorem follows from the duality theory for quadratic programming problems and results obtained in [8]. Note that solving dual problem (1.4) is critical to constructing the minimax strategy. In the next section, we describe an efficient numerical method to solve it and give examples of uncertainty sets, for which prob lem (1.4) can be solved analytically. 2. METHODS FOR SOLVING DUAL OPTIMIZATION PROBLEMS We consider an iteration algorithm for finding the solution to the dual problem assuming that all matri ces V ∈ ᐂ are positive definite. A l g o r i t h m 1. Suppose s is the number of itera tion. S t e p 1. Put s = 0, λ0 = 0 and choose V0 ∈ ᐂ arbi trarily. S t e p 2. Use formula (1.7) to calculate the current approximation of the strategy x s = x V s , λ s . S t e p 3. Solve the linear programming problem

(

(

V ∈ arg max J x ,V s

V ∈ᐂ

s

for i = 1,..., k .

( (

)

(2.1)

) )

⎧ Bx s − b ≤ 0 ⎪0, i = ⎨ max s ⎪⎩λ i , Bx − b i > 0 s

s

S t e p 4. Use (1.3) to calculate ds = L(xs, V , λ ) – L(xs, Vs, λs) ≥ 0. S t e p 5. If ds > 0, go to step 6; otherwise, put s s xˆ = x , Vˆ = V s , λˆ = λ and finish iterations. S t e p 6. Solve the onedimensional optimization problem

(

)

γ s ∈ arg max L (1 − γ ) V s + γV s , (1 − γ ) λ s + γλ s . γ ∈ [0, 1]

S t e p 7. Find the next approximations for dual variables by formulas

(

)

V s +1 = 1 − γ s V s + γ sV s ,

λ

s +1

(

x ∈ X V ∈ᐂ

⎧ ⎫ inf ⎨ V s − Vˆ + λ s − λˆ : (Vˆ, λˆ ) ∈ arg max L (V , λ )⎬ → 0 Vˆ, λˆ ⎩ V ∈ᐂ, λ ∈ Λ ⎭ for s → ∞ . The stated result is a special case of Theorem 4 from [8] (see also [12, 13]). Moreover, these propositions can de derived from one common result given here to justify that condi tional gradient method can be applied to the minimax optimization problem (2.2) Yˆ = arg max ϕ ( y ) , ϕ ( y ) = inf f ( x, y ) ,

)

s s s s = 1− γ λ + γ λ ,

x∈X

y ∈Y

where X and Y are the given sets, and f ( x, y ) is the known numerical function. By [16], the sequence of approximations {ys} obtained by the conditional gradi ent method is given by the relations

( )

y s ∈ arg max ∇ϕ y s , y − y s , y ∈Y

)

and put

λ is

increase the iteration number s by the unity and go to step 2 of the algorithm. The following theorem states that algorithm 1 is convergent. T h e o r e m 2. Suppose the set ᐂ includes only positive definite matrices and the sequence {(x s, V s, λs)} is obtained by algorithm 1. Then, the sequence {x s} converges to the unique minimax strategy xˆ and the sequence {(V s, λs)} converges to the set of solutions of the dual problem x s − xˆ → 0, where xˆ = arg min sup J ( x,V ) ,

( ) ∈ arg max ϕ ( y + γ ( y

d s = ∇ϕ y s , y s − y s ,

γs

s

γ ∈ [0, 1]

(

)

s

(2.3)

))

− ys ,

(2.4)

y s +1 = y s + γ s y s − y s , y ∈ Y , s = 1,2,..., (2.5) and the algorithm terminates at the sth iteration if d s = 0. Below, we state the result on convergence of the described sequence of approximations. T h e o r e m 3. Suppose (a) X is the closed subset of a finitedimensional space, (b) Y is the convex compact in the finitedimen sional space Ψ, (c) for each x ∈ X , the function f ( x, y ) is concave with respect to y ∈ Ψ , (d) the function f ( x, y ) and its gradient with respect to the second variable are continuous on X × Ψ, (e) for each y ∈ Y , the solution to minimization problem x ( y ) = arg min f ( x, y ) 0

x∈X

JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL

Vol. 49

No. 5

2010

A STATISTICAL MINIMAX APPROACH TO OPTIMIZING

exists and is given uniquely, (f) the set {x ( y ) : y ∈ Y } is bounded. Then, the mapping x : Y → X is continuous, the function ϕ : Y → ⺢ is concave and continuously differ entiable, with its gradient calculated by the rule ∇ϕ ( y ) = ∇ υ f ( x ( y ) , υ) υ = y

∀y ∈ Y ;

(2.6)

the sequence {ys} described by conditions (2.3)–(2.5) converges to the set of solutions Yˆ of problem (2.2) and is maximizing, i.e., s inf y − yˆ → 0,

yˆ ∈ Yˆ

( ) s +1

( ) → max ϕ ( y ) ,

ϕ y

s → ∞,

s

y ∈Y

( )

s or y ∈ Yˆ at each iteration.

with either ϕ y >ϕ y Now, it follows from (2.6) that algorithm 1 is the result of procedure (2.3)–(2.5) applied to dual prob lem (1.4). We emphasize that algorithm 1 is efficient since there is an explicit expression to find the gradient (see (2.1) and (2.6)). s

3. SOLVING THE DUAL PROBLEM FOR PARTICULAR UNCERTAINTY SETS When it comes to practical implementation of algorithm 1, the most difficult thing is generally to solve optimization problem (2.1) for the linear func tion on the convex compact set of nonnegative posi tive matrices. Below, we consider several particular types of uncertainty sets such that it becomes signifi cantly easier to solve problem (2.1). E x a m p l e 1. We consider the uncertainty set given by the elementwise constraints

ᐂ ∞ = {V ∈ ⺢ +p× p : Vij− ≤ Vij ≤ Vij+, i, j = 1,..., p} , (3.1) where V− =

{ } and V = { } are symmetrical matri − Vij

ces such that V c = {Vijc } = (V+ + V− ) /2 and Δ = {Δij} = (V+ – Vc)/2 are nonnegative definite. In this case, it becomes significantly easier to implement algorithm 1 since problem (2.1) has the analytical solution

( )

( )

V s = {Vijs } = {Vijc + Δ ij sign xis sign x sj },

(3.2)

and V s is also a nonnegative definite matrix [17]. In what follows, we will need set (3.1) modified as follows

{

ᐂ ∞ = V ∈ ⺢ +p× p : I − V −1/2VV −1/2



}

≤δ ,

where δ is the known constant and V is the given pos itive definite matrix. In this case, we write (3.2) as

{

V

and

J (x ,V ) = V x , x s

s

s

(

)}

= V 1/2 I + δ sgn ⎡⎣V 1/2x s ⎤⎦ sgn ⎡⎣V 1/2x s ⎤⎦ * V 1/2  0

(3.3)

s

s

 s , x s + δ V 1/2 x s 2 , = Vx

s

1

where sgn [ x] = col [sgn ( x 1) ,...,sgn ( x p )] and x1 + ... + x p . E x a m p l e 2. For the uncertainty set

{

x1=

}

ᐂ 2 = V ∈ ⺢ +p× p : V − V c ≤ δ , 2

given by the Frobenius norm, the results similar to (3.2) hold since the transformation Vx, x = tr[Vxx*] and the inequality tr[UV ] ≤ U 2 W 2 yield

max J (x s ,V ) = (V c + δI )x s , x s = V s x s , x s ,

V ∈ᐂ 2

and the matrix V s worst for the strategy x s has the form s c s s V = V + sδ s x (x )* ∈ ᐂ 2. x ,x Thus, problem (2.1) has the analytical solution in this case as well. E x a m p l e 3. Now, we consider the set that is the intersection of the ball of the radius δ with the center p×p in V c ∈ ⺢ + given by the spectral norm and the cone of nonnegatively definite symmetrical matrices

ᐂ sp = {V ∈ ⺢ +p× p : V − V c ≤ δ} .

(3.4)

It follows from (3.4) that 〈Vx, x〉 = 〈V cx, x〉 + 〈(V – x〉 ≤ 〈V maxx, x〉, where V max = V c + δI ∈ ᐂ sp . This means that the matrix V max = V c + δI is the worst on ᐂ sp for all strategies x ∈ X simultaneously and, hence, it is the solution to the dual problem,

V c)x,

V max ∈ arg max min Vx, x .

+ Vij

+

713

V ∈ ᐂ sp x ∈ X

Thus, for the uncertainty set of form (3.4), finding minimax strategies is a common quadratic program ming problem with the known matrix V max = V c + δI max xˆ ∈ arg min V x, x .

(3.5)

x∈X

Note that problem (3.5) has the analytical solution for the special case X = X0

(

) (

xˆ = x V max,0 = V max

)

−1

{

(

A* A V max

)

−1

}

+

A* a,

that follows from (1.7). At the same time, we have from (1.6)

(

)

Jˆ = J xˆ,V max =

{

(

A V max

)

−1

}

+

A* a, a .

E x a m p l e 4. In Section 4, we consider statistical procedures for constructing uncertainty sets as confi dence regions of the form

ᐂ sp = {V ∈ ⺢ +p× p : I − Vn−1/2VVn−1/2 ≤ δ} ,

JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL

Vol. 49

No. 5

2010

(3.6)

714

IGNASHCHENKO et al.

{

}

ᐂ 2 = V ∈ ⺢ +p× p : I − Vn−1/2VVn−1/2 ≤ δ , 2

(3.7)

where Vn is a sample estimate of the unknown covari ance matrix V, constructed using the results of n mea surements of the vector of random parameters ξ in model (1). Similarly to the previous one, we found the largest element V max = (1 + δ) Vn for set (3.6), i.e., V max  V ∀V ∈ ᐂ sp , and there exists the worst matrix

δ V xx*V n n Vnx, x for set (3.7) and the strategy x. In both cases, the value of the criterion that is maximal on the uncertainty set is max Vx, x = max Vx, x = (1 + δ) Vn x, x . (3.8) V = Vn +

V ∈ ᐂ sp

V ∈ ᐂ 2

Now, (3.8) immediately yields that the minimax strategy coincides with the optimal strategy if we assume that the true value of the covariance matrix is Vn . Thus, for uncertainty sets (3.6) and (3.7), we can find the minimax strategy from the solution of the quadratic optimization problem  ,x . xˆ = x = arg min Vx x∈X

4. STATISTICAL METHODS FOR CONSTRUCTING CONFIDENCE SETS We consider minimax problem (4), given the a pri ori information (1) the vector of random disturbances ξ ∈ ⺢p in (1) has the Gaussian distribution with unknown mathe matical expectation μ and nondegenerate covariance matrix V, (2) the observable realizations {ξ(1), …, ξ(n)} of the vector ξ are independent and identically distributed by the normal law N(μ, V). We set an admissible reliability level β ( 0 < β < 1) of the confidence region for the unknown covariance matrix V. We use the observations {ξ(1), …, ξ(n)} to find the sample estimate of the covariance matrix n

∑(

)(

)

i i Vn = 1 ξ( ) − ξn ξ( ) − ξn *, n − 1i =1

for n → ∞,

D

N ( O, ⺙ p )

˜ 1/2 ˜ –1/2 I–V n VV n

( n – 1 )/2

(4.2)

where ⺙ p is the unit operator in the space of symmetri cal matrices of the dimension ( p × p).

D

Λ ( p ),

(4.3)

where Λ(p) is the distribution of the greatest in module eigenvalues of the random symmetrical matrix of dimension ( p × p) distributed by the law N(O, ⺙p). The exact form of the distribution law Λ(p) is known [18, 19]. Similarly, we have for the distribution of the squared Frobenius norm ˜ 1/2 ˜ –1/2 22 ( n – 1 )/2 I–V n VV n

D

2

χ (r)

(4.4)

for n → ∞,

where χ2 ( r ) is the chisquare distribution with r = p(p + 1)/2 degrees of freedom. Finally, the asymptotic distribution of the uniform norm, i.e., the limit distribution of the statistics I − Vn

−1/2 VVn

−1/2



( n − 1) /2

(4.5)

coincides with the distribution of the maximum of r = p(p + 1)/2 independent standard Gaussian random variables. The respective quantile of the level β of such ⎛1 + r β ⎞ –1 distribution is Φ −1 ⎜ ⎟, where Φ (·) is the func ⎝ 2 ⎠ tion reciprocal to the distribution function of the stan dard normal variable (i.e., it is the quantile function). With asymptotic distributions of random vari ables (4.3)–(4.5), we can find the characteristic size of confidence sets constructed by spectral, Frobenius and uniform norms, respectively,

{ ᐂ ( ) = {V ∈ ⺢ ᐂ ( ) = {V ∈ ⺢

} ( ) ≤ δ }, ≤ δ( )} .

ᐂ (spn) = V ∈ ⺢ +p× p : I − Vn −1/2VVn −1/2 ≤ δ(spn) , n 2

p× p +

−1/2 −1/2 : I − Vn VVn

n ∞

p× p +

: I − Vn −1/2VVn −1/2

2



(4.6)

n 2

(4.7)

n ∞

(4.8)

Here,

δ(sp) = Λ β ( p) n

(4.1)

where ξn is the sample mean. Note that the matrix Vn is nondegenerate with the probability 1. Using the results obtained in [18] and related to estimating the covariance matrix using the multidimensional Gauss ian sample, we can show that ˜ –n1/2 VV ˜ –n1/2 ) ( n – 1 )/2 (I – V

We can find the asymptotic distribution of the spec tral norm from (4.2)

δ (2 ) = n

2 , n −1

(4.9)

2χ β2 ( r ) , n −1

(4.10)

⎛1 + r β ⎞ 2 n , δ(∞ ) = Φ −1 ⎜ ⎟ ⎝ 2 ⎠ n −1

(4.11)

where Λ β ( p) is the quantile of the level β of the distri bution Λ(p) and χβ2 ( r ) is the quantile of the level β of the chisquare distribution with r degrees of freedom. Since sets (4.6)–(4.8) are constructed based on sta tistical data, we call them statistical uncertainty sets.

JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL

Vol. 49

No. 5

2010

A STATISTICAL MINIMAX APPROACH TO OPTIMIZING

All statistical uncertainty sets constructed above are asymptotic confidence regions of reliability β for the matrix V, i.e.,

{

}

n lim P V ∈ ᐂ ( ) = β,

n→∞

n where ᐂ ( ) is any of the regions given by relati

ons (4.6)–(4.11). Using properties of minimax strategies considered in Section 1 and properties of statistical uncertainty sets, we can show that the following proposition holds. T h e o r e m 4. Suppose ᐂ ( ) coincides with set (4.6) or (4.7) and δ(n) is given by formulas (4.9) or (4.10). Then, (1) the minimax solution n

xˆn ∈ arg min max(n) J ( x,V )

(4.12)

x ∈ X V ∈ ᐂ

coincides with the solution to quadratic optimization problem (4.13) x n ∈ arg min J ( x,Vn ) , x∈X

where Vn is the sample estimate of covariance matrix (4.1), (2) the ensured value of the criterion is

(

)

n Jˆn = min max(n) J ( x,V ) = 1 + δ( ) J ( xˆn,Vn ) ; (4.14) x ∈ X V ∈ ᐂ

(3) the strategy xˆn is εnoptimal with the proba bility β, i.e.,

{

(

)

}

lim inf P J ( xˆn,V ) ≤ J x 0,V + ε n ≥ β, n→∞

where ε n = 2δ ( n) J ( xˆn,Vn ) and x0 is optimal strategy (1). It follows from Theorem 4 that if the uncertainty set is given by the spectral or Frobenius norm, it is suf ficient to solve quadratic programming problem (4.13) numerically to find minimax strategy xˆn (4.12). If the set of admissible strategies is given only by constraints in the form of equalities X = {x : Ax = a}, the minimax strategy is found analytically

(

)

+

xˆn = Vn−1 A* AVn−1 A* a. If the uncertainty set is given by the uniform norm, i.e., in form (4.8), (4.11), we need to solve the problem x ∈ X V ∈ᐂ ∞

to find the minimax strategy. The problem can be solved by algorithm 1 described in Section 2. For the case X = X0 = {x : Ax = a}, taking into account that dual functional (1.6) in this case has the form min J ( x,V ) =

( AV

)

x ∈ X0

+

A* a, a , we obtain the following iterative pro

cedure.

A l g o r i t h m 2. Suppose s is the number of itera tion. S t e p 1. Put s = 0 and choose V 0 = Vn.. S t e p 2. Calculate xs = (V s)–1A*[A(V s)–1A*]+a. Step

3.

V

Find

s

˜ 1/2 ⎛I V n ⎝

=

+

s (n) ˜ 1/2 ˜ 1/2 s ⎞ ˜ 1/2 δ ∞ sgn [ V n x ] sgn [ V n x ]* V n . ⎠

S t e p 4. Calculate d s = V s x s , x s − V s x s , x s . S t e p 5. If d s > 0, go to step 6; otherwise, put xˆn = x s , Vˆn = V s , Jˆn = V s x s , x s and finish iterations. S t e p 6. Find γs ∈ arg max 〈[A((1 – γ)V s + γ ∈ [ 0, 1 ]

s –1

γ V ) A*]+a, a〉. S t e p 7. Put V s +1 = (1 − γ ) V s + γV s , increase s by the unity and go to step 2 of the algorithm. Obviously, algorithm 2 is the particular case of algorithm 1, for which internal linear optimization problem (2.1) has analytical solution, which makes it significantly easier to apply in practice. 5. NUMERICAL EXAMPLE To illustrate the results of the work, we solve the optimization problem (5.1) xˆn ∈ arg min max(n) J ( x,V ) , x ∈ X V ∈ ᐂ

where

{

}

X = x ∈ ⺢ p : Ax = a, Bx ≤ b ,

J ( x,V ) =

Vx, x , and ᐂ is the statistical uncertainty set. Sup pose p = 3, and the parameters of problem (5.1) are given as ⎡−1 0 0 ⎤ ⎡0⎤ ⎢ 0 −1 0 ⎥ ⎢0⎥ ⎥; b = ⎢ ⎥; A = [1 1 1] ; a = 1; B = ⎢ ⎢ 0 0 − 1⎥ ⎢0⎥ ⎢⎣− 2 − 5 − 7⎥⎦ ⎢⎣−5⎥⎦ ⎡2.00 1.73 2.24⎤ V0 = ⎢1.73 4.00 3.87 ⎥ . ⎢ ⎥ ⎣⎢2.24 3.87 6.00⎦⎥ The set of admissible strategies is ( n)

{

X = x ∈ R3 : x1 + x2 + x3 = 1,

xˆn ∈ arg min max(n) J ( x,V )

−1

715

(5.2) 2x1 + 5x2 + 7x3 ≥ 5, x1 ≥ 0, x3 ≥ 0, x3 ≥ 0} . To analyze the results of the algorithm operation, we solve problem (5.1) in the following four cases. (1) Suppose the true value V0 of the covariance matrix V is known. Then, the uncertainty set consists of one element V0. Hence, x ∈ arg min J ( x,V0 ) , 0

JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL

x∈X

Vol. 49

No. 5

2010

(5.3)

716

IGNASHCHENKO et al.

Table 1. Optimization results for strategies with different types of uncertainty sets Solution

Uncertainty set

Optimal

ᐂ = {V0}

Heuristic

ᐂ = {Vheur}

Adaptive

˜ } ᐂ = {V n

Minimax statistical

Strategy

˜ (n) ᐂ= ᐂ 2

(

0.26 0.35 0.39 0.25 0.38 0.37

2.00 1.73 2.24 1.73 0.00 0.00

1.73 4.00 3.87 0.00 3.52 0.00

2.24 3.87 6.00 0.00 0.00 5.33

3.36

0.25

1.73

1.59

1.90

2.90

0.37 0.38

1.59 1.91

3.52 3.16

3.16 5.33

0.25

2.33

2.57

3.15

0.37 0.38

2.57 3.15

5.16 5.23

5.23 7.95

)

J 0 = J x 0 ,V 0 .

(5.4)

We call strategy (5.3) and the value of criterion (5.4) optimal. (2) We replace the true covariance matrix V0 by an arbitrary value V heur , which we call heuristic in what follows. We construct it as follows

ˆ 12, σ ˆ 22, σ ˆ 32 ⎤⎦ , V heur = diag ⎡⎣σ

(5.5)

ˆ 12, σ ˆ 22, σ ˆ 32 are sample estimates of the variance where σ found using the a priori statistical data. We define the strategy and the value of the criterion

x

heur

x∈X

J

(

∈ arg min J x,V

heur

(

=J x

heur

,V

heur

heur

),

(5.6)

),

(5.7)

which we also call heuristic. (3) Suppose Vn is sample estimate (4.1) of the cova riance matrix, which we use to solve optimization problems instead of the unknown covariance matrix. In this case, we call the respective strategy and the value of the criterion

x ad ∈ arg min J ( x,Vn ) ,

(5.8)

(

(5.9)

x∈X

J ad = J x ad ,Vn

)

adaptive. (4) Using the a priori information, we construct the uncertainty set ᐂ (n) as a ball given by the Frobenius norm

{

}

ᐂ (2n) = V ∈ R3+×3 : I − Vn−1/2VVn−1/2 ≤ δ(2n) , (5.10) 2

Value of the criterion

The worst covariance matrix

1.35

4.52

with its radius found by formula (4.10) δ (2 ) = n

2χ β ( r ) , n −1 2

(5.11)

where r = p( p + 1)/2 = 6 and the quantile χβ2 ( r ) = 12.59 for the reliability level β = 0.95. We define the minimax statistical strategy xˆn and the respective ensured value of the criterion Jˆn as xˆn ∈ arg min max J ( x,V ) = arg min J ( x,Vn ) , (5.12) x∈X

V ∈ ᐂ 2

( n)

x∈X

Jˆn = max(n) J ( xˆn,V ) 2 V ∈ᐂ

= (1 +

δ(2n))J

( xˆn,Vn) = J ( xˆn,Vˆn) ,

(5.13)

where Vˆn is the solution to the dual problem and is simultaneously the covariance matrix worst for xˆn

δ(2n) (5.14) Vn xˆn(xˆn)*Vn. Vn xˆn, xˆn In this example, the a priori information is given as the sample of the volume n = 80 used to find the sam ple estimate of the true covariance matrix V0 ⎡1.73 1.59 1.90⎤  (5.15) V n = ⎢1.59 3.52 3.16⎥ . ⎢ ⎥ ⎢⎣1.91 3.16 5.33⎥⎦ Using matrix (5.15), we calculated the optimal, heuristic, adaptive and minimax strategies given by relations (5.3), (5.6), (5.8), and (5.12) and the respec tive values of optimization criteria (5.4), (5.7), (5.9), and (5.13) for problem (5.1), (5.2). The calculation results are given in table 1. Note that the minimax sta tistical strategy was constructed according to Theorem 3.

Vˆn = Vn = Vn +

JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL

Vol. 49

No. 5

2010

A STATISTICAL MINIMAX APPROACH TO OPTIMIZING

717

Table 2. Solutions depending on the sample volume n Solution

Uncertainty set

Optimal

ᐂ = {V0}

Heuristic

ᐂ = {V

Minimax statistical

˜ (n) ᐂ= ᐂ 2

heur

}

Strategy

Value of the criterion

0.26 0.35 0.39

3.36

n = 500

n = 1000

n = 5000

n = 500

n = 1000

n = 5000

0.24 0.39 0.37

0.25 0.38 0.37

0.24 0.39 0.37

1.60

1.53

1.55

0.25

0.27

0.26

4.29

3.89

3.58

0.37 0.38

0.33 0.40

0.36 0.38

To be on the safe side, we constructed the same strat egy numerically using algorithm 1, which led to the same result. To find the dependence of the accuracy of the solu tion on the volume of the a priori statistical informa tion practically, we constructed the heuristic and min imax statistical strategies for different sample volumes n  (2n) (see Table 2). used to construct the uncertainty set ᐂ With the obtained results, we can conclude the fol lowing all found strategies turned out to be close to the optimal strategy constructed for the complete a priori information, the value of the optimization criterion for heuristic and adaptive strategy turned out to be unreasonably understated as compared to its optimal value. This means that heuristic and adaptive estimates of the optimal value of the criterion is not reliable, for small volume of statistical data, the ensured value of the criterion is substantially greater than its optimal value, which results from a sufficiently big radius of the confidence region if its reliability is close to the unity, the covariance matrix worst on the statistical uncertainty set may differ significantly from the true (unknown) covariance, the statistical uncertainty set “decreases” for the increasing volume of the sample, making the minimax strategies and the respective ensured value of the crite rion converge to their optimal values. The adaptive strategy possesses the similar property, unlike the heu ristic strategy, algorithm 1 efficiently synthesizes minimax strate gies, finds the worst parameters of the model and cal culates the ensured value of the criterion.

CONCLUSIONS In this work, we proposed and justified the tech nique for solving quadratic programming problem for the inexactly given weight matrix of the criterion, which is the covariance matrix of random parameters of the optimized linear stochastic model. Assuming that the unknown covariance matrix belongs to the a priori given convex compact of nonnegative definite matrices, we constructed the strategy optimal with respect to the minimax criterion and calculated the ensured value of the optimization criterion. We pro posed the convergent iterative algorithm for solving the dual optimization problem that can be used to cal culate both the worst on the uncertainty set value of the covariance matrix and the respective minimax strategy. We showed that the minimax strategy can be calculated analytically under sufficiently general assumptions. We proposed the statistical technique for synthesizing uncertainty sets as confidence regions in the space of matrices and respective minimax statisti cal strategies that employs preliminary empirical data on random parameters of the model. The results can be used to solve optimization, esti mation and decision making problems under a priori statistical uncertainty in description of random parameters of the stochastic system involved. ACKNOWLEDGMENTS This study was financially supported by the Russian Foundation for Basic Research (grant no. 0908 00369), Federal Target Program Campaign 1.1 (state contract of September 30, 2009 no. 02.740.11.0471), Federal Target Program Campaign 1.2.1 (state con tract of August 18, 2009 no. P889), and Federal Target Program Campaign 1.3.2 (state contract of August 10, 2009 no. P674).

JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL

Vol. 49

No. 5

2010

718

IGNASHCHENKO et al.

REFERENCES 1. S. Verdu and H. V. Poor, “On Minimax Robustness: A General Approach and Applications,” IEEE Trans. Inf. Theory IT–30 (2) (1984). 2. V. N. Solov’ev, “Dual Extremal Problems and Their Application to Minimax Estimation Problems,” Usp. Mat. Nauk 52 (4) (1997). 3. A. I. Matasov, Estimators for Uncertain Dynamic Systems (Kluwer, Dordrecht, 1998). 4. B. I. Anan’ev, “Minimax Linear Filtering MultiStep Processes with Uncertain Disturbance Distributions,” Avtom. Telemekh., No. 10 (1993). 5. Robustness in Identification and Control Ed. by A. Garulli, A. Tesi, and V. Vicino (Springer, New York, 1999). 6. A. BenTal, L. El Ghaoui, and A. Nemirovski, Robust Semidefinite Programming. Handbook on Semidefinite Programming (Kluwer, New York, 2000). 7. G. Calafiore and L. El Ghaoui, “WorstCase Maxi mum Likelihood Estimation in the Linear Model,” Automatica 37 (4) (2001). 8. A. R. Pankov, E. N. Platonov, and K. V. Semenikhin, “MiniMax Quadratic Optimization and Its Application to Investment Planning,” Avtom. Telemekh., No. 12 (2001) [Automat. Remote Control 62 (12), 1978–1995 (2001)]. 9. L. Li, Z.Q. Luo, T. Davidson, et al., “Robust Filtering via Semidefinite Programming with Applications to Target Tracking,” SIAM J. Optimization 12 (3) (2002).

10. D. Goldfarb and G. Iyengar, “Robust Portfolio Selec tion Problems,” Math. Oper. Res. 28 (1) (2003). 11. L. El Ghaoui, M. Oks, and F. Oustry, “WorstCase ValueAtRisk and Robust Optimization: A Conic Pro gramming Approach,” Oper. Res. 51 (4) (2003). 12. A. R. Pankov and E. N. Platonov, “Guaranteeing Solu tions of the Quadratic Programming Problem with Inexactly Assigned Parameters and Their Applications in the Investment,” Izv. Ross. Akad. Nauk, Teor. Sist. Upr., No. 1, (2003) [Comp. Syst. Sci. 42 (1), 154–166 (2003)]. 13. K. V. Semenikhin, “Minimax Estimation of Random Elements by the RootMeanSquare Criterion,” Izv. Ross. Akad. Nauk, Teor. Sist. Upr., No. 5, (2003) [Comp. Syst. Sci. 42 (5), 670–682 (2003)]. 14. K. Zhou, J. Doyle, and K. Glover, Robust and Optimal Control (Prentice Hall, New York, 1996). 15. M. S. Lobo, L. Vandenberghe, S. Boyd, and H. Lebret, “Applications of Second Order Cone Programming,” Linear Algebra and Its Applications 248 (1–3) (1998). 16. B. T. Polyak, Introduction to Optimization Theory (Nauka, Moscow, 1980) [in Russian]. 17. A. Pankov, K. Siemenikhin, and E. Ignastchenko, “SampleBased Minimax LinearQuadratic Optimiza tion,” in Proceedings of Europe Control Conference ECC’2009, Budapest, Hungary, 2009. 18. T. Anderson, An Introduction in Multivariate Statistical Analysis (Fizmatgiz, Moscow, 1963; Wiley, New York, 1958). 19. V. L. Girko, “Spectral Theory of Random Matrices,” Usp. Mat. Nauk 40 (1) (1985).

JOURNAL OF COMPUTER AND SYSTEMS SCIENCES INTERNATIONAL

Vol. 49

No. 5

2010