we have the following problem: Given a data covariance ma trix, computed from the available data, find the Toeplitzplus. Hankel matrix closest to this matrix in ...
1490
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL 40, NO 6 JUNE 1992
Two Methods for ToeplitzplusHankel Approximation to a Data Covariance Matrix WenHsien Fang and Andrew E. Yagle, Member, IEEE
AbstractRecently, fast algorithms have been developed for computing the optimal linear least squares prediction filters for nonstationary random processes (fields) whose covariances have (block) ToeplitzplusHankel form. If the covariance of the random process (field) must be estimated from the data itself, we have the following problem: Given a data covariance matrix, computed from the available data, find the ToeplitzplusHankel matrix closest to this matrix in some sense. This paper gives two procedures for computing the ToeplitzplusHankel matrix that minimizes the HilbertSchmidt norm of the difference between the two matrices. The first approach projects the data covariance matrix onto the subspace of ToeplitzplusHankel matrices, for which basis functions can be computed using a GramSchmidt orthonormalization. The second approach projects onto the subspace of symmetric Toeplitz plus skewpersymmetric Hankel matrices, resulting in a much simpler algorithm. The extension to block ToeplitzplusHankel data covariance matrix approximation is also addressed.
I. INTRODUCTION OME fast algorithms have recently been developed for computing the optimal linear least squares prediction filters for nonstationary processes (fields) whose covariances have (block) ToeplitzplusHankel form [ 11[3]. Often the covariance function is not given explicitly, but must be estimated from the data itself. To utilize these fast algorithms, the estimated covariance function must have ToeplitzplusHankel structure. The problem can be posed as follows: Given a data covariance matrix, computed from a data sequence, find a ToeplitzplusHankel matrix that is closest to the data matrix in some sense. Several common random processes (fields) have (block) ToeplitzplusHankel covariance functions. For example, the firstorder GaussMarkov process
S
x,
=
m,l
+ w,,
12
z 1,
xo = 0 ,
(a(< 1
Manuscript received July 24, 1990; revised April 1I , 1991. This work was supported by the Air Force Office of Scientific Research under Grant AFOSR890017. The authors are with the Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 481092122. IEEE Log Number 9107646.
where w, is discrete white noise with variance a2,has the ToeplitzplusHankel covariance function
K(i,j) G ~ [ x , x = ~ ly (U 2u I i  j ’  ali+Jl), 1
 U
i,j
2
0.
(2) The twodimensional circularly symmetric Markovian random field on a polar raster M
xi,,, = a
C n=l
x;~.,
XO,N
+ w,,~,
= 0,
i 1 1,
1
IN IM ,
la1 < 1
(31
where (i, 2 a ( N / M ) ) , 1 5 N 5 M are polar coordinates is twodimensional white noise on the polar raster and wi,,, with variance a / M , also has the ToeplitzplusHankel covariance ( 2 ) . Also, in image processing a twodimensional isotropic random field is often modeled [4] as having a covariance function E [ X ( i , 2 s ( N ~/ M ) ) X ( j , 2 r r ( N 2 / M ) ) l
= P
1
* +J’
 2rJ COS ( 2 r ( N 1 N2 ) / M )
 p ( 1 / 2 ) ( ~ ( 1 + J ) 2 + ( 1  J ) 2 []( 1 + J ) 2  ( 1   J ) 2 ] C O S ( 2 ~ ( N ~ N 2 ) / M ) ) = 1
+ 21 ([(i + j)2 + (i  j12]  [(i + j)’
 (i
 j)2] cos ( 2 a ( N l  N 2 ) / N ) ) In p
(4)
if p = 1, which has a block ToeplitzplusHankel structure. Clearly, for these and similar random processes (fields), a ToeplitzplusHankel structured covariance estimate will be much more accurate than a Toeplitz estimate. For the special case of a widesense stationary random process, the estimated covariance matrix is symmetric Toeplitz. The matrix minimizing the HilbertSchmidt norm of the difference between this matrix and the data covariance matrix is found by averaging the diagonals of the data covariance matrix, replacing each element being
1053587X192$03.00 0 1992 IEEE
1
FANG A N D YAGLE: TWO METHODS OF TOEPLITZPLUSHANKEL APPROXIMATION
averaged by the average [ 5 ] . This is the result of projecting the data covariance matrix on the vector space of all symmetric Toeplitz matrices, where the projection is defined using the HilbertSchmidt inner product. In this paper we extend this approach to the more general case of ToeplitzplusHankel matrices, following which the algorithms of [ 1 ]  [ 3 ]may be applied. Since the subspace of symmetric Toeplitz matrices is a subset of the subspace of symmetric ToeplitzplusHankel matrices, the errors (in the HilbertSchmidt norm sense) will always be smaller than the error using only the Toeplitz approximation. Unfortunately, the method is more complicated than simply averaging along diagonals as in Toeplitz approximation. The basis elements of the subspace need to be computed using a GramSchmidt orthogonalization,
0
[: 1
... ... 1 1
... ...
...
Iy
1
1
01 1 7 0
... ...
11. PROBLEM FORMULATION A . HilbertSchmidt Norm For any two square real n X n matrices A and B , the HilbertSchmidt inner product and norm are defined as n
(A, B)
and there seems to be no simple closedform expression for an arbitrary element. However, if we restrict ourselves to the subspace of symmetric Toeplitz plus skewpersymmetric Hankel matrices, the optimal approximation can be easily derived by simply averaging along diagonals and antidiagonals. Both methods are developed in this paper. The extension to approximation for block data covariance matrices is also included. We do not specifically address other constraints such as positive definiteness, although such constraints can be incorporated into one of the methods of Section IV, if needed. This paper is organized as follows. In Section 11, we specify the problem, the criterion used, and the approach employed. In Section 111, the optimal ToeplitzplusHankel approximation using basis elements derived from a GramSchmidt orthogonalization is derived. In Section IV, the optimal symmetric Toeplitz plus skewpersymmetric Hankel approximation using averaging along the diagonals and antidiagonals is derived. Some examples are also given to demonstrate the procedures. In Section V, the results are extended to block data covariance matrix approximation. Section VI concludes with a summary.
tr [ A B T ] ;
llA11*
A
(A, A ) =
n
c c a2.
;=I j = l
V'
(5)
The problem we will deal with can be posed as follows: Given a data covariance matrix R, find the ToeplitzplusHankel matrix R such that I(R is minimized. The solution to this problem can be easily derived by projecting R onto the subspace of ToeplitzplusHankel matrices. A set of matrices spanning this subspace is
1.ry 0
1491
... ...
:I, 0
0 1
...
..
(7)
01 1
where the 2n  1 basis function in (6) span the Toeplitz matrices, and the 2n  1 matrices in (7) span the Hankel matrices.
B. Projection Approach If we are given a set of orthogonal matrices {ei}!= I , then the minimum distance (norm) between a matrix R and the matrix R in the subspace spanned by is equal to the distance between the matrix R and its projecis minimum, then tion on this subspace, i.e., if IIR 
{ei}!=
Consider the special case where the {Qi}ZT' are the matrices in (6). Then, since the {Qi}ZT' span the subspace of Toeplitz matrices and are orthogonal in the inner product ( 5 ) , the optimal Toeplitz approximation for any matrix is to project the matrix on this subspace, and this leads to averaging along diagonals [ 5 ] . If we extend the basis { Qi} to include basis elements for Hankel matrices as well, the error metric 11R  RI1 will clearly be less than the error for Toeplitzonly approximation. Let RT be the optimal Toeplitz matrix approximation to R, and let RTH be the optimal ToeplitzplusHankel approximation to R.
1492
IEEE TRANSACTIONS ON SIGNAL PROCESSING, VOL. 40, NO. 6 . JUNE 1992
Then the improvement in the error metric is IIR 
RTH
112
=
IIR 
RA2

IIRHII’
(9)
where RH is the projection of R on the extension of the basis (Qi}2”_;’to include Hankel matrices. We now discuss this basis extension. 111. OPTIMALTOEPLITZPLUSHANKEL APPROXIMATION A. GramSchmidt Orthogonalization
Unfortunately, while the matrices in (6) are orthogonal, and those in (7) are orthogonal, the union of (6) and (7) are not orthogonal in the sense of HilbertSchmidt norm defined in ( 5 ) . So while (6) and (7) span the subspace of ToeplitzplusHankel matrices, they are not an orthogonal basis. Hence the projection of R can not be computed by averaging along the diagonals and antidiagonals. To use the projection method, the matrices in (7) must be GramSchmidt orthogonalized, extending the orthogonal basis in (6). If we represent the matrices in (6) and (7) as {Q,}:;’, and {Q,}E