Asymptotically equivalent sequences of matrices and

0 downloads 0 Views 249KB Size Report
Sep 1, 2008 - For the engineering community, Gray's tutorial monograph on Toeplitz .... j,k=1 with δj,k being the Kronecker delta and Mj ∈ CN×N satisfying. ... On the other hand, by the Stone-Weierstrass approximation theorem (see e.g. ...
1

Asymptotically equivalent sequences of matrices and Hermitian block Toeplitz matrices with continuous symbols: Applications to MIMO systems Jes´us Guti´errez-Guti´errez and Pedro M. Crespo, Senior Member, IEEE

Abstract For the engineering community, Gray’s tutorial monograph on Toeplitz and circulant matrices has been, and remains, the best elementary introduction to the Szeg¨o theory on large Toeplitz matrices. In the present paper the most important results of the cited monograph are generalized to block Toeplitz matrices by maintaining the same mathematical tools used by Gray, that is, by using asymptotically equivalent sequences of matrices. As applications of these results, the geometric MMSE for both an infinite-length multivariate linear predictor and an infinite-length DFE for MIMO channels, are obtained as a limit of the corresponding finite-length cases. Similarly, a short derivation of the well known capacity of a time-invariant MIMO Gaussian channel with ISI and fixed input covariance matrix is also presented. Index Terms Block circulant matrices; block Toeplitz matrices; channel capacity; Frobenius norm; functions of matrices; Hermitian matrices; MIMO MMSE-DFE; MMSE linear predictor; multivariate stationary processes; spectral norm.

I. I NTRODUCTION The most famous theorem on large Toeplitz matrices is the Szeg¨o theorem [1]. This result deals with the asymptotic behavior of the eigenvalues of an n × n Hermitian Toeplitz matrix Tn = (tj−k )nj,k=1 , where the complex numbers tk with k ∈ Z, are the Fourier coefficients of a bounded function called the symbol of the sequence {Tn }. Unfortunately, the level of mathematical sophistication for understanding [1] is often beyond the level provided This work was partially supported by the Spanish Ministry of Education and Science, by the European Social Fund and by the European Regional Development Fund through the Torres-Quevedo program and the MultiMIMO project (no. TEC2007-68020-C04-03). Both authors are with CEIT and Tecnun (University of Navarra), Manuel de Lardiz´abal 15, 20018, San Sebasti´an, Spain. Tel.: +34 943212800; fax: +34 943213076. Email addresses: [email protected] (J. Guti´errez-Guti´errez), [email protected] (P. M. Crespo)

September 1, 2008

DRAFT

2

by a typical engineering background. For this reason, Gray simplified the Szeg¨o theory in [2], by assuming more P restrictive conditions, namely, Toeplitz matrices with symbols in the Wiener algebra (i.e., k∈Z |tk | < ∞). Since his approach is purely matrix-analytic (mainly based on his concept of asymptotically equivalent sequences of matrices), he succeeded in conveying the main ideas of the Szeg¨o theory to a wider audience. In the paper of Gazzah, Regalia and Delmas [3], the Szeg¨o theorem for Toeplitz matrices with symbols in the Wiener algebra is generalized to block Toeplitz matrices by using asymptotically equivalent sequences of matrices (i.e., Gray’s tools) and the concept of vec-permutation matrix. In the present paper we go a step further. By using again the concept of asymptotically equivalent sequences of matrices, we extend the most important results (the Szeg¨o theorem among others) of the Gray tutorial [2] to block Toeplitz matrices with continuous symbols (i.e., the tk with k ∈ Z, are given by the Fourier coefficients of a continuous N × N matrix function), which is a less restrictive condition than being in the Wiener algebra. In [4], Bose and Boo also derived asymptotic results of block Toeplitz matrices based on Gray’s tools. However, they look at a completely different problem to the one considered in our paper. Here, Tn is an n × n block Toeplitz (BT) matrix with N × N blocks, and we look, among other things, at the asymptotic behavior of the eigenvalues of Tn as n goes to infinity while N remains fixed (specifically, we study the Szeg¨o theorem for continuous matrixvalued symbols). On the contrary, in [4], Tn is an n × n block Toeplitz matrix with Toeplitz blocks (BTTB) of order N × N , and they look at the asymptotic behavior of the eigenvalues of Tn when both n and N go to infinity (specifically, they study the Szeg¨o theorem for certain 2-variate symbols). It should be pointed out that their derivation can not be particularized to the case where N is kept fixed. Finally, it should be mentioned that, based on matrix analysis but without using the Gray concept of asymptotically equivalent sequences of matrices, Tilli [5] and Serra-Capizzano [6] derive results of Szeg¨o type for Lebesgue integrable multivariate matrix-valued symbols, that are stronger than the results given by Bose and Boo [4] and the ones presented here. Such generality is based on introducing another concept of approximation of matrix sequences, namely the notion of approximating class of sequences (a.c.s.), which is more general than that of asymptotically equivalent sequences. In the Appendix we report some analysis based on this new approach, since it has the potential of becoming a further tool available in the engineering community. Therefore, the main contribution of our paper is to present a plain generalization to block Toeplitz matrices of Gray’s famous tutorial [2] by maintaining the Gray tools. The theory on block Toeplitz matrices is in general useful in signal processing, communications and information theory, since this kind of matrices arise, for instance, as covariance matrices of multivariate stationary processes or as matrix representations of linear time-invariant discrete-time multivariate filters. As applications of this theory to MIMO systems, the geometric MMSE for both an infinite-length multivariate linear predictor and an infinite-length DFE for MIMO channels, are obtained as a limit of the corresponding finite-length cases. Similarly, a short derivation of the well known capacity of a time-invariant MIMO Gaussian channel with ISI and fixed input covariance matrix is also presented. The remainder of the paper is organized as follows. The next section states preliminary mathematical considerations. Section III derives a new result on matrix sequences that will be used in Section IV. Section IV is the September 1, 2008

DRAFT

3

main part of the paper and derives several fundamental theorems on large Hermitian block Toeplitz matrices. We conclude with Section V where the furnished theoretical results are applied to three problems in the area of signal processing and communications. II. P RELIMINARY MATHEMATICAL DEFINITIONS AND RESULTS A. Asymptotically equivalent sequences of matrices If A is an n × n diagonalizable matrix then we will indicate by λk (A), 1 ≤ k ≤ n, all the eigenvalues of A counted with their multiplicities. The following well known matrix norms (see e.g. [7] or [8]) will be used: Definition 1: The Frobenius norm kAkF and the spectral norm kAk2 of an n × n matrix A = (aj,k )nj,k=1 are defined as kAkF

=

  12 n X n X p tr(A∗ A) =  |aj,k |2  j=1 k=1

µ kAk2

= max x6=0

x∗ A∗ Ax x∗ x

¶ 12

µ =

¶ 12 max λk (A A) ∗

1≤k≤n

where tr denotes trace, ∗ denotes conjugate transpose and x is a column vector. The Frobenius norm and the spectral norm are special instances of unitarily invariant norms (see e.g. [7] or [8]). We recall that a norm k · k is unitarily invariant if kU AV k = kAk for every n × n matrix A and for every n × n unitary matrices U and V . The next definition is due to Gray (see [2] or [9]). Definition 2: Let An and Bn be n × n matrices for all n ∈ N. We say that the sequences {An } and {Bn } are asymptotically equivalent, and write {An } ∼ {Bn }, if ∃M ≥ 0 ;

kAn k2 , kBn k2 ≤ M

∀n ∈ N

(1)

and kAn − Bn kF √ = 0. (2) n In the following lemma the basic properties of this kind of sequences are reviewed, most of them are directly from lim

n→∞

Gray’s monograph [2]. Lemma 1:

1) If {An } ∼ {Bn } then {Bn } ∼ {An }.

2) If {An } ∼ {Bn } and {Bn } ∼ {Cn }, then {An } ∼ {Cn }. 3) If {An } ∼ {Bn } then {αAn } ∼ {αBn } for every α ∈ C. 4) If {An } ∼ {Bn } and {Cn } ∼ {Dn }, then {An + Cn } ∼ {Bn + Dn } and {An Cn } ∼ {Bn Dn }. Notice that ∼ is not reflexive. B. Functions of matrices We first review the concept of polynomial function of a matrix.

September 1, 2008

DRAFT

4

Definition 3: If p(x) = αq xq + . . . + α1 x + α0 is a given polynomial with coefficients in C, then for every n × n matrix A we define p(A) = αq Aq + . . . + α1 A + α0 In , where In denotes the n × n identity matrix. We now introduce the concept of function of a matrix. For a thorough treatment we refer the reader to [10] or [11]. We also recommend the text [12] for a treatment of functions of matrices, developed in a signal processing context. Definition 4: We consider an n × n diagonalizable matrix A = U diag(λ1 (A), . . . , λn (A))U −1 . If g is a complex function on the set {λ1 (A), . . . , λn (A)}, we define the n × n matrix g(A) = U diag(g(λ1 (A)), . . . , g(λn (A)))U −1 . Observe that the last definition agrees with Definition 3. Furthermore, if g is a complex function on the set {λ1 (A), . . . , λn (A)} of the eigenvalues of an n × n diagonalizable matrix A, there exists a polynomial of degree at most n − 1 such that g(A) = p(A). This is a direct consequence of the fact that the interpolation problem of determining a polynomial p(x) of degree at most n − 1 satisfying p(λk (A)) = g(λk (A)), 1 ≤ k ≤ n, has solution (see e.g. [8]). Therefore, Definition 4 is independent of the chosen eigenvalue decomposition of the matrix A. C. Block circulant matrices We now review a special type of block Toeplitz matrices. Definition 5: An n × n block circulant matrix with N × N blocks is one having the form   C0 C1 C2 ··· Cn−1     .. ..  Cn−1  . C0 C1 .     .. C= . C0 C2   Cn−2 Cn−1    ..   .. .. ..  . . . . C1    C1 ··· Cn−2 Cn−1 C0 where Ck ∈ CN ×N , 0 ≤ k ≤ n − 1, and CN ×N is the set of all N × N complex matrices. The next well known lemma characterizes block circulant matrices. Lemma 2: C = (Cj,k )nj,k=1 , with Cj,k ∈ CN ×N , is an n × n block circulant matrix with N × N blocks if and only if C = (Vn ⊗ IN )diag(M1 , . . . , Mn )(Vn ⊗ IN )∗ (n)

where ⊗ is the Kronecker product, Vn = (vj,k )nj,k=1 is the n × n Fourier unitary matrix: 2π(j−1)(k−1) 1 (n) i n vj,k = √ e− , n

and diag(M1 , . . . , Mn ) = (δj,k Mj )nj,k=1 with δj,k being the Kronecker delta and Mj ∈ CN ×N satisfying     M1 C1,1         C  M2  √   2,1   = n(Vn ⊗ IN )∗    ..   ..  .  .   .      Mn Cn,1 If C = (Cj,k )nj,k=1 is an n × n block circulant matrix, it is customary to write C = circ(C1,1 , . . . , Cn,1 ). September 1, 2008

DRAFT

5

D. Two properties of partitioned matrices Finally, we review two well known results regarding the determinant and the inverse of a partitioned matrix (see e.g. [8]), which are based on block two-sided Gaussian elimination steps. Lemma 3: Let A be a square matrix partitioned as  A=

 A11

A12

A21

A22



where A11 and A22 are also square. If A11 and A22 are nonsingular then −1 det(A) = det(A11 ) det(A22 − A21 A−1 11 A12 ) = det(A22 ) det(A11 − A12 A22 A21 ).

Lemma 4: Let A be a square matrix partitioned as in Lemma 3. If A, A11 and A22 are nonsingular then the corresponding partitioned presentation of A−1 is  −1 (A11 − A12 A−1 22 A21 )  −1 (A21 A11 A12 − A22 )−1 A21 A−1 11

−1 −1 A−1 11 A12 (A21 A11 A12 − A22 ) −1 (A22 − A21 A−1 11 A12 )

 .

III. A NEW RESULT ON ASYMPTOTICALLY EQUIVALENT SEQUENCES OF MATRICES Theorem 1: Consider two asymptotically equivalent sequences of Hermitian matrices {An } and {Bn }. Let [a, b] be a closed interval containing the eigenvalues of the matrices of these sequences. Then {g(An )} ∼ {g(Bn )}

∀g ∈ C[a, b],

where C[a, b] denotes the set of all continuous complex functions on [a, b]. Proof: Since {An } and {Bn } are sequences of Hermitian matrices, (1) implies that there is M ≥ 0 such that −M ≤ λk (An ), λk (Bn ) ≤ M , n ∈ N, 1 ≤ k ≤ n. Therefore, there exists such an interval [a, b], for instance [−M, M ]. From Lemma 1 the theorem holds for any polynomial g. Now, let g be an arbitrary continuous function on [a, b]. Since the spectral norm is unitarily invariant, we deduce that kg(An )k2 = kdiag(g(λ1 (An )), . . . , g(λn (An )))k2 = max |g(λk (An ))| ≤ K 1≤k≤n

for all n ∈ N, where K = maxa≤x≤b |g(x)|. Analogously for the case of Bn . On the other hand, by the Stone-Weierstrass approximation theorem (see e.g. [13]), given ² > 0 there is a polynomial p such that |p(x) − g(x)| < ², x ∈ [a, b]. Furthermore, from the fact that {p(An )} ∼ {p(Bn )}, there exists n0 ≥ 1 such that

September 1, 2008

kp(An ) − p(Bn )kF √

where xk = (xk , . . . , xk The variables September 1, 2008

(j) xk ,

)

denotes the input process and the filter taps H0 , H1 , . . . , Hm are No × Ni matrices.

with k ∈ Z and 1 ≤ j ≤ Ni , are uncorrelated random variables with zero mean and variance DRAFT

12

(1)

(No ) >

1. The observation noise, nk = (nk , . . . , nk

) , is uncorrelated with the input, i.e., E[xk1 n∗k2 ] = 0Ni ×No for (j)

all k1 , k2 ∈ Z; here 0Ni ×No is the Ni × No zero matrix. Furthermore, the noise variables nk , with k ∈ Z and 1 ≤ j ≤ No , are uncorrelated random variables with zero mean and variance σ 2 > 0. Let Hn be the n × (n + m) block Toeplitz matrix with No × Ni blocks  H0 H1 · · · Hm 0No ×Ni ···   0 ··· Hm 0No ×Ni  No ×Ni H0 H1  . .. .. .. ..  .. . . . .    .. .. .. ..  . . . .  0No ×Ni

···

···

0No ×Ni

H0

H1

given by ···

0No ×Ni

··· .. . .. .

0No ×Ni .. .

···

Hm

0No ×Ni

      .    

(13)

We denote by Tn the n × n banded Hermitian block Toeplitz matrix with No × No blocks defined by Tn = Hn Hn∗ . Pm −kωi This matrix can be written as Tn = Tn (F ), with F (ω) = H(ω)H(ω)∗ where H(ω) = is the k=0 Hk e transfer function of the MA filter. bk = We linearly estimate yk from the previous n − 1 vectors yk−1 , yk−2 , . . . , yk−n+1 . Let y

Pn−1 j=1

Wj yk−j

denote such an estimate. Figure 1 shows this forward linear predictor. The No × No weight matrices Wj are chosen to minimize the mean square error (MSE) defined by the trace1 of the error auto-correlation matrix Ree (n − 1) = bk . By reasoning as in [21] for the univariate case of the forward linear predictor, it is E[ek e∗k ] with ek = yk − y not difficult to prove that the resulting error auto-correlation matrix can be expressed as Ree,min (n − 1) = ([g(Tn )]1,1 )−1 where g(x) = 1/(σ 2 + x) and [g(Tn )]1,1 ∈ CNo ×No is the 1, 1 block entry of g(Tn ). Thus, the corresponding geometric MMSE is now computed as MMSEn = det(([g(Tn )]1,1 )−1 ) = The matrix Tn can be partitioned as

 Tn = 

and consequently

1 . det([g(Tn )]1,1 ) 

[Tn ]1,1

A12

A21

Tn−1



 σ 2 InNo + Tn = 

 σ 2 INo + [Tn ]1,1

A12

A21

σ 2 I(n−1)No + Tn−1

.

Combining Lemmas 3 and 4 we deduce that det([g(Tn )]1,1 ) = det([(σ 2 InNo + Tn )−1 ]1,1 ) = 1 Since

det(σ 2 I(n−1)No + Tn−1 ) , det(σ 2 InNo + Tn )

we are dealing with multi-dimensional error random processes either the trace or the determinant of Ree (n − 1) can be used as a mean

square error measure. It turns out that by using either criterion, the obtained optimal system parameters and the associated error auto-correlation matrix, Ree,min (n − 1), are the same.

September 1, 2008

DRAFT

13

xk

{Hl }

nk sk

yk−1

yk−2

W1

yk−n+1

W2

Wn−2 Wn−1 bk y

Fig. 1.

Multivariate forward linear predictor with n − 1 taps.

and therefore, MMSEn = det(g(Tn−1 ))/ det(g(Tn )). Since det(g(Tn )) > 0, using a well known property regarding limits of sequences of positive numbers, yields det(g(Tn−1 )) 1 = . n→∞ det(g(Tn )) limn→∞ (det(g(Tn )))1/n

lim MMSEn = lim

n→∞

Finally by Theorem 6, the geometric MMSE for an infinite-length multivariate forward linear predictor can be written as

à MMSE =

exp µ

=

exp

1 2π 1 2π

Z

No 2π X 0

Z

!

k=1



0

¡ ¢ ln σ 2 + λk (H(ω)H(ω)∗ ) dω

¶ ¡ ¢ ln det σ 2 INo + H(ω)H(ω)∗ dω .

B. Infinite-length MIMO MMSE-DFE In the second example, we evaluate the geometric MMSE for an infinite-length decision feedback equalizer (DFE) in multiple-input multiple-output (MIMO) channels, as a limit of the corresponding finite-length cases. We consider the communication system shown in Figure 2 where the MIMO channel has Ni inputs and No outputs. Its discrete channel impulse response is given by H = (H0 , . . . , Hm ), where the Hl are No × Ni matrices. The channel output, yk , can be written as yk =

m X

Hl xk−l + nk

l=0 (1)

(Ni ) >

where xk = (xk , . . . , xk components symbols,

(j) xk ,

)

(1)

(No ) >

and nk = (nk , . . . , nk

)

denote the input and noise respectively. The input (j)

(j)

are discrete uncorrelated random variables with E[xk ] = 0 and E[|xk |2 ] = 1. The (j)

observation noise is uncorrelated with the input, and its samples, nk , are uncorrelated random variables with zero mean and variance σ 2 > 0. The equalizer block is a standard DFE that comprises an n tapped delay line feedforward filter (FFF), and an infinite tapped delay line feedback filter (FBF). We linearly estimate the input vector xk−4 , where 0 ≤ 4 ≤ n + m − 1 is the decision delay, from the channel outputs yk , yk−1 , . . . , yk−n+1 and from the inputs ek−4 has the xk−4−1 , xk−4−2 , . . . , xk−n−m+1 (assuming correct previous decisions). Thus, such an estimate x

September 1, 2008

DRAFT

14

Equalizer

nk xk

yk

{Hl }

ek−4 x

n-taps FFF

bk−4 x

Detector

inf. FBF

Fig. 2.

MIMO communication system: discrete time channel and equalizer.

form ek−4 = x

n−1 X

Wj yk−j +

n+m−4−1 X

j=0

Bl xk−4−l ,

l=1

with the Wj and Bl being Ni × No matrices and Ni × Ni matrices, respectively. These matrix taps Wj and Bl are chosen to minimize the MSE defined by the trace2 of the error auto-correlation matrix Ree (n) = E[ek−4 e∗k−4 ] ek−4 . From [22] it can be easily shown that the resulting error auto-correlation matrix can with ek−4 = xk−4 − x be expressed as e4+1 )]4+1,4+1 Ree,min (n) = [g(A e4+1 is the matrix obtained by deleting the last (n + m − 4 − 1)Ni rows and columns of the matrix Hn∗ Hn , where A with Hn being the channel matrix (13); the function g is now g(x) =

σ2 , +x

σ2

e4+1 )]4+1,4+1 ∈ CNi ×Ni is the 4 + 1, 4 + 1 block entry of g(A e4+1 ). Thus, the corresponding geometric and [g(A MMSE can be computed as e4+1 )]4+1,4+1 ). MMSEn (4 + 1) = det([g(A Let now F (ω) be the Ni ×Ni matrix function H(ω)∗ H(ω) where H(ω) =

Pm k=0

Hk e−kωi is the transfer function

of the channel. Although the (n + m)Ni × (n + m)Ni Hermitian matrix Hn∗ Hn is in general not block Toeplitz, it is equal to the banded Hermitian block Toeplitz matrix Tn+m (F ) except perhaps for the mNi × mNi upper-left and lower-right corners. Furthermore, the entries of these corners do not depend on the order. Consequently, if we define An as the nNi × nNi matrix obtained by deleting the last mNi rows and columns of Hn∗ Hn , then An is equal to Tn (F ) except perhaps for the mNi × mNi upper-left corner and therefore, {An } ∼ {Tn (F )}. The matrix An can be partitioned as

 An = 

2 As

 An−1

A12

A21

[An ]n,n

.

in the first application, the trace or the determinant of Ree (n) can be used as a mean square error measure. In [22] it was shown that

the obtained optimal system parameters and the associated error auto-correlation matrix, Ree,min (n), are the same by using either criterion.

September 1, 2008

DRAFT

15

Thus,

 InNi + σ −2 An = 

 I(n−1)Ni + σ σ

−2

−2

An−1

σ

A21

−2

IN i + σ

A12

−2

[An ]n,n

.

Combining Lemmas 3 and 4 we deduce that det([g(An )]n,n ) = det([(InNi + σ −2 An )−1 ]n,n ) =

det(I(n−1)Ni + σ −2 An−1 ) det(g(An )) = . −2 det(InNi + σ An ) det(g(An−1 ))

(14)

Without any loss of generality, suppose that 4 + 1 = ϕ(n + m) where ϕ : N → N is a function such that 1 ≤ ϕ(n) ≤ n, ϕ(n) → ∞, and m ≤ limn→∞ n − ϕ(n) (for example, ϕ(n + m) = n). Consequently, for e4+1 is equal to Aϕ(n+m) and by using (14) we have sufficiently large n, the matrix A lim MMSEn (ϕ(n + m))

n→∞

= =

lim det([g(Aϕ(n+m) )]ϕ(n+m),ϕ(n+m) )

n→∞

lim det([g(An )]n,n ) = lim

n→∞

n→∞

det(g(An )) = lim (det(g(An )))1/n . det(g(An−1 )) n→∞

Since {An } ∼ {Tn (F )}, from Theorem 6 the geometric MMSE for an infinite-length MIMO DFE can be written as

Ã

MMSE =

lim (det(g(Tn (F ))))1/n

n→∞

1 = exp − 2π

Z 0

Ni 2π X

¡

¢ ln 1 + σ −2 λk (H(ω)∗ H(ω)) dω

!

k=1

¶ µ Z 2π ¡ ¢ 1 −2 ∗ ln det INi + σ H(ω) H(ω) dω . = exp − 2π 0 C. Capacity of a time-invariant MIMO Gaussian channel with ISI and fixed input covariance matrix Finally, we derive the capacity of a time-invariant MIMO Gaussian channel with impulse response {Hj }m j=0 subject to an input constraint, namely, a flat input power spectral density. The input-output relation is given by yk =

m X

Hj xk−j + nk

j=0

where xk−m ∈ CNi and yk ∈ CNo , k ∈ N, are the input and output of the channel, respectively. The channel noise is independent of the input, and the noise random vectors nk ∈ CNo are assumed to be i.i.d. (independent identically distributed) and circularly symmetric Gaussian with zero mean and covariance σ 2 INo . For every n ∈ N, the output of the channel up to time n can be written in matrix notation as y(n) = Hn x(n+m) + n(n) , where Hn > > > > and n(n) = (n> is given by expression (13), y(n) = (y1> , . . . , yn> )> , x(n+m) = (x> 1−m , . . . , xn ) 1 , . . . , nn ) .

Assuming we restrict the covariance of the input random vector, x(n+m) , to be Kx = NPi I(n+m)Ni , the maximum ³ ´ of the normalized mutual information n1 I(x(n+m) ; y(n) ) is n1 log2 det InNo + σ2PNi Hn Hn∗ bits per channel use, and it is obtained when the input process is Gaussian, with zero mean [23]. By denoting SNR =

P σ2

the signal to

noise ratio at the input of the MIMO channel, the channel capacity is now given by the limiting expression µ ¶ ¶ µ nN nN 1 SNR 1 Xo 1 Xo SNR C = lim log2 det InNo + Tn (F ) = lim λk (Tn (F )) = lim log2 1 + g(λk (Tn (F ))) n→∞ n n→∞ n n→∞ n Ni Ni k=1

September 1, 2008

k=1

DRAFT

16

where F (ω) is the No ×No matrix function H(ω)H(ω)∗ , with H(ω) =

³ −kωi H e , and g(x) = log k 2 1+ k=0

Pm

´

SNR Ni x

Now, by applying Theorem 5 we obtain the well known expression for the desired capacity [24] ¶ µ ¶ µ Z 2π X Z 2π No 1 1 SNR SNR C= log2 1 + λk (H(ω)H(ω)∗ ) dω = log2 det INo + H(ω)H(ω)∗ dω. 2π 0 Ni 2π 0 Ni k=1

A PPENDIX OTHER CONCEPTS OF APPROXIMATION OF MATRIX SEQUENCES In this appendix we present the notions of asymptotically p-equivalent sequences of matrices and that of approximating class of sequences (a.c.s.). Both are concepts of approximation of matrix sequences that generalize Gray’s concept of asymptotically equivalent sequences. We present these two notions to show that Theorem 1 can be stated in a more general way, by proving that this theorem can be considered as a special instance of certain known results, namely [25, Theorems 2.1-2.2]. We begin with the definition of asymptotically p-equivalent sequences of matrices that can be obtained from the definition of asymptotically equivalent sequences of matrices by substituting the Frobenius norm for any other Schatten p-norm. We first review these well known norms (see e.g. [7] or [8]). Definition 6: The Schatten p-norm kAkS,p of an n × n matrix A, p ∈ [1, ∞], is the lp norm on Cn of the vector of the singular values of A, i.e., kAkS,p

  (Pn σ (A)p )1/p k=1 k =  max σ (A) 1≤k≤n

k

if p ∈ [1, ∞), if p = ∞,

where σk (A), with 1 ≤ k ≤ n, are the singular values of A counted with their multiplicities. Notice that kAkS,∞ = kAk2 (the spectral radius if A is Hermitian) and kAkS,2 = kAkF (the l2 norm on Cn of the vector containing the eigenvalues of A when A is Hermitian). All the Schatten norms are special instances of unitarily invariant norms (see e.g. [7] or [8]). Using these norms, the definition of asymptotically p-equivalent sequences of matrices is as follows. Definition 7: Consider a strictly increasing sequence of natural numbers {dn }. Let An and Bn be dn × dn matrices for all n ∈ N. We say that the sequences {An } and {Bn } are asymptotically p-equivalent, p ∈ [1, ∞), and write {An } ∼p {Bn }, if ∃M ≥ 0 ;

kAn k2 , kBn k2 ≤ M

∀n ∈ N

(15)

and lim

kAn − Bn kS,p

= 0. (16) 1/p dn The notion of asymptotically p-equivalent sequences is a concept of approximation in norm of matrix sequences. n→∞

Observe that for p = 2 and dn = n, {An } ∼p {Bn } is the same as {An } ∼ {Bn } (the standard notion introduced by Gray). The second and last concept is the notion of a.c.s. due to Serra-Capizzano [26]. It deals with approximation in norm and in rank.

September 1, 2008

DRAFT

.

17

Definition 8: Consider a strictly increasing sequence of natural numbers {dn }. Let An be a dn × dn matrix for all n ∈ N. We say that {{Bn,m }n }m is an a.c.s. for {An } if, for all sufficiently large m ∈ N, the following splitting holds: An = Bn,m + Rn,m + Nn,m

∀n > nm ,

with rank(Rn,m ) ≤ dn c(m),

kNn,m k2 ≤ ω(m),

where nm , c(m) and ω(m) depend only on m and, moreover, lim c(m) = lim ω(m) = 0.

m→∞

m→∞

The a.c.s notion is more general than that of asymptotically p-equivalent sequences. In fact, we show next that if {Bn } ∼p {An } then {{Bn } : m} is an a.c.s for {An }. Moreover, the requirement of uniform boundedness in (15) is not necessary. Lemma 6: Consider p ∈ [1, ∞) and a strictly increasing sequence of natural numbers {dn }. Let An and Bn be dn × dn matrices for all n ∈ N. If expression (16) holds then {{Bn } : m} is an a.c.s for {An }. Proof: Let Vn diag(σ1,n , . . . , σdn ,n )Wn∗ be a singular value decomposition of the matrix An − Bn with σ1,n ≥ σ2,n ≥ . . . ≥ σdn ,n . We define

½

φ(n, m) =

min j ∈ {1, . . . , dn } : σj,n

1 < m

¾ ,

Rn,m

=

Vn diag(σ1,n , . . . , σφ(n,m)−1,n , 0, . . . , 0)Wn∗ ,

Nn,m

=

Vn diag(0, . . . , 0, σφ(n,m),n , . . . , σdn ,n )Wn∗ .

Obviously An = Bn + Rn,m + Nn,m , since Rn,m + Nn,m = An − Bn . On the one hand, kNn,m k2 = σφ(n,m),n
nm .

September 1, 2008

DRAFT

18

Finally, it should be mentioned that by using the previous lemma (with p = 2 and dn = n), Theorem 1 can be obtained from [25, Theorems 2.1-2.2]. In fact, Theorem 1 can be stated in a more general way: i) If {An } ∼p {Bn }, with some p ∈ [1, ∞), then {g(An )} ∼p {g(Bn )}, ii) If {{Bn } : m} a.c.s for {An }, then {{g(Bn )} : m} a.c.s for {g(An )}, for all continuous function g with bounded support. Theorem 2 also holds if ∼ is replaced by ∼p for every p ∈ [1, ∞) (see [26], [27]), even for bounded symbols. If the sequence {Cn (F )} is substituted by {Cˆn (F )}, where the matrix Cˆn (F ) is obtained by minimizing kTn (F ) − XkF over all the block circulant matrices X (see [28, Definition 4.1]), then Lemma 5 also holds when ∼ is replaced by ∼p for all p ∈ [1, ∞), even for bounded symbols. In fact, it holds that (see [17], [27] and [28, Proposition 4.3]) i) limn→∞

ˆn (F )kS,p kTn (F )−C n1/p

= 0 ∀F ∈ Lp ,

ii) kCˆn (F )k ≤ kTn (F )k for every F ∈ L1 and for every unitarily invariant norm k · k, iii) kTn (F )k2 is uniformly bounded for all F ∈ L∞ . Moreover, the matrix Cˆn (F ), unlike Cn (F ), has a simple explicit expression (see the formulae (18) in [28]) in terms of the Fourier coefficients of F . ACKNOWLEDGMENT The authors would like to thank the anonymous referees for their thorough review. R EFERENCES [1] U. Grenander and G. Szeg¨o, Toeplitz Forms and Their Applications. Berkeley and Los Angeles: University of California Press, 1958. [2] R. M. Gray, “Toeplitz and circulant matrices: A review,” Foundations and Trends in Communications and Information Theory, vol. 2, no. 3, pp. 155–239, 2006, [Original version: Tech. Report 6502-1, Stanford Electronic Laboratory, June 1971]. [3] H. Gazzah, P. A. Regalia, and J. P. Delmas, “Asymptotic eigenvalue distribution of block Toeplitz matrices and application to blind SIMO channel identification,” IEEE Trans. Inform. Theory, vol. 47, no. 3, pp. 1243–1251, Mar. 2001. [4] N. K. Bose and K. J. Boo, “Asymptotic eigenvalue distribution of block-Toeplitz matrices,” IEEE Trans. Inform. Theory, vol. 44, no. 2, pp. 858–861, Mar. 1998. [5] P. Tilli, “A note on the spectral distribution of Toeplitz matrices,” Linear and Multilinear Algebra, vol. 45, pp. 147–159, 1998. [6] S. Serra-Capizzano, “The GLT class as a generalized Fourier analysis and applications,” Linear Algebra Appl., vol. 419, p. 180–233, 2006. [7] R. Bhatia, Matrix Analysis.

New York: Springer-Verlag, 1997.

[8] R. A. Horn and C. R. Johnson, Matrix Analysis.

New York: Cambridge University Press, 1990.

[9] R. M. Gray, “On the asymptotic eigenvalue distribution of Toeplitz matrices,” IEEE Trans. Inform. Theory, vol. 18, no. 6, pp. 725–730, Nov. 1972. [10] F. R. Gantmacher, The Theory of Matrices. New York: Chelsea Publishing, 1977, vol. 1. [11] R. A. Horn and C. R. Johnson, Topics in Matrix Analysis.

New York: Cambridge University Press, 1994.

[12] R. A. Roberts and C. T. Mullis, Digital Signal Processing. Addison-Wesley, 1987. [13] W. Rudin, Principles of Mathematical Analysis.

McGraw-Hill, 1976.

[14] M. Miranda and P. Tilli, “Asymptotic spectra of Hermitian block Toeplitz matrices and preconditioning results,” SIAM J. Matrix Anal. Appl., vol. 21, no. 3, pp. 867–881, 2000. [15] A. B¨ottcher and B. Silbermann, Introduction to Large Truncated Toeplitz Matrices. New York: Springer-Verlag, 1999. [16] H. Widom, “Asymptotic behavior of block Toeplitz matrices and determinants. II,” Advances in Math., vol. 21, pp. 1–29, 1976.

September 1, 2008

DRAFT

19

[17] S. Serra-Capizzano, “Test functions, growth conditions and Toeplitz matrices,” Rend. Circolo Mat. Palermo, Serie II, vol. 68, pp. 791–795, 2002. [18] A. B¨ottcher, J. Guti´errez-Guti´errez, and P. M. Crespo, “Mass concentration in quasicommutators of Toeplitz matrices,” J. Comput. Appl. Math., vol. 205, pp. 129–148, 2007. [19] P. M. Crespo and J. Guti´errez-Guti´errez, “On the elementwise convergence of continuous functions of Hermitian banded Toeplitz matrices,” IEEE Trans. Inform. Theory, vol. 53, no. 3, pp. 1168–1176, Mar. 2007. [20] J. Guti´errez-Guti´errez, P. M. Crespo, and A. B¨ottcher, “Functions of banded Hermitian block Toeplitz matrices in signal processing,” Linear Algebra Appl., vol. 422, pp. 788–807, 2007. [21] S. Haykin, Adaptive Filter Theory.

Prentice-Hall, 1986.

[22] N. Al-Dhahir and A. H. Sayed, “The finite-length multi-input multi-output MMSE-DFE,” IEEE Trans. Signal Processing, vol. 48, no. 10, pp. 2921–2936, Oct. 2000. [23] T. M. Cover and J. A. Thomas, Elements of Information Theory.

New Jersey: Wiley, 2005.

[24] L. H. Brandenburg and A. D. Wyner, “Capacity of the Gaussian channel with memory: The multivariate case,” Bell System Technical J., vol. 53, no. 5, pp. 745–778, May-June 1974. [25] S. Serra-Capizzano and D. Sesana, “Approximating classes of sequences: the Hermitian case,” Heinig’s Memorial Volume, 2008, D. Bini, V. Olshevsky, E. Tyrtyshnikov Eds, Birkh¨auser. [26] S. Serra-Capizzano, “Distribution results on the algebra generated by Toeplitz sequences: a finite dimensional approach,” Linear Algebra Appl., vol. 328, pp. 121–130, 2001. [27] S. Serra-Capizzano and P. Tilli, “On unitarily invariant norms of matrix valued linear positive operators,” J. Inequalities Appl., vol. 7, no. 3, pp. 309–330, 2002. [28] S. Serra-Capizzano, “The spectral approximation of multiplication operators via asymptotic (structured) linear algebra,” Linear Algebra Appl., vol. 424, pp. 154–176, 2007.

´ Guti´errez-Guti´errez was born in Granada, Spain. He received his degree in Mathematics from the University Jesus of Granada, Spain, in 1999 and his Ph.D. degree in Electronics and Communications from the University of Navarra, PLACE

Spain, in 2004. He is currently with the Electronics and Communications Department at the R&D center CEIT, San

PHOTO

Sebasti´an, Spain. He is also an associate professor at the Engineering School of the University of Navarra (Tecnun). His

HERE

current research interests include matrix analysis, random matrix theory, spectral theory and orthogonal polynomials applied to problems in communications.

September 1, 2008

DRAFT

20

Pedro M. Crespo (S’80-M’84-SM’91) was born in Barcelona, Spain. In 1978, he received the engineering degree in Telecommunications from Universidad Polit´ecnica de Barcelona. He received the M.Sc. in Applied Mathematics and PLACE

Ph.D. in Electrical Engineering from the University of Southern California (USC), in 1983 and 1984, respectively.

PHOTO

From September 1984 to April 1991, he was a member of the technical staff in the Signal Processing Research group

HERE

at Bell Communications Research, New Jersey, USA, where he worked in the areas of digital communication and signal processing. He actively contributed in the definition and development of the first prototypes of xDSL (Digital Subscriber Lines transceivers). From May 1991 to August 1999 he was a district manager at Telef´onica Investigaci´on

y Desarrollo, Madrid, Spain. From 1999 to 2002 he was the technical director of the Spanish telecommunication operator Jazztel. At present he is the head of the Electronics and Communications Department at the R&D center CEIT, San Sebasti´an, Spain. He is also a full professor at the Engineering School of the University of Navarra (Tecnun). Dr. Crespo is a Recipient of the Bell Communication Research Award of excellence. He holds seven patents in the areas of digital subscriber lines and wireless communications. His research interests include the general areas of digital communications, signal processing and information theory.

September 1, 2008

DRAFT