Distortion Lower Bounds for Finite Dimensional Joint ... - CiteSeerX

ISIT 2008, Toronto, Canada, July 6 - 11, 2008

Distortion Lower Bounds for Finite Dimensional Joint Source-Channel Coding Amir Ingber, Itai Leibowitz, Ram Zamir and Meir Feder Department of Electrical Engineering - Systems Tel-Aviv University, Tel-Aviv 69978, ISRAEL Email: {ingber, leibo, zamir, meir}@eng.tau.ac.il

Abstract— In this work we consider joint source-channel coding (JSCC) schemes that are limited to work in blocks of ſnite length. We focus on the high resolution and high signal to noise ratio (SNR) regime, and derive new lower bounds for the distortion of JSCC schemes over rth -moment constrained additive noise channels. These new bounds are based on the method of Ziv and Zakai [11], combined with the Rényi information measure, as was recently proposed by Leibowitz and Zamir [5]. Numerical results are presented for the case of Gaussian source and channel, and it is shown that the new bounds improve upon Shannon’s original bound in several cases, including bandwidth expansion and reduction.

I. I NTRODUCTION We consider the problem of transmitting a real valued source over an additive noise channel, when the goal is to reproduce the source at the receiver with minimum distortion. We assume that there is a limitation on the delay of the communication system. The limit on the delay is represented by the model of a system that transmits a vector S of k source samples over n channel uses, denoted by the vector X = f (S). We consider a general additive noise channel, whose output Y is given by Y = X + Z,

(1)

where Z is the additive noise. At the receiver side the reproˆ = g(Y) is constructed, and the performance of the duction S communication system is measured by the average distortion per source sample 1 ˆ (2) D = E[d(S, S)]. k In Shannon’s famous paper [9], he already provided a lower bound for the performance in this setting. This bound has the well known form of R(D) ≤ C,

(3)

where R(D) is the rate-distortion function (RDF) of the kdimensional source, and C is the capacity of the n-dimensional channel. The RDF is given by ˆ R(D) = inf I(S; S),

(4)

where the inſmum is over all the conditional distributions of ˆ given S that satisfy the distortion constraint. The capacity S is given by C = sup I(X; Y), (5) where the supremum is taken over the all the channel inputs X, and sometimes only over channel inputs that satisfy some constraint (e.g. power constraint). Since R(D) is monotone non-increasing with D, (3) gives the bound D ≥ R−1 (C).

978-1-4244-2571-6/08/$25.00 ©2008 IEEE

(6)

Shannon’s bound (3) is shown directly using the data processing theorem (see, e.g. [2], Theorem 2.8.1), which states that if the random variables A, B and C form a Markov chain (denoted A → B → C), then I(A; B) ≥ I(A; C), I(B; C) ≥ I(A; C).

(7)

ˆ and an For our system, it is clear that S → X → Y → S, immediate corollary is that ˆ ≤ I(S; Y) ≤ I(X; Y), I(S; S) ˆ ≤ I(X; S) ˆ ≤ I(X; Y). I(S; S)

(8)

If the block lengths (k and n) are unlimited, then Shannon’s well known separation theorem ([9], Theorem 21) states that then the bound (3) can be asymptotically achieved. Moreover, it can be achieved using separate source coding and channel coding operations. However, when k and/or n are ſnite, Shannon’s result does not hold anymore and it is generally not true that (3) can be achieved, using separate or joint source-channel coding. Therefore, stronger bounds on the performance when the block lengths are limited are needed in order to better understand the limitation of our communication system. Ziv and Zakai, in their paper from 1973 [11], have noticed that using functionals other than Shannon’s mutual information can also result in bounds of the form (3), as long as the new functionals satisfy the data processing theorem (Eq. (7)). By deſning the analogs for the capacity and the RDF and carefully selecting the functionals that replace the mutual information, stronger bounds can be derived. Recently, Ziv and Zakai’s approach was utilized in order to get additional bounds ([5], [4]). By selecting the Rényi information measure as the alternative functional, new bounds were derived for the distortion in ſxed-rate vector quantization, and in transmission through low-dimensional modulolattice additive noise channels. In this paper we extend the work done in [5] and [4] and ſnd stronger bounds for general additive noise channels. For the high-SNR and high resolution case, we develop the Rényi capacity for an arbitrary additive noise channel subject to rcth moment input constraint, for some positive rc . Combined with the RDF for a general source vector derived in [4], we provide a general bound on the lowest achievable distortion for transmission in ſnite blocks over an additive noise channel. The paper is organized as follows. In Section II we formulate the problem and review the Ziv-Zakai method and its usage combined with the Rényi information. In Section III we

1183

Authorized licensed use limited to: IEEE Xplore. Downloaded on December 14, 2008 at 08:06 from IEEE Xplore. Restrictions apply.


describe our main result - the new bound for the distortion over general additive noise channels. In Section IV we consider the special case of Gaussian source with square distortion measure and the power constrained Gaussian channel. We show numerical results and demonstrate the improvement of our bound in the case of bandwidth reduction and expansion. Conclusions and further work follow in Section V. II. P RELIMINARIES A. Problem formulation Let S be a k-dimensional source with probability density function (pdf) fS (s), to be transmitted over an n-dimensional channel. We consider general additive noise channels, with a constraint on the rcth moment on the channel input, which generalizes the usual power constraint (rc = 2). The constraint is given by 1 E[Xrrcc ] ≤ 1, (9) n 1/r where E[ · ] denotes expectation and xr ( i |xi |r ) . The output and additive noise shall be denoted by Y and Z respectively. The noise pdf is given by fZ (z). At the receiver ˆ and the side the reconstructed source shall be denoted S, th distortion measure shall be the rd power : ˆ = S − S ˆ rrd , d(S, S) d

(10)

which generalizes the common square error distortion measure (rd = 2). B. The Ziv-Zakai method Shannon’s information measure is given by I(X; Y) = −

fXY (x, y) log

fX (x)fY (y) , fXY (x, y)

(11)

where log denotes the natural logarithm. Ziv and Zakai noticed that replacing the − log(·) by a different convex function Φ(·) that satisſes lim t · Φ(1/t) = 0,

t→0

(12)

results in an alternative information measure that still satisſes the data processing inequality as in (7). The generalized mutual information relative to the function Φ(·) is given by fX (x)fY (y) fXY (x, y)Φ . (13) IΦ (X; Y) = fXY (x, y) The data processing inequality for the generalized mutual information gives ˆ ≤ IΦ (X; Y). IΦ (S; S)

(14)

The generalized RDF of the source S is deſned by ˆ RΦ (D) inf IΦ (S; S),

(15)

where the inſmum is taken over all the conditional distribˆ given S that satisfy the distortion constraint. The ution of S generalized capacity is deſned similarly: CΦ sup IΦ (X; Y),

I(X1 , X2 ; Y1 , Y2 ) = I(X1 ; Y1 ) + I(X2 ; Y2 ).

(18)

This property does not hold for the generalized mutual information, which makes the bound (17) depend on the block length. Therefore, by a clever selection of the generalized information measure stronger bounds can be derived, that are speciſc for the block length. C. Rényi entropy and information In 1961, Alfréd Rényi introduced a generalization of Shannon’s information measure [8]. Rényi’s entropy of order α is given by 1 log fX (x)α dx. Hα (X) (19) 1−α Rényi’s information of order α is given by α−1 fXY (x, y) 1 Iα (X; Y) log fXY (x, y) dxdy. α−1 fX (x)fY (y) (20) In both deſnitions α > 0 and α = 1, where in the limit of α → 1 both the entropy and the information are identical to Shannon’s original deſnitions. It appears that the Rényi information indeed satisſes the data processing inequality, and can be used in order to provide bounds of the form (17). This can be shown directly, using the convexity property of the Rényi information measure, or more elegantly, following [5]: we deſne the Rényi information power IPα (X; Y ) = exp(|α − 1|Iα (X; Y )). The information power is indeed of the form (13), with 1−α t , α > 1; Φ(t) = (21) −t1−α , α < 1. In both cases it is easy to see that (12) is satisſed. Therefore the Rényi information power satisſes the data processing inequality. Since the Rényi information is related to the information power by a monotone increasing function, we get that the data processing inequality holds for the Rényi information as well. D. Rényi rate-distortion function at high resolution In [5], the Rényi information measure was used with the Ziv-Zakai method in order to provide new bounds for the distortion in ſxed-rate vector quantization and for transmission through low-dimensional modulo-lattice channels. For the case of high resolution, the Rényi RDF was calculated for an arbitrary source vector with rth -power distortion. The derivation of the Rényi rate distortion involves the usage of the Rényi moment-power inequality [6]: k , and a random vector Theorem 1: For r > 0, α > k+r k X ∈ R with ſnite Rényi entropy of order α and rth moment, 1/r

(16)

where the supremum is taken over all input distributions that satisfy the input constraint (9). Combining (14)-(16) we get RΦ (D) ≤ CΦ ,

and the resulting lower bound on the distortion. We also note that for a memoryless source and channel, Shannon’s information is additive in the block length, i.e.

where

(17)

E {Xrr }

≥ ck,r,α , exp k1 Hα (X)

(22)

− r1 r 1/k ck,r,α ak,r,α α 1 + bk,r,α , −1 k

(23)

1184 Authorized licensed use limited to: IEEE Xplore. Downloaded on December 14, 2008 at 08:06 from IEEE Xplore. Restrictions apply.


bk,r,α

⎧

1 ⎨ k(1−α) k(1−α) , 1 − rα ⎩ exp(− 1 ), r

and ak,r,α

⎧ ⎪ ⎪ ⎪ ⎪ ⎨ ⎪ ⎪ ⎪ ⎪ ⎩

(1−α)k/r+1 Γ( k 2 +1)

1 k π k/2 β( k r +1, 1−α − r ) Γ( k 2 +1)

α = 1; α = 1,

(24)

=

, α < 1; α = 1;

,

π k/2 Γ( k r +1) (α−1)k/r+1 Γ( k 2 +1)

k Rα (D) ∼ log = rd

1 rd 1 ck,rd ,α · k D

+ Hμ (S),

(26)

where

(2 − α)rd + k(α − 1) μ . (27) k(α − 1) + rd The approximation is in the sense that the difference between Rα (D) and the RHS of (26) vanishes as D → 0. III. M AIN RESULT A. Rényi capacity of order α In order to provide a bound of the form (17) for the additive noise channel we now calculate the Rényi capacity for that channel: Cα = sup Iα (X; Y),

(28)

where the maximization is on all legal pdfs for X that satisfy 1 E Xrrcc ≤ 1. n

(a)

(25)

, α > 1. 1 π k/2 β( k r +1, 1−α ) The above constants are simply the normalizing coefſcients of the generalized Gaussian distribution, that achieves the Rényi moment-power inequality with an equality (see [6]). The resulting Rényi RDF is given as follows [5]: Theorem 2: Let S be a k-dimensional source vector with k . For the rdth -power pdf fS (s), and let rd > 0 and α > k+r d distortion measure (10) and high resolution, the Rényi RDF is approximately given by

(29)

Since the RDF result (26) refers to the case of high resolution, we make high-SNR simpliſcations for the capacity as well. This enables us to get analytical bounds for the highSNR and high-resolution regime. n . For the Theorem 3: Let rc > 0 and 0 < α < 2 − n+r c high-SNR n-dimensional additive noise channel with an rcth moment input constraint, the Rényi capacity of order α is approximately given by

n c log n · c−r (30) Cα ∼ = n,rc ,2−α − Hα (Z), rc where cn,rc ,2−α is deſned in (23). The approximation is in the same sense as in Theorem 2. Proof: We follow the asymptotic assumptions in [5], and assume that for all w, fX (w) ≈ fY (w), which essentially means that the input and the output of the channel have similar distributions. Another assumption is that the conditional probability Q(y/x) is positive in only a small neighborhood of x, and in the high SNR limit, Q(y/x) > 0 shall mean that x ≈ y. Applying these approximations, we get:

α−1 fXY (x, y) dxdy = fXY (x, y) fX (x)fY (y) α−1 Q(y/x) dxdy = Q(y/x)fX (x) fY (y) α−1 Q(y/x) Q(y/x)fX (x) dydx = fX (x) fZ (y − x)α · fX (x)2−α dydx = ·fX (x)2−α dx. (31) fZ (z)α dz

∼ =

(b)

=

=

where (a) is due to the high SNR approximations, and (b) is due to the fact that the noise is additive. Plugging (31) into the Rényi information: 1 log fX (x)2−α dx − Hα (Z) = Iα (X; Y) ∼ = α−1 = H2−α (X) − Hα (Z). (32) Since Hα (Z) does not depend on X, the Rényi capacity is given by Cα = sup Iα (X; Y) ∼ = sup{H2−α (X)} − Hα (Z).

(33)

In order to bound the maximum H2−α (X) we utilize the Rényi moment-power inequality. According to (22), 1/rc E Xrrcc 1

≥ cn,rc ,2−α . (34) exp n H2−α (X) Rearranging terms, we get an upper bound on the Rényi entropy:

n c H2−α (X) ≤ (35) log E Xrrcc c−r n,rc ,2−α . rc Adding the rcth -power input constraint (29):

n c H2−α (X) ≤ log n · c−r n,rc ,2−α . rc

(36)

Plugging into (33), we get (a)

Cα ≤

n c log n · c−r n,rc ,2−α − Hα (Z). rc

(37)

Inequality (a) is in fact an equality, since the moment-power inequality (22) holds with an equality when the input follows a generalized Gaussian distribution (see [6]). Note that even the inequality is sufſcient in order to bound the distortion. B. The SNR effect on the Rényi capacity For rcth -moment channel input constraint, we deſne the channel SNR as the ratio between the rcth -moments of the channel input X and the noise Z. We now write the Rényi capacity in terms of the SNR. Lemma 1: Let r > 0, λ > 0 and a > 0, and let W be an ndimensional random vector with unit average rth moment and Renyi entropy Hλ (W). Then the random variable U = a · W has average rth moment ar and Rényi entropy Hλ (U) = Hλ (W) + n · log(a).


(38)


Proof: If the pdf of W is fW (w), then the pdf of U is given by fU (u) = a1n fW ((1/an )u). The average rth moment of U is given by 1 1 E{Urr } = ar E{Wrr } = ar . (39) n n The Rényi entropy of order λ of U is given by 1 log fU (u)λ du = Hλ (U) = 1−λ 1 1 = log · an · fW (w)λ dw = λ·n 1−λ a 1 log an(1−λ) = = Hλ (W ) + 1−λ = Hλ (W ) + n · log(a). (40) 1 a Z,

For an additive noise Z, we deſne the noise Z0 = s.t. Z0 has unit rcth moment. Since the channel input is constrained to have unit average rcth moment, then the channel SNR is given by a1r . Combined with Lemma 1 we get that n Hα (Z) = Hα (Z0 ) − log(SN R). (41) rc Combining with (30), we get

n n c Cα ∼ log n · c−r log(SN R). (42) = n,rc ,2−α −Hα (Z0 )+ rc rc C. Combining the Rényi capacity and RDF Since the Rényi information satisſes the data processing inequality, we may utilize the method of Ziv and Zakai, and bound the distortion: −1 (Cα ), D ≥ Rα

(43)

−1 (·) is the inverse of the Rényi RDF (26), and Cα where Rα is the Rényi capacity (30). k n Equation (43) holds for k+r < α < 2 − n+r . Therefore c d the strongest bound can be attained by optimizing w.r.t. α:

D≥

max α∈

k k+rd

n ,2− n+r c

−1 Rα (Cα ).

(44)

Note that the optimized bound is guaranteed to be not weaker that Shannon’s bound (6), that can be viewed as the special case of α = 1. In the next section we demonstrate several cases where the new bound is stronger than Shannon’s bound. IV. E XAMPLES A simple and practical case is the Gaussian source with square distortion measure and a power constrained additive white Gaussian noise (AWGN) channel. In this section we provide the optimization results for the new bound for this case (Eq.44). For the case of scalar source and channel (k = n = 1) it is well known that Shannon’s bound is achievable, by simply sending a scaled version of the source directly into the channel. This approach, also known as uncoded transmission [3], also applies to the case of k = n. It is therefore clear that in these cases the new bound (44) is equal to Shannon’s bound. The problem of sending a k-dimensional source over an n-dimensional noisy channel when n = k has been studied by several authors, starting with Shannon in 1949 [10] and

more recently in [7], [1] and more. These works focused on the low dimensional case, e.g. 2 : 1, 1 : 2 and 3 : 2, and the considered methods included mainly space-ſlling curves. None of the schemes that were studied (for k = n) achieved Shannon’s bound, and our new bounds indeed explain a part of the gap to Shannon’s bounds. A. Shannon’s bound The Shannon capacity of the n-dimensional AWGN is well known to be C = n2 log(1 + SN R). The RDF for a unit variance k-dimensional Gaussian vector is also known, and is 1 . Therefore, Shannon’s bound on given by R(D) = k2 log D the distortion as a function of the channel SNR is given by D ≥ (1 + SN R)−n/k ≈ SN R−n/k .

(45)

The same bound can be viewed as a lower bound on the required SNR in order to achieve a given distortion D: SN R ≥ D−k/n − 1 ≈ D−k/n .

(46)

(Note that for high SNR / low distortion the addition of 1 to the SNR can be neglected.) The improvement of the new bound (44) can be measured by its gain w.r.t. either (45) or (46). B. 2 : 1 bandwidth reduction For the case of 2 : 1 (2D source and a scalar channel) the best method known so far was proposed by Chung [1]. This method include non-uniform space ſlling spiral, that is optimized for the speciſc channel SNR. The distortion that was achieved by his method was only about 0.9dB from the Shannon bound (R(D) = C). By calculating our new bound (44) for this case (k = 2, n = 1, rd = rc = 2) the received distortion bound is about 0.25dB better from Shannon’s, which reduces the gap between the best known achievable performance and the strongest bound from 0.9dB to about 0.65dB, and in fact eliminates a signiſcant part of the gap. C. k : 1 bandwidth reduction We have calculated the new bound for the case of k : 1 bandwidth reduction for k ranging from 1 to 40. In Figure 1 the improvement in the distortion bound is plotted. The same improvement from an SNR point of view is shown in Figure 2. Note that as k grows, the distortion bound improvement diminishes, while the SNR bound improvement grows. This is easily explained: the slope of Shannon’s bound (on the graph of the distortion in dB as a function of the SNR) is a decreasing linear function (in high SNR) with slope −1/k. Therefore an improvement of xdB in the distortion amounts to an improvement of k · xdB in the SNR. D. 1 : n bandwidth expansion For the 1 : n bandwidth expansion case we have also calculated the new bound for n ranging from 1 to 40. In Figure 3 the improvement in the distortion bound is plotted, and the improvement in the SNR bound is shown in Figure 4. Note the exceptionally high gains (of tens of dB) in Fig. 3. These are indeed the real improvements of the bound. For example, for SNR of 20dB and 1:10 bandwidth expansion, Shannon’s distortion bound (in terms of signal-to-distortion ratio) becomes 200db(!). The improved bound is 19dB lower, and is 181dB only.



100 Distortion bound improvement [dB]

Distortion bound improvement [dB]

0.6 0.5 0.4 0.3 0.2 0.1 0 Ŧ0.1

0

10

20 30 k (source dimension)

SNR bound improvement [dB]

SNR bound improvement [dB]

20 0

0

10

20 30 n (channel dimension)

40

2.5

8

6

4

2

Fig. 2.

40

Fig. 3. Improvement in the distortion bound for 1 : n bandwidth expansion

10

0

60

Ŧ20

40

Improvement in the distortion bound for k : 1 bandwidth reduction

Fig. 1.

80

0

10

20 30 k (source dimension)

2 1.5 1 0.5 0 Ŧ0.5

40

Improvement in the SNR bound for k : 1 bandwidth reduction

Fig. 4.

V. C ONCLUSION AND FURTHER WORK The Ziv-Zakai approach is a promising technique for developing outer bounds (either on the distortion or on the channel SNR) for communication using ſnite block lengths. Combined with the Rényi information measure, new bounds were found for the high resolution / high SNR regime. These new bounds were shown to be sharper than Shannon’s famous bound of R(D) ≤ C for several cases, including the Gaussian case with bandwidth reduction and expansion. The Rényi information measure was selected mainly for its algebraic tractability, and we do not claim that the bounds provided here are tight. Even though the bounds improve upon Shannon’s original bound, we believe that using different information measures could lead to even stronger bounds. This is currently left for further work. R EFERENCES

0

10

20 30 n (channel dimension)

40

Improvement in the SNR bound for 1 : n bandwidth expansion

[2] Thomas M. Cover and Joy A. Thomas. Elements of Information Theory. John Wiley & sons, 1991. [3] T. J. Goblick. Theoretical limitations on the transmission of data from analog sources. IEEE Trans. on Inform. Theory, IT-11(4):558–567, 1965. [4] I. Leibowitz. The Ziv-Zakai bound at high resolution, analog matching, and companding. Master’s thesis, Tel Aviv University, 2007. [5] Itai Leibowitz and Rami Zamir. A Ziv-Zakai-Rényi lower bound on distortion at high resolution. 2008. Submitted to the 2008 IEEE Information Theory Workshop, availible online at www.eng.tau.ac.il/∼zamir. [6] Erwin Lutwak, Deane Yang, and Gaoyong Zhang. Moment-entropy inequalities. Ann. Probab., 32:757–774, 2004. [7] Tor A. Ramstad. Shannon mappings for robust communication. Telektronikk, 98(1):114–128, 2002. Information theory and its applications. [8] Alfréd Rényi. On measures of entropy and information. In Proc. 4th Berk. Symp. Math., Stat. and Prob., pages 547–561. Univ. of Calif. Press, 1961. [9] C. E. Shannon. A mathematical theory of communication. The Bell System technical journal, 27:379–423, 1948. [10] C. E. Shannon. Communications in the presence of noise. In Proc.of the IRE, volume 37, pages 10–21, 1949. [11] J. Ziv and M. Zakai. On functionals satisfying a data-processing theorem. IEEE Trans. on Inform. Theory, IT-19(3):275–283, 1973.

[1] Sae-Young Chung. On the construction of some capacity-approaching coding schemes. Phd thesis, Massachussettes institute of technology, september 2000.