On the Capacity of 'Cheap' Relay Networks - Semantic Scholar

10 downloads 0 Views 200KB Size Report
Mar 12, 2003 - Reznik et al. [9] also considered Gaussian physically degraded relay channels and extended the results for multiple relay stages with a total av-.
2003 Conference on Information Sciences and Systems, The Johns Hopkins University, March 12–14, 2003

On the Capacity of ‘Cheap’ Relay Networks Mohammad Ali Khojastepour, Ashutosh Sabharwal, and Behnaam Aazhang Department of Electrical and Computer Engineering, Rice University 6100 Main St., MS-366 Houston, TX 77005 e-mail: {amir, ashu, aaz}@rice.edu Abstract — We consider the communication problem in a multi-hop relay network where the intermediate relay nodes cannot transmit and receive at the same time. The motivation for this assumption comes from the fact that current radios operate in TDD mode when the transmitting and receiving frequencies are the same. We label such a node radio as a cheap radio and the corresponding node of the network as a cheap node. In this paper we derive the capacities of the degraded cheap relay channel and the multi-hop network with cheap nodes. The proof of the achievability parts in coding theorems are presented based on the jointly typical sequences, while the proof of the converses are derived from the direct application of the upper bounds derived in [7].

I. Introduction In sensor networks, most radios operate in TDD mode when the transmitting and receiving frequencies are the same. Although it might be possible to build the RF radios that are able to send and receive in the same frequency band, design of such radios needs precise and expensive components. We consider information theoretic approach to quantify the capacity of the network with this practical constraint. Thus, we assume that the nodes of the network can either send or receive at a given time or use of the network. We label such a node radio as a cheap radio and the corresponding node of the network as a cheap node. In this paper we find the capacity of the cheap relay channel, Figure 1, and multi-hop network, Figure 2, where the intermediate relay nodes are cheap nodes. The relay channel was first defined and studied by Van der Mullen [8] for the case of three terminals (or nodes) and later considered by Cover [1]. Cover’s results on the relay channel are still the most comprehensive results in terms of achievability and capacity bounds (converses) for discrete memoryless relay channel. Recent interest in sensor and ad-hoc networks has sparkled new research into the relay channel. Gupta and Kumar [6] presented an extension of the relay channel and showed that using more sophisticated multiuser coding schemes can provide sizable gains in terms of transport capacity. Also a follow-up paper by Xie and Kumar [10] established an explicit achievable rate expression for the degraded Gaussian channel with multiple relays and characterized the bounds on transport capacity. Reznik et al. [9] also considered Gaussian physically degraded relay channels and extended the results for multiple relay stages with a total average power constraint. Gasper and Vetterli [4] derived a lower bound on the capacity of the relay channel by allowing arbitrarily complex network coding. Also, they considered upper bounds from the cut-set theorem [3] and showed that these

S

D Y’

X’1 Y’1 C1

X’2

C2

R

Figure 1: Discrete Memoryless Relay Channel upper and lower bounds meet asymptotically as the number of relay nodes goes to infinity. In all of the previous work it has been assumed that transmission and reception can be performed at the time in the same frequency band, which is the key difference between the previous works on the relay channel and the results of this paper. Rest of the paper is organized in two sections: In the section II, we consider the problem of cheap relay channel. In the section III, the problem of multi-hop network with cheap relays is considered.

II. Cheap Relay Channel Consider discrete memoryless relay channel of Figure 1, in which source node S is willing to transmit information to the destination node D by using direct link between the node pair (S, D) as well as help of another relay node R (if it improves the achievable rate of transmission) by using link pairs (S, R) and (R, D). Furthermore assume that relay node R is a cheap node and thus it cannot transmit and receive at the same time. A cheap relay channel, Figure 1, consists of an input x01 , a relay output y10 , a relay sender x02 (which depends only on the past values of y10 ), and a channel output y 0 . The channel is assumed to be memoryless. With the assumption of cheap relay node, there are two possible states of operation for the cheap relay channel. In state m1 , relay node R acts as a receiver and thus channel probability function is given by p(y 0 , y10 |x01 |m1 ) 1 , while in the state m2 relay node functions as a transmitter and the channel probability function is given by p(y 0 |x01 , x02 |m2 ). Let X0 , X10 |m1 , X1 , X10 |m2 , X2 , X20 |m2 , Y0 , Y 0 |m1 , Y1 , Y10 |m1 , Y , Y 0 |m2 . Using these new random variables, 1 Remark: (i) Throughout this paper we shall drop the subscripts of probability mass functions, for example: p(u|v) = pU |V (u|v), since it can be inferred by inspection of the arguments of the function. We shall also use the notation X ∼ p(x) to indicate that the p(x) is the probability distribution of random variable X, or random variable X is drawn according to the probability mass function p(x).(ii) The notation p(y1 , y2 , . . . , yN1 |x1 , x2 , . . . , xN2 |m) follows from the notation of [11], and [7] for the channel probability function (c.p.f.) of multi-state channels.

the discrete memoryless cheap relay channel can be denoted by two 3-tuple (X0 , p(y0 , y1 |x0 ), Y0 ×Y1 ), (X1 ×X2 , p(y|x1 , x2 ), Y), in modes m1 and m2 respectively, which consists of six finite sets X0 , X1 , X2 , Y, Y0 , Y1 and a collection of probability distributions p(., .|x0 ), p(.|x1 , x2 ) on Y0 × Y1 , Y, one for each x0 , (x1 , x2 ) respectively. The interpretation is that there are two modes m1 and m2 . In mode m1 , x0 is the input to the channel, y0 is the output at the destination node D and y1 is the output at the relay node R. In mode m2 , x2 is the input symbol chosen by the relay node R and x1 is the source input. Suppose that we can arbitrarily choose the modes m1 , m2 in order to maximize the rate of information transfer between the source node S and destination node D. The problem is to find the capacity of the channel between the sender S and receiver D. Let m1 = 0, m2 = 1, S = {m1 , m2 }, an (M, n) code for a discrete memoryless cheap relay channel consists of a set of integers:

where si is the state of the network at the i’th network use, a set of relay functions {fi }n i=1 such that 0 0 0 x02i = fi (Y11 , Y12 , . . . , Y1(i−1) ) for 1 ≤ i ≤ n,

and a decoding function n

g : (Y 0 1 , S n ) → M. For generality, the encoding functions x0i (.), fi (.) and decoding function g(., .) are allowed to be stochastic functions. Note that the structure of both the encoding functions at the source and the relay nodes are actually from the definition of the cheap relay channel. At the source node encoding is based on choosing the mode of operation of the network and then choosing the symbols in the corresponding mode. Also, because of non-anticipatory relay condition relay input x02i is allowed to depend only on the past (i−1) 0 0 0 y0 1 = (y11 , y12 , . . . , y1(i−1) ) [1]. In [7], an upper bound for the information transfer rate R from source node S to the destination node D is shown to be:

min{t I(X1 ; Y, Y1 |m1 ) + (1 − t) I(X1 ; Y |X2 ,

t, 0≤t≤1

m2 ), t I(X1 ; Y |m1 ) + (1 − t) I(X1 , X2 ; Y |m2 )}

(1)

with the new definition of input and output random variables we have: R≤

sup

(3)

Thus, if in the state m1 the received signal y at the destination node D is degraded form of the received signal y1 at the relay node then the bound of (2) would coincide with this achievable rate for some value of t. Hence the capacity of the degraded cheap relay channel would be given by C = R∗ defined in (3). Thus we have: Theorem 1. The capacity of the degraded cheap relay channel is given by C=

sup

min{t I(X0 ; Y1 ) + (1 − t) I(X1 ; Y |X2 ), (4)

II.1 Achievability of C in Theorem 1

: M → (Xs1 × Xs2 × . . . × Xsn )

sup

t I(X0 ; Y0 ) + (1 − t) I(X1 , X2 ; Y )}

where the supremum is taken over t, 0 ≤ t ≤ 1 and all the joint distributions p(x0 , x1 , x2 ) on X0 × X1 × X2 .

sn : M → S n ,



, min{t I(X0 ; Y1 ) + (1 − t) I(X1 ; Y |X2 ),

t I(X0 ; Y0 ) + (1 − t) I(X1 , X2 ; Y )}

a set of encoding functions2

R

R∗

t, p(x0 ,x1 ,x2 )

M = {1, 2, . . . , M } , [1, M ],

xn 1

In Section II it will be shown that for every t, 0 ≤ t ≤ 1, the rate R∗ is achievable where R∗ is given by:

min{t I(X0 ; Y0 , Y1 ) + (1 − t) I(X1 ; Y |X2 ),

t, 0≤t≤1

t I(X0 ; Y0 ) + (1 − t) I(X1 , X2 ; Y )}

(2)

2 We shall use the superscript notation v n to indicate a n-tuple (v1 , v2 , . . . , vn ).

In this section, we will show the achievability of the rate R∗ in (3) for any t,0 ≤ t ≤ 1. We first begin with a brief outline of the proof. We consider a source U = (W, V ) which consists of two independent message sources W , V to be transmitted from the source node S to the destination node D. First, the network will be used t1 times in the mode m1 , in which the source node S transmits v from the message u = (w, v) to the relay node R. We assume that this rate is possibly too high for the destination node D to allow reliable decoding. Then, the network will be used t2 times in mode m2 , in which both of the source node S and relay node R cooperate to resolve the uncertainty of message v at the receiver of the destination node D. Also, the source node S will transmit the other part of the message u = (w, v), i.e. message w ∈ W , to the destination during the use of the channel in this mode. In other words we consider the message source pair (U, V ), where the input of the relay node R will be determined by the message source V and the input of the source node S will be determined by message source U . The destination node D will collect all of the received information in the t1 + t2 uses of the network about the message u = (w, v) and then performs the decoding. Throughout this paper we will use the notion of ²-typical sequences and the asymptotic equipartition property as defined in [2] and [3]. For simplicity of our notation we use the following definitions: Definition 1:(n’th extension of the memoryless source) The n’th extension of a random variable X with distribution X ∼ p(x) is a sequence of independent and identically distributed (i.i.d.) random variables Xi , 1 ≤ i ≤ n with the same distribution p(xi ) = p(x) and is denoted by X n . Definition 2:(n’th extension of discrete memoryless channel) The n’th extension of a discrete memoryless channel with channel probability function p(y1 , y2 , . . . , yN1 |x1 , x2 , . . . , xN2 ) is defined as a channel with the chann n n nel probability function p(y1n , y2n , . . . , yN |xn 1 , x2 , . . . , xN2 ) = 1 Qn i=1 p(y1i , y2i , . . . , y(N1 )i |x1i , x2i , . . . , x(N2 )i )

A block code for the channel consists of three integers t1 , t2 , m and three encoding functions. In mode m1 the encoding function xt01 m : V m → X0t1 m assigns codewords to the source outputs. In mode m2 , the encoding functions x1t2 m : U m → X1t2 m and xt22 m : V m → X2t2 m assign codewords to the source outputs, and finally decoding function (t +t )m d(t1 +t2 )m : Y 0 1 2 → U m decode the transmitted message. Let X˜0 = X0t1 , Y˜0 = Y0t1 , and Y˜1 = Y1t1 be the t1 ’th extension of the random variable X0 , Y0 , Y1 , and X˜1 = X1t2 , X˜2 = X2t2 , and Y˜ = Y t2 , be the t2 ’th extension of the random variables X1 , X2 , and Y respectively. Also consider the t1 ’th extension of the channel in mode m1 as a channel with c.p.f. p(˜ y0 , y˜1 |˜ x0 ) = p(y0t1 , y1t1 |xt01 ) and similarly the t2 ’th extension of the channel in mode m2 as a channel with c.p.f. p(˜ y |˜ x1 , x ˜2 ) = p(y t2 |xt12 , xt22 ). In the following subsections we consider m-sequences of the random variables u, v, w, x ˜0 , x ˜1 , x ˜2 , y˜, y˜0 , y˜1 denoted as u, ˜0, x ˜1, x ˜2, y ˜, y ˜0, y ˜1. v, w, x II.2 Encoding and Decoding Generating Random Codes: Fix the conditional probability mass functions p(x0 |v), p(x2 |v), and p(x1 |u, x2 ) where p(u, v, x0 , x1 , x2 , y, y0 , y1 ) = p(u)p(v)p(x0 |v)p(x2 |v) p(x1 |u, x2 )p(y|x1 , x2 )p(y0 , y1 |x0 ). ˜ 2 sequence and (i) For each v ∈ V m generate one x ˜Q one x 0 sequence drawn according to the distribuQm tion m i=1 p(x2i |vi ) and i=1 p(x0i |vi ) and label ˜ 2 (v) and x ˜ 0 (v) respectively. them as x ˜Q (ii) For each u ∈ U m generate one x 1 sequence drawn according to the distribution m i=1 p(x1i |x2i , vi ) ˜ 1 (u|˜ and label it as x x2 ). Encoding: Source node S uses the network in mode m1 and ˜ 0 (v). Relay node R performs decoding upon retransmits x ˜ 1 using jointly ²-typical sequences. Provided that the ceiving y transmitted v in the mode m1 is decodable at the relay node R (but not necessarily at the destination node D), the source ˜ 1 (u|˜ node S transmits x x2 ) and the relay node R transmits ˜ 2 (v) in mode m2 . x Decoding at the relay node: The relay node, R, performs ˜ 1 using jointly ²-typical sequences. decoding upon receiving y In order to find the transmitted message v, the decoder at the relay node R finds the only message v such that ˜ 0 (v), y ˜ 1 ) ∈ A² , where A² is the appropriate set of jointly (v, x ²-typical sequences. If there is no message v or there exists more than one such a message the decoder declares error. Thus, if the source has transmitted the message v0 then the error occurs if

˜ 0 (v), x ˜ 1 (u|˜ ˜ 2 (v), y ˜, y ˜ 0 ) ∈ A² , where A² is the ap(u, v, x x2 ), x propriate set of jointly ²-typical sequences. If there is no such pairs or there exists more than one such pair the decoder declares error. Thus, if the source has transmitted the pair (u0 , v0 ) then the error occurs if ˜ 0 (v0 ), x ˜ 1 (u0 |˜ ˜ 2 (v0 ), y ˜, y ˜0) ∈ (i) (u0 , v0 , x x2 ), x / A² , or (ii) there exists some (u, v) 6= (u0 , v0 ) such that ˜ 0 (v), x ˜ 1 (u|˜ ˜ 2 (v), y ˜, y ˜ 0 ) ∈ A² (u, v, x x2 ), x II.3 Analysis of the error probability Analysis of the probability of error of the decoder at the destination node: With the mentioned definition of error declaration at the decoder of destination node D, the average probability of error P¯m,D can be written as: X

P¯m,D =

p(u, v)P {error made at the decoder

(u,v)∈U m ×V m

of node D |(u, v)is the transmitted message from S}

(5)

It can be bounded from above by: X

P¯m,D ≤

p(u, v)P {error made at the decoder of

(u,v)∈A²

node D|(u, v) is the transmitted message from S} +

X

p(u, v)

(6)

(u,v)∈A / ²

From the asymptotic equipartition property (AEP), for the P sufficiently large n, (u,v)∈A / ² p(u, v) ≤ ², thus P¯m,D ≤

X

p(u, v)P {error made at the decoder of

(u,v)∈A²

node D |(u, v)is the transmitted message from S} + ²

(7)

Now, we consider the terms in the summation and find an upper bound which is independent of the (u, v). To show this we assume that (u0 , v0 ) ∈ A² is the transmitted message from the source node S, and let F denote such an event. Thus, we are interested in an upper bound for P {error made at the decoder |F} The event or error at the decoder, E = E1 ∪ E2 , is union of two events E1 and E2 , where ˜ 0 (v0 ), X ˜ 1 (u0 |X ˜ 2 ), E1 : the event that (u0 , v0 , X ˜ 2 (v0 ), Y˜ , Y˜0 ) ∈ X / A² , and E2 : the event that there exist some (u, v) 6= (u0 , v0 ) ˜ 0 (v), X ˜ 1 (u|X ˜ 2 ), X ˜ 2 (v), Y˜ , Y˜0 ) such that (u, v, X ∈ A²

˜ 0 (v0 ), y ˜1) ∈ (i) (v0 , x / A² , or ˜ 0 (v), y ˜1) (ii) there exists some v 6= v0 such that (v, x ∈ A² Decoding at the destination node: The destination node, ˜0, y ˜ in modes m1 D, performs decoding upon receiving y and m2 by using jointly ²-typical sequences. In order to find the transmitted message w, the decoder at the destination node D finds the only pair (u, v) such that

Note that since we have generated the code randomly and we are averaging the probability of error over all coding scheme generated this way, the only random variables in the event E ˜0, X ˜1, X ˜ 2 , Y˜ , and Y˜0 . Following from the asymptotic are X equipartition property (AEP), it is always possible to choose m large enough to make the probability of the error event E1 |F as small as possible, P {E1 |F} ≤ ²

(8)

The event E2 = E21 ∪ E22 itself can be decomposed into is union of two events E21 and E22 , where

for which we have, P {E22 |F} =

E21 : the event that there exist a u 6= u0 such that ˜ 0 (v0 ), X ˜ 1 (u|X ˜ 2 ), X ˜ 2 (v0 ), Y˜ , Y˜0 ) ∈ A² (u, v0 , X E22 : the event that there exist a u 6= u0 , and a v 6= v0 ˜ 0 (v), X ˜ 1 (u|X ˜ 2 ), X ˜ 2 (v), Y˜ , Y˜0 ) such that (u, v, X ∈ A²

X

˜ 0 (v), P {(u, v, X u6=u0 ,v6=v0 ,(u,v)∈A² ˜ 1 (u|X ˜ 2 ), X ˜ 2 (v), Y˜ , Y˜0 ) ∈ A² |F} X

From the properties of jointly typical sequences, it can be shown that ˜ 0 (v), X ˜ 1 (u|X ˜ 2 ), X ˜ 2 (v), Y˜ , Y˜0 ) ∈ A² |F} P {(u, v, X

Now we find the bounds on P {E2i |F} for i = 1, 2

˜

˜ 0 (v0 ), X ˜ 1 (u|X ˜ 2 ), P {E21 |F} = P r{∃u 6= u0 : (u, v0 , X ˜ ˜ ˜ X2 (v0 ), Y , Y0 ) ∈ A² |F}

(9)

˜

˜

˜

Therefore,

X

˜

˜

˜

˜

˜

2−m[I(X1 ,X2 ;Y )+I(X0 ;Y0 )−8²]

u6=u0 ,v6=v0 ,(u,v)∈A²

X

(21)

˜ 0 (v0 ), X ˜ 1 (u|X ˜ 2 ), P {(u, v0 , X

u6=u0 ,(u,v0 )∈A² ˜ 2 (v0 ), Y˜ , Y˜0 ) ∈ A² |F} (10) X

or P {E22 |F} ≤ ||{(u, v) : (u, v) ∈ A² }||× ˜

˜

˜ ˜

˜

˜ ˜

2−m[I(X1 ;Y |X2 )−8²]

˜

˜

||{(u, v) : (u, v) ∈ A² }|| ≤ 2m[H(U,V )+2²]

(11)

for (u, v0 ) ∈ A² . Note that this bound is independent of u, hence by substituting (11) in (10) we have X

˜

(12)

u6=u0 ,(u,v0 )∈A²

(23)

Combining Equations (22) and (23), and using the fact that u = (w, v), we have ˜1, X ˜ 2 ; Y˜ ) + I(X ˜ 0 ; Y˜0 ) + 10²] (24) P {E22 |F} ≤ m[H(U ) − I(X Thus, if ˜1, X ˜ 2 ; Y˜ ) + I(X ˜ 0 ; Y˜0 ) − 10² H(U ) ≤ I(X

Thus ˜

˜ ˜

P {E21 |F} ≤ ||{u : (u, v0 ) ∈ A² }||.2−m[I(X1 ;Y |X2 )−8²]

(13)

and from the joint typicality of (u, v) we have m[H(U |V )+2²]

(14)

Combining Equations (13) and (14), and using the fact that u = (w, v), and also w and v are independent we have ˜ 1 ; Y˜ |X ˜ 2 ) + 10²] P {E21 |F} ≤ m[H(W ) − I(X

(15)

˜ 1 ; Y˜ |X ˜ 2 ) − 10² H(W ) ≤ I(X

(16)

Then for large enough m we have P {E21 |F} ≤ ²

(17)

˜ 0 (v), P {E22 |F} = P r{∃u 6= u0 , ∃v 6= v0 : (u, v, X ˜ ˜ ˜ ˜ ˜ X1 (u|X2 ), X2 (v), Y , Y0 ) ∈ A² |F}

(18)

(26)

Combining the bounds on P {E2i |F} for i = 1, 2 from the Equations (17), (26) and using the union bound on P {E2 |F} will results in P {E2 |F} ≤ P {E21 |F} + P {E22 |F} ≤ 2²

(27)

Considering the union bound for P {E} and using the bounds on P {E1 |F}, P {E2 |F} in the Equations (8), (27) we have P {E|F} ≤ P {E1 |F} + P {E2 |F} ≤ 3²

(28)

Using the definition of the event E (which is bounded by (28)) in the error probability P¯m,D and substituting in the Equation (7) we have P¯m,D ≤ P {E|F} + ² ≤ 4²

Bound for P {E22 |F} : We have

(25)

Then for large enough m we have P {E22 |F} ≤ ²

||{u : (u, v0 ) ∈ A² }|| ≤ 2

(22)

and from the joint typicality of (u, v) we have

˜ 0 (v0 ), X ˜ 1 (u|X ˜ 2 ), X ˜ 2 (v0 ), Y˜ , Y˜0 ) ∈ A² |F} P {(u, v0 , X ≤ 2−m[I(X1 ;Y |X2 )−8²]

˜

2−m[I(X1 ,X2 ;Y )+I(X0 ;Y0 )−8²]

From the properties of jointly typical sequences, it can be shown that

Thus, if

(20)

for (u, v0 ) ∈ A² . By substituting the bound of (20) into (19), and noting that this bound is independent of (u, v) we have P {E22 |F} ≤

P {E21 |F} ≤

˜

≤ 2−m[I(X1 ,X2 ;Y )+I(X0 ;Y0 )−8²]

Bound for P {E21 |F} : We have

P {E21 |F} =

(19)

(29)

Thus, the average probability of error at the decoder of destination node D can be made arbitrarily small (as of Equation (29)) if we use the mentioned encoding scheme for large

enough m, and the conditions of the Equations (16) and (25) are satisfied. Analysis of the probability of error of the decoder at the relay node: Similar to what we had for the decoder at the destination node D, the average probability of error can be made arbitrarily small for large enough m. The relay node ˜ 1 by finding the only R performs decoding upon receiving y ˜ 0 (v), y ˜ 1 ) ∈ A² , where A² is the apmessage v such that (v, x propriate set of jointly ²-typical sequences. Thus, the average probability of error at the decoder of relay node R, P¯m,R , can be written as P¯m,R =

X

p(v)P {error made at the decoder of

v∈V m node R |v is the transmitted message from S}

(30)

Using the same line of proof like what we did to bound the average error probability at the decoder of the destination node we can show that P¯m,R ≤ 3²

(31)

for large enough m, if the following inequality is satisfied ˜ 0 ; Y˜1 ) − 5² H(V ) ≤ I(X

(32)

Thus, from the above analysis of the probability of error and form the Equations (16), (32) and (25), the source U = (W, V ) can be transmitted with arbitrarily small probability of error for large enough m provided that the following conditions hold ˜ 1 ; Y˜ |X ˜2) H(W ) ≤ I(X ˜ 0 ; Y˜1 ) H(V ) ≤ I(X

(34)

˜1, X ˜ 2 ; Y˜ ) + I(X ˜ 0 ; Y˜0 ) H(U ) ≤ I(X

(35)

(33)

Since the message source U = (W, V ) is composed of two independent message sources W and V , by combining the Equations (33) and (34) we have ˜ 1 ; Y˜ |X ˜ 2 ) + I(X ˜ 0 ; Y˜1 ) H(U ) = H(W ) + H(V ) ≤ I(X

(36)

On the other hand if for the message source U , the conditions of the Equations (35) and (36) are satisfied it can be decomposed into two independent message sources W and V such that the rates of the message sources W and V satisfy the conditions of the Equations (33) and (34). Thus, the source U can be reliably transmitted if ˜ 1 ; Y˜ |X ˜ 2 ) + I(X ˜ 0 ; Y˜1 ), H(U ) = min{I(X ˜1, X ˜ 2 ; Y˜ ) + I(X ˜ 0 ; Y˜0 )} I(X

(37)

Now, Assume that the average rate of source U per channel use is R, then H(U ) = (t1 + t2 )R. Thus, by using the ˜0, X ˜1, X ˜ 2 , Y˜ , Y˜0 , and Y˜1 , we have definitions of X (t1 + t2 )R = min{t1 I(X0 ; Y1 ) + t2 I(X1 ; Y |X2 ), t1 I(X0 ; Y0 ) + t2 I(X1 , X2 ; Y )}

(38)

1 , we conclude the achievability of the By defining t , t1t+t 2 rate R for any t, 0 ≤ t ≤ 1 defined below which is the same as the rate R∗ in the Equation (3).

R = min{t I(X0 ; Y1 ) + (1 − t) I(X1 ; Y |X2 ), t I(X0 ; Y0 ) + (1 − t) I(X1 , X2 ; Y )}

(39)

0

Ri+1

Ri

R1

RL

i

1

i+1

Ci

L

Ci+1

Figure 2: Cascaded Network III. Cascaded Network with Cheap Nodes Consider L discrete memoryless channel in cascade shown in Figure 2. We index each channel from left to right as i = 1, 2, . . . , L, and each node from left to right as 0, 1, 2, . . . , L. We are interested in transmitting information from node 0 to node L. Thus, node i receives Yi which is the output of the channel i and transmit its information, Xi+1 via channel i + 1, which is the input to this channel. Since we have assumed that channels are cascaded and there are no other connection between the nodes other than stated, the channel output Yi just depends on the input Xi but no other transmitted signals. For each channel i, where i ∈ {1, 2, . . . , L}, define capacity of each individual link as Ri , maxp(x) I(Xi ; Yi ) where maximization is taken over all possible distributions of Xi . III.1 Capacity of the Cheap Cascaded Network It has been known [3] that the capacity of such cascaded system without the mentioned practical limitation on transmission and reception at the same time is the minimum of the individual rates of the channels C1 = min{R1 , R2 , . . . , , RL }. Since each channel can transmit information at least with the rate of C1 without any restriction on receiving data from previous node, achievability of this minimum rate is immediate. Also, the known cut-set bound [3] of network information theory states that no higher rate is achievable. On the other hand, imposing the mentioned practical limitation will decrease the achievable rate in this cascaded channel and the mentioned known cut-set bound is no longer tight. In [7], an upper bound for the information transfer rate R from the node source 0 to the destination node L is shown to be R , R(0L) ≤ sup min{( tm

i

M X

tm δim ) Ri }

(40)

m=1

when the minimization is taken over i, i ∈ {1, 2, . . . , L} and the supremum is over all the non-negative tm subject to PM i=1 tm = 1. In the above expression δim = 1 iff link i is used in state m of the network, otherwise δim = 0. It is also possible to prove that the above bound is actually achievable. The proof is based on the fact that for any given sets of {t1 ,P t2 , . . . , , tM } associated with theP states 1, 2, . . . , M M satisfying M t = 1, the rate of min {( m i i=1 m=1 tm δim ) Ri } is achievable with arbitrarilyP small probability of error. Thus, the above rate suptm mini {( M m=1 tm δim ) Ri } is the capacity of the multi-hop network with cheap nodes, and in section III.2 we will prove the following capacity expression. Theorem 2. The capacity of the Cascaded network of the figure 2 is given by C = min{

R2 R3 RL−1 RL R1 R2 , ,..., } R1 + R2 R2 + R3 RL−1 + RL

(41)

where Ri := maxp(x) I(Xi ; Yi ) is defined as the capacity of link i III.2 Sketch of the Proof for Theorem 2

nodes {0, 1, 2, 3} and {1, 2, . . . , K} respectively. From the assumption of induction it is possible to find a set of modes and their associated timing sets for the right network such that R0 R rate R0 3+R22 is achievable in this network. Since in this case 3

Consider the cascaded network of Figure 2. First we prove that the rate R defined in the Equation (40) cannot be higher than C defined in the Equation (41). To show this, we consider two cut-set Ci , Ci+1 for each i ∈ {1, 2, . . . , L − 1}. Let Ti , PM m=1 tm δim . Since for each i and m we have δim δ(i+1)m = 0 it can easily be verified that Ti + Ti+1 ≤ 1. Thus, R ≤ min{Ti Ri , Ti+1 Ri+1 } ≤

Ri Ri+1 Ri + Ri+1

(42)

Therefore if we consider the Equation (42) for all values of i ∈ {1, 2, . . . , L − 1} then we conclude that R ≤ C. On the other hand it is possible to show that states 1, 2, . . . , M and their associated set of {t1 , t2 , . . . , , tM } exist such that the rate C defined in the Equation (41) is achievable. Assume that in an L-hop network the minimum of RL−1 RL R2 { RR11+R , R2 R3 , . . . , RL−1 } occurs at node i, i.e. +RL 2 R2 +R3 R1 R2 R2 R3 RL−1 RL Ri Ri+1 = min{ , ,..., } Ri + Ri+1 R1 + R2 R2 + R3 RL−1 + RL (43) Ri+1 Ri then it is easy to see that Ti = Ri +R and T . i+1 = R +R i+1 i i+1 We use strong induction on the number of hops to prove that C in the Equation (41) is achievable. For L = 2 it is fairly easy to see that using two states 1, 2 with the associated tim2 ing set of {t1 , t2 } are enough when, T1 = t1 = R1R+R and 2 R1 T2 = t2 = R1 +R2 Now suppose that for all values of L, L ∈ {2, 3, . . . , K −1} it is possible to find the states 1, 2, . . . , M and their associated timing set of {t1 , t2 , . . . , , tM } for any Lhop network, such that the rate C defined in the Equation (41) is achievable. We will show that the rate C is achievable for the any K-hop network as well, and show how to choose the corresponding states and the associated set of timing for each state. Assume that for the K-hop network the RK−1 RK R2 min of the expression { RR11+R , R2 R3 , . . . , RK−1 } occurs +RK 2 R2 +R3 at node i. if i ∈ {2, . . . , K − 2} then we can consider the left and right networks defined by set of nodes {0, 1, 2, . . . , i + 1} and {i − 1, i, i + 1, . . . , K} respectively. From the assumption of induction it is possible to find two different set of modes and their associated timing sets for the left and right network such R R that rate Rii+Ri+1 is achievable in both networks. Since for the i+1 both left and right networks the minimum occurs at node i, Ri+1 for both of the mentioned set of modes we have Ti = Ri +R i+1 C=

Ri . Therefore, it is possible to compose the and Ti+1 = Ri +R i+1 set of modes and timing sets for the left and right network in order to find a set of modes and the associated timing set for the K-hop network. Now we consider the case that the minimum occurs at node R2 (the other case when the minimum i = 1, i.e. C = RR11+R 2 occurs at the node i = K −1 is similar). In this case we choose R0 R R2 another R30 such that C = RR11+R = R0 3+R22 if R2 ≤ R4 ; or 2

C=

R1 R2 R1 +R2

=

0 R3 R4 0 +R R3 4

3

if R4 ≤ R2 , and we define a new K-hop

network in where the capacity of link 3 is given by R30 instead of R3 . Note that with this definition the minimum still occurs at Node 1 as well as Node 2 if R2 ≤ R4 , or Node 3 if R2 ≤ R4 . In the first case when R2 ≤ R4 we decompose the network into the left and right networks defined by the sets of the

R30 = R1 , it is sufficient to assign all the modes in which the Link 2 is ’off’ to the Link 1. In the other case when R4 ≤ R2 we decompose the network into the left and right networks defined by the sets of the nodes {0, 1, . . . , 4} and {2, 3, . . . , K} respectively. From the assumption of induction it is possible to find a set of modes and their associated timing sets for the both of left and right R0 R networks such that rate R0 3+R44 is achievable in both of the 3 networks. Note that for both of the above case retrieving the original R3 will not decrease the achievable rates since by the way that we defined R30 it is always greater than or equal to the R3 , and it completes the inductive proof.

IV. Concluding remarks We derived the capacity of the degraded cheap relay channel and the multi-hop network with the cheap nodes as stated in Theorems 1 and 2. The assumption of cheap relay nodes in the network is important for design of the practical systems, and the results of this paper characterize the limits of information transfer in such a networks.

References [1] Thomas M. Cover, Abbas S. El Gammal: Capacity Thorems for the Relay Channel. IEEE Transactions on Information Theory, Vol-25, No.5, pp. 572–584, september. (1979) [2] Thomas M. Cover, Abbas S. El Gammal, Masoud Salehi: Multiple Access Channel with Arbitrarily Correlated Sources, IEEE Transactions on Information Theory, Vol-26, No.6, pp. 648–657, november. (1980) [3] Thomas M. Cover, Joy A Thomas: Elements of Information Theory. John Wiley and Sons, Inc., New York (1991) [4] Michael Gastpar, Martin Vetterli: On the Capacity of Wireless Networks: The Relay Case. (2002) [5] Piyush Gupta and P. R. Kumar: The Capacity of Wireless Networks, IEEE Transactions on Information Theory, pp. 388-404, vol. IT-46, no. 2, March (2000) [6] P.Gupta, P.R. Kumar: Towards an Information Theory of Large Networks: An Achievable Rate Region, Submitted to the IEEE Trans. on Information Theory, September (2001), Revised November (2002) [7] Mohammad A. Khojastepour, Ashutosh Sabharwal, Behnaam Aazhang: Bounds on Achievable Rates for General Multiterminal Networks with Practical Constraints, Submitted to The 2nd International Workshop on Information Processing in Sensor Networks (IPSA ’03), Pale Alto, California, USA, April (2003) [8] E. C. van der Meulen: Three-terminal communication channels, In: Adv. Appl. Prob., vol. 3, pp. 120-154, (197l) [9] A. Reznik, S.R. Kulkarni, S. Verdu: Capacity and Optimal Resource Allocation in the Degraded Gaussian Relay Channel with Multiple Relays, In: 40th Allerton Conference on Communication, Control, and Computing, September (2002) [10] L. Xie, P.R. Kumar: A network information Theory for Wireless Communication: Scaling Laws and Optimal Operation, Submitted to the IEEE Trans. on Information Theory, April (2002) [11] Jacob Wolfowitz : Coding Theorems of Information Theory. 3rd edn. Springer-Verlog, Berlin Heidelberg Newyork (1978)