Non-Systematic LDPC Codes via Scrambling and ...

0 downloads 0 Views 182KB Size Report
LDPC codes by scrambling or splitting redundant data bits into coded bits. These methods consist of cascading a sparse matrix or an inverse of a sparse matrix ...
Non-Systematic LDPC Codes via Scrambling and Splitting Amira Alloum, Joseph J. Boutros, Gil I. Shamir, and Li Wang∗ Abstract We consider design of non-systematic low-density parity-check (LDPC) codes for channel decoding of redundant sequences. We demonstrate that in the presence of source redundancy in channel coded sequences there may be a significant advantage to well designed non-systematic channel encoding over systematic encoding. In particular, we study methods we recently proposed for designing non-systematic LDPC codes by scrambling or splitting redundant data bits into coded bits. These methods consist of cascading a sparse matrix or an inverse of a sparse matrix with an LDPC code. We propose a method to perform density evolution on splitting based LDPC codes, and show that splitting based LDPC codes achieve better gains in the presence of redundancy than other known codes, including MacKayNeal (MN) codes, without significant loss even if the data contains no redundancy. Using density evolution, we show that for independently identically distributed (i.i.d.) nonuniform (redundant) sequences, splitting based non-systematic codes can theoretically achieve very good performance.

1

Introduction and Notation

Consider a binary i.i.d. source. Let Hs = H2 (π1 ) denote the source entropy, where π1 ; π1 ∈ [0 . . . 1/2], is the probability that a source symbol equals 1. The source output is encoded by a binary C(N, K)2 channel code of length N and dimension K. Codewords of C are transmitted over a noisy channel, which in this sequel is either a binary symmetric channel (BSC) or an additive white Gaussian noise (AWGN) channel. The source distribution π1 is known by the decoder. The system model is illustrated on Figure 1. Systems in which redundant sequences are channel coded are common where compression is not performed or not optimally performed due to lack of error resilience of source codes, or due to the varying rate of compressed sequences. Many prior works (see, e.g., [5], [7]-[10], [14]-[16]) considered this setting. In particular, Hagenauer [7] demonstrated that sometimes, in practical systems, it is better to carry redundancy over to the channel decoder rather than use compression prior to channel encoding. The setting in this paper assumes that source redundancy was left in the data, and the goal is to design a code structure that benefits the most from such redundancy. While prior work showed that one can benefit from utilizing such redundancy in the decoder, it is clear that the benefit can increase significantly if a non-systematic channel code is used. ∗

Amira Alloum is with France Telecom R&D, 92130 Issy-Les-Moulineaux, France, e-mail: [email protected]. Joseph J. Boutros is with Communications and Electronics Department, ENST, 46 Rue Barrault, Paris, France, e-mail: [email protected]. Gil I. Shamir and Li Wang are with ECE Department, University of Utah, Salt Lake City, UT 84112, U.S.A., e-mails: [email protected], [email protected]. The work of G. Shamir and L. Wang was supported by NSF Grant CCF-0347969.

Source

Channel Encoder Noisy

source distribution

Sink

Channel

Channel Decoder

Figure 1: Source-controlled channel decoding on a noisy channel. The reason is that with a well designed non-systematic code, channel distribution that is closer to the capacity achieving one than that obtained with systematic codes can be obtained in the presence of source redundancy. This point was illustrated in recent work by Zhu et. al. [16], in which a search for code generators for non-systematic turbo codes for nonuniform sequences with different non-uniformities was performed. The MacKayNeal (MN) codes [9], [10] are also non-systematic codes on graphs that were proposed for coding redundant data. Shamir and Boutros [14] recently proposed a general method to construct non-systematic LDPC encoders and decoders for channel decoding of redundant data sequences. In this paper, we continue study of these codes, with specific focus on splitting based LDPC codes. The paper is organized as follows: In Section 2, we describe the theoretical limits for systematic and non-systematic codes. Then, Section 3 describes MN codes and the general code structure proposed in [14] for non-systematic LDPC codes. Finally, in Section 4, a method for density evolution of splitter based LDPC codes is described.

2

Capacity with Nonuniform Sources

First, consider the AWGN channel with a real Gaussian code book of rate Rc = K/N for channel encoding. The information rate R transmitted by the encoder is R = Hs × Rc (bits per real dimension). For a given source distribution and a fixed coding rate, we want to determine the minimal achievable signal-to-noise ratio per bit, E br /N0 , where Ebr denotes the energy per redundant source symbol. Hence, we solve the equation where information rate is equal to the channel capacity, R = 21 log2 (1 + 2Rc Ebr /N0 ). We obtain the well known expression [5] Ebr /N0 =

22Hs Rc − 1 . 2Rc

(1)

Now, consider a noisy channel with a binary input. Let q; q ∈ [0, 1/2], denote the probability of one channel input value. We must distinguish two cases, systematic and non-systematic codes. In the latter case, it is assumed that non-systematic encoding is ideal. The code book is randomly constructed in order to generate independent uniformly distributed bits out of source bits, i.e., q = 1/2. Then, we obtain the minimal achievable signal-to-noise ratio if R = I, where I is the average mutual information between channel input and channel output. On BSC with channel transition probability pBSC , the upper limit on pBSC for non-systematic codes satisfies Hs Rc = 1 − H2 (pBSC ) .

(2)

If binary phase ³p shift-keying ´ (BPSK) with hard decisions at the detector is used, then pBSC = Q 2Rc Ebr /N0 , where Q(·) is the Q-function. Using this relation, we can obtain the lower limit on Ebr /N0 . On the Gaussian channel with BPSK input and real continuous output, the limit on Ebr /N0 for non-systematic codes satisfies Hs Rc = 1 − E [log2 (1 + exp(−2X/N0 ))] ,

X is N (1, N0 ),

(3)

where mathematical expectation is denoted by E[·], N (1, N0 ) denotes a Gaussian ran√ dom variable with mean 1 and variance N0 , and the BPSK amplitude A = 2Rc Ebr is normalized to 1. When encoding with a systematic code, we assume that an ideal random interleaver is mixing the Rc -fraction source bits and the (1−Rc )-fraction parity bits before transmission on the noisy channel. Then q = Rc p1 + (1 − Rc ) 21 . On the binary symmetric channel, the limit on pBSC for systematic codes satisfies Hs Rc = H2 (q(1 − pBSC ) + (1 − q)pBSC ) − H2 (pBSC )

(4)

On the Gaussian channel with BPSK input, the limit on Ebr /N0 for systematic codes satisfies Hs Rc = −qE [log2 (q + (1 − q) exp(−2X/N0 ))] − (1 − q)E [log2 (1 − q + q exp(2Y /N0 ))] (5) where X is N (1, N0 ) and Y is N (−1, N0 ). Figure 2 illustrates the information theoretical limits given by (1)-(5). An additional curve on the right graph in Figure 2 shows the density evolution curve for a splitter based code, as discussed in Section 4. As can be observed in both graphs, gains can be obtained with non-systematic codes. In particular, in the AWGN case, gains in signal to noise ratio of about 1dB and slightly more can be observed at source entropy 0.4 or less. 0.50

2

Systematic code Non-Systematic code

0

0.40

Minimum Achievable Ebr/N0 (dB)

Channel Transition probability pBSC

0.45

0.35

0.30

0.25

0.20

-4

-6

-8

Capacity limit, Systematic code, BPSK input Capacity limit, Non-Systematic code, BPSK input split-LDPC code, DE thresholds

-10

0.15

0.10

-2

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

-12

0.1

0.2

Source Entropy (bits)

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Source Entropy (bits)

Figure 2: Maximum achievable BSC transition probability pBSC (left) and minimum achievable Ebr /N0 for AWGN channel (right) versus source entropy Hs for coding rate Rc = 1/2.

3

Non-Systematic LDPC codes

Before we describe non-systematic codes built by cascading sparse and dense scrambling and splitting matrices with LDPC codes, we briefly recall the structure of MN codes.

3.1

Mackay-Neal Codes

Mackay-Neal codes [9], [10], also called ‘MN’ codes, are non-systematic codes built as follows: Construct a low density binary matrix C1 of dimensions N × K. Construct a full rank low density binary square matrix C2 of dimensions N × N . Both C1 and C2 can be found by applying a Gaussian elimination to an N × (K + N ) LDPC matrix H of column weight db and row weight dc (see the construction proposed in [10]). After applying column permutations to guarantee the full rank, H is written as H = [C1 |C2 ]. A source vector s consisting of K information bits is encoded by a low-density generator matrix (LDGM) code [6] defined by C1 followed by a scrambling of the LDGM codewords defined by the inverse of C2 . If t is an MN codeword, then the encoding process is t = C2−1 C1 s. On a BSC, the binary channel output is z = t + n, where n is the binary noise vector of length N . Decoding should solve C2 z = C1 s + C2 n. In general, for any kind of channel, a Tanner graph representation of MN codes is defined by C2 t = C1 s. Both representations, the one suitable for BSC and the general one are shown in Figure 3.

    !

 !  %( ! 

"

  

) #$%% & %'  

    



 

  



Figure 3: MN graph representation. Check nodes are denoted by β. Edges between β and source bits are defined by C1 . For BSC (left), edges between β and noise are defined by C2 . On a general DMC channel (right), edges between β and coded bits t are also defined by C2 . In the above description, MN codes are regular since the column weight of C1 and C2 is equal to db . Improved irregular MN codes have been proposed by Kanter and Saad in [8]. Stability of irregular MN codes has been improved in [12]. MN codes can be described as multi-edge codes [12] with a simple cascade structure. In the sequel, the scrambling matrix C2−1 will be called a splitter, because it can be viewed as if an incoming bit is split into several code bits. MN codes are obtained by LDGM followed by a splitter. We next describe non-systematic multi-edge codes by applying a scrambler (equivalent to C2 ) or a splitter (equivalent to C2−1 ) on the source symbols followed by LDPC encoding. Such codes were recently proposed by Shamir and Boutros in [14]. The term scrambler is used since the operation of a sparse square matrix on incoming bits can be viewed as if bits are scrambled together to generate an output bit.

3.2

Scrambling and Splitting Cascaded with LDPC Codes

In [14], a general approach was proposed for non-systematic LDPC codes based on either pre-coder or post-coder scrambler or splitter, combined with either LDPC or LDGM codes. Here, we focus on pre-coding with either a scrambler or a splitter combined with an LDPC code. We refer to a pre-coding scrambler system as a scramble-LDPC code, and to the pre-coding splitter as a split-LDPC code. (Post-coding scrambler and splitter LDPC codes will be referred to as LDPC-scramble and LDPC-split codes, respectively.)

Let A be a sparse matrix of dimensions K × K. For a regular scrambler or splitter, A has row and column weight ds . A scramble-LDPC encoder first encodes (scrambles) the source vector s into u by u = As. Then, the vector u is encoded by a systematic LDPC generator matrix G of dimensions N × K to the code vector [uT |ϑT ]T = Gu, where the superscript T denotes the transpose operator. The vector ϑ is the parity vector generated by the systematic LDPC generator matrix. For a regular LDPC code, the parity-check matrix H has column weight db and row weight dc . A split-LDPC encoder is very similar, except that the scrambling operation is replaced by splitting performed by u = A −1 s. In LDPC-scramble and LDPC-split encoders, the LDPC encoding and the scrambling or splitting operations, respectively, are performed in the opposite order, where the matrix A has dimensions N ×N . We note that splitting resembles the quick-look-in (QLI) property in turbo codes [1], in which two (or more) parity bits sum up to the original source bit. The QLI property has been recently used [15] to aid in design of non-systematic turbo codes for decoding redundant sequences with unknown statistics. Decoding graphs for scramble-LDPC and split-LDPC codes that combine parity checks of the LDPC codes, denoted by β, with those obtained from the scrambler or splitter, denoted by α (see [14] for more details on the decoding graph) are shown in Figure 4. For redundant sequences, a split-LDPC code has an advantage over a scramble-LDPC code in generating channel distribution closer to the uniform capacity achieving one. This is because splitting results in an even split to 1 and 0 bits. While scrambling leads to output distribution closer to uniform from the original nonuniform one, this distribution can still be shown to be nonuniform. Hence, the limiting curves for scramble-LDPC codes can be shown to be between those for systematic and non-systematic codes in Figure 2. Figure 5 shows bit error rate performance for an MN code, and regular scramble-LDPC and split-LDPC codes for a nonuniform source with π1 = 0.1 (entropy Hs ≈ 0.47). While the curves for both the MN code and the scramble-LDPC code improve on those obtained for systematic codes that utilize the source redundancy (see, e.g., [14]), the split-LDPC code exhibits far superior performance to the other two codes.

(1 − Rc ).N

ϑ

dc

db

β

(1 − Rc ).N

(1 − Rc ).N

ϑ

dc

db

β

dc Rc .N

db 1

u

Rc .N

db ds

u

ds

1 Rc .N

α

s ds

ds

(1 − Rc ).N

dc

Rc .N

Rc .N

α

s 1

Rc .N

1

Figure 4: Simple cascade of a regular ds -degree scrambler (left) and ds -degree splitter (right) and a regular (db , dc ) binary LDPC code. Similar graph representations are valid for scrambler/splitter cascaded with an irregular (λ(x), ρ(x)) LDPC code. The number of nodes is indicated as a fraction of code length. In the splitter cascade, source nodes are all of degree 1, and can be deleted from the graph by embedding them into the splitter check nodes α.

10-1

Scrambler ds=3 + (3,6)-regular LDPC MN code, column weight=3 Splitter ds=3 + (3,6)-regular LDPC +

Bit Error Rate

10-2

-3

10

10-4

10-5

-2

-1.8

-1.6

-1.4

-1.2

-1 -0.8 Ebr/N0 (dB)

-0.6

-0.4

-0.2

0

Figure 5: Bit error probability vs. signal-to-noise ratio for rate 1/2 MN, scramble-LDPC, and split-LDPC codes, dimension K = 1000, and nonuniform source distributions π1 = 0.1.

4

Density Evolution for Split-LDPC with Nonuniform Sources

As shown in the previous section, a split-LDPC code is superior to both MN and scrambleLDPC code. In this section, we describe how density evolution can be performed on a split-LDPC code with a nonuniform source in order to determine the code threshold under iterative decoding on a binary input additive white Gaussian noise channel. Let ³ ´ P (0) us introduce some notations. Logarithmic ratio (LR) is defined as log P (1) . Following the notations in [3], we define the R-operator and the related ρ-transform as R(a, b) = 2 tanh

−1

µ

µ ¶¶ b tanh , tanh 2 2 ³a´

ρ(p) =

dc X

ρj Rj−1 p.

j=2

Messages propagating on graph edges are of LR-type, and will be characterized by their probability density function. • The probability distribution of LR-messages going from u nodes to α nodes is p1 (x). These are type-1 messages. A type-1 message includes ds − 1 incoming extrinsics from check nodes α and db incoming extrinsics from check nodes β. The tree representation of type-1 messages is illustrated in Figure 6. Source nodes s are hidden inside α check nodes. • The probability distribution of LR-messages going from u nodes to β nodes is p2 (x). These are type-2 messages. A type-2 message includes ds incoming extrinsics from check nodes α and db − 1 incoming extrinsics from check nodes β. The tree representation of type-2 messages is illustrated in Figure 6.

• The probability distribution of LR-messages going from ϑ nodes to β nodes is p3 (x). These are type-3 messages. A type-3 message includes db − 1 incoming extrinsics from check nodes β. The tree representation of type-3 messages is illustrated in Figure 6. • The probability distribution of LR-messages propagating from α check nodes to u variable nodes is denoted pα (x). Finally, in a similar fashion, probability distribution of messages generated by β is pβ (x).

0

1

*$23,.0

0

*=F D >=F@8A

D

G

G

P

>:H!@BA C

QWSVU O C

JI GLK JNI M

>:?!@BA

O

O

Q RTSVU JI GLK JI M