Differential Turbo Coded Modulation with APP Channel Estimation

6 downloads 41998 Views 337KB Size Report
APP channel estimation decoding without initial CSI over constant and random walk ...... provides a 0.2 dB advantage in turbo cliff onset, at the cost of a higher ...
Differential Turbo Coded Modulation with APP Channel Estimation Sheryl L. Howard and Christian Schlegel Dept. of Electrical & Computer Engineering University of Alberta Edmonton, AB Canada T6G 2V4 Email: sheryl,[email protected]

Abstract— A serially concatenated coding system which can operate without channel state information (CSI) with use of a simple channel estimation technique is presented. This channel estimation technique utilizes the inner decoder’s a posteriori probability (APP) information about the transmitted symbols to form a channel estimate for each symbol interval, and is termed APP channel estimation. The serially concatenated code (SCC) is comprised of an outer rate 2/3 binary error control code, separated by a bit interleaver from an inner code consisting of an 8-PSK bit mapping and differential 8-PSK modulation. Coherent decoding provides bit error rate (BER) performance 0.6 dB from 8-PSK capacity for large interleaver sizes. APP channel estimation decoding without initial CSI over constant and random walk phase models shows near-coherent results, with fractions of a dB performance loss for random walk and linear phase models.

Keywords: Differential modulation, serial concatenated codes, channel phase estimation, iterative decoding. I. I NTRODUCTION With the advent of turbo codes [1], [2] and iterative decoding of serially concatenated codes [3], the push for nearcapacity performance of error control codes has become a reality. Use of these near-capacity codes with iterative decoding now allows receiver operation in very high noise/low signalto-noise ratio (SNR) environments. Turbo trellis coded modulation (TTCM) has shown performance gains with higher order modulation similar to those of binary turbo coding. Both parallel concatenated TCM with bit interleaving [4] and symbol interleaving [5] using 8-PSK, 16QAM and 64-QAM modulation and serial concatenated TCM for 8-PSK [6], [7] have been examined. However, at the low SNR values achievable with turbo codes, issues such as phase synchronization become critical, especially for higher order modulations. Conventional phase synchronization utilizes a PLL (phase-locked loop) or Costas loop [8], resulting in phase ambiguities for PSK constellations. In addition, the squaring loss for higher order PSK modulation becomes significant, effectively prohibiting the use of these mechanisms for phase synchronization at the low SNRs at which turbo codes can operate. For 8-PSK suppressed-carrier signalling, the squaring loss of an eighth-power-law device at Es /N0 =9 dB is upper-bounded by 10 dB [9] with respect to PLL operation on an unmodulated carrier. Typical loop SNRs must be at least 6 dB to achieve synchronization [10], [11]; thus the eighth-power squaring device must see a minimum

loop SNR of 16 dB to achieve lock. This requires narrow loop bandwidth, which is incompatible with fast tracking and acquisition as needed in wireless packet systems. Effective phase synchronization for iteratively decoded systems using higher order PSK modulation becomes highly problematic. The classical method of eliminating phase synchronization is differential M-PSK encoding with differential demodulation; however, a 3 dB loss in SNR vs bit error rate (BER) occurs for M-PSK, with M > 2 [12]. This also applies to turbo coding. Differential BPSK modulation resulted in a 2.7 dB loss in SNR [13] for a rate 1/2 turbo code using differential detection. Such significant loss is counter to the near-capacity performance expected of turbo codes. Various techniques to mitigate the penalty incurred by differential detection have been used. Multiple symbol differential detection [14], [15], [16] has been applied to iteratively decoded serially concatenated codes with differential modulation [17], using linear prediction to obtain channel estimates. This technique results in an exponential expansion of the decoding trellis. An expanded-state decoding trellis is also presented in [18] for serial concatenation of a rate 1/2 convolutional code with differential M-PSK modulation. The channel phase is discretized into N states, resulting in a linear trellis expansion of M N states. Similarly, in [19] iterative decoding of turbo codes with QPSK modulation incorporates channel estimation for fading channels by using quantized phase in an expanded ”supertrellis”. Instead of expanding the already complex APP decoding trellis, channel estimation of PSAM (pilot-symbol-assisted modulation) BPSK turbo codes over fading channels in [20] is accomplished in the iterative decoding block but outside the APP decoder to still allow for iteratively improving channel estimates. A BER of 10−4 within 0.5 dB of coherent detection for slow fading is achieved. Turbo-embedded estimation (TEE) is an alternate approach that has been investigated for BPSK in [21] and extended to QPSK and 8-PSK [22]. This technique uses the most probable state during the forward recursion of the APP decoder to obtain a symbol estimate, which is fed to a simple tracking loop to compute an updated phase estimate. No state expansion of the APP trellis, and corresponding increase in complexity, occurs. However, TEE requires an initial phase estimate to begin decoding; this phase estimate is obtained from a known preamble whose length is approximately 1.5% of the total

2

packet length. Results over the AWGN channel, with phase noise according to a recent DVB-S2 proposal [23], for rate 1/3 8-PSK modulation are 0.3 dB from coherent decoding at a BER of 10−5 , increasing with rate 1/2 8-PSK modulation to 0.7 dB from coherent decoding at a BER of 2 × 10−3 . From a coding perspective, differential modulation may be viewed as a recursive convolutional code [24] and, as such, can be used as a viable inner code in a serially concatenated coding (SCC) system which will exhibit interleaver gain [3]. This idea has been used for binary SCCs [24], [25], and shown to provide good performance when coherently decoded. In this paper, we approach the construction of an SCC for higher order modulations from a code design perspective. We also show how use of an inner differential M-PSK code leads in a natural way to a decoder which functions without any prior channel phase information, providing performance nearly indistinguishable from that of an ideal receiver with complete phase knowledge. This decoder relies on the ability of the APP algorithm for the differential inner modulation code to provide useful soft information on the information symbols without any prior phase knowledge, even in the presence of significant timevarying channel phase offset, which is confirmed via EXIT analysis [26], [27]. This soft information is used by a lowcomplexity phase estimator to iteratively improve the channel metrics. Through iterations, decoding performance close to the complete phase knowledge scenario is achieved with integrated APP channel phase estimation. This method uses APP soft information from the differential decoder without expanding trellis complexity. Neither external phase estimation, such as a training sequence preamble or non-data-aided (NDA) phase estimation, nor differential demodulation are needed to begin the decoding process. An EXIT analysis approach [26], [27] is used to find matched outer codes that work optimally with the inner modulation code. The (3,2,2) parity check code is such a code, providing turbo cliff results only 0.6 dB away [28] from the 8PSK capacity at a system rate of 2 bits/symbol [29], [30]. An 8-PSK mapping specifically designed to optimize this system’s error floor [31] is presented. The performance of our serially concatenated system compares favorably with two coherently-decoded serially concatenated systems using 8-PSK modulation at a system rate of 2 bits/symbol. These are 1) serial concatenated trellis coded modulation (SCTCM) [6], [7] and 2) soft-decision iteratively decoded bit-interleaved coded modulation (BICM-ID) [32]. The paper is organized as follows: Section 2 describes the serially concatenated binary error control code/differential 8-PSK system, examining the encoder, simulated channel, and decoder. Section 3 discusses analysis techniques used to optimize the system; distance spectrum analysis allows improvement of the error floor through a new mapping, while EXIT analysis predicts turbo cliff behavior and completes the error analysis. Section 4 presents our method of obtaining channel estimates when decoding without CSI, which we term APP channel estimation. Section 5 presents simulation results. Conclusions are discussed in Section 6.

II. S YSTEM D ESCRIPTION A. Encoding System The encoder for our proposed serially concatenated transmission system is shown in Figure 1. A sequence of information bits u = [u0 , . . . , uL−1 ] of length L is encoded by an outer rate 2/3 binary error control code, such as the (3,2,2) parity code, into a sequence of coded bits v = [v0 , . . . , v3L/2−1 ] of length 3L/2. The coded bits v are interleaved bitwise through a random interleaver [2],[3]. These interleaved √ coded bits v0 are mapped to 8-PSK symbols w, wk = Es ejθk ; the symbol energy is normalized to Es = 1, and the 8PSK symbols have unit magnitude and phase angle θk ∈ [0, π/4, . . . , 7π/4] rads. Initially, we assume natural mapping, which labels the bit combinations 000 − 111 increasing consecutively counterclockwise from 0 rads.

u

(3,2,2) Parity Encoder

v0

v Interleaver

Differential Modulator (010) (011)

(001)

(100)

(000)

(101)

(111) (110)

×

x

D Fig. 1.

Serial turbo encoder for differential turbo coded modulation.

The 8-PSK symbols are then differentially encoded, so that the current transmitted differential symbol xk = wk xk−1 . This differential 8-PSK encoding serves as the inner code of the serially concatenated system, and is viewed as a recursive non-systematic convolutional code [24]. A recursive inner code is necessary to provide interleaving gain in a serially concatenated system [3]. This inner code has a regular, fullyconnected 8-state trellis, in which the trellis states correspond to the current transmitted differential symbol. Transmission of a block of encoded symbols is initialized with √ x0 = w0 , and terminated with a reference symbol xL/2 = Es ej0 to end in the zero state. A simplified three-stage differential 8-PSK trellis section (corresponding to three symbol time intervals) is shown in Figure 2. The actual trellis is fully connected but for clarity, only the first stage shows all connections. Two decoding paths are depicted in the trellis section. The solid bold path indicates the correct path, with the dashed bold path above it resulting from a phase rotation of π/2 radians. Both the original 8-PSK symbols w and the transmitted D8-PSK symbols x are shown as w, x above each corresponding trellis branch. Notice that while the π/2 phase rotation affects the transmitted D8-PSK symbols x, the information 8-PSK symbols w are identical on both paths. The trellis and code are rotationally invariant. This rotational invariance will be significant when considering system decoding without phase synchronization. B. Channel Model We use a complex AWGN channel model with noise variance N0 /2 in each dimension. Initially, the channel phase is assumed known at the receiver. The received symbols y consist of the transmitted symbols plus complex noise of variance N0 , i.e., y = x + n.

3

PAPP (w|y)

Correct phase: w=[2 1 2], x=[2 3 5] Rotated phase: w=[2 1 2], x=[4 5 7] 7

wk , xk

p(y|x)

2,7

6

7 6

1,5

5

2,5

4

-

4

2,4

1,3

3 2

-

5

2

1

1 time

Fig. 2.

8-PSK differential trellis section.

0

k−1

k

Bit→ LLR

λe (v0 )

-

π −1

λa (v)

Pa (w) Bit→ Symbl

LLR →Bit λa (v0 )

π

λe (v)

Outer Parity APP Soft Decision Decoder

Pa (v0 )

Filter yˆ x∗

Symbl →Bit

ˆ h

PAPP (x|y) x ˆk

3

2,2

0 state

Inner D8-PSK APP Soft Decision Decoder

PAPP (v0 )

k+1

Next, the assumption of phase knowledge is dropped, and we consider an unknown phase offset at the receiver. The channel now consists of complex AWGN n plus a possibly time-varying symbol phase offset φ: yk = xk ejφk + nk . Three different phase models are considered: • Constant but unknown phase offset φk = Φ, ∀ k. This could model short packet transmission systems or frequency hopping, where the phase could be assumed approximately constant over one transmission block. • Gaussian random walk model. The unknown phase evolves as φk = φk−1 + ∆Φk , where the phase process ∆Φk is given by zero-mean i.i.d. Gaussian random 2 variables with variance σΦ . This could model phase jitter from oscillator instability. • Constant Frequency Offset ∆f . Phase offset equals frequency offset multiplied by time; thus a constant frequency offset results in a linear phase offset of slope 2π∆f rads with respect to time. For simulation purposes, we assume the phase offset constant during one symbol interval T ; the phase increases by a discrete amount of 2π∆f T rads for each consecutive symbol interval as φk = φk−1 + 2π∆f T . This could model a Doppler scenario. C. Iterative Decoding Iterative decoding of this serially concatenated system proceeds according to turbo decoding principles [2], [3]. Figure 3 displays a block diagram of the decoding process. The APP channel estimation block is shown in the dashed rectangle, and is discussed in detail in Section IV. Note that differential demodulation is never used at the receiver for detection with or without phase knowledge, when APP channel estimation is employed. Assuming detection with known phase for now, received channel symbols y are converted to channel metrics 2 p(yk |xk ) = (πN0 )−1 e−|yk −xk | /N0 , which are fed to the inner APP decoder for the differential code, along with a priori information Pa (w). No a priori information is available for the inner APP decoder during the first iteration, thus  uniform a priori probabilities are used; Pa wk = ej2lπ/8 = 1/8, ∀l = 0, . . . , 7. The inner APP decoder uses the BCJR [33] or

Fig. 3. Serial turbo decoder, D8-PSK inner decoder and (3,2,2) outer parity decoder with APP channel estimation.

forward-backward algorithm operating on the 8-state trellis of the differential 8-PSK code, depicted in Figure 2, to calculate conditional symbol probabilities on both the 8-PSK symbols w and the transmitted D8-PSK symbols x. The APP 8-PSK symbol probabilities PAPP (w) are converted first to APP bit probabilities PAPP (v0 ) through marginalization, for example, X 0 0 Pe (wk (vi0 = 0, vi+1 , vi+2 )) (1) Pe (vi0 = 0) = wk :vi0 =0

The APP bit probabilities are converted to APP log-likelihood ratios (LLRs) λAPP (vi0 ) = PAPP (vi0 = 1)/PAPP (vi0 = 0). Extrinsic output LLRs λe (v0 ) are obtained as usual by subtracting the a priori LLRs λa (v0 ) from the APP LLRs λAPP (v0 ). Deinterleaving the extrinsic LLRs λe (v0 ) provides a priori LLRs λa (v) as input to the outer APP decoder. The outer APP decoder for a simple binary code may be implemented with low complexity; the (3,2,2) parity code examined herein is simple enough, with only 3 bits and 4 codewords, to implement the 6 APP equations explicitly. The APP probabilities that express the parity constraints of the code are given by PAPP (v1 = 0) ∝ Pa (v1 = 0) · (Pa (v2 = 0)Pa (v3 = 0) + Pa (v2 = 1)Pa (v3 = 1)) (2) PAPP (v1 = 1) ∝ Pa (v1 = 1) · (Pa (v2 = 0)Pa (v3 = 1) + Pa (v2 = 1)Pa (v3 = 0)), (3) and analogously for v2 and v3 . APP LLRs λAPP (v) are computed from these, and extrinsic LLRs λe (v) obtained by subtracting off the a priori LLRs λa (v). These extrinsic LLRs λe (v) are then interleaved to obtain new a priori LLRs λa (v0 ). These must be converted first to bit probabilities Pa (v0 ), then to a priori symbol probabilities Pa (w) for use in the next iteration of decoding in the inner APP symbol decoder. Symbol probabilities are calculated as the product of their component bit probabilities, that is, Pa (w(v1 = V 1, v2 = V 2, v3 = V 3)) =

3 Y

Pa (vi = V i) (4)

i=1

In this fashion, with inner and outer APP decoders exchanging extrinsic information each iteration, iterative decoding continues until convergence or a fixed number of iterations is reached, at which time a hard decision on the APP information bit LLRs λAPP (u) from the outer binary APP decoder ˆ. determines the estimated information sequence u

4

III. C ODE P ROPERTIES

AND

P ERFORMANCE A NALYSIS

Turbo code performance may be divided into three regions. In the first, the low SNR/high BER region, the turbo code does not perform well and iterative decoding has minimal effect. The turbo cliff or waterfall region follows, where the BER performance drops sharply to low values in only fractions of a dB SNR increase. Finally, at high SNR, there may be an error floor or flare where the BER curve flattens out due to the predominance of low-weight error events. The latter two regions, the turbo cliff and error floor regions, require separate methodologies to analyze concatenated code performance. Extrinsic mutual information transfer, or EXIT, analysis [26], [27] is used to accurately predict turbo cliff onset SNR. Mutual information serves as a reliability measure of the soft information into and out of each component decoder. The second method is the minimum distance asymptote approximation [34] of the error performance in the high SNR error floor region, where performance of turbo coded systems flattens out by following the error curve of the most likely, minimum-weight sequence error event. Both methods are used to demonstrate the superior behavior of serially concatenated coded modulation and to optimize system performance. We first examine our system via minimum distance analysis. A. Minimum Distance Analysis The trellis of a differential 8-PSK encoder is fully connected, so that the shortest error event is always only 2 branches long. With respect to the all-zeros sequence, the 7 possible two-branch error events are listed below; the first symbol in each branch is the original 8-PSK symbol and the second symbol is the differential 8-PSK symbol, as shown in Figure 2. Noting that merging branches all carry the same output symbol, only the diverging branch contributes to the minimum squared Euclidean distance (MSED). The twobranch error event MSED is simply that of naturally mapped 8-PSK, 0.586, found in (1) and (7). Without an outer code, the input sequences to the D8-PSK encoder are unconstrained, and the D8-PSK MSED is 0.586. (1) (2) (3)

1 − 1 7 − 0 (4) 2 − 2 6 − 0 (5) 3 − 3 5 − 0 (6)

4 − 4 4 − 0 (7) 7 − 7 1 − 0 5−5 3−0 6−6 2−0

The 8-PSK mapping is not regular, i.e., all symbols separated by a given squared Euclidean distance (SED) do not have the same Hamming distance between them. For example, rotating 45o from one symbol to the next gives a SED=0.586, but Hamming distance dH varies from 1 (between 000 at 0 rads and 001 at π/4 rads) to 3 (between 000 at 0 rads and 111 at 7π/4 rads). The distance between the all-zeros sequence and an error sequence is not representative of the distance between all correct and incorrect paths when the symbol mapping is not regular. Thus we must consider all sequences, not just the all-zeros sequence, recognizing that parallel branches of the differential trellis carry identical information symbols due to the rotationally invariant trellis; all parallel paths are equivalent. However, consideration of the all-zeros sequence is sufficient to show the MSED of the differential code.

0 0 0 1 1 1 1

8-PSK 0 1 1 0 0 1 1

Natural 1 1 0 1 1 1 0 1 1 0 0 0 1 0

Map 1 1 0 0 1 1 0

1 0 1 0 0 1 1

1 0 1 0 1 0 1

8-PSK 1 0 0 1 1 1 0

even x x x x x Improved 1 1 1 1 0 0 0 0 1 1 0 0 1 1

0 0 0 1 1 1 1

Improved 8-PSK Mapping 0 1 → 1 1 1 0 → 0 0 1 1 → 1 0 0 0 → 0 1 0 1 → 0 1 1 0 → 1 1 1 1 → 1 0

Map 0 1 1 1 0 0 1

1 1 0 0 1 0 1

even 1 0 1 0 0 1 1

x

TABLE I B IT SEQUENCES OF TWO - BRANCH ERROR EVENTS FOR NATURAL AND IMPROVED 8-PSK MAPPINGS .

Calculation of the minimum distance of turbo coded systems with random interleavers is significantly more involved than considering the minimum distance of the component codes [3], [34], [35]. The input sequences to the inner decoder for our serially concatenated system are constrained to be interleaved codewords of the outer [3,2,2] parity-check code, which all have Hamming weight dH =2, and thus the entire interleaved sequence is constrained to even Hamming weight. This system is expected to manifest a rather flat error floor due to the very low MSED of 0.586; simulation results presented later will demonstrate such an error floor. We now consider the MSED error events, with the goal of better matching the 8-PSK mapping to the parity code sequence constraints to reduce the most likely error events. The minimum length detour of the inner code is 6 bits long for a two-branch error event. If natural 8-PSK mapping is used, the seven possible bit sequences for the two-branch error events with the all-zeros sequence as reference are given in the left side of Table I. Five out of seven of the two-branch detours in Table I have even weight, and are permissible sequences. An 8PSK signal mapping that generates primarily odd-weight twobranch error events would lower the probability of choosing a MSED sequence, and be better matched to the outer parity check code. Such an improved mapping, presented in [31], is given in the center section of Table I. The input bit sequences for the two-branch error events using the improved mapping with respect to the all-zeros sequence are shown on the right. The improved mapping has only one even-weight sequence, 010-010, which generates a two-branch error event with a squared Euclidean distance (SED) of 4.0. The sequences 111101 and 101-111 both have SED of 0.586, but are not eligible as two-branch error events because they are of odd weight. When we consider all other reference sequences besides the all-zeros sequence, there is only one two-branch error event resulting in MSED=0.586 and minimum dH =2. This MSED error event occurs with the interleaved coded sequence pair ˆ 0 =010-011, 011-010. v0 , v This improved mapping has the same number of evenweight MSED two-branch error events, but far fewer (1 vs. 16)

5

1 0.9 0.8 0.7

Convolutional Coded System

o

0.6 i

EXIT (extrinsic information transfer) analysis [26], [27] is a valuable technique for evaluating concatenated system performance in the turbo cliff, or waterfall, region. The mutual information I(X; E) between symbols X and the extrinsic soft information E with regards to X is used as a measure of the reliability of E generated by each component decoder. Likewise, I(X; A) measures the reliability of the a priori soft information A into the component decoder, with regards to X. Input a priori LLRs A are generated assuming a Gaussian distribution for p(A), shown to be a very good approximation, especially with increasing iterations [38]. Given A, with an associated I(X; A) = IA , the component APP decoder will produce E, with associated I(X; E) = IE . The inner decoder also requires channel metrics on the transmitted symbols, and thus is dependent on the channel SNR. The outer decoder of a serially concatenated system never sees the channel information and is independent of SNR. In this manner, an EXIT chart displaying IA , ranging from 0 to 1, versus IE for the APP decoder is produced. These component EXIT charts are used to study the convergence behavior of concatenated iteratively decoded systems. The interleaving process does not alter mutual information; while scrambling the soft information, interleaving does not change its distribution. In addition, interleaving destroys any correlation between successive symbols. This separation and independence between the two decoders allows the component decoder EXIT charts to be combined into a single EXIT graph depicting the iterative behavior of the turbo decoding process. Each component decoder is simulated individually, without the need for implementation and simulation of the actual concatenated system. EXIT analysis allows the component codes to be chosen for optimization of concatenated system performance, without lengthy simulation of each code combination. For our concatenated system, the outer parity code produces binary soft information which can be processed as LLRs Ao (v) and Eo (v), with associated I(Ao ; v) = IAo and I(Eo ; v) = IEo . As the interleaver operates bitwise, the extrinsic interleaved bit LLRs will be passed on to the inner D8-PSK APP decoder. However, as the inner decoder

E A

B. EXIT Analysis

operates on symbols, these interleaved LLRs Ai (v0 ) (or their corresponding bit probabilities Pa (v0 )) must be converted to symbol probabilities Pa (w) as discussed in Section II. Conversely, the inner APP extrinsic information will be symbol probabilities Pe (w), which must be converted to bit probabilities Pe (v0 ), or their corresponding LLRs Ei (v0 ), for deinterleaving. I(Ai ; v0 ) = IAi and I(Ei ; v0 ) = IEi are calculated for the inner LLR values. Figure 4 displays the EXIT chart for our serially concatenated system with the (3,2,2) parity check code as outer code (IEo on the horizontal axis, and IAo on the vertical axis) and differential 8-PSK as the inner code (with axes swapped). The inner decoder EXIT curves depend on SNR, and are shown for 3.4 and 3.6 dB. The improved 8-PSK mapping discussed in Section III-A is used. Natural 8-PSK mapping allows for earlier turbo cliff onset at SNR 3.3 dB, but produces a higher error floor. An EXIT trajectory for the improved mapping at SNR 3.6 dB is shown also. All curves are simulated using a 180000 bit interleaver. Notice the close fit between the outer

I ,I

dH =2 MSED error events. It can be shown [36] that a random interleaver is far more likely to contain a permutation allowing a dH =2 two-branch error event for both these mappings, with probability independent of interleaver length, than one for dH =4 or 6, which decreases as O(N −1 ). The improved mapping significantly reduces the number of dH =2 MSED two-branch error events compared to natural mapping. This reduction of MSED multiplicity lowers the error floor as will be seen in Section III-C. It does not increase the MSED of the code, which remains at 0.586 with high likelihood. Use of a spread interleaver lowers the error floor further. An S-random interleaver [37] with spreading S ≥ 6 will prevent the occurrence of a single two-branch error event, though it cannot prevent the occurrence of two two-branch error events. The MSED of the concatenated code with a S-random spread interleaver of S ≥ 6 is 1.172.

0.5

Parity Coded System

0.4

outer [3,2,2] parity code inner D8PSK nat map: SNR 3.4 dB inner D8PSK new map: SNR 3.4 dB inner D8PSK new map: SNR 3.6 dB outer rate 2/3 16 st cc inner D8PSK nat map: SNR 5.0 dB EXIT trajectory: SNR 3.6 dB

0.3 0.2 0.1 0 0

0.2

0.4

IA ,I i

0.6

0.8

1

E

o

Fig. 4. EXIT chart for outer (3,2,2) parity code and inner differential 8PSK modulation, at SNR=3.4 and 3.6 dB, with improved mapping; decoding trajectory shown for SNR=3.6 dB. EXIT curve for outer rate 2/3 16-state convolutional code and differential 8-PSK at SNR=5 dB also shown.

(3,2,2) parity check and inner differential 8-PSK EXIT curves at SNR 3.4 dB. The two codes are well-matched, in the sense that the combined codes minimize turbo cliff onset compared to a set of less well-matched codes. An EXIT curve for an outer rate 2/3 16-state maximal free distance recursive systematic convolutional code is also shown in Figure 4, together with an inner differential 8-PSK EXIT curve for SNR 5 dB. Natural mapping is used. The free distance of this convolutional code is 5, compared to a free distance of 2 for the (3,2,2) parity check code. The increase in free distance of the outer code increases the minimum distance of the concatenated code and significantly lowers and reduces the error floor which exists with the parity check code. However, reduction of the error floor comes at a large increase in turbo cliff onset SNR. As shown in Figure 4, pinchoff for the convolutional code and differential 8-PSK modulation occurs at SNR 5 dB, 1.7 dB past that of the concatenated system with the outer (3,2,2) parity code. It is clear from the EXIT curve of the outer rate 2/3 convolutional code that the differential EXIT curve must be

6 0

10

raised significantly higher by increasing SNR to clear the outer EXIT curve, and provide a channel for iterative convergence. This increase in turbo cliff onset is due to the poor match, as shown by EXIT chart, between component decoders.

Figure 5 shows the performance of our proposed serially concatenated coded modulation system with both natural and improved 8-PSK mapping and random interleaving. Results are shown for interleaver sizes of 15000 bits and 180000 bits. Notice the lowered error floor of the improved mapping. Also shown is the improved mapping with a fixed S-random spread interleaver of S=9 and 15000 bits. The spread interleaver lowers the error floor further. At a rate of 2 bits/symbol, 8-PSK capacity is at Eb /N0 = 2.9 dB [29], [30]. Our concatenated system provides BER performance 0.6-0.8 dB away from capacity for large interleaver size. The BER performance of our concatenated system underscores the effectiveness of simple design techniques, combined with turbo coding analysis techniques, in designing and optimizing systems for excellent performance. Not only are the two component decoders very simple to implement (an 8state trellis decoder for the inner code and a lookup table for the outer code) but taken individually, their error control potential is very limited. The (3,2,2) parity check code is very weak, and differential 8-PSK modulation alone is used for its independence from phase synchronization, rather than any error-correcting ability. Together, however, they unfold the full potential of turbo coding, outperforming even large 8-PSK Ungerb¨ock trellis codes [39] by 1 dB. A 64-state 8-PSK TCM code achieves an asymptotic coding gain over uncoded QPSK of 5 dB at a rate of 2 bits/symbol. As shown in Figure 5, our SCC system provides performance results 6 dB better than uncoded QPSK at a BER=10−5 . Along the turbo cliff, 50 decoding iterations are required for convergence. EXIT analysis predicted that a large number of iterations along the turbo cliff would be required for convergence, due to the well-matched EXIT curves of the component codes, which result in a low SNR turbo cliff onset but provide only a narrow tunnel for iterative improvement. Convergence above 4.5 dB occurs in 10 iterations or less. A minimum of 50 errors per data point were collected. As predicted by EXIT analysis, the larger interleaver size shows turbo cliff onset at SNR=3.3 dB for natural mapping and 3.5 dB for the improved mapping. Natural mapping provides a 0.2 dB advantage in turbo cliff onset, at the cost of a higher error floor. For the smaller interleaver size, along the turbo cliff, convergence requires 50 iterations; at SNR 4 dB, 30 iterations are required for convergence, decreasing to 10 iterations at SNR 5 dB and above. EXIT analysis also predicted these rates of convergence. Our coherently decoded SCC achieves a BER of 2 × 10−6 at an SNR of 3.9 dB, with an interleaver size of 15,000 bits using the improved 8-PSK mapping. In comparison, the serially concatenated TTCM system presented in [6], which assumes phase synchronization, provides a BER of 2 × 10−5 at SNR=3.7 dB, the largest SNR value simulated, using an

D8PSK: natural mapping nat map:180k intlv new map:180k intlv nat map:15k intlv new map:15k intlv new map: 15k S=9 spread intlv

−2

BER: Bit Error Rate

C. Code Performance

−1

10

10

−3

10

−4

10

−5

10

−6

10

−7

10

3

3.5

4

4.5

5

5.5

6

SNR=Eb /N0 in dB Fig. 5. Performance of the serially concatenated D8-PSK system with outer (3,2,2) parity code.

interleaver size of 16,385 bits and 8-PSK modulation at a rate of 2 bits/symbol over the AWGN channel. Bit-interleaved coded modulation using iterative decoding (BICM-ID) has been examined for 8-PSK modulation [32]. At a rate of 2 bits/symbol over an AWGN channel with phase synchronization, BICM-ID achieves a BER of 3 × 10−5 at SNR=4.2 dB with an interleaver of 6000 bits. Our coherently decoded SCC provides comparable performance to the SCTTCM system, and superior performance to BICM-ID, and in addition, offers the potential for decoding without CSI. We now examine system performance in the presence of phase noise, without CSI, making use of our APP channel estimator. IV. D ECODING W ITHOUT C HANNEL I NFORMATION Accurate phase acquisition and tracking on physical channels at low SNR to achieve coherent decoding is no easy task. Implementation of the required algorithms often consumes more VLSI area than the decoder itself. Thus, we now consider the case when the received channel phase is unknown; decoding must be performed without a priori CSI, i.e., ”noncoherently”. A key observation is that the rotationally invariant property of the inner code allows us to extract a posteriori information on the information symbols wk , even without knowledge of the carrier phase. This soft information is provided as symbol probabilities from the inner APP decoder. Figure 6 shows the EXIT chart for the differential 8-PSK decoder with various phase offsets at SNR=4.5 dB. Neither differential detection nor pilot symbols were used, and external channel information is not used, i.e., phase synchronization is not assumed. The D8-PSK APP decoder provides some extrinsic information in the presence of phase offset, even when no a priori information is available (along the vertical axis, corresponding to the initial iteration, when IAi = 0). Even the worst-case offset between two symbols, π/8 rads, still provides some extrinsic information initially. The presence of extrinsic information without any a priori information is significant; no external method of generating initial phase

7

1 0.9 0.8 0.7

i

I ,I

E A

o

0.6 0.5 0.4

o

D8PSK: 0 offset o D8PSK: 9 offset o D8PSK: 11.25 offset o D8PSK: 15 offset o D8PSK: 22.5 offset [3,2,2] parity outer code

0.3 0.2 0.1 0 0

0.2

0.4

0.6

I ,I

0.8

1

A E i

o

Fig. 6. EXIT chart of the differential 8-PSK decoder using the new mapping, for various constant phase offsets, SNR=4.5 dB, and (3,2,2) parity check decoder.

The inner APP decoder generates both 8-PSK symbol probabilities P (w) and D8-PSK symbol probabilities P (x); the latter will be used as input to a channel estimator for subsequent iterations. Low complexity is important, given the emphasis on implementation simplicity leading to our choice of component codes. The channel estimator complexity must not overshadow that of the decoding system, and thus an optimal linear estimator such as the minimum mean square error (MMSE) estimator is not considered. We choose a simpler filtering estimator, presented in [28], [31], [40]. The channel model used is AWGN with complex noise variance N0 and a complex time-varying channel gain h. In the case of unknown channel phase, h is a unit-length time-varying rotation. The received signal y is given as yk = hk xk + nk ;

hk = ejφk , nk : N (0, N0 )

(5)

From the first moment equation, E[yk ] = hk E[xk ], where E[xk ] is the expectation over the a posteriori symbol probabilities P (xk ) at time k from the inner APP decoder, an ˆ k may be found as instantaneous channel estimate h ˆ k = yk xˆ∗ h k

(6)

where x ˆk is the mean E[xk ] modified to unit modulus, i.e., x ˆk = E[xk ]/|E[xk ]|. At each iteration, the inner APP decoder sends soft probability estimates of the channel symbols xk to the channel estimator, which calculates the instantaneous channel estimates according to Equation 6. These channel

estimates are then filtered through a lowpass digital filter fk , whose bandwidth allows tracking of the phase noise process, ˜ ˆ to generate the filtered channel estimates PNhk ˆ= hk ? fk . For ˜ a constant phase offset, hk = (1/N ) k=1 hk , where N is the frame length, is simply the average of the instantaneous channel estimates. For a time-varying Markov phase process such as the random walk or linear phase processes, the channel estimates are filtered through an exponentially decaying ˜ k = (1 − α) Pk αk−j h ˆ j , where moving average filter as h j=1 α, the exponential decay parameter, is typically close to 1. These filtered channel estimates are used to generate coherent decoder branch metrics P (yk |xk ) ∝ exp(−|yk − ˜hk xk |2 /N0 ) to be used by the inner APP decoder. No channel information is yet available during the initial ˜ k = 1 is chosen for the initial branch metrics. As iteration, so h we will show, each iteration improves the a posteriori values P (x) from the inner decoder, and thus an improved channel ˜ can be recalculated, as long as the SNR is above estimate h the turbo cliff of the coherent decoder. As mentioned previously, the differential trellis is rotationally invariant to integer multiples of π/4 rads phase shifts. If, however, in decoding the trellis, the beginning and final trellis states are assumed fixed to state 0 as is commonly done, endpoint errors will occur with a phase shift. The rest of the trellis will shift to a rotated sequence, but the endpoints cannot shift and remain fixed, causing errors. To prevent this, we use a “floating” decoding trellis for the inner APP decoder, with both beginning and final states assumed unknown and set to uniform probabilities. .1 .08

random walk phase: solid

.06 .04 Phase in Rads

information will be necessary, such as a training sequence or non-data-aided (NDA) phase estimator, nor will differential demodulation be needed. This initial extrinsic information is insufficient, however, to complete convergence; as seen in Figure 6, the component EXIT curves intersect, and decoding will be stuck there at a high bit error rate for all except φ = 0o . Hence we propose to use this soft information from the inner decoder to estimate and track the channel phase through successive iterations. This leads to a decoder which can achieve convergence even in the presence of significant channel phase offset.

.02 0 -.02 -.04 -.06 -.08 -.1

estimated phase: dashed 0

1000

2000

3000 4000 5000 8-PSK symbols Fig. 7. Random walk channel phase model and estimated phase with π/4 phase slips, decoded without errors.

Figure 7 illustrates the rotational invariance, with a sample random walk phase process at top and the final APP phase estimate beneath. Twice, the phase estimate “slips” to a phase rotated −π/4 rads from the random walk phase. However, no decoding errors occur in these phase slips; the rotationally invariant inner trellis ensures no decoding errors if the phase estimator converges to a rotated phase, and the outer decoder cancels bit errors due to phase jumps, so the phase estimator seamlessly slips between constellation symmetry angles.

8

100

coherent decoding APP channel estimation 10

BER: Bit Error Rate

We now examine system performance in the presence of phase noise, without CSI, making use of our APP channel estimator. Three different channel phase processes are simulated: • Constant Phase Offset: φk = φ for φ = π/16 rads. • Random Walk Phase Process: φk = φk−1 + Φk , 2 2 Φk : N (0, σΦ ) for σΦ = 0.05 deg2 . • Linear Phase Process: φk = φk−1 + 2π∆f T . All channel models include AWGN. Decoding without channel state information is achieved using our integrated APP channel phase estimation algorithm. All channel phase estimation results use the same S-random interleaver with minimal spreading S = 3, size 15000 bits, and are compared with coherent results for the same interleaver. Figure 8 compares the BER performance of decoding with and without CSI over an AWGN channel with a constant channel phase offset of π/16, with near-coherent results. Fifty decoding iterations are used. The constant phase channel esti˜ for the entire frame is an average of the instantaneous mate h ˆ as described in Section IV. Performance channel estimates h degrades somewhat as the phase offset approaches π/8. This is a metastable point, as π/8 is halfway between two valid differential sequences. The instantaneous channel estimates ˆ will oscillate to either side of the π/8 boundary, and h convergence to the correct phase estimate is very slow for a phase offset of exactly π/8 rads. APP channel phase estimation results for a random walk 2 = 0.05 deg2 are compared to coherent phase process with σΦ decoding results in Figure 9. Fifty iterations are used. The ˜k = channel P phase estimation filter coefficients are given by h k k−j ˆ (1 − α) j=1 α hj with exponential decay parameter α = 0.99. APP channel estimation for the random walk phase process gives results 0.25 dB from coherent decoding along the turbo cliff, where 50 iterations are insufficient for convergence. A constant frequency/linear phase offset may model oscillator drift or a mobile Doppler scenario. A carrier frequency fc of 1 GHz and oscillator drift of 0.1 ppm from a high quality oscillator gives ∆f =100 Hz. A symbol rate of 106 symbols/sec corresponds to ∆f T = 10−4 . A lower carrier frequency of 450 MHz reduces ∆f T to 4.5 × 10−5 . In a mobile Doppler scenario, a frequency offset at the receiver occurs when the transmitter is moving relative to the receiver. This frequency offset, termed the Doppler shift, is found as ∆f = vfc /c, where v is the transmitter velocity, and c is the speed of light. A typical highway velocity of 110 km/h and a carrier frequency of 1 GHz correspond to a Doppler shift ∆f of 100 Hz. Increasing fc to 2.1 GHz, a 3G/CDMA/GSM allocated downlink frequency, gives a Doppler shift of 210 Hz, and ∆f T = 2.1 × 10−4 . Three values of constant frequency/linear phase offset are considered, ∆f T = 4.5 × 10−5 , 10−4 , and 2.1 × 10−4 , with larger ∆f T expected to negatively impact performance. Simulation results for a linear phase process with ∆f T = 4.5 × 10−5, 10−4 , and 2.1 × 10−4 are also shown in Figure 9, for 100 iterations. The APP channel estimation of the linear phase process uses filter parameter α0 when the mean a priori LLR magnitude µ|λa (v0 )| < 10, and α1 when µ|λa (v0 )| ≥ 10.

For ∆f T = 4.5 × 10−5 and 10−4 , α0 = 0.975, α1 = 0.99; for ∆f T = 2.1 × 10−4 , α0 = 0.95, α1 = 0.99. Phase estimation results are approximately 0.5 dB from coherent decoding for ∆f T = 2.1 × 10−4 , with 0.4 dB loss for ∆f T = 10−4 , and near coherent performance for ∆f T = 4.5×10−5 . The largest frequency offset has a slightly raised error floor, but smaller offsets converge to the coherent decoding error floor.

−1

10−2 10−3 10−4

Constant Phase Offset π/16 rads 10−5 10−6 10−7 3

3.5

4

4.5

5

5.5

SNR=Eb /N0 in dB Fig. 8. BER results without CSI, phase offset=π/16 rads, using APP channel estimation, compared to coherent decoding; interleaver size=15k bits, S=3 interleaver, new 8-PSK mapping. 0

10

coherent decoding random walk phase −5 linear phase:∆ f T=4.5 × 10 −4 linear phase: ∆ f T=10 −4 linear phase: ∆ f T=2.1 × 10

−1

10

−2

BER: Bit Error Rate

V. C HANNEL E STIMATION S IMULATION R ESULTS

10

−3

10

−4

10

−5

10

−6

10

−7

10

3

3.5

4

4.5 5 SNR=Eb/N0 in dB

5.5

6

Fig. 9. BER results without CSI, random walk and linear phase processes using APP channel estimation compared to coherent decoding; interleaver size=15k bits, S=3 interleaver, new 8-PSK mapping.

VI. C ONCLUSIONS We have shown that a simple serially concatenated system combining an outer (3,2,2) parity check code with an inner differential 8-PSK modulation code offers very good results with iterative decoding according to turbo principles. The rotationally invariant property of the inner code aids in channel phase estimation, and supplies sufficient soft information

9

from the inner APP decoder to allow initial operation under unknown channel phase rotations. A simple channel estimation procedure using this soft information from the inner APP decoder achieves near-coherent performance without channel phase information, under both constant and time-varying simulated phase processes. Neither pilot symbols nor differential demodulation are used or needed. The random walk phase process channel estimation results in a turbo cliff shift of about 0.25 dB to the right, while the linear phase process results in near-coherent results for ∆f T = 4.5 × 10−5 to a 0.5 dB loss in turbo cliff for more severe ∆f T = 2.1 × 10−4 . Both encoding and decoding portions of this system may be implemented with low complexity, and could be used in conjunction with packet transmission, where short messages increase the need for phase offset immunity. Due to a very low MSED, this simple code has a significant error floor. An improved 8-PSK mapping lowers the error floor at a slight 0.2 dB SNR penalty in the turbo cliff onset region. Use of a spread interleaver lowers the error floor further over random interleaving. R EFERENCES [1] C. Berrou, A. Glavieux and P. Thitimajshima, ”Near Shannon limit errorcorrecting coding and decoding: Turbo codes”, Proc. IEEE Intl. Conf. Commun. (ICC) 1993, Geneva, Switzerland, 1993, pp. 1064-1070. [2] C. Berrou, A. Glavieux and P. Thitimajshima, ”Near optimum error correcting coding and decoding: turbo-codes”, IEEE Trans. Commun., vol. COM-44, no. 10, pp. 1261-1271, Oct. 1996. [3] S. Benedetto, D. Divsalar, G. Montorsi and F. Pollara, ”Serial concatenation of interleaved codes: Performance analysis, design, and iterative decoding”, IEEE Trans. Inform. Theory, 44(3), May 1998, pp. 909-926. [4] S. Benedetto, D. Divsalar, G. Montorsi, and F. Pollara, ”Parallel concatenated trellis coded modulation”, Proc. IEEE Intl. Conf. Commun. (ICC) 1996, Dallas, TX, pp. 974-978, June 23-27, 1996. [5] P. Robertson and T. W¨orz, ”Bandwidth-Efficient Turbo Trellis-Coded Modulation Using Punctured Component Codes”, IEEE J. Sel. Areas Commun., vol. JSAC-16, no. 2, pp. 206-218, Feb. 1998. [6] D. Divsalar and F. Pollara, ”Turbo Trellis Coded Modulation with Iterative Decoding for Mobile Satellite Communications”, IMSC 97, June 97. [7] D. Divsalar and F. Pollara, ”Serial and Hybrid Concatenated Codes with Applications”, Proc. 1st Intl. Sym. on Turbo Codes Brest, France, Sept. 3-5, 1997, pp. 80-87. [8] J.P. Costas, ”Synchronous Communications”, Proc. IRE, vol. 44, Dec. 1956, pp. 1713-1718. [9] S.A. Butman, J.R. Lesh, ”The Effects of Bandpass Limiters on n-Phase Tracking Systems”, IEEE Trans. Inform. Theory, vol. 25, June 1977, pp. 569-576. [10] H. Meyr and G. Ascheid, Synchronization in Digital Communications, Vol. 1, Wiley Series in Telecommunications, 1990. [11] F.M. Gardner, Phaselock Techniques, 2nd edition, John Wiley & Sons, 1979. [12] J.G. Proakis, Digital Communications, 4th edition, McGraw-Hill, 2000. [13] E.K. Hall and S.G. Wilson, ”Turbo codes for noncoherent channels”, Proc. IEEE GLOBECOM’97, Nov. 1997, pp. 66-70. [14] D. Divsalar and M.K. Simon, ”Multiple symbol differential detection of MPSK”, IEEE Trans. Commun., vol. 38, pp. 300-308, Mar. 1990. [15] D. Makrakis and K. Feher, ”Optimal noncoherent detection of PSK signals”, Electron. Lett., vol. 26, pp. 398-400, Mar. 1990. [16] M. Peleg and S. Shamai, ”Iterative decoding of coded and interleaved noncoherent multiple symbol detected DPSK”, Electron. Lett., vol. 33, no. 12, pp. 1018-1020, June 1997. [17] P. Hoeher and J. Lodge, “’Turbo DPSK’: Iterative differential PSK demodulation and channel decoding,” IEEE Trans. Commun., 47(6), June 1999, pp. 837-843.

[18] M. Peleg, S. Shamai and S. Gal´an, “Iterative decoding for coded noncoherent MPSK communications over phase-noisy AWGN channel”, IEE Proc. Commun., vol. 147, Apr. 2000, pp. 87-95. [19] C. Komninakis and R. Wesel, ”Joint Iterative Channel Estimation and Decoding in Flat Correlated Rayleigh Fading”, IEEE J. Sel. Areas Commun., vol. 19, No. 9, Sept. 2001, pp. 1706-1717. [20] M.C. Valenti and B.D. Woerner, ”Iterative channel estimation and decoding of pilot symbol assisted turbo codes over flat-fading channels,” IEEE J. Sel. Areas Commun., vol. 9, Sept. 2001, pp. 1691-1706. [21] S. Cioni, G. E. Corazza, and A. Vanelli-Coralli, ”Turbo Embedded Estimation with imperfect Phase/Frequency Recovery”, Proc. IEEE Intl. Conf. Commun. (ICC) 2003, Anchorage, AK, May 2003. [22] S. Cioni, G. E. Corazza, and A. Vanelli-Coralli, ”Turbo Embedded Estimation for High Order Modulation”, Proc. 3rd Intl. Sym. on Turbo Codes & Related Topics, Brest, France, Sept. 1-5, 2003, pp. 447-450. [23] ESA DVB-S2 contribution, ”DVB-S2 Phase Jitter synthesis”, Jan. 2003. [24] K. R. Narayanan and G. L. St¨uber, ”A Serial Concatenation Approach to Iterative Demodulation and Decoding”, IEEE Trans. Commun., vol. COM-47, no. 7, pp. 956-961, July 1999. [25] M. Peleg, I. Sason, S. Shamai and A. Elia, “On interleaved, differentially encoded convolutional codes”, IEEE Trans. Inform. Theory, 45(7), Nov. 1999, pp. 2572-2582. [26] S. ten Brink, “Design of serially concatenated codes based on iterative decoding convergence”, in 2nd International Symposium on Turbo Codes and Related Topics, Brest, France, 2000. [27] S. ten Brink, “Convergence behavior of iteratively decoded parallel concatenated codes”, IEEE Trans. Commun. , 49(10), Oct. 2001, pp. 1627-1737. [28] S. Howard, C. Schlegel, “Differentially-Encoded Turbo Coded Modulation with APP Channel Estimation”, Proc. IEEE GLOBECOM 2003, San Francisco, CA, December 1-5, 2003. [29] G. Ungerboeck, ”Channel Coding with Multilevel/Phase Signals”, IEEE Trans. Inform. Theory, vol. IT-28, Jan. 82, pp. 55-67. [30] R.G. Gallager, Information Theory and Reliable Communications, New York, Wiley, 1968. [31] S. Howard, C. Schlegel, L. P´erez, F. Jiang, “Differential Turbo Coded Modulation over Unsynchronized Channels”, Proc. of IASTED 3rd Intl. Conf. on Wireless and Optical Commun. (WOC) 2002, Banff, Alberta, Canada, 2002, pp. 96-101. [32] X. Li, A. Chindapol and J.A. Ritcey, ”Bit-Interleaved Coded Modulation With Iterative Decoding and 8PSK Signaling”, IEEE Trans. Commun., Vol. 50, August 2002, pp. 1250-57. [33] L.R. Bahl, J. Cocke, F. Jelinek and J. Raviv, “Optimal Decoding of Linear Codes for Minimizing Symbol Error Rate”, IEEE Trans. Inform. Theory, vol. 20, Mar. 1974, pp. 284-287. [34] L.C. P´erez, J. Seghers and D.J. Costello, Jr., ”A distance spectrum interpretation of turbo codes”, IEEE Trans. Inform. Theory, vol. 42, pp. 1698-1709, Part I, Nov. 1996. [35] S. Benedetto and G. Montorsi, ”Unveiling Turbo-Codes: Some Results on Parallel Concatenated Coding Schemes,” IEEE Trans. Inform. Theory, vol. 43, no. 2, March 1996, pp. 409-428. [36] S. Howard, ”Probability of Random Interleaver Containing dH =2,4 Two-Branch Error Patterns for the Differential 8PSK/[3,2,2] Parity SCC”, unpublished report, HCDC Laboratory, Dept. of Electrical and Computer Engineering, University of Alberta, December 2003. [37] D. Divsalar and F. Pollara, ”Turbo Codes for PCS Applications”, Proc. IEEE Intl. Conf. Commun. (ICC) 1995, Seattle, WA, June 1995. [38] S. ten Brink, ”Design of Concatenated Coding Schemes based on Iterative Decoding Convergence”, Ph.D. dissertation, Universit¨at Stuttgart, Shaker Verlag, Aachen 2002. [39] G. Ungerboeck, ”Trellis-coded modulation with redundant signal sets, Part I: Introduction,” IEEE Commun. Mag., vol. 25, No. 2, pp. 5-11, February 1987. [40] C. Schlegel and A. Grant, ”Differential space-time codes”, IEEE Trans. Inform. Theory, vol. 49, no. 9, Sept. 2003, pp. 2298-2306.

Sheryl Howard (S-’00) received the B.S.E.E. in 1984 from the University of Utah, Salt Lake City, Utah, and the M.E.E.E. in 1988, also from the University of Utah. She is currently working towards the Ph.D. degree in electrical engineering at the University of Alberta, Edmonton, AB, Canada. Her research interests include iterative error-control decoding and coding techniques. Christian Schlegel received the Dipl. El. Ing. ETH degree from the Federal Institute of Technology, Zurich, in 1984, and the M.S. and Ph.D. degrees in electrical engineering from the University of Notre Dame, Notre Dame, IN, in 1986 and 1989. He held academic positions at the University of South Australia, University of Texas and University of Utah, Salt Lake City. In 2001 he was named iCORE Professor for High-Capacity Digital Communications at the University of Alberta, Canada, a 3-million-dollar research program in leading-edge digital communications. His interests are in error control coding and applications, multiple access communications, digital communications, and analog and digital implementations of communications systems. He is the author of “Trellis Coding” and “Trellis and Turbo Coding” by IEEE/Wiley, and “Coordinated Multiple User Communications”, co-authored with Professor Alex Grant. Dr. Schlegel received a 1997 Career Award, and a Canada Research Chair in 2001. Dr. Schlegel is associate editor for coding theory and techniques for IEEE Transactions on Communications, and a guest editor of the IEEE Proceedings on Turbo Coding. He served as technical program co-chair of ITW 2001 and ISIT’05, and general chair of CTW’05, as well as on numerous technical conference program committees.