Generalized Normalized Gradient Descent Algorithm ...

Generalized Normalized Gradient Descent Algorithm Based on Estimated a Posteriori Error Minsung Hur, Jin Yong Choi, Jong-Seob Baek, and JongSoo Seo School of Electrical and Electronic Engineering, Yonsei University 134 Sinchon-dong, Seodaemun-gu, Seoul, Korea [email protected]

Abstract— A Normalized-Least-Mean-Square (NLMS) algorithm is used to adapt to an unknown system by minimizing the error between the desired signal and the signal resulting from the adaptive filter. A small positive constant value is used to ameliorate the problem of its diverging when the input power is small. Moreover, it can be a variable, where Generalized Normalized Gradient Descent (GNGD) algorithm can be used to update the value based on the a priori error. The Steady-State Mean Square Error (SSMSE) performance of the algorithm is expected to improve by using a posteriori error instead of the a priori error. In this paper, a GNGD algorithm based on estimated a posteriori error is derived for use in coefficient update using NLMS algorithm. With simulation results, it is shown that using a posteriori error for updating the small constant value within channel equalizer coefficient update process decreases the algorithm’s SSMSE performance sensitivity to varying initial step size. Keywords— A Posteriori Error, Generalized Normalized Gradient Descent Algorithm, Normalized Least Mean Square Algorithm

I. I NTRODUCTION One of the most popular algorithms in adaptive signal processing is the Least-Mean-Square (LMS) algorithm [1]. The adaptive algorithm using an LMS algorithm is based on a stochastic gradient method, which is widely used due to its low complexity and ease to implement [2]. In the LMS algorithm, the adjustment is directly proportional to the tap-input vector. As a result, when the tap-input vector becomes large, the LMS algorithm suffers from a gradient noise amplification problem. To overcome this difficulty, a modification is made to LMS, referred as Normalized-LMS (NLMS), of which the adjustment applied to the tap-weight vector at some iteration time instance is normalized with respect to the squared Euclidean norm of the tap-input vector at the prior iteration time instance. Another popular approach is to employ a time-varying step size in the standard LMS weight update recursion [8][11]. This is based on using large step size values when the algorithm is far from the optimal solution, thus speeding up the convergence rate. When the algorithm is near the optimum, small step size values are used to achieve a low level of misadjustment, thus achieving better overall performance. This can be obtained by adjusting the step size value in accordance 0 This

research was supported by the Korean Broadcasting System (KBS). This work was also supported by Yonsei University Institute of TMS Information Technology, a Brain Korea 21 program, Korea.

ISBN 978-89-5519-136-3

Input vector u(n)

Output signal y(n)

Transversal Filter ŵ(n)

Adaptive weight -control Mechanism

Fig. 1.

Error signal e(n)

Σ +

Desired response d(n)

Block diagram of adaptive transversal filter

with some criterion that can provide an approximate measure of the adaptation process state. Several criteria have been used: squared instantaneous error [8], sign changes of successive samples of the gradient [10], attempting to reduce the squared error at each instant [9], or cross correlation of input and error [11]. Furthermore, conventional NLMS algorithm can be modified to ameliorate its diverging nature during intervals where the input vector is small, by introducing a small constant value in NLMS update equation. This value can also be updated using the Generalized Normalized Gradient Descent (GNGD) algorithm. This paper sought to achieve better performance by using a posteriori error in GNGD algorithm for NLMS update instead of a priori error so as to minimize the trade-off between Steady-State Mean Square Error (SSMSE) and tracking capability as varying step size is changed adaptively. The proposed technique is expected to improve the performance compared to that of the conventional GNGD algorithms. In Section 2, the conventional algorithms will be reviewed. The proposed algorithm is derived in Section 3. The algorithm’s performance is compared with that of the conventional GNGD algorithm in Section 4. Some concluding remarks are given in Section 5. II. C ONVENTIONAL A LGORITHMS A. The Normalized Least-Mean-Square-Algorithm The structure of the normalized LMS algorithm is shown in Fig. 1. The algorithm is basically the as the standard LMS algorithm; they use the same weight-update mechanism, as can

-23-

Feb. 17-20, 2008 ICACT 2008

2

be seen in the following equation: w(n ˆ + 1) = w(n) ˆ +µ ¯(n)u(n)e∗ (n),

(1)

where the LMS algorithm uses fixed µ ¯(n) = µ0 . It can be seen from the equation that when the power of the input vector, u(n) is large, the LMS algorithm would suffer from a severe amplification of the gradient. To ameliorate this effect, the normalized LMS algorithm uses the normalized version of the input vector, u(n)/||u(n)||2 . Thus, modifying (1), a new weight update equation is derived: w(n ˆ + 1) = w(n) ˆ +

µ ˜ u(n)e∗ (n). ||u(n)||2

(2)

Here, in correspondence to (1), the step size for the normalized LMS algorithm can be defined as, µ ¯(n) =

µ ˜ . ||u(n)||2

µ ˜ u(n)e∗ (n), ||u(n)||2 + δ

C. Generalized Normalized Gradient Descent Algorithm The Generalized Normalized Gradient Descent (GNGD) algorithm has the step size update equation, (4), but (5) is modified as follows. µ ¯ . µ ¯(n) = (7) ||u(n)||2 + δ(n) The update process of δ(n) using the stochastic gradient method of a priori error is given by δ(n + 1) = δ(n) − ρ∇δ(n−1) E(n),

(3)

Also to overcome the gradient noise amplification problem associated with LMS filter, which is caused by the small value of the squared norm ||u(n)||2 , (2) can be modified to produce w(n ˆ + 1) = w(n) ˆ +

At early stages of adaptation, the error is large, causing the step size to increase, thus providing faster convergence speed. When the error decreases, the step size decreases, and it yields smaller misadjustment near the optimum.

where E(n) is defined as E(n) = e2 (n).

µ ˜ , ||u(n)||2 + δ

∂µ ¯(n − 1) ∂E(n) ∂E(n) ∂e(n) ∂y(n) ∂ w(n) ˆ = ∂δ(n − 1) ∂e(n) ∂y(n) ∂ w(n) ˆ ∂µ ¯(n − 1) ∂δ(n − 1) e(n)e(n − 1)uH (n)u(n − 1) , = (||u(n − 1)||2 + δ(n − 1))2 (10)

(5)

it can be viewed as an LMS filter with a time-varying step size parameter. B. Variable step size Algorithm In the standard LMS weight update recursion, one popular approach is to employ a time-varying step size [8]. This is based on using a large step size value when the algorithm is far from the optimal solution, thus speeding up the convergence rate. When algorithm is near the optimum, small step size values are used to achieve a low level of misadjustment, thus achieving better overall performance. This can be obtained by adjusting the step size value in accordance with some criterion that can provide an approximate measure of the adaptation process state. The adaptation step size is adjusted using the energy of the instantaneous error, as shown in (1). The step size update is expressed as µ ¯(n + 1) = α¯ µ(n) + γe2 (n).

(6)

where 0 < α < 1, γ > 0, and µ ¯(n+1) is set to µmin or µmax , when it falls below or above these lower and upper bounds, respectively. The constant µmax is normally selected near the point of instability of the conventional LMS to provide the maximum possible convergence speed. The value of µmin is chosen as a compromise between the desired level of steadystate misadjustment and the required tracking capabilities of the algorithm. The parameter γ controls the convergence time as well as the level of misadjustment of the algorithm. The algorithm has preferable performance over the fixed step size LMS.

ISBN 978-89-5519-136-3

(9)

Here, by using the chain rule, ∇²(n−1) E(n) is expressed as

(4)

where δ > 0. By defining µ ¯(n) as µ ¯(n) =

(8)

where y(n) = uH (n)w(n). ˆ The adaptation algorithm of δ(n) can be obtained as follows by applying (10) to (8). δ(n + 1) = δ(n) − ρµ0

e(n)e(n − 1)uH (n)u(n − 1) (||u(n − 1)||2 + δ(n − 1))2

(11)

III. P ROPOSED GNGD ALGORITHM BASED ON A POSTERIORI ERROR

GNGD algorithm based on estimated a posteriori error simply uses a posteriori error instead of a priori error in the conventional GNGD algorithm. Now, the coefficients of the adaptive filter is updated as follows. w(n ˆ + 1) = w(n) ˆ +µ ¯(n)ep (n)u(n)

(12)

A posteriori error, ep (n) can be estimated using its relationship to a priori error e(n). In order to obtain the relationship between ep (n) and e(n), we pre-multiply uH (n) at both sides of (12) and add d(n) to obtain ep (n) = e(n) − µ ¯(n)ep (n)uH (n)u(n),

(13)

where ep (n) = d(n) − uH (n)w(n ˆ + 1). From (12), estimated a posteriori error ep (n) can be obtained as ep (n) =

e(n) . (1 + µ ¯(n)||u(n)||2 )

(14)

Now, (9) is modified as follows.

-24-

E(n) = e2p (n)

Feb. 17-20, 2008 ICACT 2008

(15)

3

+

Desired Data d(n)

+

Adaptive Equalizer System

Σ

0 priori posteriori

e(n)

-

−5

y(n)

−10 MSE [dB]

Raised Cosine Channel

Data

MSE learning curve

Σ

+ Additive White Gaussian Noise Fig. 2.

µ = 0.1 0

−15

µ0 = 0.9 −20

Block diagram of adaptive equalizer

−25

Steady−State MSE

−30

−18 priori posteriori

0

100

200

300 400 samples (x10)

500

600

700

−20

Fig. 4.

MSE Performance (W=2.9)

SSMSE [dB]

−22 MSE learning curve 0

W = 3.5

−24

priori posteriori −5

−26

MSE [dB]

−28 W = 2.9 −30 0.1

Fig. 3.

0.2

0.3

0.4

0.5

µ0

0.6

0.7

0.8

0.9

1

−20

Thus, (10) is changed to

−25

∂µ ¯(n − 1) ∂E(n) ∂ep (n) ∂y(n) ∂ w(n) ˆ ∂E(n) = ∂δ(n − 1) ∂ep (n) ∂y(n) ∂ w(n) ˆ ∂µ ¯(n − 1) ∂δ(n − 1) ep (n)ep (n − 1)uH (n)u(n − 1) = . (||u(n)||2 + δ(n − 1))2 (1 + µ ¯(n)||u(n)||2 ) (16) The adaptation algorithm of δ(n) can be obtained by applying (16) to (8). δ(n + 1) = δ(n)− ep (n)ep (n − 1)uH (n)u(n − 1) (||u(n − 1)||2 + δ(n − 1))2 (1 + µ ¯(n)||u(n)||2 ) (17)

IV. S IMULATION R ESULTS In order to compare the performance of the conventional GNGD algorithm with the GNGD algorithm based on estimated a posteriori error, the raised cosine channel was used. Fig. 2 is the block diagram of adaptive equalizer to which the NLMS using GNGD algorithm is applied. Input signal is a zero-mean and unit-variance binary random 2 symbol, and zero-mean white noise has variance, σN = 0.001(30dB). The raised cosine channel is given by ¡ ¢¤ ½ 1£ cos 2π (n − 2) , n = 1, 2, 3 2 W (18) h(n) = 0 , otherwise

ISBN 978-89-5519-136-3

µ0 = 0.9

−15

SSMSE performance

ρµ0

µ0 = 0.1

−10

Fig. 5.

0

100

200

300 400 samples (x10)

500

600

700

MSE Performance (W=3.5)

where n is the sample number in the time domain and W is variable which change the magnitude of channel distortion. 7000 input signals and 11 equalizer taps were used for simulation. White gaussian noise is added so as to attain SNR of 30dB. MSE performance is obtained from ensemble average over 100 independent experiments. Also, SSMSE performance is evaluated with varying initial step size, µ0 , for different channel parameters, namely W = 2.9 and 3.5, where 0.001 was used for the initial value of ². ρ = 0.00008, and 0.000001 for W values of 2.9 and 3.5, respectively. As can be seen in Fig. 3, the GNGD algorithm based on estimated a posteriori error has better SSMSE performance than conventional GNGD algorithm. From the figure, it can be verified that the GNGD algorithm based on estimated a posteriori error is less sensitive to varying step size value, which implies that it experiences less SSMSE variation. Fig. 4 and Fig. 5 represent the MSE learning curves of the adaptive equalizer in raised cosine channels with W = 2.9 and W = 3.5, respectively. The GNGD algorithm based on estimated a posteriori error has less SSMSE performance, but

-25-

Feb. 17-20, 2008 ICACT 2008

4

exhibits slower convergence speed than conventional one. Note that as the initial step size, µ0 , is increased, the difference between the performance of SSMSE and convergence speed increases. V. C ONCLUSION The GNGD algorithm based on estimated a posteriori error which is intended to minimize the trade-off between SSMSE performance and tracking capability, uses a posteriori error to update tap coefficients. With simulation results, it was found that it has better SSMSE performance than the conventional GNGD algorithm. However, it was shown that the estimated a posteriori based GNGD algorithm exhibits slower convergence speed. Nonetheless, using the a posteriori error based GNGD algorithm, the algorithm’s sensitivity to different initial step sizes, µ0 , has improved compared to that of the conventional GNGD algorithm. R EFERENCES [1] B. Widrow, J. M. McCool, M. G. Larimore, and C. R. Johnson, Jr., ”Stationary and nonstationary learning characteristics of the LMS adaptive filter,” Proceedings of the IEEE, vol. 64, pp. 11511162, Aug. 1976. [2] B. Widrow and S.D.Stearns, Adaptive Signal Processing, Englewood Cliffs, NJ:Prentice-Hall, 1985. [3] S. Haykin, Adaptive Filter Theory, 4th ed., Prentice-Hall Inc., New Jersey, 2002. [4] D. T. M. Slock, ”On the Convergence Behavior of the LMS and Normalized LMS Algorithms,” IEEE Trans. On Signal Processing, vol. 41, no. 9, pp.2811-2825, Sept. 1993. [5] R. H. Kwong and E. W. Johnston, ”A variable step size LMS algorithm,” IEEE Trans. On Signal Processing, vol. 40, pp.16331642, July 1992. [6] Danilo P. Mandic, ”A Generalized Normalized Gradient Descent,” IEEE Signal Processing letters, vol. 11, Feb. 2004. [7] S.C. Douglas, M. Rupp, ”A posteriori updates for adaptive filters,” Signals, Systems and Computers, Conference Record of the Thirsty-First Asilomar Conference, vol. 5, pp. 1641-145, Nov. 1997. [8] R.H. Kwong et al, ”A variable step size LMS algorithm,” IEEE Trans. Signal Processing, vol. 40, pp. 1633-1642, July 1992. [9] V. J. Mathews and Z. Xie, ”A stochastic gradient adaptive filter with gradient adaptive step size,” IEEE Trans. Signal Processing, vol. 41, pp. 2075-2087, June 1993. [10] R. Harris et al., ”A variable step size (VSS) algorithm,” IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, pp. 499-510, June, 1986. [11] T. J. Shan and T. Kailaith, ”Adaptive algorithms with an automatic gain control feature,” IEEE Trans. Acoust., Speech, Signal Processing, vol. 35, pp. 122-127, Jan. 1988.

ISBN 978-89-5519-136-3

-26-

Feb. 17-20, 2008 ICACT 2008