General Classes of Performance Lower Bounds for ... - CiteSeerX

5 downloads 0 Views 485KB Size Report
Jan 14, 2010 - it is simple to calculate and it is usually possible to obtain closed form ... Therefore, numerical search ..... C. MSE lower bounds via projections in closed subspaces of Hφ .... according to (21) and the Pythagorean theorem [31],.
SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

1

General Classes of Performance Lower Bounds for Parameter Estimation - Part I: Non-Bayesian Bounds for Unbiased Estimators Koby Todros and Joseph Tabrikian Dept. of ECE Ben-Gurion University of the Negev Beer-Sheva 84105, Israel Email: [email protected], [email protected]

Abstract In this paper, a new class of lower bounds on the mean-square-error (MSE) of unbiased estimators of deterministic parameters is proposed. Derivation of the proposed class is performed by projecting each entry of the vector of estimation error on a closed Hilbert subspace of L2 . This Hilbert subspace contains linear transformations of elements in the domain of an integral transform of the likelihood-ratio function. The integral transform generalizes the traditional derivative and sampling operators, which are applied on the likelihood-ratio function for computation of performance lower bounds, such as Cram´erRao, Bhattacharyya and McAulay-Seidman bounds. It is shown that some well known lower bounds on the MSE of unbiased estimators can be derived from this class by modifying the kernel of the integral transform. A new lower bound is derived from the proposed class using the kernel of the Fourier transform. In comparison with other existing bounds, the proposed bound is computationally manageable and provides better prediction of the threshold region of the maximum-likelihood estimator, in the problem of single tone estimation.

Index Terms Parameter estimation, mean-square-error bounds, threshold SNR, non-Bayesian estimation, uniformly unbiased estimators, performance lower bounds.

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

2

I. I NTRODUCTION Lower bounds on the mean-square-error (MSE) of estimators enable optimal performance prediction and constitute a benchmark for performance evaluation. There are three main categories of lower bounds on the MSE of estimators: (1) non-Bayesian bounds for cases where the parameters of interest are deterministic, (2) Bayesian bounds for cases where the unknown parameters are random, and (3) Hybrid bounds for cases where the observation model contains deterministic and random parameters. In this and the succeeding paper, new classes of non-Bayesian and Bayesian lower bounds on the MSE of estimators are proposed. In Part I, a new class of non-Bayesian bounds is derived for the case of unbiased estimators by projecting each entry of the vector of estimation error on a closed Hilbert subspace of L2 . This Hilbert subspace contains linear transformations of elements in the domain of an integral transform of the likelihood-ratio (LR) function. In part II, a new class of Bayesian bounds is derived by projecting each entry of the vector of estimation error on a closed Hilbert subspace of L2 , which contains linear transformations of elements in the domain of an integral transform of particular functions, which are orthogonal to any function of the observations. Historically, the first lower bound on the MSE of any unbiased estimator of deterministic parameters was the Cram´er-Rao bound (CRB) [1], [2]. The CRB is widely used due to the following reasons. First, it is simple to calculate and it is usually possible to obtain closed form expressions, which are useful for system analysis and design. Second, under some mild conditions, the maximum-likelihood estimator (MLE) attains the bound asymptotically. The main disadvantage of the CRB is the fact that it is not tight for “large” estimation errors, since it is derived using local statistical information of the observations in the vicinity of the true parameters. Another disadvantage is that some regularity conditions on the likelihood function and the function of the parameters to be estimated are required. A tighter lower bound was proposed by Bhattacharyya [3]. In similar to the CRB, the Bhattacharyya bound (BHB) is not tight for “large” estimation errors since it is also based on local statistical information. Regularity conditions on the likelihood function and the function of the parameters to be estimated are required as well. Furthermore, derivation of the BHB is cumbersome due to high order derivatives of the likelihood function and the function of the parameters to be estimated. Under the assumption of uniform unbiasedness over an indexed parameter space, the tightest lower bound on the MSE of uniformly unbiased estimators was derived by Barankin [4]. Unfortunately, in many cases the Barankin bound (BB) is practically incomputable. Therefore, numerous works have been devoted to derive computationally manageable approximations of the BB, such as Hammersley-Chapman-

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

3

Robbins (HCR) [5], McAulay-Seidman (MS) [6], McAulay-Hofstetter (MH) [7] and Abel [8] bounds. In [9] it was shown that all the bounds derived in these works, including the CRB and the BHB, may be unified under one general class in which the BB is approximated via piecewise Taylor series expansions of the likelihood function and the function of the parameters to be estimated. Using this approach, a new computationally manageable and tighter BB approximation was derived in [9]. Lower bounds on the MSE of unbiased estimators have been computed for several estimation problems, such as source localization [7], [10]-[12], direction-of-arrival (DOA) estimation [13], Doppler and range estimation [14] and time delay estimation [15]. Computation of these bounds for other applications may be found in [16]-[18]. In [6], [19]-[25] lower bound on the MSE of unbiased estimators have been applied for signal-to-noise-ratio (SNR) threshold analysis in bearing estimation, frequency estimation, source localization, data-aided carrier synchronization and time delay estimation. One may notice that all BB approximations are based on cross-correlating the estimation error with derivatives and/or samples of the likelihood-ratio (LR) function and using the Cauchy-Schwartz inequality. The derivative operator imposes regularity conditions on the likelihood function and yields bounds in which the function of the parameters to be estimated is required to be differentiable. Regarding the sampling operator, increasing the amount of samples or test-points may improve the tightness of the bounds. However, this improvement comes at the expense of computational complexity. Although there is some intuitive link between the ambiguity diagram and the optimal test-points, as shown in [26], there is no explicit analytical procedure for optimal selection of test-points. Therefore, numerical search methods are usually applied for this task. These methods become computationally cumbersome as the number of test-points and the dimensionality of the parameters increase. We note that the derivative and sampling operators may be viewed as integral transforms with specific kernels. In this paper, we show that other lower bounds can be obtained by choosing other kernels of the integral transform. Under the assumption of uniform unbiasedness over a parameter space with finite Lebesgue-measure, a new class of lower bounds on the MSE of uniformly unbiased estimators is proposed in this paper. We begin by showing that lower bounds on the MSE of uniformly unbiased estimators can be obtained via projections of the estimation errors on some Hilbert subspaces of L2 . Hence, a Hilbert subspace of L2 , denoted by Hφ , which contains linear transformations of the LR function, is constructed. It is shown

that projection of each entry of the vector of estimation error on any closed subspace of Hφ , denoted by Hϕ , yields an estimator-independent lower bound on the MSE of any uniformly unbiased estimator. As a special case, if Hϕ = Hφ the integral form of the BB, also presented in [9], [17] and [27], is obtained. We note that this is an extension to the contributions in [28] and [29], in which it was shown January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

4

that in the case of estimating a scalar function of the unknown parameters, the BB is the squared norm of the projection of the estimation error on a Hilbert subspace of L2 , spanned by a countable set of LR functions. (h)

The proposed class is based on the projecting each entry of the vector of estimation error on Hϕ ⊆ Hφ , (h)

where Hϕ

contains linear transformations of elements in the domain of an integral transform of the

LR function. The use of integral transform generalizes the traditional derivative and sampling operators used for computation of the bounds in [2]-[9]. Hence, it is shown that some well known bounds in the literature can be derived from the proposed class via specific choices of the integral transform kernel. It is also shown that for any invertible integral transform, the integral form of the BB is obtained from the proposed class. Moreover, new lower bounds can be derived from the proposed class via modification of the kernel of the integral transform. Hence, by choosing an integral transform, such that the significant information in the LR function is “compressed” into few elements in the domain of the transform, tight and computationally manageable bounds can be obtained by sampling the domain of the integral transform. In searching for this kind of “compressing” integral transform, we note that in cases where the spectrum of the LR function is concentrated in a small subset of the frequency domain, the significant information in the LR function in the parameter space can be “compressed” into a few frequency components via the Fourier transform. Hence, a new lower bound is derived from the proposed class using the kernel of the Fourier transform. In comparison with other existing bounds, it is shown that the proposed bound is computationally manageable and provides better prediction of the signal-to-noise ratio (SNR) threshold region exhibited by the MLE, in the problem of single tone estimation. We mention that the basic idea behind this paper was first presented by us in the conference paper [30]. The paper is organized as follows: In Section II, we show that lower bounds on the MSE of uniformly unbiased estimators can be obtained via projection of each entry of the vector of estimation error on some Hilbert subspaces of L2 . In Section III, a new class of lower bounds on the MSE of uniformly unbiased estimators is derived. The relation of this class to the integral form of the BB is also discussed. In Section IV, some well known bounds are derived from the proposed class by modifying the integral transform kernel. In Section V, a new bound is derived from the proposed class using the kernel of the Fourier transform. In Section VII, the proposed bound, in section V, is compared with some other known bounds in terms of threshold SNR prediction in the problem of single tone estimation. Section VIII, summarizes the main points of this contribution.

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

5

II. MSE LOWER BOUNDS VIA PROJECTIONS IN SOME H ILBERT SUBSPACES OF L2 In this section, it is shown that lower bounds on the MSE of uniformly unbiased estimators can be obtained via projection of each entry of the vector of estimation error on some Hilbert subspaces of L2 . We begin with stating some definitions and assumptions, which will be used in this paper. Afterwards, a Hilbert subspace of L2 , denoted by Hφ , which contains linear transformations of the LR function, is constructed. It is then shown that projection of each entry of the vector of estimation error on any closed subspace of Hφ , denoted by Hϕ , yields an estimator-independent lower bound on the MSE of any uniformly unbiased estimator. As a special case, if Hϕ = Hφ it is shown that the integral form of the BB, also presented in [9], [17] and [27], is obtained.

A. Definitions and assumptions: 1) Parameter space: We assume a measurable parameter space with finite Lebesgue-measure, denoted by Θ ⊆ RM . 2) Function to be estimated: We consider the estimation of g (θ t ), where g : Θ → RL is bounded and deterministic known, and θ t ∈ Θ is deterministic unknown. We note that all the functions used in this paper are assumed to

be measurable [31]. 3) Observation space: An observation space of points, x, is denoted by X . 4) Probability measure and probability density function: Let Pθ denote a family of probability measures on X , parameterized by θ ∈ Θ. It is assumed that Pθ is absolutely continuous w.r.t. the σ -finite positive measure µ on X , such that the Radon-Nikodym derivative [31], f (x; θ) =

dPθ (x) , ∀θ ∈ Θ dµ (x)

(1)

exists. The function f (x; θ) is also termed as the likelihood function of θ observed by the vector x ∈ X.

5) The Hilbert space of L2 (X ): The Hilbert space of absolutely square integrable functions, ζ : X → C, w.r.t. Pθt is denoted by L2 (X ). The inner product of two elements, ζ (x), ζ 0 (x), in L2 (X ) is defined by Z   0 hζ (x) , ζ (x)iL2 (X ) , ζ (x) ζ 0∗ (x) dPθt (x) , Ex;θt ζ (x) ζ 0∗ (x) ,

(2)

X

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

6

where (·)∗ and Ex;θt [·] denote the complex-conjugate and the expectation w.r.t. f (x; θ t ), respectively. Hence, ζ (x) ∈ L2 (X ) if and only if the squared norm, kζ (x)k2L2 (X ) , hζ (x) , ζ (x)iL2 (X ) ,

(3)

is finite. 6) Estimation error and MSE: L ˆ (x) ∈ LL Let g 2 (X ) denote an estimator of g (θ t ), where L2 (X ) , L2 (X ) × · · · × L2 (X ). The | {z }

vector of estimation error and the MSE matrix at θ t are given by

L

ˆ (x) − g (θ t ) , e (ˆ g (x)) , g

(4)

  MSE (ˆ g (x)) , Ex;θt e (ˆ g (x)) eH (ˆ g (x)) ,

(5)

and

respectively. 7) The class of uniformly unbiased estimators with finite L2 (X )-norms: The class of uniformly unbiased estimators of g (θ t ), with finite L2 (X )-norms, is given by  ˜ (x) ∈ LL M, g g (x)] = g (θ) , ∀θ ∈ Θ . 2 (X ) : Ex;θ [˜

(6)

8) The likelihood-ratio function: Let ν (x, θ) ,

 

f (x;θ) f (x;θt )

=

dPθ (x) dPθt (x) ,

 0,

f (x; θ t ) > 0

(7)

otherwise

denote the LR function. In this paper, it is assumed that ν (x, θ) ∈ L2 (X ) ∀θ ∈ Θ. In Appendix A, it is shown that this assumption implies that ν (x, θ) ∈ L1 (Θ) for a.e. x ∈ X , where L1 (Θ) denotes the space of absolutely integrable functions in Θ. B. Construction of the Hilbert subspace, Hφ ∈ L2 (X ) In this subsection, the following Hilbert subspace of L2 (X ) is constructed.   Z   Hφ , φq (x) , q (θ) ν (x, θ)dθ : q (θ) ∈ L1 (Θ) ,  

(8)

Θ

where q : Θ → C. Proposition 1: Let Hφ denote the space defined in (8), then Hφ ⊂ L2 (X ).

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

Proof: According to (2), (3) and (8), it is implied that for any φq (x) ∈ Hφ 2 Z Z 2 kφq (x)kL2 (X ) = q (θ) ν (x, θ)dθ dPθt (x) . X

7

(9)

Θ

Hence, by the properties of the Lebesgue integral it is implied that    Z Z Z   kφq (x)k2L2 (X ) ≤  |q (θ)| ν (x, θ)dθ  q θ 0 ν x, θ 0 dθ 0 dPθt (x) . X

(10)

Θ

Θ

    Since q θ 0 ∈ L+ (Θ) and ν (x, θ) ∈ L+ (X × Θ), then q (θ) q θ 0 ν (x, θ) ν x, θ 0 ∈ L+ (Θ × X × Θ). Therefore, by the Tonelli theorem [31] integration order in (10) can be interchanged,

and hence, kφq (x)k2L2 (X ) ≤

Z Z Θ

  Z   |q (θ)|  ν (x, θ)ν x, θ 0 dPθt (x) q θ 0 dθ 0 dθ.

Θ

(11)

X

One can notice that since ν (x, θ) ∈ L2 (X ) ∀θ ∈ Θ, then by the Cauchy-Schwartz inequality [31] there exists a finite positive constant, c ∈ R, such that ∀θ, θ 0 ∈ Θ Z   ν (x, θ)ν x, θ 0 dPθt (x) = hν (x, θ) , ν x, θ 0 iL2 (X ) X

 ≤ kν (x, θ)kL2 (X ) ν x, θ 0 L2 (X ) < c.

(12)

Therefore, according to (11) and (12) kφq (x)k2L2 (X )

 2 Z < c  |q (θ)| dθ  < ∞,

(13)

Θ

where the last inequality in (13) stems from the fact that q (θ) ∈ L1 (Θ). Hence, under the assumption that Hφ is complete, i.e. any Cauchy sequence in Hφ converges to a limit in Hφ , then Hφ is a closed subspace of L2 (X ). C. MSE lower bounds via projections in closed subspaces of Hφ ˆ (x) ∈ M denote a uniformly unbiased estimator of g (θ t ). In this subsection, the following Let g

properties are shown: 1) The projections of each entry of e (ˆ g (x)) on any closed subspace of Hφ form a vector, termed as ˆ (x)-independent. the “projections-vector”, which is g ˆ (x) is lower bounded by the autocorrelation-matrix of the “projections-vector”. 2) The MSE matrix of g

The two properties stated above, are formulated in the following theorem. January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

8

ˆ (x) ∈ M, and let Theorem 1: Let g  T pJ (e (ˆ g (x)) |Hϕ ) , pJ ([e (ˆ g (x))]0 |Hϕ ) , . . . , pJ [e (ˆ g (x))]L−1 |Hϕ ,

(14)

where pJ ([e (ˆ g (x))]l |Hϕ ) is the projection of [e (ˆ g (x))]l on Hϕ ⊆ Hφ , and Hϕ is closed. ˆ (x)-independent. 1) pJ (e (ˆ g (x)) |Hϕ ) is g

2)   g (x)) |Hϕ ) pH g (x)) |Hϕ ) . MSE (ˆ g (x))  CHϕ , Ex;θt pJ (e (ˆ J (e (ˆ

(15)

Proof: 1) The Hilbert projection theorem, stated in Appendix B, implies that ∀l = 0, . . . , L − 1, pJ ([e (ˆ g (x))]l |Hϕ ) = arg

min φq (x)∈Hϕ

k[e (ˆ g (x))]l − φq (x)k2L2 (X ) ,

(16)

where according to (3) k[e (ˆ g (x))]l − φq (x)k2L2 (X ) =  k[e (ˆ g (x))]l k2L2 (X ) − 2Re hφq (x), [e (ˆ g (x))]l iL2 (X ) + kφq (x)k2L2 (X ) ,

(17)

where < denotes the real operator. Substitution of (17) into (16) and applying that [e (ˆ g (x))]l is independent of φq (x) yields pJ ([e (ˆ g (x))]l |Hϕ ) = arg

min φq (x)∈Hϕ

n  o kφq (x)k2L2 (X ) − 2Re hφq (x), [e (ˆ g (x))]l iL2 (X ) . (18)

ˆ (x) ∈ M, then according to Proposition 3, in Appendix C, Since g Z hφq (x) , [e (ˆ g (x))]l iL2 (X ) = q (θ) [ξ (θ)]l dθ,

(19)

Θ

where ξ (θ) , g (θ) − g (θ t ).

(20)

ˆ (x)-independent. Since φq (x) is g ˆ (x)Hence, the inner-product, hφq (x), [e (ˆ g (x))]l iL2 (X ) , is g

independent as well, then the projections, pJ ([e (ˆ g (x))]l |Hϕ ), l = 0, . . . , L − 1, are independent ˆ (x). Thus, the vector pJ (e (ˆ ˆ (x)-independent. of g g (x)) |Hϕ ), defined in (14), is g

2) Let u (e (ˆ g (x)) |Hϕ ) , e (ˆ g (x)) − pJ (e (ˆ g (x)) |Hϕ )

(21)

denote the projection-error. According to the Hilbert projection theorem, stated in Appendix B, [u (e (ˆ g (x)) |Hϕ )]l ⊥ Hϕ , ∀l = 0, . . . , L − 1. Thus, due to the fact that [pJ (e (ˆ g (x)) |Hϕ )]l ∈

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

9

Hϕ , ∀l = 0, . . . , L − 1, then rH u (e (ˆ g (x)) |Hϕ ) ⊥ rH pJ (e (ˆ g (x)) |Hϕ ), ∀r ∈ CL . Hence,

according to (21) and the Pythagorean theorem [31],

H

2

2

2

r e (ˆ g (x)) L2 (X ) = rH pJ (e (ˆ g (x)) |Hϕ ) L2 (X ) + rH u (e (ˆ g (x)) |Hϕ ) L2 (X ) ∀r ∈ CL .

(22)

2 Therefore, due to the fact that 0 ≤ rH u (e (ˆ g (x)) |Hϕ ) L2 (X ) < ∞, then

H

2

2

r e (ˆ g (x)) L2 (X ) ≥ rH pJ (e (ˆ g (x)) |Hϕ ) L2 (X ) ∀r ∈ CL .

(23)

Hence, according to (2), ∀r ∈ CL     g (x)) |Hϕ ) pH g (x)) |Hϕ ) r. g (x)) eH (ˆ g (x)) r ≥ rH Ex;θt pJ (e (ˆ rH Ex;θt e (ˆ J (e (ˆ

(24)

    Thus, since Ex;θt e (ˆ g (x)) eH (ˆ g (x)) and Ex;θt pJ (e (ˆ g (x)) |Hϕ ) pH g (x)) |Hϕ ) are HerJ (e (ˆ

mitian matrices, it is implied by (5) and (24) that the semi-inequality in (15) holds.

In conclusion, Theorem 1 implies that the bound, CHϕ , defined in (15), is estimator-independent and g (x) ∈ M. MSE (ˆ g (x))  CHϕ , ∀ˆ

(25)

By modifying Hϕ ⊆ Hφ , a variety of bounds can be derived. D. Integral form of the Barankin bound In this subsection, it is shown that projection of each entry of the vector of estimation error on Hϕ = Hφ yields the integral form of the BB, also presented in [9], [17] and [27]. We note that this is an extension to the contributions in [28] and [29], in which it was shown that in the case of estimating a scalar function of the unknown parameters, the BB is the squared norm of the projection of the estimation error on a Hilbert subspace of L2 , spanned by a countable set of LR functions. In Appendix D, it is shown that Z Z CHφ = Θ

 H 0 0 ˜ (θ) K θ, θ 0 q ˜ θ dθ dθ, q

(26)

Θ

where     K θ, θ 0 = hν (x, θ) , ν x, θ 0 iL2 (X ) = Ex;θt ν (x, θ) ν x, θ 0

(27)

˜ : Θ → CL is the solution of an integral equation, is the autocorrelation kernel of the LR function and q

given by

Z

  ˜ θ 0 dθ 0 = ξ (θ) . K θ, θ 0 q

(28)

Θ

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

10

Geometric interpretation of CHφ is depicted in Fig. 1 for the one-dimensional case, i.e. L = 1. g (x)), The bound in (26) is an integral form of the BB. According to [17] and [27], CHφ = min MSE (ˆ ˆ ∈M g

i.e. the BB is the MSE of the uniformly unbiased estimator with minimum MSE at θ t . This estimator is termed as the locally-best-uniformly-unbiased (LBU) estimator. Therefore, CHφ is the tightest lower bound on the MSE of uniformly unbiased estimators. Calculation of CHφ in (26) involves the solution of the integral equation in (28). In many cases this task is analytically impossible and consequently CHφ is practically incomputable. In the following, a new class of bounds from which computationally

manageable bounds can be obtained is derived. III. A

NEW CLASS OF NON -BAYESIAN LOWER BOUNDS FOR UNBIASED ESTIMATORS

In this section, a new class of non-Bayesian lower bounds on the MSE of uniformly unbiased estimators is derived by projecting each entry of e (ˆ g (x)) on a closed subspace of Hφ , which contains linear transformations of elements in the domain of an integral transform of the LR function. Derivation of the proposed class is carried out via the following steps. First, a closed Hilbert subspace of Hφ is constructed. Second, the result of Theorem 1 is applied in order to derive the proposed class. The relation of the proposed class to the integral form of the BB in (26), is discussed as well. A. Construction of a closed subspace of Hφ In this subsection, a closed subspace of Hφ is constructed in the following manner. Let h : Λ × Θ → CP , where Λ ⊆ RM is a measurable space with finite Lebesgue-measure. An integral transform on ν (x, θ) is given by

Z η h (x, τ ) , (Th (ν)) (x, τ ) =

h (τ , θ) ν (x, θ) dθ,

(29)

Θ

where τ ∈ Λ and h (τ , θ) is the kernel of Th . Hence, given h (·, ·), the following space is constructed:   Z   Hϕ(h) , ϕa,h (x) , aH (τ ) η h (x, τ ) dτ : a (τ ) ∈ LP1 (Λ) , (30)   Λ

where a : Λ → CP and LP1 (Λ) , L1 (Λ) × · · · × L1 (Λ). A sufficient condition, according to which | {z } P

(h)

Hϕ ⊆ Hφ is given in the following theorem.

Theorem 2: If h (τ , θ) ∈ LP1 (Θ) and h (τ , θ) ∈ LP1 (Λ) for a.e. τ ∈ Λ and a.e. θ ∈ Θ, respectively, (h)

then Hϕ ⊆ Hφ .

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

11

Proof: According to (29) and (30) 

 Z

aH (τ ) 

ϕa,h (x) =

Z h (τ , θ) ν (x, θ) dθ  dτ .

(31)

Θ

Λ

Since ν (x, θ) ∈ L1 (Θ) for a.e. x ∈ X , h (τ , θ) ∈ LP1 (Θ) and h (τ , θ) ∈ LP1 (Λ) for a.e. τ ∈ Λ and a.e. θ ∈ Θ, respectively, then by Theorem 4, in Appendix E, it is implied that h (τ , θ) ν (x, θ) ∈ LP1 (Θ) for a.e. τ ∈ Λ and a.e. x ∈ X . Therefore, since a (τ ) ∈ LP1 (Λ), it can be shown using the Tonelli  theorem [31] that aH (τ ) h (τ , θ) ν (x, θ) ∈ L1 (Λ × Θ) for a.e. x ∈ X . Hence, by the Fubini theorem [31] it is implied that 

 Z ϕa,h (x) =

Z 

Θ

Z =

aH (τ ) h (τ , θ) dτ  ν (x, θ) dθ

Λ

q 0 (θ) ν (x, θ) dθ

(32)

Θ

for a.e. x ∈ X , where

q 0 (θ)

,

R

aH

(τ ) h (τ , θ) dτ . Since h (τ , θ) ∈ LP1 (Θ) for a.e. τ ∈ Λ and

Λ

a (τ ) ∈ LP1 (Λ) it can be shown using the Tonelli theorem [31] that q 0 (θ) ∈ L1 (Θ). Therefore, according

to (8) and (32) ϕa,h (x) ∈ Hφ . (h)

Hence, assuming that Theorem 2 holds and assuming that Hϕ is complete, i.e. any Cauchy sequence (h)

(h)

(h)

in Hϕ converges to a limit in Hϕ , then Hϕ is a closed subspace of Hφ . B. Derivation of the proposed class In this subsection, the proposed class of bounds is derived. According to (15), it is implied that the proposed class is given by    i h (h) CHϕ(h) , Ex;θt pJ e (ˆ g (x)) |Hϕ(h) pH e (ˆ g (x)) |H . J ϕ

Under the assumptions of Theorem 2, it is shown in Appendix F that Z Z   ˜ H (τ ) Kh τ , τ 0 A ˜ τ 0 dτ 0 dτ , CHϕ(h) = A Λ

where  Kh τ , τ 0 =

Z Z

  h (τ , θ) K θ, θ 0 hH τ 0 , θ 0 dθ 0 dθ

(35)

Θ

˜ : Λ → CP ×L is the solution of the following integral equation: and A Z Z   0 0 ˜ 0 Kh τ , τ A τ dτ = h (τ , θ) ξ T (θ) dθ.

January 14, 2010

(34)

Λ

Θ

Λ

(33)

(36)

Θ

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

12

Geometric interpretation of CHϕ(h) is depicted in Fig. 1 for the one-dimensional case i.e. L = 1. The bound in (34) constitutes a new family of lower bounds on the MSE of any uniformly unbiased (h)

estimator. By modifying h (·, ·), the subspace Hϕ is modified and a variety of bounds can be obtained from the proposed class. Let Hϕ and Hϕ0 denote closed subspaces of Hφ . According to theorem 7 in appendix L, if Hϕ ⊃ Hϕ0 then CHϕ  CHϕ0 . Therefore, order relation between any two bounds, (h1 )

CHϕ(h1 ) and CHϕ(h2 ) can be obtained by comparing the Hilbert subspaces Hϕ

since Hφ ⊇

(h) Hϕ ,

(h2 )

and Hϕ

. Moreover,

it is concluded that CHφ  CHϕ(h) . In Appendix G, it is shown that for any invertible

integral transform, Th , which preserves the information in ν (x, θ), CHφ = CHϕ(h) . Non-invertible integral transforms may produce less tighter bounds than CHφ . However, as discussed in Section II, in many cases, CHφ is practically incomputable. Therefore, in order to obtain computationally manageable and tight bounds we look for kernels, which yield non-invertible integral transforms, such that the integral equation in (36) is solvable and the significant information in ν (x, θ) is compressed into few elements in the domain of the integral transform. In the following section, it is shown that some well known bounds on the MSE of unbiased estimators can be derived from the proposed class by specific selections of h (·, ·).

IV. D ERIVATION OF EXISTING BOUNDS VIA CERTAIN SELECTION OF THE KERNEL OF THE INTEGRAL TRANSFORM

The integral transform generalizes the derivative and sampling operators used for computation of some well known lower bounds on the MSE of unbiased estimators. Hence, in this section it is shown that some well known bounds can be derived from the proposed class in (34) by specific choice of the kernel, h (·, ·). We begin with derivation of a new subclass of bounds from the proposed class in (34), using a

family of kernel functions with a specific form. It is then shown that some well known bounds are the limits of convergent sequences of bounds, which are obtained from the proposed subclass. A. Subclass of lower bounds using structured kernel functions In this subsection, a new subclass of lower bounds on the MSE of uniformly unbiased estimators is derived from the proposed class in (34) for the case where h (·, ·) is of the form h (τ , θ) = v (θ) , ∀τ ∈ Λ,

where v (θ) ∈ LP1 (Θ). Hence, according to (34), (35), and (37)     Z Z  ˜ H (τ ) dτ  Kv  A ˜ τ 0 dτ 0  , Cv , CHϕ(v) =  A Λ

January 14, 2010

(37)

(38)

Λ

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

Fig. 1.

13

Geometric interpretation of the integral form of the BB, in (26) and the proposed class of bounds in (34), for the (h)

one-dimensional case i.e. L = 1. The spaces L2 (X ), Hφ and Hϕ are illustrated by the spheroid, the plane, and the white axis, ” “ (h) (marked by thick dashed arrows) denote the projections respectively. The terms pJ (e (ˆ g (x)) |Hφ ) and pJ e (ˆ g (x)) |Hϕ (h)

1/2

1/2

of e (ˆ g (x)) (marked by a thick solid arrow) on Hφ and Hϕ ⊂ Hφ , respectively. The terms CHφ and C (h) are the norms Hϕ “ ” (h) of pJ (e (ˆ g (x)) |Hφ ) and pJ e (ˆ g (x)) |Hϕ in L2 (X ), respectively.

where

Z Z Kv , Θ

  v (θ) K θ, θ 0 vH θ 0 dθ 0 dθ.

(39)

Θ

Using (35), (36), and (37) it is implied that Z ˜ (τ ) dτ = K−1 ΓH , A v v

(40)

Λ

where

Z Γv ,

ξ (θ) vH (θ) dθ,

(41)

Θ

and it is assumed that Kv is nonsingular. Therefore, substituting (40) into (38) yields the following subclass H Cv = Γv K−1 v Γv .

(42)

By modifying v (·), a variety of bounds can be derived from (42).

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

14

B. Derivation of existing bounds from the subclass Cv In this subsection, it is shown that some well known bounds on the MSE of unbiased estimators are the limits of convergent sequences of bounds, which are obtained from the proposed subclass in (42). In a more detailed manner, derivation of each bound is carried out via the following procedure: 1) A sequence of kernel functions, {vk (·)} is constructed using the following sequence of auxiliary “test-functions”. Let ψ : RM → R denote an infinitely differentiable and compactly supported R “test-function”, such that ψ (y) dy = 1, y ∈ RM . A sequence of “test-functions” is given by RM

 {ψk (y)} , k M ψ (ky) , k = 1, 2, . . . .

(43)

We note that lim ψk (y) = δ (y), where δ (·) is the dirac’s delta function. For example, we can k→∞

choose ψ (y) =

M −1 Y

ψ 0 (ym ) ,

(44)

m=0

where ym ∈ R, m = 0, . . . , M − 1, denote the entries of y, such that ψ 0 (y) = R

ψ 00 (y) ψ 00 (y) dy

(45)

R

and ψ 00 (y) ,

    exp − 1 2 , |y| < 1 1−y  0,

.

(46)

otherwise

For each bound, {vk (·)} is constructed using {ψk (·)} in a different manner, as will be detailed in the sequel. 2) Using {vk (·)}, a sequence of bounds, {Cvk } is derived from (42). The desired bound is given by  −1 (47) C , lim Cvk = lim Γvk lim Kvk lim ΓH vk , k→∞

k→∞

k→∞

k→∞

where it is assumed that {Γvk } and {Kvk } converge as k → ∞, and the matrix lim Kvk is k→∞

nonsingular. The second equality in (47) can be verified using some basic properties of convergent sequences limits. The term lim Kvk is calculated using the following identity, which is proved in k→∞

Appendix H:   lim Kvk = Ex,θt γ (x) γ H (x) ,

k→∞

(48)

where γ (x) , lim η vk (x) , k→∞

January 14, 2010

(49)

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

and

15

Z η vk (x) ,

vk (θ) ν (x, θ) dθ.

(50)

Θ

We note that that the limit notation in (49) means that η vk (x) → γ (x) for a.e. x ∈ X , as k → ∞. By applying the approach described above, we show that some well known bounds can be obtained from (42). We note that in cases where the Qth -order derivative operator is being used, the following regularity conditions on the function to be estimated and on the likelihood function are assumed. 1) The Qth order derivative of each entry of g (θ) w.r.t. each entry of θ , denoted by θi , i = 1, . . . , M , exists and defines a continuous function on Θ. 2) The Qth order derivative of f (x; θ) w.r.t. each entry of θ exists and defines a continuous function on Θ.



R

3) For all θ ∈ Θ and s≤Q the integrals

∂s f

(x;θ)

∂θi ∂θi ···∂θi s 1 2

f (x;θt )

X

«2

dµ (x); is = 1, . . . , M converge and define

continuous functions on Θ. 1) The Cram´er-Rao bound [1], [2]: Under the regularity conditions stated above, for Q = 1, the CRB is obtained via the following steps: a) Construction of {vCRB,k (·)}: The k th member of {vCRB,k (·)} is given by (1)

(51)

vCRB,k (θ) = −ψ k (θ t − θ),

where (Q) ψ k (θ)

and

 ,

∂Q ∂ θ⊗Q

∂ Q ψk (θ) ∂θ ⊗Q

denotes the row vector of derivatives  b) Calculating the limit of CvCRB,k in (47):

T , Q ∈ N,

∂Q ∂θi1 ···∂θiQ , iq

(52)

= 1, . . . , M .

i) Calculation of lim ΓvCRB,k : k→∞

According to (41), (51) Z

(1)T

−ξ (θ) ψ k

ΓvCRB,k ,

(θ t − θ)dθ.

(53)

Θ

Hence, assuming that supp {ψ (θ t − θ)} ⊂ Θ, where supp {·} denotes the support set, then applying integration by parts on the r.h.s. of (53) yields

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY



Z ΓvCRB,k

−ξ (θ) ψk (θ t − θ)

=

16

∂θ ∂λ

T

Z ψk (θ t − θ)

dS +

∂Θ

∂ξ (θ) dθ ∂θ

(54)

Θ

Z

∂ξ (θ) ψk (θ t − θ) = dθ ∂θ Θ   ∂ξ (θ) (θ t ) . = ψk (θ) ∗ ∂θ

The first equality stems from formula (1.2.5) in [33], where ∂Θ denotes boundary set  1  ∂θM T ∂θm ∂θ of Θ, ∂λ , ∂θ , . . . , , ∂λ is the outer normal derivative of θm on the surface ∂λ ∂λ ∂Θ, and dS is an area element of ∂Θ. The second equality stems from the fact that supp {ψ (θ t − θ)} ⊂ Θ, which implies that ψk (θ t − θ) = 0 ∀θ ∈ ∂Θ and ∀k ≥ 1. In the

third equality, (· ∗ ·) (·) denotes an evaluation point of the convolution integral. Therefore, according to Theorem 6 in Appendix J, regarding the limit of the convolution integral, one obtains lim ΓvCRB,k

k→∞

∂g (θ) ∂ξ (θ) = , = ∂θ θ=θt ∂θ θ=θt

(55)

where the last equality in (55) stems from the definition of ξ (θ) in (20). ii) Calculation of lim KvCRB,k : k→∞

According to (50) and (51) Z

(1)

−ν (x, θ) ψ k (θ t − θ)dθ.

η vCRB,k (x) =

(56)

Θ

Hence, assuming that supp {ψ (θ t − θ)} ⊂ Θ, then applying integration by parts on the r.h.s. of (56) yields   Z ∂ν (x, θ) T η vCRB,k (x) = ψk (θ t − θ) dθ = ∂θ

 ψk (θ) ∗

∂ν (x, θ) ∂θ

T ! (θ t ) (57)

Θ

According to Theorem 6 in Appendix J regarding the limit of the convolution integral, one obtains γ CRB (x) , lim η vCRB,k (x) = k→∞

!T ∂ν (x, θ) = ∂θ θ=θt

!T ∂logf (x; θ) , (58) ∂θ θ =θ t

where the last equality in (58) stems from the definition of ν (x, θ) in (7). Thus, substituting (58) in (48) implies that   IFIM , lim KvCRB,k = Ex,θt γ CRB (x) γ TCRB (x) , k→∞

January 14, 2010

(59)

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

17

where IFIM is the Fisher information matrix. iii) Calculating lim CvCRB,k : k→∞

Substitution of (55) and (59) into (47) yields (1)T CCRB , lim CvCRB,k = Z(1) (θ t ) I−1 (θ t ) , FIM Z

(60)

k→∞

where Z

(Q)

(θ t ) ,

h

∂g(θ) ∂θ

,...,

∂ Q g(θ) ∂ θ⊗Q

i

.

(61)

θ =θ t

2) The Bhattacharyya bound [3]: The Qth -order BHB is obtained via the following steps. n o (Q) a) Construction of vBHB,k (·) : n o (Q) The k th member of vBHB,k (·) is given by (Q)

vBHB,k (θ) =

h

(1)T −ψ k

Q

(θ t − θ) , . . . , (−1)

(Q)T ψk

(θ t − θ)

iT

,

(62)

where it is assumed that supp {ψ (θ t − θ)} ⊂ Θ. o n , in (47): b) Calculating the limit of Cv(Q) BHB,k

i) Calculation of lim Γv(Q) : k→∞

BHB,k

Using the same techniques described in (53)-(55), it can be shown that lim Γv(Q)

k→∞

BHB,k

= Z(Q) (θ t ) .

(63)

ii) Calculation of lim Kv(Q) : k→∞

BHB,k

Using the same techniques described in (56)-(59), it can be shown that KBHB , lim Kv(Q) k→∞

BHB,k

  = Ex;θt γ BHB (x) γ TBHB (x) ,

(64)

where γ BHB (x) ,

h

∂ν(x,θ) ∂θ

,...,

∂ Q ν(x,θ) ∂ θ⊗Q

i T

.

(65)

θ =θ t

iii) Calculating lim Cv(Q) : k→∞

BHB,k

Substitution of (63) and (64) into (47) yields (Q)

CBHB , lim Cv(Q) k→∞

BHB,k

(Q)T = Z(Q) (θ t ) K−1 (θ t ) . BHB Z

(66)

3) The McAulay-Seidman bound [6]: The N th -order MS bound is obtained via the following steps.

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

18

n o (N ) vMS,k (·) : n o (N ) member of vMS,k (·) is given by

a) Construction of The k th

(N )

vMS,k (θ) = [ψk (θ 0 − θ) , . . . , ψk (θ N −1 − θ)]T ,

(67)

where θ n ∈ Θ, n = 0, . . . , N − 1, denote test-points. n o b) Calculating the limit of Cv(N ) in (47): MS,k

i) Calculation of lim Γv(N ) : k→∞

MS,k

According to (41) and (67), Z ξ (θ) [ψk (θ 0 − θ) , . . . , ψk (θ N −1 − θ)] dθ Γv(N ) = MS,k

Θ

  (ψk (θ) ∗ ξ (θ)) (θ 0 ), . . . , (ψk (θ) ∗ ξ (θ)) (θ N −1 ) .

=

(68)

Therefore, according to Theorem 6 in Appendix J regarding the limit of the convolution integral, one obtains Φ , lim Γv(N ) = [ξ (θ 0 ) , . . . , ξ (θ N −1 )] . k→∞

(69)

MS,k

ii) Calculation of lim Kv(N ) : k→∞

MS,k

According to (50) and (67), Z η v(N ) (x) = ν (x, θ) [ψk (θ 0 − θ) , . . . , ψk (θ N −1 − θ)]T dθ MS,k

Θ

=



(ψk (θ) ∗ ν (x, θ)) (θ 0 ) , . . . , (ψk (θ) ∗ ν (x, θ)) (θ N −1 )

T

. (70)

Therefore, according to Theorem 6 in Appendix J regarding the limit of the convolution integral, one obtains γ MS (x) , lim η v(N ) (x) = [ν (x, θ 0 ), . . . , ν (x, θ N −1 )]T . k→∞

(71)

MS,k

where the last equality in (71) stems from the definition of ν (x, θ) in (7). Thus, substituting (71) into (48) implies that   KMS , lim Kv(N ) = Ex;θt γ MS (x) γ TMS (x) . k→∞

(72)

MS,k

iii) Calculating lim Cv(N ) : k→∞

MS,k

Substitution of (69) and (72) into (47) yields (N )

T CMS , lim Cv(N ) = ΦK−1 MS Φ . k→∞

January 14, 2010

(73)

MS,k

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

19

4) The Hammersley-Chapman-Robbins bound [5]: The N th -order HCR bound is obtained bound is obtained via the following steps. n o (N ) a) Construction of vHCR,k (·) : n o (N ) The k th member of vHCR,k (·) is given by (N )

vHCR,k (θ) = [ψk (θ t − θ) , ψk (θ 0 − θ) , . . . , ψk (θ N −1 − θ)]T . n o b) Calculating the limit of Cv(N ) in (47):

(74)

HCR,k

i) Calculation of lim Γv(N ) : k→∞

HCR,k

Using the same techniques described in (68) and (69), it can be shown that h i h i VHCR , lim Γv(N ) = ξ (θ t ) Φ = 0 Φ . HCR,k k→∞

(75)

The last equality in (75) stems from the fact that by (20) ξ (θ t ) = 0, where 0 denotes an L × 1 vector of zeroes.

ii) Calculation of lim Kv(N ) : k→∞

HCR,k

Using the same techniques described in (70)-(72), it can be shown that KHCR , lim Kv(N ) k→∞

HCR,k

  = Ex;θt γ HCR (x) γ THCR (x) ,

(76)

where h

γ HCR (x) ,

ν (x, θ t ) [ν (x, θ 0 ), . . . , ν (x, θ N −1 )] h iT = . 1 γ TMS (x)

iT

(77)

The last equality in (77) stems from the fact that by (7) ν (x, θ t ) = 1, and from the definition of γ MS (x) in (71). iii) Calculating lim Cv(N ) : k→∞

HCR,k

Substitution of (75) and (76) into (47) yields  (N )

CHCR , lim Cv(N ) k→∞

HCR,k

=

h

0 Φ

i 

1

1T

−1 

 1 KMS −1 T = Φ KMS − 11T Φ ,



0 ΦT

 

(78)

where 1 denotes an L×1 vector of ones. The block-matrix inversion in (78) was performed using formula (7.7.5) in [35]. 5) The McAulay-Hofstetter bound [7]: The N th -order MH bound is obtained via the following steps: January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

20

n o (N ) vMH,k (·) : n o (N ) member of vMH,k (·) is given by

a) Construction of The k th

h iT (N ) (N )T T vMH,k (θ) = vCRB,k (θ) , vMS,k (θ) ,

(79)

where it is assumed that supp {ψ (θ t − θ)} ⊂ Θ. n o b) Calculating the limit of Cv(N ) , in (47): MH,k

i) Calculation of lim Γv(N ) : k→∞

MH,k

Using the same techniques described in (53)-(55) and in (68), (69), it can be shown that h i (80) VMH , lim Γv(N ) = Z(1) (θ t ) Φ MH,k k→∞

ii) Calculation of lim Kv(N ) : k→∞

MH,k

Using the same techniques described in (56)-(59) and in (70)-(72) it can be shown that   KMH , lim Kv(N ) = Ex;θt γ MH (x) γ TMH (x) , k→∞

(81)

MH,k

where  γ MH (x) ,



∂ log f (x;θ) ∂θ θ =θ

γ TMS (x)

T .

(82)

t

iii) Calculating lim Cv(N ) : k→∞

MH,k

Substitution of (80) and (81) into (47) yields (N )

T CMH , lim Cv(N ) = VMH K−1 MH VMH . k→∞

(83)

MH,k

6) The Abel bound [8]: The (Q, N )(th) -order Abel bound is obtained via the following steps: n o (Q,N ) a) Construction of vAbel,k (·) : n o (Q,N ) The k th member of vAbel,k (·) is given by h iT (Q,N ) (Q)T (N )T vAbel,k (θ) = vBHB,k (θ) , vHCR,k (θ) , n o b) Calculating the limit of Cv(Q,N ) in (47):

(84)

Abel,k

i) Calculation of lim Γv(Q,N ) : k→∞

Abel,k

Using the same techniques described in (53)-(55) and in (68), (69), it can be shown that h i (85) VAbel , lim Γv(Q,N ) = Z(Q) (θ t ) VHCR , Abel,k k→∞

where VHCR is defined in (75).

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

21

ii) Calculation of lim KvAbel,k : k→∞

Using the same techniques described in (56)-(59) and in (70)-(72) it can be shown that   KAbel , lim KvAbel,k = Ex;θt γ Abel (x) γ TAbel (x) ,

(86)

k→∞

where γ Abel (x) ,

h

γ TBHB (x) γ THCR (x)

iT

(87)

.

iii) Calculating lim Cv(Q,N ) : k→∞

Abel,k

Substitution of (85) and (86) into (47) yields (Q,N )

CAbel

T lim Cv(Q,N ) = VAbel K−1 Abel VAbel

,

k→∞

(88)

Abel,k

7) The QCL bound [9]: The bound proposed by Quinlan et al. in [9] is obtained via the following steps:   (Q00 ,...,Q0N −1 ) a) Construction of vQCL,k (·) :   (Q00 ,...,Q0N −1 ) The k th member of vQCL,k (·) is given by (Q00 ,...,Q0N −1 ) (θ) = vQCL,k

h h

T T (θ) , . . . , v ˜N ˜ 0,k v −1,k (θ)

i

(N )T

vMS,k (θ)

iT

,

(89)

where ˜ n,k (θ) , v

h

(1)T

−ψ k

0

. , N − 1. and Q0n ≤ Q, ∀n = 0, . .(

b) Calculating the limit of

(Q0n )T

(θ n − θ) , . . . , (−1)Qn ψ k

(θ n − θ)

iT

,

(90)

)

C (Q0 ,...,Q0 ) 0 N −1

in (47):

vQCL,k

i) Calculation of lim Γ (Q0 ,...,Q0 ) : N −1 k→∞ v 0 QCL,k Using the same techniques described in (53)-(55) and in (68), (69), it can be shown that VQCL ,

lim Γ (Q0 ,...,Q0 ) 0 N −1 vQCL,k h i 0 0 = Z(Q0 ) (θ 0 ) , . . . , Z(QN −1 ) (θ N −1 ) Φ . k→∞

(91)

ii) Calculation of lim K (Q0 ,...,Q0 ) : N −1 k→∞ v 0 QCL,k Using the same techniques described in (56)-(59) and in (70)-(72) it can be shown that   KQCL , lim K (Q0 ,...,Q0 ) = Ex;θt γ QCL (x) γ TQCL (x) , (92) N −1 k→∞ v 0 QCL,k

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

22

where γ QCL (x) ,

h h

rT0

(x) , . . . ,

rTN −1 (x)

i

(N )T

γ MS (x)

iT

(93)

and rn (x) ,

h

∂ν(x,θ) ∂θ

0

,...,

∂ Qn ν(x,θ) 0 ∂ θ⊗Qn

iT

.

(94)

θ =θ n

iii) Calculating lim C (Q0 ,...,Q0 ) : N −1 k→∞ v 0 QCL,k Substitution of (91) and (92) into (47) yields (Q00 ,...,Q0N −1 ) T CQCL , lim C (Q0 ,...,Q0 ) = VQCL K−1 QCL VQCL . 0 N −1 k→∞

(95)

vQCL,k

We note that in practice, it was offered in [9] to use Q0n = 1, ∀n = 0, . . . , N − 1. One can notice that any two bound C1 and C2 obtained viathe procedure described above, constitute    the limits of two convergent sequences of bounds, C (v1,k ) and C (v2,k ) , where {v1,k (·)} and Hϕ



{v2,k (·)} are two sequences of kernel functions in LP1 (Θ). Hence using the result of theorem 7 in

appendix L is is shown in [37] that order relation between C1 and C2 can be obtained by comparing the n o n o (v ) (v ) (v ) (v ) limits of the Hilbert subspace sequences, Hϕ 1,k and Hϕ 2,k , i.e. if lim Hϕ 1,k ⊃ lim Hϕ 2,k , k→∞

k→∞

then C1  C2 . V. A

NEW NON -BAYESIAN BOUND BASED ON THE

F OURIER TRANSFORM

In this section, a new lower bound is derived from (42) using the kernel function of the Fourier transform. We show that the proposed bound is computed by applying the Discrete Fourier transform −1 (DFT) of the sequence {ν (x, θ n )}N n=0 . In cases where the spectrum of this sequence is concentrated in

few frequency components, a computationally manageable bound, which exploits all the information in −1 {ν (x, θ n )}N n=0 is obtained. The proposed bound is derived via the following steps: n o (J,N ) 1) Construction of vCRFB,k (·) : n o (J,N ) The k th member of vCRFB,k (·) is given by

h iT (J,N ) (1)T vCRFB,k (θ) = −ψ k (θ t − θ), [ψk (θ 0 − θ) , . . . , ψk (θ N −1 − θ)] WT ,

(96)

−1 where it is assumed that supp {ψ (θ t − θ)} ⊂ Θ. The set {θ n }N n=0 contains N equally spaced test-

points in Θ, where θ n = [n1 ∆θ, . . . , nM ∆θ]T , ∆θ is a sampling interval, nm ∈ {0, . . . , Nm − 1} denotes a test-point index in the mth dimension of Θ, the index n is a unique combination of M Q {n1 , . . . , nM }, i.e. {n1 , . . . , nM } ↔ n, and N = Nm . The M -dimensional-discrete-Fourierm=1

transform (DFT) matrix with J < N frequency components is denoted by W ∈ CJ×N , such

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

23

that  (97) [W]j,n = exp −iΩTj θ n , j = 0, . . . , J − 1, n = 0, . . . , N − 1, h iT 2πj1 2πjM Ωj = ∆θN , . . . , is a frequency test-bin, jm ∈ {0, . . . , Nm − 1} denotes a test-bin index ∆θNM 1

in the mth dimension of the frequency domain, Λ, and the index j is a unique combination of {j1 , . . . , jM }, i.e. {j1 , . . . , jM } ↔ j . n o 2) Calculating the limit of Cv(J,N ) in (47): CRFB,k

a) Calculation of lim Γv(J,N ) : k→∞

CRFB,k

Using the same techniques described in (53)-(55) and in (68), (69), it can be shown that h i VCRFB , lim Γv(J,N ) = Z(1) (θ t ) ΦWH (98) CRFB,k k→∞ h i   . = Z(1) (θ t ) E∗x;θt e (ˆ g (x)) [ν (x, θ 0 ), . . . , ν (x, θ N −1 )] WT ˆ (x) is uniformly unbiased then by The last equality in (98) stems from the fact that since g   (4), (6) ,(7), (20) and (69) E∗x;θt e (ˆ g (x)) [ν (x, θ 0 ), . . . , ν (x, θ N −1 )] WT = ΦWH .

b) Calculation of lim KvCRFB,k : k→∞

Using the same techniques described in (56)-(59) and in (70)-(72) it can be shown that   KCRFB , lim KvCRFB,k = Ex;θt γ CRFB (x) γ TCRFB (x) ,

(99)

k→∞

where  γ CRFB (x) ,



∂ log f (x;θ) ∂θ θ =θ

[ν (x, θ 0 ), . . . , ν (x, θ N −1

)] WT

T .

(100)

t

c) Calculating lim Cv(J,N ) : k→∞

CRFB,k

Substitution of (98) and (99) into (47) yields (J,N )

CCRFB ,

lim Cv(J,N )

k→∞

CRFB,k

H = VCRFB K−1 CRFB VCRFB .

(101)

In Appendix K, it is shown that the proposed bound in (101) can be rewritten as (J,N )

(1)T CCRFB = Z(1) (θ t ) I−1 (θ t ) + QWH WRWH FIM Z

−1

WQT ,

(102)

where IFIM and Z(1) (θ t ) are defined in (59) and (61), respectively, Q , Z(1) (θ t ) I−1 FIM D − Φ,

(103)

D , [d (θ 1 ), . . . , d (θ N )] ,

(104)

the matrix Φ is defined in (69),

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

 d (θ n ) , −

∂KLD [f (x; θ n ) ||f (x; θ)] ∂θ

24

T

,

(105)

θ =θ t

the term KLD [f (x; θ n ) ||f (x; θ)] is the Kullback-Leibler divergence [34] of f (x; θ) from f (x; θ n ), R , Ψ − DT I−1 FIM D,

(106)

[Ψ]m,n ,K (θ m , θ n ) , m, n = 0, . . . , N − 1.

(107)

and the elements of Ψ are given by

(J,N )

One can notice that CCRFB is composed of the CRB supplemented by a positive-definite term. The positive-definiteness of the supplemental term is shown in (158) in Appendix K. Thus, the regularity conditions imposed by the CRB are required also for this bound. In cases where these conditions are not (J,N )

satisfied, we may choose vFB,k (θ) = W [ψk (θ 0 − θ) , . . . , ψk (θ N −1 − θ)]T , and the bound in (102) −1 (J,N ) becomes CFB = ΦWH WΨWH WΦT . The bound in (102) is computed using N equally spaced test-points in Θ and J < N frequency testbins in Λ. For a given scenario, the frequency test-bins are selected such that the bound is maximized. According to (97), (98) and (100), one can notice that the transform W [ν (x, θ 0 ), . . . , ν (x, θ N −1 )]T is −1 actually the M -dimensional-DFT of the sequence {ν (x, θ n )}N n=0 , evaluated at J < N frequency test-

bins. Hence, in cases where the spectrum of this sequence is concentrated in J  N frequency test-bins, −1 a computationally manageable bound, which exploits all the information in {ν (x, θ n )}N n=0 is obtained.

Therefore, in these cases the computational complexity of the proposed bound is significantly lower in comparison to bounds, such as the MS, MH, Abel bounds, as well as the bound proposed in [9], in which maximization w.r.t. N > J test-points in Θ is required in order to obtain tight bounds. Order relations between the proposed bound and some other existing bounds can be obtained by comparing the limits of the Hilbert subspace sequences induced by the kernel sequences used in the proposed bound and in the compared bounds. In [37], it is shown that if 1) the same set of test-points, −1 {θ n }N n=0 , is used in the derivations of the proposed bound in (102) and the Abel bound in (88), such that −1 θ 0 =θ t , and if 2)the spectrum of {ν (x, θ n )}N n=0 is concentrated in J < N frequency test-bins, then   (v(J,N ) ) (v(1,N ) ) (J,N ) (1,N ) lim Hϕ CRFB,k = lim Hϕ Abel,k and hence CCRFB = CAbel . Therefore, it is concluded that

k→∞ (J,N ) CCRFB

k→∞



(1,N 0 ) CAbel ,

∀N 0

−1 < N . This conclusion implies that in cases where the spectrum of {ν (x, θ n )}N n=0

is concentrated in J < N frequency test-bins and the computation of the (1, N )-order Abel bound is analytically cumbersome, for example due to large matrix inversion, then it is preferable to use the proposed bound instead of decreasing the number of test-points in the Able bound.

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

25

VI. E XAMPLE The proposed bound in (102) is compared to other existing bounds in the literature [2], [5]-[9] in the problem of single tone estimation with Gaussian noise. The comparison criterion is prediction of the SNR threshold region exhibited by the MLE. The observation model is given by (108)

x = sb (θt ) + n,

where x denotes an L × 1 observation vector and s ∈ C is the known signal amplitude. The lth element of b (θt ) is given by [b (θt )]l , exp (i2πlθt ), l = 0, . . . , L − 1, where θt ∈ Θ = [− 12 , 21 ) is the true tone to be estimated and n denotes an L × 1 circular complex Gaussian noise vector, with zero mean and known covariance, Cn = σ 2 IL . Under the observation model in (108), ν (x, θ) =   exp σ22 Re sxH (b (θ) − b (θt )) . Hence, one can easily verify that ν (x, θ) ∈ L2 (X ) ∀θ ∈ Θ. It can be also easily verified that the regularity conditions on the function to be estimated and on the likelihood function described below (50), hold. The values of θt and L were set to 0 and 10, respectively. The parameter space, Θ, was sampled uniformly with sampling interval of ∆θ =

1 N,

where N = 29 . Therefore, θn =

n N

− 12 , n = 0, . . . , N − 1.

Hence, the terms composing (102) are given by 2 1) IFIM = 2SNRk b˙ (θt ) k2 , where b˙ (θt ) ,



db(θ) dθ θ=θ

and SNR , t

|s|2 σ2 .

2) Z(1) (θ t ) = 1. n o 3) d (θn ) = 2SNR · Re b˙ H (θt ) [b (θn ) − b (θt )] .

4) Φ = [(θ0 − θt ) , . . . , (θN −1 − θt )].  n o 5) K (θm , θn ) = exp 2SNR · Re [b (θn ) − b (θt )]H [b (θm ) − b (θt )] . 6) [W]j,n = exp (−iωj n), where ωj = Ωj ∆θ ∈ [0, 2π). We used J = 1 frequency test bins, and for each SNR, the bound in (102) is computed as a supremum over the possible values of ω =

2πk N ,

k = 0, . . . , N −1. All other bounds, except the CRB and HCR bound,

were computed as supremum over the possible values of {θn }3n=1 = [0, m∆θ, −m∆θ] , m = 1, . . . , 28 , as described in [9]. −1 NP −1 Let |ˆ ν (x, ω)| = ν (x, θn ) exp (−iωn) denote the spectrum of {ν (x, θn )}N n=0 . Fig. 2 depicts n=0

−1 kˆ ν (x, ω)k2L2 (X ) for SNR of 0 dB, obtained by evaluating the two-dimensional DFT of {K (θm , θn )}N m,n=0

at (ω, −ω). One can notice that the squared L2 (X )-norm of the spectrum is concentrated in low −1 frequencies. Therefore, the sequence {ν (x, θn )}N n=0 can be “compressed” into a few low frequency

components and the use of the proposed bound is suitable for this scenario. Fig. 3 depicts the compared January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

26

bounds on the root MSE (RMSE) as a function of SNR. The RMSE of the MLE is depicted as well in order to compare the threshold behavior of the bounds. It can be observed that the proposed bound allows better prediction of the SNR threshold. The proposed bound exceeds the RMSE of the MLE for low SNR, due to the fact that in this region the MLE is biased.

240 220 200

||"(x,!)||2 [dB]

180 160 140 120 100 80 60 40

−3

−2

−1

0

1

2

3

! [rad]

Fig. 2.

−1 10 The squared L2 (X )-norm of the spectrum of {ν (x, θn )}N n=0 , where SNR = 0 dB, N = 2 , θt = 0 and L = 10.

VII. C ONCLUSION In this paper, a new class of lower bounds on the MSE of unbiased estimators was derived by projecting each entry of the vector of estimation error on a Hilbert subspace of L2 . This Hilbert subspace contains linear transformations of elements in the domain of an integral transform of the LR function. The integral transform generalizes the traditional derivative and sampling operators, applied for computation of existing performance lower bounds. Hence, it was shown that some well known lower bounds on the MSE of unbiased estimators can be derived from this class via specific choices of the kernel of the integral transform. Moreover, it was shown that for any invertible integral transform, the integral form of the BB is derived from the proposed class. A new lower bound was derived from the proposed class using the kernel of the Fourier transform. The bound was shown to be computationally manageable and in

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

27

−1

RMSE

10

−2

10

CRFB − 1 Test bin QCL − 3 Test points Abel − 3 Test points MH − 3 Test points MS − 3 Test points CRB MLE HCR − 1 Test point −3

10 −10

−5

0

5

SNR [dB]

Fig. 3.

Comparison of RMSE lower bounds versus SNR, where the comparison criterion is prediction of the SNR threshold

region exhibited by the MLE.

comparison with other existing bounds, provided better prediction of the SNR threshold region, exhibited by the ML estimator, in the problem of single tone estimation. Finding some other integral transforms, for which new computationally manageable and tight lower bounds will be derived from the proposed class is a topic for future research, as well as derivation of estimators which attain these bounds and extension of the proposed class to the case of biased estimation. Extension of the proposed class of lower bounds to the case of Bayesian estimation is presented in Part II. A PPENDIX A In this appendix, the following proposition is proved. Proposition 2: If ν (x, θ) ∈ L2 (X ), ∀θ ∈ Θ, then ν (x, θ) ∈ L1 (Θ) for a.e. x ∈ X , where L1 (Θ) denotes the space of absolutely integrable functions in Θ. Proof: Since ν (x, θ) ∈ L2 (X ), ∀θ ∈ Θ, then by Holder’s inequality [31] Z |ν (x, θ)| dPθt (x) < ∞, ∀θ ∈ Θ.

(109)

X

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

Therefore, since Θ has finite Lebesque-measure, then Z Z |ν (x, θ)| dPθt (x) dθ < ∞. Θ

28

(110)

X

Let L+ (X × Θ) denote the space of positive functions defined on X × Θ. Due to the fact that ν (x, θ) ∈ L+ (X × Θ), then by the Tonelli theorem [31], Z Z Z |ν (x, θ)| d (Pθt × θ) |ν (x, θ)| dPθt (x) dθ = Θ

(111)

X ×Θ

X

Hence, according to (110) and (111) it is concluded that ν (x, θ) ∈ L1 (X × Θ). Thus, by the Fubini theorem [31] it is implied that ν (x, θ) ∈ L1 (Θ) for a.e. x ∈ X . A PPENDIX B In this appendix the Hilbert projection theorem is stated. Theorem 3: Let V denote an abstract Hilbert space, let U be a closed subspace of V , and let v be an element in V . Then there exists a unique vector in U , denoted by pJ (v|U), and termed as the projection of v on U , which satisfies the following equivalent conditions: kv − pJ (v|U)kV = min kv − ukV ,

(112)

hv − pJ (v|U) , uiV = 0 ∀u ∈ U.

(113)

u∈U

The proof can be found in [28]. A PPENDIX C In this appendix, the following proposition is proved. ˆ (x) ∈ M. Then ∀φq (x) ∈ Hφ and ∀l = 0, . . . , L − 1 Proposition 3: Let g Z hφq (x) , [e (ˆ g (x))]l iL2 (X ) = q (θ) [ξ (θ)]l dθ,

(114)

Θ

where ξ (θ) , g (θ) − g (θ t ).

Proof: According to (2), (4), and (8) it is implied that   Z Z hφq (x) , [e (ˆ g (x))]l iL2 (X ) =  q (θ) ν (x, θ) dθ  [ˆ g (x) − g (θ t )]l dPθt (x) . X

(115)

(116)

Θ

ˆ (x) ∈ LL Since q (θ) ∈ L1 (Θ), ν (x, θ) ∈ L2 (X ), ∀θ ∈ Θ and g 2 (X ), it can be shown using the

Tonelli theorem and the Cauchy-Schwartz inequality [31] that the term {q (θ) ν (x, θ) [ˆ g (x) − g (θ t )]l } ∈ January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

29

L1 (Θ × X ). Therefore, it is implied by the Fubini theorem [31] that integration order in the r.h.s. of

(116) can be interchanged. Hence, Z hφq (x) , [e (ˆ g (x))]l iL2 (X ) =

  Z g (x) − g (θ t )]l dPθt (x) dθ q (θ)  ν (x, θ) [ˆ X

Θ

Z q (θ) Ex;θ [[ˆ g (x) − g (θ t )]l ] dθ

= Θ

Z q (θ) [g (θ) − g (θ t )]l dθ

= Θ

Z =

q (θ) [ξ (θ)]l dθ,

(117)

Θ

where the second equality in (117) stems from (2) and (7), the third equality stems from the fact that ˆ (x) ∈ M, and the last one stems from the definition of ξ (θ) in (115). g

A PPENDIX D In this appendix, the closed form expression of CHφ in (26) is derived. Let Z φq˜l (x) , pJ ([e (ˆ g (x))]l |Hφ ) = q˜l (θ) ν (x, θ)dθ,

(118)

Θ

where pJ ([e (ˆ g (x))]l |Hφ ) denotes the projection of [e (ˆ g (x))]l , l = 0, . . . , L − 1, on Hφ and the last equality in (118) stems from the fact that φq˜l (x) ∈ Hφ and from the definition of Hφ in (8). It is implied by (2), (14), (15), and (118) that Z Z Z     CHφ k,l = hφq˜k (x) , φq˜l (x)iL2 (X ) = q˜k (θ) ν (x, θ)dθ q˜l∗ θ 0 ν x, θ 0 dθ 0 dPθt (x) . X

Θ

(119)

Θ

Due to the fact that q˜k (θ) ∈ L1 (Θ) ∀k = 0, . . . , L − 1 and ν (x, θ) ∈ L2 (X ) ∀θ ∈ Θ, then by using the Tonelli theorem and the Cauchy-Schwartz inequality [31], it can be shown that the term    q˜k (θ) ν (x, θ) ν x, θ 0 q˜l∗ θ 0 ∈ L1 (X × Θ × Θ) ∀k, l = 0, . . . , L − 1. Therefore, by the Fubini theorem [31] integration order in the r.h.s. of (119) can be interchanged and thus   Z Z Z     CHφ k,l = q˜k (θ)  ν (x, θ) ν x, θ 0 dPθt (x) q˜l∗ θ 0 dθ 0 dθ Θ

Θ

Z Z = Θ

=

January 14, 2010

  q˜k (θ) hν (x, θ) , ν x, θ 0 iL2 (X ) q˜l∗ θ 0 dθ 0 dθ

Θ

Z Z Θ

X

  q˜k (θ) K θ, θ 0 q˜l∗ θ 0 dθ 0 dθ,

(120)

Θ

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

30

where     K θ, θ 0 , hν (x, θ) , ν x, θ 0 iL2 (X ) = Ex;θt ν (x, θ) ν x, θ 0

(121)

denote the autocorrelation kernel of the LR function. By the Hilbert projection theorem, stated in Appendix B, φq˜k (x) is the unique solution of the following set of equations: hφq (x) , φq˜k (x)iL2 (X ) = hφq (x) , [e (ˆ g (x))]k iL2 (X ) ∀φq (x) ∈ Hφ .

In similar to the derivation of (120), it can be shown that Z Z   hφq (x) , φq˜k (x)iL2 (X ) = q (θ) K θ, θ 0 q˜k∗ θ 0 dθ 0 dθ. Θ

(122)

(123)

Θ

ˆ (x) ∈ M Moreover, according to Proposition 3, in Appendix C, since g Z hφq (x) , [e (ˆ g (x))]k iL2 (X ) = q (θ) [ξ (θ)]k dθ.

(124)

Θ

Hence, by substitution of (123) and (124) into (122) one obtains   Z Z Z  ∗ 0 0 0   q (θ) K θ, θ q˜k θ dθ dθ = q (θ) [ξ (θ)]k dθ ∀q (θ) ∈ L1 (Θ) Θ

(125)

Θ

Θ

and thus, q˜k (θ) is the solution of the following integral equation: Z   K θ, θ 0 q˜k θ 0 dθ 0 = [ξ (θ)]k .

(126)

Θ

Finally, the closed form expression of CHφ in (26) is obtained by rewriting (120) and (126) in a matrix ˜ (θ) , [˜ form, where q q0 (θ) , . . . , q˜L−1 (θ)]T .

A PPENDIX E Theorem 4: (Theorem (6.18) in [31]) Let (Λ, M, µ) and (Θ, F, ν), be σ -finite measure spaces, and let h be an M ⊗ F -measurable functions on Λ × Θ. Suppose that h (τ, θ) ∈ L1 (Λ) for a.e. θ ∈ Θ R and h (τ, θ) ∈ L1 (Θ) for a.e. τ ∈ Λ. If φ ∈ Lp (Θ) (1 ≤ p < ∞), the integral h (τ, θ) φ (θ)dν (θ) Θ

converges absolutely for a.e. τ ∈ Λ. The proof can be found in [31].

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

31

A PPENDIX F In this appendix, the closed form expression of CHϕ(h) in (34) is derived under the conditions stated in Theorem 2. Let ϕa˜l ,h (x) ,pJ



[e (ˆ g (x))]l |Hϕ(h)



Z =

˜H a l

Z (τ )

Λ

h (τ , θ) ν (x, θ) dθdτ ,

(127)

Θ

  (h) (h) where pJ [e (ˆ g (x))]l |Hϕ denotes the projection of [e (ˆ g (x))]l , l = 0, . . . , L − 1, on Hϕ and the (h)

last equality in (127) stems from the fact that ϕa˜l ,h (x) ∈ Hϕ

(h)

and the definition of Hϕ

in (30). It is

implied by (2), (14), (15), and (127) that h i CHϕ(h) = hϕa˜k ,h (x) , ϕa˜l ,h (x)iL2 (X ) k,l Z Z Z Z Z    H ˜k (τ ) h (τ , θ) ν (x, θ) dθdτ ˜l τ 0 dθ 0 dτ 0 dPθt (x) . = a ν x, θ 0 hH τ 0 , θ 0 a X

Λ

Θ

Λ

(128)

Θ

Due to the fact that ν (x, θ) ∈ L1 (Θ) for a.e. x ∈ X , h (τ , θ) ∈ LP1 (Θ) and h (τ , θ) ∈ LP1 (Λ) for a.e. ˜k (τ ) ∈ LP1 (Λ) ∀k = 0, . . . , L − 1, then as shown in Theorem τ ∈ Λ and a.e. θ ∈ Θ, respectively, and a ˜H 2, αk (τ , θ, x) , a k (τ ) h (τ , θ) ν (x, θ) ∈ L1 (Λ × Θ) for a.e. x ∈ X , ∀k = 0, . . . , L − 1. Therefore,  using the Tonelli theorem [31], it can be shown that αk (τ , θ, x) αl∗ τ 0 , θ 0 , x ∈ L1 (Λ × Θ × Λ × Θ × X ) ∀k, l = 0, . . . , L − 1. Thus, by the Fubini theorem [31] integration order in the r.h.s. of (128) can be

interchanged, and hence, h i CHϕ(h) = k,l

 Z Z Λ

˜H a k (τ )

Λ

Z Z h (τ , θ)  Θ

= Λ

where K θ, θ



 0

  ν (x, θ) ν x, θ dPθt (x) hH τ 0 , θ 0 dθ 0 dθ˜ al τ 0 dτ 0 dτ

(129)

X

Θ

Z Z

0

 Z

˜H a k

Z Z (τ ) Θ

Λ

   h (τ , θ) K θ, θ 0 hH τ 0 , θ 0 dθ 0 dθ˜ al τ 0 dτ 0 dτ ,

Θ

is defined in (121). Let Z Z    0 Kh τ , τ , h (τ , θ) K θ, θ 0 hH τ 0 , θ 0 dθ 0 dθ Θ

(130)

Θ

denote the transformed autocorrelation kernel of the LR function. Then according to (129) and (130) h i CHϕ(h) can be expressed as k,l

h i CHϕ(h)

k,l

Z Z = Λ

January 14, 2010

  0 ˜l τ 0 dτ 0 dτ . ˜H a k (τ ) Kh τ , τ a

(131)

Λ

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

32

By the Hilbert projection theorem, stated in Appendix B, ϕa˜k ,h (x) is the unique solution of the following system of equations: hϕa,h (x) , ϕa˜k ,h (x)iL2 (X ) = hϕa,h (x) , [e (ˆ g (x))]k iL2 (X ) ∀ϕa,h (x) ∈ Hϕ(h) .

In similar to the derivation of (131), it can be shown that Z Z   ˜k τ 0 dτ 0 dτ . hϕa,h (x) , ϕa˜k ,h (x)iL2 (X ) = aH (τ ) Kh τ , τ 0 a Λ

(132)

(133)

Λ

ˆ (x) ∈ M, it is implied by (32) and Proposition 3 in Appendix C that Since ϕa,h (x) ∈ Hφ and g   Z Z hϕa,h (x) , [e (ˆ g (x))]k iL2 (X ) =  aH (τ ) h (τ , θ) dτ  [ξ (θ)]k dθ. (134) Θ

Since a (τ ) ∈

LP1

(Λ), h (τ , θ) ∈

LP1

Λ

(Θ) for a.e. τ ∈ Λ and [ξ (θ)]k is bounded in Θ, ∀k = 0, . . . , L−1,

it can be shown using the Tonelli theorem [31] that a (τ ) h (τ , θ) [ξ (θ)]k ∈ L1 (Λ × Θ), ∀k = 0, . . . , L− 1. Therefore, by the Fubini theorem [31] it is implied that integration order in the r.h.s. of (134) can be

interchanged, and hence, Z hϕa,h (x) , [e (ˆ g (x))]k iL2 (X ) =

H

Z

a (τ ) Λ

h (τ , θ) [ξ (θ)]k dθdτ

(135)

Θ

Hence, substitution of (133) and (135) into (132) implies that Z Z Z Z  0  H 0 H 0 ˜k τ dτ dτ = a (τ ) h (τ , θ) [ξ (θ)]k dθdτ ∀a (τ ) ∈ LP1 (τ ) (136) a (τ ) Kh τ , τ a Λ

Λ

Λ

Θ

˜k (τ ) is the solution of the following integral equation: and thus, a Z Z   0 0 0 ˜k τ dτ = h (τ , θ) [ξ (θ)]k dθ. Kh τ , τ a Λ

(137)

Θ

Finally, the closed form expression of CHϕ(h) in (34), is obtained by rewriting (131) and (137) in a matrix ˜ (τ ) , [˜ ˜L−1 (τ )]. form, where A a0 (τ ) , . . . , a

A PPENDIX G In this appendix, it is shown that for any invertible integral transform, Th , CHϕ(h) = CHφ . Proposition 4: Let Th denote an invertible integral transform with kernel h (·, ·), which satisfies the conditions of Theorem 2. Then CHϕ(h) = CHφ . Proof: Substitution of (35) into (34) yields Z Z Z Z    H ˜ ˜ τ 0 dτ 0 dτ . CHϕ(h) = A (τ ) h (τ , θ) K θ, θ 0 hH τ 0 , θ 0 dθ 0 dθ A Λ

January 14, 2010

Λ

Θ

(138)

Θ

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

33

Observing (27), one can notice that since ν (x, θ) ∈ L2 (X ), ∀θ ∈ Θ, then by the Cauchy-Schwartz  ˜ (τ ) ∈ LP ×L (Λ) and h (τ , θ) ∈ LP (Θ) inequality K θ, θ 0 is bounded in Θ × Θ. Therefore, since A 1 1 for a.e. τ ∈ Λ it can be shown using the Tonelli theorem [31] that the matrix function n o   ˜ H (τ ) h (τ , θ) K θ, θ 0 hH τ 0 , θ 0 A ˜ (τ 0 ) ∈ LL×L (Λ × Θ × Λ × Θ). Hence, by the Fubini theA 1 orem [31], integration order in the r.h.s. of (138) can be interchanged and thus,     Z Z Z Z    ˜ H (τ ) h (τ , θ) dτ  K θ, θ 0  hH τ 0 , θ 0 A ˜ τ 0 dτ 0  dθ 0 dθ  A CHϕ(h) = Θ

Θ

Z Z = Θ

Λ

Λ

˜ H θ dθ 0 dθ, ˜ (θ) K θ, θ q q  0

 0

(139)

Θ

where

Z ˜ (θ) = q

˜ H (τ ) h (τ , θ) dτ . A

(140)

Λ

Substitution of (35) into (36) yields Z Z Z Z    ˜ τ 0 dτ 0 = h (τ , θ) ξ T (θ) dθ. h (τ , θ) K θ, θ 0 hH τ 0 , θ 0 dθ 0 dθ A Λ

Θ

Θ

(141)

Θ

 ˜ (τ ) ∈ LP ×L (Λ) and h (τ , θ) ∈ LP (Θ) for a.e. τ ∈ Λ Again, since K θ, θ 0 is bounded in Θ × Θ, A 1 1

it can be shown using the Tonelli theorem [31] that the matrix function o n   ˜ (τ 0 ) ∈ LL×L (Θ × Λ × Θ) for a.e. τ ∈ Λ. Therefore, by the Fubini h (τ , θ) K θ, θ 0 hH τ 0 , θ 0 A 1 Theorem [31], integration order in the l.h.s. of (141) can be interchanged. Thus, using (140), (141) can be rewritten as Z

  Z Z   ˜ H θ 0 dθ 0  dθ = h (τ , θ) ξ T (θ) dθ. h (τ , θ)  K θ, θ 0 q

Θ

Θ

(142)

Θ

By the definition of the integral transform in (29), (142) becomes    Z    Th  K θ, θ 0 q ˜ H θ 0 dθ 0  (τ ) = Th ξ T (θ) (τ ) .

(143)

Θ

˜ (θ) is the solution of an Therefore, applying the inverse transform on both sides of (143) implies that q

integral equation, given by

Z

  ˜ θ 0 dθ 0 = ξ (θ) . K θ, θ 0 q

(144)

Θ

Finally, according to (26), (28), (139) and (144) it is concluded that CHϕ(h) = CHφ .

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

34

A PPENDIX H In this appendix, the identity in (48) is proved. According to (2), (27) and (39),   Z Z Z  ∗  [vk (θ)]m  ν (x, θ) ν x, θ 0 dPθt (x) vk θ 0 n dθ 0 dθ. [Kvk ]m,n = Θ

X

Θ

Due to the fact that vk (θ) ∈

(145)

LN 1

(Θ) ∀k ∈ N and ν (x, θ) ∈ L2 (X ) ∀θ ∈ Θ it can be shown, using the ∗   Tonelli theorem and the Cauchy-Schwartz inequality [31] that [vk (θ)]m ν (x, θ) ν x, θ 0 vk θ 0 n ∈

L1 (X × Θ × Θ) ∀m, n = 0, . . . , N − 1 and ∀k ∈ N. Therefore, by the Fubini theorem [31] integration

order in the r.h.s. of (145) can be interchanged and thus according to (2), (145) can be rewritten in the following manner Z

Z

[Kvk ]m,n = h

ν x, θ 0

[vk (θ)]m ν (x, θ) dθ,

Θ

  vk θ 0 n dθ 0 iL2 (X ) .

(146)

Θ

R

[vk (θ)]m ν (x, θ) dθ → γm (x) for a.e. x ∈ X , as k → ∞, it is concluded R from theorem 5 in appendix I that γm (x) ∈ L2 (X ) and [vk (θ)]m ν (x, θ) dθ → γm (x) in the L2 (X )-

Under the assumption that

Θ

Θ

norm, as k → ∞. Therefore, according to proposition (5.21) in [31], regarding the continuity of the inner-product operator, it is implied that Z Z   lim [Kvk ]m,n = lim h [vk (θ)]m ν (x, θ) dθ, ν x, θ 0 vk θ 0 n dθ 0 iL2 (X ) k→∞

k→∞

Θ

Θ

Z = h lim

Z [vk (θ)]m ν (x, θ) dθ, lim

k→∞

k→∞

Θ

ν x, θ 0

  vk θ 0 n dθ 0 iL2 (X )

Θ



 Z

Z [vk (θ)]m ν (x, θ) dθ · lim

= Ex;θt  lim

k→∞

k→∞

Θ

ν x, θ 0

 ∗ vk θ 0 n dθ 0  , (147)

Θ

where the last equality stems from (2). Therefore, by rewriting (147) in a matrix form, the equality in (48) is obtained. A PPENDIX I Theorem 5: let {φk (x)} denote a sequence in Hφ , such that φk (x) → φ (x) for a.e. x ∈ X , as k → ∞. Then φ (x) ∈ Hφ and φk (x) → φ (x) in the L2 (X )-norm, as k → ∞.

Proof: First, we find a function in L2 (X ), which dominates each element of {φk (x)}. According to (8) Z qk (θ) ν (x, θ) dθ, k ∈ N,

φk (x) =

(148)

Θ

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

35

where qk (θ) ∈ L1 (Θ) ∀k ∈ N. Therefore, Z |φk (x)| = qk (θ) ν (x, θ) dθ Θ Z ≤ |qk (θ)| ν (x, θ) dθ

(149)

Θ

  Z ≤  |qk (θ)| dθ  max {ν (x, θ)} , ∀x ∈ X , Θ

Θ

where the first inequality stems from the properties of the Lebesgue integral and from the fact that ν (x, θ) ∈ L+ (X × Θ), and the second inequality stems from the fact that ν (x, θ) ≤ max {ν (x, θ)} R Θ ∀x ∈ X . Since qk (θ) ∈ L1 (Θ) ∀k ∈ N then there exists a positive constant c, such that |qk (θ)| dθ ≤ c Θ

∀k ∈ N. Hence, by (149) |φk (x)| ≤ c · max {ν (x, θ)} , ∀x ∈ X .

(150)

Θ

Since ν (x, θ) ∈ L2 (X ) ∀θ ∈ Θ, then c · max ν (x, θ) ∈ L2 (X ) as well. Θ

Therefore, by the dominated convergence theorem in Lp spaces, stated in theorem 5.2.2 in [32], it is concluded that φ (x) ∈ L2 (X ) and φk (x) → φ (x) in the L2 (X )-norm, as k → ∞. Since Hφ is closed, the limit of any convergent sequence in Hφ is contained in Hφ . Thus, φ (x) ∈ Hφ . A PPENDIX J  R Theorem 6: (Theorem (8.15) in [31]) Suppose ψ (y) ∈ L1 RM , ψ (y) dy = 1, and {ψk (y)} , RM  M  k ψ (ky) , k = 1, 2, . . .. If φ ∈ Lp RM (1 ≤ p < ∞), then lim (ψk (y) ∗ φ (y)) (y0 ) = φ (y0 ), for k→∞

every y0 in the Lebesgue set of φ (·)-in particular, for almost every y0 ∈ RM , and for every y0 ∈ RM at which φ (·) is continuous. The proof can be found in [31]. A PPENDIX K In this appendix, it is shown via the following steps that the proposed bound in (101) can be written in the form, presented in (102). 1) Calculation of KCRFB : According to (99) and (100)  KCRFB , 

January 14, 2010

IFIM

DWH

WDT

WΨWH

 ,

(151)

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

36

where IFIM is defined in (59), (152)

D , [d (θ 0 ) , . . . , d (θ N −1 )] ,

" # # ∂log f (x; θ) ∂log f (x; θ) ν (x, θ n ) = Ex;θn dT (θ n ) = Ex;θt ∂θ ∂θ θ =θ t θ =θ t   ∂ f (x; θ n ) ∂KLD [f (x; θ n ) ||f (x; θ)] = − =− , (153) Ex;θn log ∂θ f (x; θ) θ=θt ∂θ θ =θ t    the term KLD f x; θ 0 ||f (x; θ) denotes the Kullback-Leibler divergence [34] of f (x; θ) from  f x; θ 0 , and according to (27) and (107) the entries of Ψ are given by "

[Ψ]m,n = Ex;θt [ν (x, θ m ) , ν (x, θ n )] ,K (θ m , θ n ) , m, n = 0, . . . , N − 1.

(154)

We note that the third equality in (153) holds under the condition that the first order derivative of f (x; θ) w.r.t. each entry of θ exists and defines a continuous function on Θ.

2) Calculation of K−1 CRFB : Assuming that KCRFB is positive-definite, then according to (151) and formula (7.7.5) in [35] it is implied that −1   H H −1 T = I − DW WΨW WD FIM 1,1  H −1 −1 H WDT I−1 W Ψ − DT I−1 = I−1 FIM , FIM D W FIM + IFIM DW 

K−1 CRFB



(155)

where the last equality in (155) stems from the Sherman-Morrison-Woodbury formula [36],  H −1  −1  H , W DT I−1 KCRFB 1,2 = I−1 FIM D − Ψ W FIM DW  −1   H KCRFB 2,1 = K−1 CRFB 1,2 ,  −1   H −1 KCRFB 2,2 = W Ψ − DT I−1 . FIM D W

(156) (157) (158)

We note that by Theorem (7.7.6) in [35], (158) is positive-definite. 3) Calculation of the bound: Substitution of (98) and (155)-(158) into (101) yields (J,N )

T (1) CCRFB = Z(1) (θ t ) I−1 (θ t ) + QWH WRWH FIM Z

−1

WQT ,

(159)

(1) (θ ) I−1 D − Φ. where R , Ψ − DT I−1 t FIM FIM D, and Q , Z

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

37

A PPENDIX L Theorem 7: Let Hϕ and Hϕ0 denote closed subspaces of Hφ . If Hϕ ⊃ Hϕ0 then CHϕ  CHϕ0 . Proof: Let u (e (ˆ g (x)) |Hϕ ) , e (ˆ g (x)) − pJ (e (ˆ g (x)) |Hϕ ) ,

(160)

  u e (ˆ g (x)) |Hϕ0 , e (ˆ g (x)) − pJ e (ˆ g (x)) |Hϕ0

(161)

and

denote the projection-errors for Hϕ and Hϕ0 , respectively, where pJ (e (ˆ g (x)) |Hϕ ) is defined in (14). According to (160) and (161)    pJ (e (ˆ g (x)) |Hϕ ) = pJ e (ˆ g (x)) |Hϕ0 + u e (ˆ g (x)) |Hϕ0 − u (e (ˆ g (x)) |Hϕ ) .

(162)

By the Hilbert projection theorem, stated in Appendix B, it is implied that [u (e (ˆ g (x)) |Hϕ )]l ⊥ Hϕ , ∀l = 0, . . . , L − 1.

(163)

  u e (ˆ g (x)) |Hϕ0 l ⊥ Hϕ0 , ∀l = 0, . . . , L − 1.

(164)

and

Therefore, since Hϕ ⊃ Hϕ0 it is is implied by (163) that [u (e (ˆ g (x)) |Hϕ )]l ⊥ Hϕ0 , ∀l = 0, . . . , L − 1.

(165)

  Thus, due to the fact that pJ e (ˆ g (x)) |Hϕ0 l ∈ Hϕ0 , ∀l = 0, . . . , L − 1, then by (164) and (165)    rH u e (ˆ g (x)) |Hϕ0 − u (e (ˆ g (x)) |Hϕ ) ⊥ rH pJ e (ˆ g (x)) |Hϕ0 , ∀r ∈ CL .

(166)

Hence, according to (162) and the Pythagorean theorem [31],

H

2

 2

r pJ (e (ˆ g (x)) |Hϕ ) L2 (X ) = rH pJ e (ˆ g (x)) |Hϕ0 L (X ) 2

H   2 0 + r u e (ˆ g (x)) |Hϕ − u (e (ˆ g (x)) |Hϕ ) L

  2 Therefore, due to the fact that 0 ≤ rH u e (ˆ g (x)) |Hϕ0 − u (e (ˆ g (x)) |Hϕ ) L

H

2

 2

r pJ (e (ˆ g (x)) |Hϕ ) L2 (X ) ≥ rH pJ e (ˆ g (x)) |Hϕ0 L

2

(X )

2

(X )

(167) 2

(X )

.

< ∞, then

∀r ∈ CL .

(168)

Hence, according to (2) and the definition of CHϕ in (15), ∀r ∈ CL   rH CHϕ r = rH Ex;θt pJ (e (ˆ g (x)) |Hϕ ) pH g (x)) |Hϕ ) r J (e (ˆ    ≥ rH Ex;θt pJ e (ˆ g (x)) |Hϕ0 pH g (x)) |Hϕ0 r = rH CHϕ0 r. J e (ˆ

(169)

Thus, since CHϕ and CHϕ0 are Hermitian matrices, it is implied by (169) that CHϕ  CHϕ0 . January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

38

R EFERENCES [1] C. R. Rao, “Information and accuracy attainable in the estimation of statistical parameters” Bull. Calcutta Math. Soc., vol. 37, pp. 81-91, 1945. [2] H. Cram´er, “A contribution to the theory of statistical estimation,” Skand. Akt. Tidskr., vol. 29, pp. 85-94, 1946. [3] A. Bhattacharyya, “On some analogous of the amount of information and their use in statistical estimation,” Shankya, vol. 8, no. 1, pp. 1-14, 201-211, 211-218, 1946. [4] E.W. Barankin, “Locally best unbiased estimates,” Ann. Math. Stat., vol. 20, pp. 477-501, 1946. [5] D. G. Chapman and H. Robbins, “Minimum variance estimation without regularity assumptions,” Ann. Math. Stat., vol. 22, pp. 581-586, 1951. [6] R. J. McAulay and L. P. Seidman, “A useful form of the Barnakin lower bound and its application to PPM threshold analysis,” IEEE Trans. on Information Theory, vol. 15, pp. 273-279, 1969. [7] R. J. McAulay and E. M. Hofstetter, “Barankin bound on parameter estimation,” IEEE Trans. on Information Theory, vol. 17, no. 6, pp. 669-676, 1971. [8] J. S. Abel, “A bound on mean-square-estimate error,” IEEE Trans. on Information Theory, vol. 39, pp. 1675-1680, 1993. [9] A. Quinlan, E. Chaumette, and P. Larzabal, “A direct method to generate approximations of the barankin bound,” Proc. of the ICASSP 2006, vol. 3, pp. 808-811, 2006. [10] J. Tabrikian and J. L. Krolik, “Barankin bound for source localization in shallow water,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, Munich, Apr. 1997. [11] J. Tabrikian and J. Krolik, “Barankin bounds for source localization in an uncertain ocean environment,” IEEE Trans. Signal Processing, vol. 47, pp. 2917-2927, Nov. 1999. [12] J. Tabrikian, “Barankin bounds for target localization by MIMO radars,” Fourth IEEE Workshop on Sensor Array and Multichannel Process., pp. 278-281, July 2006. [13] H. Lai and K. Bell “Cram´er-Rao lower bound for DOA estimation using vector and higher-order sensor arrays,” Forty-First Asilomar Conference on Signals, Systems and Computers, 2007, pp. 1262-1266, Nov. 2007. [14] A. Pinkus and J. Tabrikian, “Barankin bounds for range and Doppler estimation using orthogonal signal transmission,” IEEE Conf. on Radar, pp. 94-99, Apr. 2006. [15] A. Zeira and P. M. Schultheiss, “Realizable lower bounds for time delay estimation,” IEEE Trans. Signal Processing, vol. 41, no. 11, pp. 3102-3113, Nov. 1993. [16] A. B. Baggeroer, “Barankin bound on the variance of estimates of Gaussian random process,” Technical Report, MIT Lincoln Laboratory, Lexinton, Massachusetts, Jan. 1969. [17] T. L. Marzetta, “Computing the Barankin bound by solving an unconstrained quadratic optimization problem,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, Munich, Apr. 1997. [18] A. Ferrari and J. Y. Tourneret, “Barankin lower bound for change points in independent sequences,” in Proc. of IEEE Statistical Signal Processing Workshop, St. Louis, Missouri, Sep. 2003. [19] I. Reuven and H. Messer, “The use of the Barankin bound for determining the threshold SNR in estimating the bearing of a source in the presence of another,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, vol. 3, Detroit, USA, May 1995, pp. 1645-1648. [20] I. Reuven and H. Messer, “On the effect of nuisance parameters on the threshold SNR value of the Barankin bound,” IEEE Trans. Signal Processing, vol. 47, no. 2, pp. 523-527, Feb. 1999.

January 14, 2010

DRAFT

SUBMITTED TO THE IEEE TRANSACTIONS ON INFORMATION THEORY

39

[21] L. Knockaert, “The Barankin bound and threshold behavior in frequency estimation,” IEEE Trans. Signal Processing, vol. 45, pp. 2398-2401, Sep. 1997. [22] L. Atallah, J. P. Barbot, and P. Larzabal, “SNR threshold indicator in data aided frequency synchronization,” IEEE Signal Processing Letters,vol. 11, pp. 652-654, Aug. 2004. [23] L. Atallah, J. P. Barbot, and P. Larzabal, “From Chapman Robbins bound towards Barankin bound in threshold behaviour prediction,” Electronic Letters, vol. 40, pp. 279-280, Feb. 2004. [24] A. Renaux, L. N. Atallah, P. Forster, and P. Larzabal, “A useful form of the Abel bound and its application to estimator threshold prediction,” IEEE Trans. Signal Processing, vol. 55, no. 5, pp. 2365-2369, May 2007. [25] A. Zeira and P. M. Schultheiss, “Realizable lower bounds for time delay estimation. 2. Threshold phenomena,” IEEE Trans. Signal Processing, vol. 42, no. 5, pp. 1001-1007, May 1994. [26] W. Xu, “Performances bounds on matched-field methods for source localization and estimation of ocean environmental parameters,” PhD thesis, Massachusetts Institute of Technology, Cambridge, MA, June 2001. [27] P. Forster and P. Larzabal, “On lower bounds for deterministic parameter estimation,” Proc. of the ICASSP 2002, vol. 2, pp. 808-811. [28] E. Parzen, Time series analysis papers, Holden-Day, pp. 251-382, 1967. [29] J. Albuquerque, “The Barankin bound: A geometric interpretation,” IEEE Trans. on Information Theory, vol. 19, pp. 559-561, 1973. [30] K. Todros and J. Tabrikian “A new lower bound on the mean-square error of unbiased estimators,” Proc. of ICASSP 2008, pp. 3913-3916. [31] G. B. Folland, Real Analysis. John Wiley and Sons, p. 65, 185, 235, 1984. [32] M. Simonnet, Measures and probabilities. Springer Verlag, p. 100, 1996. [33] T. X. He, Dimensionality reducing expansion of multivariate integration, Birkh¨auser, Boston, p. 6, 2001. [34] S. Kullback and R. A. Leibler, “On information and sufficiency,” Ann. Math. Stat., vol. 22, pp. 79-86, 1951. [35] R. A. Horn and C. R. Johnson, Matrix Analysis, Cambridge University Press, p. 472, 1985. [36] G. H. Golub and C. F. Van Loan, Matrix Computations, 3rd ed. Baltimore, MD: Johns Hopkins, p. 50, 1996. [37] K. Todros and J. Tabrikian, “On Order-relations between lower bounds on the mean-square-error of unbiased estimators,” Technical report: http://www.ee.bgu.ac.il/∼spl/publications/technical reports.

January 14, 2010

DRAFT