Ratio tests for variance change in nonparametric regression

1 downloads 0 Views 193KB Size Report
In this paper, we propose a ratio test to detect the variance change in the nonparametric regression models under both fixed and random design cases.
Statistics, 2014 Vol. 48, No. 1, 1–16, http://dx.doi.org/10.1080/02331888.2012.708030

Ratio tests for variance change in nonparametric regression Zhanshou Chena,b * and Zheng Tianc a Department of Mathematics, Qinghai Normal University, Xining, Qinghai 810008, People’s Republic of China; b Key Laboratory of Tibetan Information Processing of Ministry of Education, Qinghai Normal University, Xining 810008, People’s Republic of China; c Department of Applied Mathematics,

Northwestern Polytechnical University, Xi’an, Shaanxi 710129, People’s Republic of China (Received 7 July 2010; final version received 27 June 2012) In this paper, we propose a ratio test to detect the variance change in the nonparametric regression models under both fixed and random design cases. The asymptotic validity of detection procedure is derived and its finite sample performance is evaluated via a simulation study. Compared with the existing cumulative sums (CUSUM) test in the literature, our ratio test does not need to estimate any scale parameter. In particular, the ratio test performs significantly better than those in the literature when variance shifts from a large value to a small value. Finally, we illustrate our method in practice by analysing a set of China stock data and a set of light detection and ranging data. Keywords: ratio test; nonparametric regression; variance change AMS Subject Classifications: 62F03; 62G08

1.

Introduction

Testing for structural change in time series and other models is still ongoing up to now. The presence of variance change can easily mislead the conventional time series analysis procedure resulting in erroneous conclusions. Thus, it is an extremely important work to detect and locate these change points. Since the seminal work of Hsu [1] to test the variance change in time series, there are amount of papers to consider this problem. These include [2–8] among many others. The present paper concentrates on the following nonparametric regression model: Yt = g(Xt ) + εt ,

t = 1, . . . , n,

(1)

where n denotes the sample size, g(·) is a regression function, and {εt } is an innovation sequence with zero mean and satisfying some additional assumptions specified below. Since the inference based on nonparametric model is robust against the misspecification of the underlying regression model, the nonparametric model can effectively avoid the problem of misspecification found in parametric approaches. This model has become one of the most popular nonlinear regression models. Many authors have studied this model. For example, Hall and Hart [9], Fan and Gijbels [10],

*Corresponding author. Email: [email protected] © 2012 Taylor & Francis

2

Z. Chen and Z. Tian

and Lee et al. [11] considered the estimation of regression function g(·). Müller [12] and Delgado and Hidalgo [13] used one-sided kernel method to detect and estimate the change point in the nonparametric regression function. Horváth and Kokoszka [14] and Prieur [15] did similar study by local polynomial fitting. Gijbels and Goderniaus [16] and Su and Xiao [17] applied the bootstrap method to detect the change point in the regression function. Wang [18], Raimondo and Tajvidi [19], and Chen et al. [20] tested the change point in the regression function by wavelet. Lee et al. [6] proposed a cumulative sums (CUSUM) of squares detection procedure to test the variance change in model (1) under fixed design case, namely, Xt = xt = t/n, t = 1, . . . , n. Their procedure is based on the test statistic    1  k − [kh] ∗  ˆ ˆ n = Sk − max Sn−[nh]  , [nh]+1≤k≤n−[nh]  ϕ ˆn n − 2[nh]  n where Sˆ n is the estimator of Sn = nt=1 εt2 and ϕˆ n2 = n{γˆ (0) + 2 lk=1 γˆ (k)} is the estimator of 2 ϕ 2 = Var(Sn ), in which γˆ (k) is the estimator of γ (k) = E(ε12 − σ 2 )(εi+k − σ 2 ) with σ 2 = Eε12 . ∗ Large values of n indicates the existence of a variance change. Lee et al. [6] has showed that p the limiting distribution of n∗ is a supreme of a Brownian bridge when ϕˆ → ϕ. However, the p assertion ϕˆ → ϕ may not hold (especially under the alternative hypothesis), and the testing results depend on the choice of bandwidth ln in practice. In particular, this estimation will dramatically influence the testing power. The reason is that the different location of change point as well as its direction (namely, variance changes from a large value to a small value or vice versa) give different estimate values. For random design case, there exists the similar problem [21,22]. In order to avoid estimating these nuisance scale parameters, we propose a ratio test. Our new test is a modified version of the ratio test suggested by Horváth et al. [23] for a mean change-point problem. The new ratio test is based on the squares of kernel-type regression residuals. A lot of simulations in Section 4 indicate that the new ratio test performs well. More attractive finding is that our ratio test performs significantly better than the available test in the literature when variance shifts from a large value to a small value. The rest of the paper is organized as follows. In Section 2, we consider the detection problem under the fixed design case. The random design case will be studied in Section 3. Section 4 provides the simulation results and illustrates it using a set of stock closing price data in China stock market and a set of light detection and ranging (LIDAR) data. Section 5 concludes the paper. All proofs of the main results are gathered in Section 6.

2. The fixed design case In this section, we develop the ratio test procedure for variance change in model (1) under the fixed design case. Our goal is to test the following hypothesis: H0 : Eεt2 = σ02

for all t = 1, . . . , n,

versus H1 : there is 1 < k ∗ < n such that Eε12 = · · · = Eεk2∗ = σ02  = Eεk2∗ +1 = · · · = Eεn2 = σ12 . The parameters σ02 , σ12 , and k ∗ are unknown. Let K(·) be a kernel function and h = hn be a bandwidth. Given observations Y1 , . . . , Yn , we can estimate regression function g(·) by 1 Yt Kh (x − xt ), n t=1 n

gn (x) =

0 ≤ x ≤ 1,

Statistics

3

where Kh (·) = K(·/h)/h, and the estimation residuals are εˆ t = Yt − gn (xt ),

t = 1, . . . , n.

We define the following two test statistics  max1≤i≤k | ij=1 (ˆεj2 − ε¯ k2 )| n Tn,1 = max , (2) nδ≤k≤n−nδ maxk≤i≤n | εj2 − ε˜ k2 )| j=i (ˆ   max1≤i≤k ij=1 (ˆεj2 − ε¯ k2 ) − min1≤i≤k ij=1 (ˆεj2 − ε¯ k2 ) n  , (3) Tn,2 = max nδ≤k≤n−nδ maxk≤i≤n εj2 − ε˜ k2 ) − mink≤i≤n nj=i (ˆεj2 − ε˜ k2 ) j=i (ˆ   where 0 < δ < 21 , ε¯ k2 = (1/k) kt=1 εˆ t2 , and ε˜ k2 = (1/(n − k)) nt=k+1 εˆ t2 . To derive the limiting distribution of above statistics (2) and (3), we need to adopt the following assumptions. (A1) {εt } is the geometrically strong mixing sequence with strong mixing coefficient αk satisfying αk ≤ Ce−ρk for some constant C > 0 and ρ > 0, and E|ε12 − σ02 |r < ∞ for some r > 2. (A2) The regression function g(·) satisfies the Lipschitz condition, that is, |g(x) − g(y)| ≤ D1 |x − y|,

0 ≤ x, y ≤ 1,

for some constant 0 < D1 < ∞. (A3) The kernel function K(·) vanishes outside [−1, 1], and the Lipschitz condition continuous on [−1, 1], that is, |K(x) − K(y)| ≤ D2 |x − y|,



−1 ≤ x, y ≤ 1,

for some constant 0 < D2 < ∞, and K(·) satisfies K(x) dx = 1. (A4) The bandwidth h = hn satisfies nhn2 → ∞ and nhn4 → 0 as n → ∞. All the above assumptions are similar to the assumptions of Lee et al. [6]. Our first result gives the asymptotic distribution of Tn,1 and Tn,2 under the null hypothesis H0 . Theorem 2.1 If Assumptions (A1)–(A4) hold, then under the null hypothesis H0 d

Tn,1 −→

sup δ≤t≤1−δ

d

Tn,2 −→

sup δ≤t≤1−δ

where

η1,1 (t) , η1,2 (t)

(4)

η2,1 (t) , η2,2 (t)

(5)

  s   η1,1 (t) = sup W (t) − W (t) , t 0≤s≤t     (1 − s)  η1,2 (t) = sup B(s) − B(t) , (1 − t) t≤s≤1

s

s η2,1 (t) = sup W (s) − W (t) − inf W (s) − W (t) , 0≤s≤t t t 0≤s≤t

(1 − s) (1 − s) η2,2 (t) = sup B(s) − B(t) − inf B(s) − B(t) , t≤s≤1 (1 − t) (1 − t) t≤s≤1

where W (t), 0 ≤ t ≤ 1, denotes a Wiener process and B(t) = W (1) − W (t).

4

Z. Chen and Z. Tian

If a variance change occurs at k ∗ = [nθ ] with some 0 < θ < 1, let n = σ02 − σ12 be the change size. The following theorem gives the consistency of our ratio test procedures. Theorem 2.2 If n1/2 | n | → ∞, and δ < θ < 1 − δ hold, then under assumptions (A1)–(A4) and the alternative hypothesis H1 , we have p

p

Tn,1 −→ ∞ and Tn,2 −→ ∞. According to Theorems 2.1 and 2.2, we can reject null hypothesis H0 for large values of statistics Tn,1 and Tn,2 . However, a lot of previous simulations indicate that the statistics Tn,1 and Tn,2 are very sensitive for decreased variance change (namely, σ1 > σ2 ), but insensitive (especially in a small sample size case) for increased variance change (namely, σ1 < σ2 ). On the other hand, the −1 −1 statistics Tn,1 and Tn,2 have contrary test performance. Therefore, we apply statistics −1 }, T˜ n,i = max{Tn,i , Tn,i

i = 1, 2,

to test H0 against H1 . Following the proof of Theorems 2.1 and 2.2, one can easily verify that under H0 ,

η1,1 (t) η1,2 (t) d T˜ n,i −→ max sup , sup , i = 1, 2, δ≤t≤1−δ η1,2 (t) δ≤t≤1−δ η1,1 (t) and under the alternative hypothesis H1 , T˜ n,i −→ ∞,

i = 1, 2.

3. The random design case In this section, we consider the random design case. Let {(Xt , Yt ), t = 1, 2, . . .} in model (1) is a sequence of random variables and Xt ∈ [a, b] with the density function f (x). We are interested in testing the following null hypothesis H0r : Eεt2 = σ02

for all t = 1, . . . , n,

against the alternative hypothesis H1r : there is 1 < k ∗ < n such that Eε12 = · · · = Eεk2∗ = σ02 , but Eεk2∗ +1 = · · · = Eεn2 = σ12 , with σ02  = σ12 , where σ02 and σ12 are unknown constants and change point k ∗ is also not prescribed. Let n Kn,h (Xt − x)Yt gˆ n (x) = t=1 n t=1 Kn,h (Xt − x) be the kernel estimator of nonparametric regression function g(x), in which Kn,h (Xt − x) =

K((Xt − x)/h) , h

Statistics

5

where K(·) is a kernel function and h = hn is a bandwidth. Then, we can define the following ratio statistics:  max1≤i≤k | 1≤j≤i (Zj2 − Z¯ k2 )| , (6) Vn,1 = max  2 ˜2 nδ≤k≤n−nδ maxk≤i≤n | i≤j≤n (Zj − Zk )|   max1≤i≤k 1≤j≤i (Zj2 − Z¯ k2 ) − min1≤i≤k 1≤j≤i (Zj2 − Z¯ k2 ) , (7) Vn,2 = max   2 2 ˜2 ˜2 nδ≤k≤n−nδ maxk≤i≤n i≤j≤n (Zj − Zk ) − mink≤i≤n k≤i≤n (Zj − Zk )   in which Zt = Yt − gˆ (Xt ), Z¯ k2 = (1/k) kt=1 Zt2 , and Z˜ k2 = (1/(n − k)) nt=k+1 Zt2 . To prove the limiting distribution of statistics (6) and (7), the following assumptions need to be adopted. (B1) The regression function g(x) is twice continuously differentiable on [a, b]. (B2) f (x) is a bounded function with M1 < f (x) < M2 for some positive M1 and M2 and has continuous second-order derivatives on (a, b). (B3) Kernel function K(·) is a symmetric probability density function supported on the interval [−c0 , c0 ] with a bounded derivative, and the Fourier transform of K(·) is absolutely integrable. (B4) h = hn is a sequence of bandwidths such that nh4 = o(1) and nh3+d → ∞ for some d > 0 as n → ∞. (B5) {εi } is a strictly stationary and strong mixing sequence with mixing coefficients satisfying ∞ 

(α(n))δ/(2+δ) < ∞,

n=1

and E|ε|4+δ < ∞ for some δ > 0. (B6) {(Xi , εi ), i = 1, 2, . . .} is a strictly stationary and strongly mixing sequence with coefficient α(n) = O(cn ) for some 0 < c < 1. Remark Assumptions (B1)–(B4), similar to the assumptions of Neumeyer and Keilegon [21], are standard assumptions to derive the convergence of kernel estimator. Assumption (B5) is used to guarantee the weak invariance principle of the sum of the sequence {εi }. Assumption (B6) allows for the moving average (MA) and autoregressive (AR) process among the regressors. These two assumptions are similar to the assumptions of Chen et al. [22]. For more detailed discussions about these assumptions, we refer the reader to these two papers and references therein. Theorem 3.1 If Assumptions (B1)–(B6) hold, then under the null hypothesis H0r , d

Vn,1 −→

sup δ≤t≤1−δ

d

Vn,2 −→

sup δ≤t≤1−δ

η1,1 (t) , η1,2 (t)

(8)

η2,1 (t) , η2,2 (t)

(9)

where η1,1 (t), η1,2 (t), η2,1 (t), and η2,2 (t) have the same definitions as in Theorem 2.1. If there is a change in volatility at k ∗ = [nθ ] with some 0 < θ < 1, let n = σ12 − σ02 be the change size. The following theorem shows the consistency of our ratio test.

6

Z. Chen and Z. Tian

Theorem 3.2 If n1/2 | n | → ∞, and δ < θ < 1 − δ hold, then under Assumptions (B1)–(B6) and the alternative hypothesis H1r , we have p

p

Vn,1 −→ ∞ and Vn,2 −→ ∞. Similar to the fixed design case, we apply the statistics −1 V˜ n,i = max{Vn,i , Vn,i },

i = 1, 2,

to test H0r against H1r and reject null hypothesis H0r for large values of V˜ n,1 and V˜ n,2 .

4.

Simulations and empirical applications

4.1. Simulations In this section, we evaluate the finite sample performance of our tests by the Monte Carlo method. To save space, we just show the results for the statistics T˜ n,1 and V˜ n,1 . Unreported simulations show that the statistics T˜ n,2 and V˜ n,2 have similar performance as the statistics T˜ n,1 and V˜ n,1 , respectively. All the critical values used in this section are obtained via a direct simulation based on i.i.d. N(0, 1) sequences by 50,000 repetitions. The empirical sizes and powers are calculated as the rejection number of the null hypothesis out of 2500 repetitions at 5% nominal level. Similar to Horváth et al. [23], we select δ = 0.2. To carry out the simulation, we consider the following nonparametric regression model: Yt = g(Xt ) + σt εt , t = 1, . . . , n, (10) where g(Xt ) = 25Xt3 − 45Xt2 + 24Xt − 3.6, Xt = xt = t/n under the fixed design case, and Xt follows an uniform distribution on [0,1] when the design is random. The innovation sequence εt follows an AR(1) model, that is, εt = βεt−1 + et ,

|β| < 1, t = 1, . . . , n,

(11)

in which et follows the i.i.d. N(0, 1) distribution. Similar to Lee et al. [6], we use the Epanechnikov kernel function 3 K(x) = (1 − x 2 ), |x| ≤ 1, 4 to estimate the regression function and set the bandwidth h = hn = n−1/3 /3. Note that this choice which has been used in Lee et al. [6] may not be the ideal choice. To simplify our simulation study, we continue using this choice. For a comprehensive account of this problem, we refer the reader to [24] and references therein. First, we compare the performance of statistic T˜ n,1 with the statistic n∗ of Lee et al. [6] under the fixed design case. In this case, we perform a test for the following hypothesis: H0 : σt2 remains equal to 1 for t = 1, . . . , n versus H1 : σt2 = 1 for t = 1, . . . , k ∗ and σt2 = for t = k ∗ + 1, . . . , n, where takes the values 2, 4, 9, 21 , 41 and 91 , and change point k ∗ takes three different values, that is, k ∗ = 0.25n, 0.5n, and 0.75n. We also employ the sample size n = 100, 200, 500 and β = 0, 0.2, 0.5, 0.8 to see the correlation effect. As Lee et al. [6], we select the bandwidth ln = [n1/4 ] to estimate the parameter ϕ in the statistic n∗ .

Statistics Table 1.

7

Empirical sizes and powers under the fixed design case with sample size n = 100. T˜ n,1 (t)

k∗

0.25

0.5

0.75

n∗



φ=0

φ = 0.2

φ = 0.5

φ = 0.8

φ=0

φ = 0.2

φ = 0.5

φ = 0.8

0 2 4 9 1 2 1 4 1 9 2 4 9 1 2 1 4 1 9 2 4 9 1 2 1 4 1 9

5.64 6.84 32.36 81.36

6.84 7.44 33.60 79.24

8.24 11.08 38.84 80.60

10.08 15.80 44.32 83.88

5.44 7.68 11.44 16.40

5.02 7.68 11.92 16.28

5.52 8.44 11.72 16.44

5.88 7.76 11.20 13.96

43.52

44.20

43.32

37.84

12.44

12.52

14.52

14.00

90.84

90.52

86.36

79.68

37.40

38.92

38.08

35.92

99.88

99.84

99.56

97.56

59.60

60.24

59.76

58.88

10.24 53.16 96.44

10.96 51.56 95.96

14.52 53.92 95.68

19.80 57.84 95.40

26.64 66.64 85.44

27.56 66.24 84.60

25.96 62.08 82.28

23.28 56.60 77.16

45.20

47.00

46.32

43.24

25.68

25.76

23.96

23.2

91.96

91.32

88.56

81.40

64.96

65.28

61.44

56.16

99.92

99.96

99.76

98.00

84.08

83.96

80.72

76.52

6.04 40.96 90.64

6.24 41.16 89.16

8.00 43.36 88.56

12.48 46.68 89.28

12.60 37.80 58.68

12.84 38.36 59.84

14.48 37.28 59.76

14.56 36.28 57.80

33.72

35.40

38.20

37.28

8.16

7.68

8.80

8.16

75.96

75.84

75.28

71.52

12.92

13.20

13.80

11.76

97.52

97.36

96.28

93.40

14.84

15.68

14.80

13.40

Table 1 reports the empirical sizes (i.e. τ = 0 case) and powers of statistics T˜ n,1 and n∗ based on samples of size n = 100. Tables 2 and 3 contain the results for sample sizes n = 200 and n = 500, respectively. It can be seen that both the statistics T˜ n,1 and n∗ have a slight size distortion when β nears to 0. We guess that the slight size distortion was induced by the boundary effect in nonparametric kernel estimation. However, our test statistic T˜ n,1 exists more serious size distortion than statistic n∗ when β nears to 1. This finding is similar to the finding of Horváth et al. [23] which is a shortcoming of the ratio test. Turning to the results of empirical powers of two tests in Tables 1–3, a number of comments seem appropriate. First, the empirical power increases as the sample size or change size increases. This is unsurprising given that both tests are consistent tests. Note that our test statistic T˜ n,1 has significantly better empirical power performance than Lee et al. [6] when the sample size is not too large, but slightly worse under large sample size n = 500. In other words, our ratio test procedure has better small sample performance than Lee et al. [6]. Secondly, those change points close to the middle of the sample can be detected more easily. However, the test power of statistic n∗ can be influenced more easily by the location of change point than T˜ n,1 . Again this is unsurprising for the estimation of parameter ϕ, which was estimated based on full sample may obtain different estimation for different change-point locations, will dramatically influence the test power of statistic n∗ . In our test, however, we do not need to estimate this nuisance parameter. It is interesting to note that the change point can be detected more easily for k ∗ = 0.25 than k ∗ = 0.75 when variance shifts from a large value to a small value, but more easily for k ∗ = 0.75 than k ∗ = 0.25 when variance shifts from a small

8 Table 2.

Z. Chen and Z. Tian Empirical sizes and powers under the fixed design case with sample size n = 200. T˜ n,1 (t)

k∗

0.25

0.5

0.75

n∗



φ=0

φ = 0.2

φ = 0.5

φ = 0.8

φ=0

φ = 0.2

φ = 0.5

φ = 0.8

0 2 4 9 1 2 1 4 1 9 2 4 9 1 2 1 4 1 9 2 4 9 1 2 1 4 1 9

6.68 10.68 47.36 94.46

7.26 12.20 50.16 93.28

10.48 19.28 58.56 94.52

16.08 31.04 68.92 95.72

6.2 22.68 59.96 82.00

6.92 22.84 59.60 51.12

8.24 21.00 50.96 70.44

8.44 17.60 34.80 48.88

71.04

69.08

61.38

50.08

49.64

51.00

45.72

37.12

99.28

99.08

97.72

92.60

96.32

95.92

93.04

83.92

99.80

99.88

99.96

99.52

98.60

20.52 87.00 1.000

1.000

22.26 86.26 99.96

28.84 86.32 99.88

37.44 87.52 99.80

71.36 99.64 1.000

71.40 99.48 99.96

63.72 98.80 99.92

49.88 94.80 99.72

71.44

70.96

64.24

55.72

72.12

71.92

65.00

49.48

99.66

99.60

99.08

91.92

99.84

99.88

98.84

94.72

99.88

99.48

1.000

1.000

1.000

1.000

1.000

1.000

1.000

1.000

11.56 71.84 99.76

12.96 80.28 99.32

17.84 71.40 98.76

25.64 74.24 98.48

47.64 96.76 99.92

47.44 96.40 99.88

43.40 92.64 99.88

34.64 84.56 99.32

43.60

45.00

45.20

46.88

22.72

22.64

20.84

16.92

87.84

87.64

85.24

81.08

58.20

58.20

49.40

33.68

99.84

99.84

99.32

99.28

83.40

82.64

72.96

50.68

value to a large value. Thirdly, similar to the finding of Lee et al. [6], the power decreases as β approaches 1. Finally, our test statistic T˜ n,1 performs significantly better than the statistic n∗ when variance changes from a large value to a small value, slightly worse than n∗ when variance shifts from a small value to a large value. 1/2 Next, we compare the performance of statistic V˜ n,1 with the statistic V˜ n of Chen et al. [22] under the random design case. The statistic of Chen et al. [22] is V˜ nv =

√ max

nδ k ∗ , then if 1 ≤ i ≤ k ∗ , ⎤ ⎡ ∗ i k i k k      i i (k − k ∗ )i εj2 − εj2 = (εj2 − σ02 ) − ⎣ (εj2 − σ02 ) + (εj2 − σ12 )⎦ − n , k j=1 k j=1 k j=1 j=1 j=k ∗ +1

Proof of Theorem 2.2

if i > k ∗ i  j=1

⎤ ⎡ ∗ k k∗ i k k      i i εj2 − ε2 = (εj2 − σ02 ) + (εj2 − σ12 ) − ⎣ (εj2 − σ02 ) + (εj2 − σ12 )⎦ k j=1 j k j=1 j=1 j=k ∗ +1 j=k ∗ +1 − (i − k ∗ ) n −

(k − k ∗ )i n . k

Since there is no change in the nonparametric regression function g(·), by the proof of Theorem 2.1, if k = [nτ ] with some θ < τ < 1 − δ, we get that      i  i       1  2 1 max √  (ˆεj − ε¯ k2 ) = max √  (εj2 − εk2 ) + op (1) 1≤i≤k n  j=1  1≤i≤k n  j=1     [nθ ]  [nτ ] 1  2 [nθ]  2  ≥√  ε − ε [nτ ] j=1 j  n  j=1 j = Op (1) +

[nθ ]([nτ ] − [nθ ]) | n | √ [nτ ] n

√ (16) = Op ( n| n |). √ Combining Equations (15) and (16) and condition n| n | → ∞, we conclude that Tn,1 → ∞. Similar arguments yield the proof of Tn,2 → ∞.  Proof of Theorem 3.1 Let k = [nt] and i = [ns], then from the assumption (B5) and the invariant theorem, we have    i  k   s 1  2 i  2  D[0,1]   (17) εj − εt  −→ sup κw W (s) − W (t) , max √  1≤i≤k k t=1  t n  j=1 0≤s≤t where κw is some positive value such that ⎞ ⎛ n  1 [ε 2 − E(εj2 )]⎠ → κw2 , Var ⎝ n j=1 j

as n → ∞.

Statistics

15

Chen et al. [22] had proved that (cf. Lemmas A.7 and A.8) k 1  p [g(Xt ) − gˆ (Xt )]2 −→ 0, √ k t=1 k 1  p [ˆg(Xt ) − g(Xt )]εt −→ 0. √ k t=1

Thus,    i  k     1 i [g(Xj ) − gˆ (Xt )]2  = op (1), max √  [g(Xj ) − gˆ (Xj )]2 − 1≤i≤k k t=1 n  j=1     i  k  2  i  max √  [ˆg(Xj ) − g(Xj )]εj − [ˆg(Xj ) − g(Xt )]εt  = op (1). 1≤i≤k k t=1 n  j=1 

(18)

(19)

By Equations (12) and (17)–(19), we have    i     1  2 s   2  D[0,1] ¯ max √  (Zj − Zk ) −→ sup κw W (s) − W (t) . 1≤i≤k t n  j=1 0≤s≤t  The rest of the proof follows exactly in the same way as the proof of Theorem 2.1.



Proof of Theorem 3.2 This theorem can be derived similarly using the arguments in Theorem 2.2 and the results in Theorem 3.1. We omit to report here.  Acknowledgements We are grateful to the editors and referees for useful suggestions and helpful comments for improving the paper. We also thank the Professor Rupert who provided the LIDAR data used in Section 4.2. This work was supported by the National Natural Science Foundation of China (No.60972150) and the Science and Technology Innovation Foundation of Qinghai Normal University (2012).

References [1] D.A. Hsu, Tests for variance shift at an unknown time point, Appl. Stat. 26 (1977), pp. 279–284. [2] C. Inclán and G.C. Tiao, Use of cumulative sums of squares for retrospective detection of changes of variances, J. Am. Statist. Assoc. 89 (1994), pp. 913–923. [3] E. Gombay, L. Horváth, and M. Hušková, Estimators and tests for change in variances, Stat. Decis. 14 (1996), pp. 145–159. [4] M. Csörgó and L. Horváth, Limit Theorems in Change-Point Analysis, Wiley, Chichester, 1997. [5] S. Lee and S. Park, The cusum of squares test for scale changes in infinite order moving average processes, Scand. J. Stat. 28 (2001), pp. 625–644. [6] S. Lee, O. Na, and S. Na, On the cusum of squares test for variance change in nonstationary and nonparametric time series models, Ann. Inst. Statist. Math. 55 (2003), pp. 467–485. [7] Z. Chen and Z. Tian, Modified procedures for change point monitoring in linear models, Math. Comput. Simul. 81 (2010), pp. 62–75. [8] Z. Chen, Z. Tian, and R. Qin, Monitoring variance change in infinite order moving average processes and nonstationary autoregressive processes, Comm. Stat. Theory Methods 40 (2011), pp. 1254–1270. [9] P. Hall and J.D. Hart, Nonparametric regression with long range dependence, Stochastic Process. Appl. 36 (1990), pp. 339–351. [10] J. Fan and I. Gijbels, Local Polynomial Theory and Its Applications, Chapman and Hall, London, 1996.

16

Z. Chen and Z. Tian

[11] Y.K. Lee, E. Mammen, and B.U. Park, Bandwidth selection for kernel regression with correlation errors, Statistics 44 (2010), pp. 327–340. [12] H.G. Müller, Change-points in nonparametric regression analysis, Ann. Stat. 24 (1992), pp. 1667–1678. [13] M.A. Delgado and J. Hidalgo, Nonparametric inference on structural breaks, J. Econom. 96 (2000), pp. 113–144. [14] L. Horváth and P. Kokoszka, Change point detection with nonparametric regression, Statistics 36 (2002), pp. 9–31. [15] C. Prieur, Change point estimation by local linear smoothing under a weak dependence condition, Math. Methods Stat. 16 (2007), pp. 25–41. [16] I. Gijbels and A.C. Goderniaus, Bootstrap test for change-points in nonparametric regression, J. Nonparametr. Stat. 16 (2004), pp. 591–611. [17] L. Su and Z. Xiao, Testing structural change in time-series nonparametric regression models, Stat. Inference 1 (2008), pp. 347–366. [18] Y. Wang, Jump and sharp cusp detection by wavelets, Biometrika 82 (1995), pp. 385–397. [19] M. Raimondo and N. Tajvidi, A peaks over threshold model for change-point detection by wavelets, Statist. Sinica 14 (2004), pp. 395–412. [20] G. Chen, Y.K. Choi, and Y. Zhou, Detection of changes in return by a wavelet smoother with conditional heteroscedastic volatility, J. Econom. 143 (2008), pp. 227–262. [21] N. Neumeyer and V. Keilegon, Change-point tests for the error distribution in nonparametric regression, Scand. J. Stat. 36 (2009), pp. 518–541. [22] G. Chen, Y.K. Choi, and Y. Zhou, Nonparametric estimation of structural change points in volatility models for time series, J. Econom. 126 (2005), pp. 79–114. [23] L. Horváth, Z. Horváth, and M. Hušková, Ratio tests for change point detection, Inst. Math. Stat. 1 (2008), pp. 293– 304. [24] J. Opsomer, Y. Wang, and Y. Yang, Nonparametric regression with correlated errors, Statist. Sci. 16 (2001), pp. 134–153. [25] D. Ruppert, M.P. Wand, and R.J. Carroll, Semiparametric Regression, Cambridge University Press, Cambridge, 2003. [26] P. Doukhan, Mixing: Properties and Example, Lecture Notes in Statistics, Springer-Verlag, New York, 1994.