Unit Root Test in a Threshold Autoregression ...

3 downloads 243 Views 319KB Size Report
AE-mail:m.seo@lse.ac.uk; ... bootstrap, and Seo (2003) for a residual-based parametric bootstrap especially in a ..... The restriction in (12) is to guarantee.
Unit Root Test in a Threshold Autoregression: Asymptotic Theory and Residual-based Block Bootstrap Myunghwan Seo∗† Department of Economics London School of Economics

Abstract There is a growing literature on unit root testing in threshold autoregressive models. This paper makes two contributions to the literature. First, an asymptotic theory is developed for unit root testing in a threshold autoregression, in which the errors are allowed to be dependent and heterogeneous, and the lagged level of the dependent variable is employed as the threshold variable. The asymptotic distribution of the proposed Wald test is non-standard and depends on nuisance parameters. Second, the consistency of the proposed residual-based block bootstrap is established based on a newly developed asymptotic theory for this bootstrap. It is demonstrated by a set of Monte Carlo simulations that the Wald test exhibits considerable power gains over the ADF test that neglects threshold effects. The law of one price hypothesis is investigated among used car markets in the US. JEL classification: C12; C15; C22; Keywords: threshold autoregression, unit root test, threshold cointegration, residual-based block bootstrap, used car

∗ †

E-mail:[email protected]; I am indebted to Bruce Hansen for his invaluable guidance and time.

1

1

INTRODUCTION

There is a growing literature on unit root testing in threshold autoregressive (TAR) models. As nonstationarity is brought into a TAR model, it is of great econometric importance whether its threshold variable is a lagged di¤erence or the lagged level of the dependent variable. Under the null of a unit root, the lagged level is nonstationary, while the lagged di¤erence is stationary. Each case calls for a completely di¤erent asymptotic distribution theory. For di¤erence-based models, Caner and Hansen (2001) provide a rigorous treatment of statistical tests. However, there has been no formal econometric literature concerning level-based models.1 The level-based TAR models have exhibited a great deal of empirical relevance. A leading example is the threshold cointegration model introduced by Balke and Fomby (1997), in which deviation from a long-run equilibrium follows a threshold autoregression, and the lagged level of the deviation is the threshold variable. See Lo and Zivot (2001) for a review of this model’s wide application in the analysis of the purchasing power parity hypothesis or the law of one price hypothesis. Two testing issues arise: One is testing for linearity, and the other is testing for nonstationarity, that is, testing for a unit root in a level-based TAR model. Econometric theory for the …rst issue is well established and built on the stationarity of the threshold variable, as in Hansen (1996), among others. This paper provides a formal distribution theory for unit root testing in a level-based TAR model. Speci…cally, a TAR(1) model allowing for dependent heterogeneous innovations is considered. As in conventional unit root testing, however, the parameters of interest are estimated, including some lagged di¤erence terms, and the Wald test is constructed based on these estimators. In our model, threshold parameters are not identi…ed under the null of a unit root, and the Wald test is the typical sup-Wald type test popularized by Davies (1987). We …nd that the Wald test has a non-standard asymptotic distribution; it is biased and depends on nuisance parameters in a complicated manner, and thus cannot be tabulated. The asymptotic distribution is derived based on newly developed asymptotic tools. The key development is a convergence of a stochastic integral for dependent heterogeneous arrays 1

I found that there is some ongoing work in this area, independent of mine. Research is being conducted by Bec, Guay, and Guerre (2002), and Kapetanios and Shin (2003). However, I allow a more general innovation process, which calls for more involved asymptotic theory. Furthermore, the consistency of a residual-based block b ootsrap is shown in this paper for the …rst time in the literature.

2

(speci…cally, the convergence of the sample mean of the product of nonlinear transformations of an integrated process and a strong mixing process). This paper extends Hansen (1992) by showing uniform convergence over nonlinear/discontinuous transformations indexed by a set of parameters. Another closely related work is that of Park and Phillips (2001), which develops an asymptotic theory applicable to a broad range of transformations–including the ones considered in this paper–of a nonstationary variable, but is limited to pointwise convergence and only allows for martingale arrays.2 A residual-based block bootstrap is proposed to approximate the sampling distribution of the Wald statistic. The nonparametric nature of block bootstrap is appropriate to the weak assumption on the dependent structure of our model. The …rst order consistency of the bootstrap is shown. In literature, it is quite a recent development to derive an invariance principle for the integrated process of resampled innovation processes, and to show the consistency of various bootstrap unit root tests (see, for example, Park (2002) and Chang and Park (2003) for a di¤erence-based sieve bootstrap, Paparoditis and Politis (2003) for a residual-based block bootstrap, and Seo (2003) for a residual-based parametric bootstrap especially in a threshold vector error correction model). This paper contributes to this literature by establishing the uniform convergence of the stochastic integral mentioned above in the residual-based block bootstrap context. Unlike in linear models, this is not a straightforward extension of the invariance principle. The …nite sample performances of the proposed bootstrap are examined and compared to the conventional augmented Dickey Fuller (ADF) test through Monte Carlo simulations. It is well documented in the literature that the ADF test loses power dramatically for some classes of threshold alternatives (see, for example, Pippenger and Goering (1993)). We …nd that the power gain from explicitly considering the threshold alternative is substantial for various data generating processes. The …nding does not depend on di¤erent block length selections. The newly developed testing strategy is illustrated through an investigation of the law of one price (LOP) hypothesis amongst used car markets in the US. An important feature of used car markets is the presence of uncertainty of quality, which has generated a large debate on 2

I recently found that an extension of Park and Phillips (2001) is underway by De Jong (2003) to allow more general innovation processes. However, it does not include the transformations considered in this paper.

3

the e¢ ciency of these markets. An implication for the study of the LOP is that it (in addition to the transaction costs) will obstruct the arbitrage for at least a short period of time, and that the assumption of the linear adjustment in conventional cointegration models is not likely. In fact, we …nd that conventional cointegration tests, such as the ADF test and Horvath and Watson (1994), fail to reject the null of no cointegration, while the Wald test of this paper and the supW test by Seo (2003) provide much stronger evidence for the LOP. The remainder of this paper is organized as follows: Section 2 introduces the model and the Wald test, and develops the asymptotic theory for the test. The residual-based block bootstrap is introduced, and its asymptotic validity is established in Section 3. Section 4 presents simulation evidence for reasonable …nite sample performance of the bootstrap. Section 5 illustrates the testing strategy through the study of the LOP in used car markets in the US. All proofs are collected in the appendix.

2

TESTING FOR UNIT ROOT IN TAR

Consider a threshold autoregressive model: ¢ yt = ® 1yt¡ 1 1fyt¡ 1 · ° 1g + ® 2 yt¡ 11 fyt¡ 1 > ° 2g + ut;

(1)

t = 1; : : : ; n, where 1 fAg is the indicator function that has value 1 if A is true, and value 0

otherwise. The threshold parameter ° = (° 1; ° 2) ; ° 1 · ° 2; is unknown, and both ° 1 and ° 2 take

values from a compact set S including zero. The error ut is allowed to be a general stationary process such as a strong mixing process, more general than a …nite order autoregressive process

with iid innovation assumed in Bec, Guay, and Guerre (2002) or Kapetanios and Shin (2003). This model, often called a band-TAR model, is a special case of a three-regime TAR model, and it becomes a two-regime TAR model when ° 1 = ° 2: We want to test the null of a unit root against the alternative of a stationary TAR: Specifically, the null hypothesis of interest is that there is a unit root and no threshold: H0 : ® 1 = ® 2 = 0:

(2)

Unfortunately, however, our understanding is not complete as to the stationarity conditions for general TAR processes. When the errors are independent, Chan, Petruccelli, Tong, and 4

Woolford (1985) provide a necessary and su¢ cient condition of stationarity: ® 1 < 0; ® 2 < 0; and (® 1 + 1) (® 2 + 1) < 1;

(3)

which suggests that the natural alternative to H0 should be H1 : ® 1 < 0 and ® 2 < 0:

(4)

An important feature of the model (1) is that the threshold variable is not a lagged di¤erence y t¡ 1 ¡ yt¡ m for some m > 1 but the lagged level yt¡ 1: The lagged di¤erence is stationary, and Caner and Hansen (2001) develop a distribution theory for unit root testing in this case. Unlike the lagged di¤erence, the lagged level yt¡ 1 is nonstationary under the null of a unit root, so that the statistical inference for the testing relies on a totally di¤erent distribution theory from that of Caner and Hansen (2001).

2.1

TEST STATISTICS

The standard Wald statistic is considered to test the null (2) : In particular, a constant and some lagged …rst di¤erence terms are included in the estimation, as is common in conventional unit root testings. That is, we compute the Wald statistic based on the following Dickey Fuller type regression: ¢ yt = ®^ 1 (°) yt¡ 11 fy t¡ 1 · ° 1 g + ®^ 2 (°) yt¡ 11 fyt¡ 1 > ° 2 g +^ ¹ (°) + ^½1 (°) ¢ y t¡ 1 + ¢¢¢+ ^½ p (°) ¢ yt¡ p + ^et (° ) ;

(5)

for each ° 2 S: Here we write ° 2 S to denote that each element of ° lies in S: We de…ne

¾^ 2 (°) as the residual variance from OLS estimation of (5), and ¾^ 20 as that of the null model: Then, we obtain the least squares (LS) estimators: °^ = arg min ¾^ 2 (°) ; ^¾ 2 = ¾^ 2 (^ ° ) ; ®^ i = ®^ i (^ ° ) ; etc, ° 2S

(6)

and the Wald statistic: µ

¶ µ ¶ ¾^ 20 ¾^ 20 Wn = n ¡ 1 = sup n ¡ 1 = sup Wn (° ) ¾^ 2 ¾^ 2 (°) ° 2S °2S

5

(7)

where Wn (°) is the standard Wald statistic for a given ° 2 S: The above equality follows, since ¾^ 20 is independent of °; and Wn (°) is a decreasing function of ¾^ 2 (°) : Thus, the statistic Wn is the well-known “sup-Wald” statistic advocated by Davies (1987). As will be shown shortly, the OLS estimator ®^ i (°) of the regression (5) is a consistent estimator for any p: In practice, we expect that ®^ i (°) is less biased than the OLS estimator from (1) by including those lagged di¤erence terms. However, we do not attempt to increase the lag-order p as the sample size increases, as in Said and Dickey (1984), which is an approximation of an ARMA process by an in…nite order AR process. Note that the approximation does not make sense in our case, since the error process ut is not con…ned to an ARMA process.

2.2

ASYMPTOTIC DISTRIBUTION

We assume the following: Assumption 1 Let yt = y0 + 2+±

with Ejutj satisfying

P1

Pt

s=1 us

and let futg be strictly stationary with mean zero and

< 1 for some ± > 0; ergodic and strong mixing with mixing coe¢ cients ® m

1=2¡ 1=r m=1 ® m

< 1 for some r > 2:

Assumption 1 is much more general than the …nite order linear process that is assumed in most literature on TAR with nonstationarity, and allows conditional heteroskedasticity which is commonly observed in economic data. The in…nite order linear process is also strong mixing, provided that the innovation process satis…es certain smoothness conditions (see section 14.4 of Davidson (1994)). It can be shown, however, that the asymptotic theory developed here also holds under the assumption of linear process. The development of the asymptotic theory is based on the martingale approximation of Hansen (1992), which is based on the following representation of futg. De…ne "t =

1 X s=0

(Et ut+s ¡ Et¡ 1ut+s ) ; ³t =

1 X

Etut+s;

s=1

where Et X = E(XjFt) and Ft is the typical natural …ltration. Then, ¡ ¢ ut = "t ¡ ³ t ¡ ³ t¡ 1 ; 6

(8)

and f"tg is a martingale di¤erence sequence, and fut ³ t ¡ Eut ³t g is a uniformly integrable

L1 ¡ mixingale (see Hansen (1992), p. 499). The representation (8) yields the useful decomposition of a stochastic integral process:

n n 1 X 1 X f (yt¡ 1; µ) ut = f (yt¡ 1 ; µ) "t + Dn; · n t=2 · n t=2

(9)

where n 1 X 1 Dn = (f (yt; µ) ¡ f (yt¡ 1 ; µ)) ³ t ¡ f (yn¡ 1; µ) ³n : · n t=2 ·n

The convergence of the …rst term in the right hand side (9) and the convergence rate · n are studied by Park and Phillips (2001) for various classes of functions f (¢; µ) : Despite its generality, it is restrictive in that the convergence is not uniform in µ: Hansen (1992) provides a limit of the bias Dn when f (x) = x or x2: Recently, De Jong (2003) studies the limit of Dn for a class of functions considered by Park and Phillips (2001), yet not applicable to our case. In a special case of f (x; µ) = xk 1fx · µg ; where µ 2 S and k > 0; the uniform convergence of the …rst term on a compact parameter space is shown in Seo (2003), and that of the bias Dn is explored below. We introduce some notations. Let “[x] " denote the integer part of x, and “) " the weak convergence with respect to the uniform metric on the parameter space. Next, de…ne the P autocovariance function r (k) =Eutut+k and let ¾ 2 = r (0) ; ¸ = 1 s=1 r (s) and the long-run P 2 variance ! 2 = 1 s=¡ 1 r (s) = ¾ + 2¸ : Assume y0 is zero for the sake of simplicity. It is P[nr] well known that under Assumption 1, p1n y[nr] = p1n t=1 ut ) B (r) where B is a Brownian motion with variance ! 2: The following theorem is stated more generally than necessary for our speci…c case, but its generality may prove useful in other cases. Theorem 1 Suppose the sequence fwt ¡ Ewtg is a uniformly integrable L1 -mixingale and Assumption 1 holds. Then, for k ¸ 0; 1 n1+k=2

X t

y kt 1 fyt

· °g wt ) (Ewt )

uniformly in ° on the set S. 7

Z

Bk 1 fB · 0g

We can easily note from Theorem 1 that µ ¶ p ¢k 1 X 1 X yt 1fy t · °g k Asymptotically 1 X ¡ p wt = yt 1 fyt · ° g= n ¢ wt: n t n n t n t

p In other words, yt 1 fyt · ° g= n and wt are asymptotically uncorrelated. This asymptotic uncorrelatedness between stationary process and nonstationary process is useful for deriving

the limit behavior of the bias term Dn , as in Theorem 3.3 of Hansen (1992). Theorem 3 of Caner and Hansen (2001) provides an extension of Hansen (1992) to a two-parameter process. Theorem 1 is distinguished from previous works in that the convergence is uniform on the parameter space S, and in that it allows for the limit sample path of the nonstationary part

to be discontinuous. Note that the case of particular interest is when k = 0: In this case, 1 fB · 0g is discontinuous at zero, and Theorem 1 states that Z 1X 1 fy t · °g wt ) (Ewt ) 1 fB · 0g: n t

The limit is a random variable that relies on the proportion of each realization of limit Brownian motion which is less than zero. Built on this asymptotic uncorrelatedness, the uniform convergence of the stochastic integral follows as below. Theorem 2 Under Assumption 1, 1X yt¡ 11 fyt¡ 1 · °g ut ) n t

Z

B ¢1 fB · 0g dB + ¸

Z

1fB · 0g

uniformly in ° on the set S

Now we turn to the convergence of the parameter estimators and the Wald statistic. To ease the exposition of the main theorem, we introduce some notations: Z Z B¹L = B1 fB · 0g ¡ B1fB · 0g ; B¹U = B1fB > 0g ¡ B1 fB > 0g ; 0 1 0 r (0) r (1) ¢¢¢ r (p ¡ 1) r (0) B r (1) C B r (0) ¢¢¢ r (p ¡ 2) C B B r (0) + r (1) Gp = B . C ; ~rp = B .. .. .. . . . @. A @. . . . r (p ¡ 1) r (p ¡ 2) ¢¢¢ r (0) r (0) + ¢¢¢+ r (p ¡ 1)

1

C C C; A

and gp = (r (1) ; ¢¢¢; r (p))0 : Let ¶p be a p-dimensional vector of ones. The following theorem provides the limit distribution of ®^ i; i = 1; 2 and that of Wn : 8

Theorem 3 Suppose that Assumption 1 holds. Then, as n ! 1; à ¡R ! µ ¶ ¢ ¹ 2 ¡ 1 Ap;L n^ ® 1 (°) B (i) ) ¡R ¹ L2 ¢¡ 1 ; n^ ® 2 (°) BU Ap;U

¡ ¢¡ 1 h¡R 2 ¢¡ 1 2 ¡R 2 ¢¡ 1 2 i ¹ (ii) Wn ) ¾ 2 ¡ gp0 G¡p 1gp B¹L Ap;L + B Ap;U ; U

where

¶ Z 1fB · 0g ¡ g0pG¡p 1 ~rp 1 fB · 0g µZ ¶ Z Z ¡ ¢ 0 ¡1 0 ¡1 ¹ = 1 ¡ gpGp ¶ BU dB + ¸ 1fB > 0g ¡ gpGp ~rp 1 fB > 0g :

Ap;L = Ap;U

¡ ¢ 1 ¡ g0pG¡p 1¶

µZ

B¹L dB + ¸

Z

Due to the recursion property of Brownian motion, the limit distributions above are wellde…ned, even though they are nonstandard and nonconventional. They depend on nuisance parameters, such as ! 2; ¸ ; r (0) ; :::; r (p). The dependence on the data structure is quite complicated, and thus critical values cannot be tabulated. In the next section, we turn to a bootstrap procedure to approximate the sampling distribution of Wn:

3

BOOTSTRAP TESTING

We discuss a bootstrap approximation of the sampling distribution of the statistic (7). Under nonstationarity we cannot resample the data fyt g directly. Instead, we apply conventional

bootstrap methods developed for dependent data to the …rst di¤erences of y t or to regression residuals using consistent estimators ®^ 0is, and then integrate the resampled ones. Speci…cally, we apply the overlapping block bootstrap of Künsch (1989) to the appropriately centered residuals and call it a residual-based block bootstrap (RBB). In the context of conventional unit root testing, Paparoditis and Politis (2003) introduce the idea of RBB and establish its asymptotic validity. Although our RBB is similar to theirs, it should be noted that our procedure is distinguished in the manner of constructing centered residuals. Furthermore, the convergence of our bootstrapped Wald statistic is more complicated due to the involved nonlinearity. Compared to a di¤erence-based bootstrap, an RBB is believed to have a power advantage, because it does not impose the null hypothesis to obtain the residuals. Paparoditis and Politis 9

(2003) show analytically that this is true in the case of block bootstrap scheme in the context of conventional unit root testing. Expecting similar power concerns, we focus on RBB. Following the convention in the literature, we denote the bootstrap quantities such as data, probability measure, expectation, variance, etc, with an asterisk ¤.

3.1

RESIDUAL-BASED BLOCK BOOTSTRAP

The RBB resampling is conducted through the following algorithm: First, we de…ne the residuals using the estimators ®^ i ; i = 1; 2 in (6) as u ^t = ¢ yt ¡ ®^ 1yt¡ 11 fyt¡ 1 · ^° 1 g ¡ ®^ 2yt¡ 1 1fyt¡ 1 > °^ 2g ; t = 2; : : : ; n;

(10)

and then calculate centered residuals n¡ b b 1 X1X u ~t = u ^t ¡ u ^ i+j ; t = 2; : : : ; n: n ¡ b i=1 b j=1

(11)

The centering here appears crucial to having a valid bootstrap approximation, since u ^t in (10) does not have mean zero while ut has mean zero. However, the centering in (11) is di¤erent from that of Paparoditis and Politis (2003), who adopt a simple demeaning. Second, we resample u ~ t by overlapping blocking technique and integrate the resampled u ~0t s to get an integrated process. Speci…cally, for a positive integer b (< n) ; k = [(n ¡ 1) =b] and l = kb + 1; a bootstrap pseudo-series y¤1 (= y1) ; : : : ; yl¤ is constructed as follows yt¤ = y¤t¡ 1 + u ~im+s ; t = 2; : : : ; l; where m = [(t ¡ 2) =b] ; s = t ¡ mb ¡ 1 and i0; i1; : : : ; ik¡ 1 are random variables drawn i.i.d. uniform on the set f1; 2; : : : ; n ¡ bg:

Third, we compute the pseudo-statistic Wl¤ using the generated bootstrap sample y1¤ ; : : : ; y¤l

as de…ned in (7) : Repeating the second and third steps su¢ ciently many times, we can obtain an empirical distribution for the bootstrap statistic Wl¤ ; which in turn yields a consistent estimator for the sampling distribution of Wn: The centering (11) is commonly observed in the standard block bootstrap literature because the block bootstrapped series is not stationary, even in the …rst moment. It ensures that the expectation (based on the bootstrap distribution) of the sample mean of ¢ yt¤ ; not of ¢ y¤t 10

itself, is zero. Hall, Horowitz, and Jing (1995) show that the centering (11) accelerates the convergence of the estimation error of the block bootstrap in the case of a sample mean statistic with stationary data. This higher order property is, however, unclear in our case, in which an integrated process is involved. Instead, we …nd that the centering (11) facilitates the proof of the bootstrap version of Theorem 2, especially the part concerning the convergence of the bias term. I am not sure whether the bias term converges to the correct limit when the simple demeaning is adopted for the centering instead of (11) : It is well-known that the …nite sample performance of the conventional ADF test relies heavily on the lag order. This is natural in the sense that the same critical values are used regardless of the lag order chosen. On the other hand, the truncation e¤ect due to the lag order selection can be reduced in bootstraps. Paparoditis and Politis (2003) argue that the results based on RBB testing procedures are less sensitive with respect to the choice of the lag order p; since the truncation e¤ect is also replicated in the bootstrap.

3.2

Invariance Principle

In this subsection, we show that the bootstrap statistic Wl¤ converges to the proper limit distribution de…ned in Theorem 3. Since the distribution of the statistic Wl¤ depends on each realization of fyt g ; we de…ne the following convergence notion: Tn¤ ) T in P, meaning that the distance between the law of a statistic Tn¤ based on the bootstrap sample and that of a random measure T tends to zero in probability for any distance metrizing weak convergence (see Paparoditis and Politis (2003)). First, we establish an invariance principle for our bootstrap. De…ne a standardized partial sum process fSl¤ (r) ; 0 · r · 1g by Sl¤ (r) = =

j¡ 1 1 X ¤ j¡ 1 j p ut for · r < (j = 2; : : : ; l) ¤ l l l! t=1 l 1 X ¤ p ut for r = 1; l! ¤ t=1

³ ´ Pl ¤ ¤2 = var¤ l ¡ 1=2 ¤ : Due where u¤1 = y¤1 ; u¤t = yt¤ ¡ yt¡ for t = 2; 3; : : : ; l and ! u 1 j=2 j 11

to the fast rate of convergence of ®^ 0is, it can be shown that the partial sum process of the resampled centered residuals (11) is asymptotically equivalent to that of the resampled fut g : The invariance principle for the latter is derived in Paparoditis and Politis (2003). Let W denote a standard Brownian motion. Then we have the following invariance principle. p Theorem 4 Suppose that Assumption 1 holds. If b ! 1 such that b= n ! 0 as n ! 1; then

S¤l ) W in P. Next, we establish the convergence of the stochastic integral for our RBB, i.e., a bootstrap version of Theorem 2. Without loss of generality, assume u¤t = 0 if t > l: Let u¤t = "¤t ¡ ´ ¡ ¤ ¢ P ³ ¤ ¤ P ¤ u¤ ¤ ¤ ³ t ¡ ³¤t¡ 1 where "¤t = 1 E u ¡ E and ³ ¤t = 1 t t+j t¡ 1 t+j j =0 j=1Et u t+j: Recalling that u¤t = u ~ im+s ; we have

Et¤ u¤t+j so that "¤t since

Pb

¤ ¤ j=1 E ut+j

=

½ Pb¡ 1 0;

=

½

u¤t+j ; if j · b ¡ s ; E¤ u¤t+j; o/w, b¡ s

X if s = 1; ; and ³ ¤t = u¤t+j ; o/w,

¤ j =0 ut+j ;

j=1

= 0 for any t ¸ 2: Note that "¤t is, in fact, an independent resampling of

the sums of blocks. Based on this decomposition, we derive the following convergence of the stochastic integral for our RBB. p Theorem 5 Suppose that Assumption 1 holds. If b ! 1 such that b= n ! 0 as n ! 1; then

(i) l ¡ 1¡ k=2

l X t=2

© ¤ ª ¤k yt¡ ) 11 yt¡ 1 · °

Z

B k 1 fB · 0g in P,

Z Z l X © ¤ ª ¤ ¡1 ¤ (ii) l yt¡ 11 yt¡ 1 · ° ut ) B1 fB · 0g dB + ¸ 1 fB · 0g in P. t=2

It is worthwhile to note that the above convergences are a direct consequence of the invariance principle and the continuous mapping theorem, in the case of linear models. For example, see Paparoditis and Politis (2003). Now we are ready to state the consistency of the RBB as follows. 12

p Theorem 6 Suppose that Assumption 1 holds. If b ! 1 such that b= n ! 0 as n ! 1; then

¡ ¢¡ 1 Wn¤ ) ¾ 2 ¡ gp0 G¡p 1gp

4

"µ Z

¹ L2 B

¶¡1

A2p;L +

µZ

¹ U2 B

¶¡1

A2p;U

#

in P.

Monte Carlo Simulation

This section examines the …nite sample performance of the RBB of Wn, compared to that of the conventional ADF test. For the sake of fair comparison, we also apply RBB to the conventional ADF test as explained in Paparoditis and Politis (2003). Due to the heavy burden of computation, the number of simulation repetitions and of bootstrap iterations is set at 200 in the following computations. Several details remain to be determined in order to implement the RBB in practice. One is the selection of the block length b and the lag order p: In this experiment we try several values of b and p to see how dependent the performance of the RBB is on those choices. We do not attempt a data-dependent method. Another is the matter of how to set the parameter space S. Typically, we construct the set S in the form of an interval, such as [¡ ¹° ; °¹ ] ; where °¹ is a sample quantile of jyt j : Since ¹° should be bounded and, fyt g is an integrated process, we need to use

lower and lower quantiles as the sample size increases. In practice, however, this argument does not answer the question of which quantile is optimal. In this regard, the bootstrap-based inference is expected to have some advantage over the asymptotics based inference, since the dependence on the particular choice of quantile is replicated by bootstrap. In this experiment, we let ¹° = maxt· n jyt¡ 1 j due to the small sample size, and then compute

a bootstrap statistic Wl¤ ; taking supremum of Wl¤ (° ) over the set ( ) n n X X © ¤ ª © ¤ ª ° 1; ° 2 2 [¡ ¹° ; °¹ ] ; j 1 yt¡ 1 · ° 1 ¸ m and 1 y t¡ 1 > ° 2 ¸ m; : t=2

(12)

t=2

As mentioned above, °¹ could be other quantiles, and the choice of quantile does not matter much in RBB, as long as ¹° is not too low a quantile. The restriction in (12) is to guarantee that the regime-speci…c parameters ® 0i s are estimated properly; however, the constraint (12) is not binding in large samples because of the recursion property of the Brownian motion. The number m is set at 10 in our experiment. 13

(½; µ) (0; 0) (¡ :5; 0) (:5; 0) (0; ¡ :5) p=3 .050 .065 .075 .085 p=6 .030 .065 .060 .045 b= 8 p=3 .065 .065 .060 .080 p=6 .075 .075 .070 .060 Wn b= 6 p=3 .040 .065 .070 .080 p=6 .055 .055 .090 .040 b= 8 p=3 .080 .050 .060 .105 p=6 .085 .035 .070 .065 Note: n = 100: Nominal size 5%: RBB-based inference. ADF

b=6

(0; :5) .050 .050 .050 .050 .075 .060 .100 .095

Table 1: Size of unit root tests We generate data from the following process: ¢ yt = ® yt¡ 11 fjyt¡ 1j > °g + ut

(13)

ut = ½ut¡ 1 + "t + µ"t¡ 1 where f"t g follows iid standard normal distributions. As is common in the conventional unit root testing literature, we consider the following combination of (½; µ) : (0; 0) ; (¡ 0:5; 0) ; (0:5; 0) ; (0; ¡ 0:5) ; and (0; 0:5) : When ® is not zero, we set the threshold parameter ° as 4 or 8: As the parameter ° increases, the no-adjustment region becomes larger, which may have an in‡uence on the power of the tests. We …rst study the size of nominal 5% tests. The data are simulated from (13) with ® = 0 and with the sample size n = 100: Note that there is no threshold e¤ect if ® = 0: We choose the block length b = 6; 8 and the lag order p = 3; 6: The rejection frequencies are reported in Table 1, which shows that both the ADF and Wn tests have reasonable size for most error types, even in this small sample size. One exception may be that Wn slightly over-rejects when the error ut has a MA component, b = 8 and p = 3. It is di¢ cult to …nd any dependence structure of rejection on the block length b or the lag order p: Next, we examine the power properties of the tests. Let ® = ¡ 0:1 and let ° equal either

4 or 8: We also consider two di¤erent sample sizes of 100 and 250. The tests are the same as before except that the lag order p is …xed at 3: The block length b is 6 at the sample size 100; 14

(½; µ) (0; 0) (¡ :5; 0) ° =4 .140 .115 ° =8 .110 .080 Wn b=6 ° =4 .165 .140 ° =8 .195 .070 n : 250 ADF b = 6 ° = 4 .380 .270 ° =8 .140 .105 b = 10 ° = 4 .415 .245 ° =8 .200 .125 Wn b=6 ° =4 .715 .460 ° =8 .475 .250 b = 10 ° = 4 .725 .490 ° =8 .435 .245 Note: Nominal size 5%: p = 3: RBB-based inference. n : 100

ADF

b=6

(:5; 0) .195 .145 .190 .195 .755 .230 .730 .260 .735 .695 .645 .700

(0; ¡ :5) .125 .110 .115 .095 .305 .140 .295 .185 .425 .145 .460 .250

(0; :5) .190 .105 .205 .180 .745 .270 .700 .225 .765 .665 .740 .720

Table 2: Power of unit root tests and both 6 and 10 at the sample size 250. Table 2 reports the simulation results. Across most parametrizations, Wn has better power than ADF, which can be seen more clearly as the sample size n increases from 100 to 250: Especially when n = 250 and ° = 8; the rejection frequencies of Wn are about two or three times higher than those of ADF, regardless of the block length: We can also …nd that the increase of the threshold parameter ° results in the decrease of power for both the ADF and Wn tests, more or less. This drop of power is natural in the sense that the higher ° means the broader no-adjustment region. Yet, this change in ° deteriorates ADF much more than it does Wn: For example, see the case with (½; µ) = (0; 0) ; (0:5; 0) and (0; 0:5) when n = 250: Another feature of the simulation results is that both tests have relatively low power when the error ut has a negative AR or MA component. This is due to the fact that the proportion of the no-adjustment region is higher for a given ° in those cases than in the other cases.

5

AN EMPIRICAL ILLUSTRATION

We investigate the law of one price (LOP) hypothesis amongst used car markets in the US. US Bureau of Labor Statistics Monthly Consumer Price Indexes of 29 di¤erent locations3 are 3 Cities Abbv. : 1. Anchorage AN, 2. Atlanta AT, 3. Baltimore BT, 4. Boston BO, 5. Bu¤alo BU, 6. Chicago CH, 7. Cleveland CL, 8. Cincinnati CI, 9. Dallas DA, 10. Denver DN, 11. Detroit DT, 12. Honolulu HO, 13. Houston HS, 14. Kansas City KC, 15. Los Angeles LA, 16. Miami MA, 17. Milwaukee MI, 18.

15

used for the period of December 1986-June 1996 (115 observations). This data set is the same as the one used in Lo and Zivot (2001) and in Seo (2003). Like these researchers, 28 bivariate systems of log prices are constructed with a benchmark city, New Orleans. We perform several di¤erent cointegration tests using the cointegrating vector (1; ¡ 1)0

implied by the LOP. We compare the tests developed for threshold cointegration with those developed for conventional cointegration. In the univariate framework, we compare the RBB of the Wald statistic Wn developed in this paper to that of the conventional ADF. In the multivariate framework, the supW test of Seo (2003) is compared with the Wald test by Horvath and Watson (1994) (HW). In order to compute the p-values for the supW statistic, we follow the residual-based bootstrap developed in Seo (2003). All the bootstrap p-values are computed with 500 replications. For the lag order selection, the Schwarz Criterion (BIC) is used based on linear models for comparison. Table 3 reports the results of these four tests. In the case of the conventional methods, HW and ADF, it is very hard to …nd evidence for the LOP. Only two systems are rejected for the LOP by those tests, on the basis of which Lo and Zivot (2001) argue that there is no LOP in the used car markets, since used cars are more heterogeneous than other categories that they consider, such as meats, fresh fruits and vegetables. Yet, it is unclear why used cars are more heterogeneous than meats or vegetables across locations. In contrast to those conventional tests, the RBB of Wn or the supW test provides much more evidence for the LOP in the used car markets. Thirteen systems are rejected for the LOP. These di¤erent testing results indicate that we need be more cautious about the used car markets. An important feature of the used car markets is the presence of uncertainty of quality, which has generated a large empirical debate as to whether the used car markets are e¢ cient or not. An implication of the debate for the study of the LOP is that the uncertainty acts as a kind of transaction barrier in the used car markets. In other words, in the band type threshold cointegration models, a longer no-adjustment period is expected in used car markets than in, for example, new car markets. We make a simple comparison between the two markets by applying all four tests to the new car markets. The results are summarized in Table 4. In Minneapolis MS, 19. New York NY, 20. Philadelphia PH, 21. Pittsburgh PI, 22. Portland PO, 23. San Diego SD, 24. San Francisco SF, 25. Seattle SE, 26. St. Louis SL, 27. Tampa Bay TA, 28. Washington D.C. DC, Benchmark: New Orleans NO

16

Cities NY PH CH LA SF Bo Cl Ci DC Ba SL MS Ma SD Po BU DA AT AN DN DT MI KC HS HO PI TA SE

supW(p-value) 0.009 0.661 0.22 0.372 0.7 0.293 0.432 0.71 0.839 0.061 0.087 0.567 0.052 0.299 0.515 0.038 0.034 0.064 0.091 0.044 0.536 0.677 0.299 0.506 0.155 0.888 0.556 0.854

HW(10%:8.3) 1.214 1.109 0.285 1.339 1.549 0.7 5.371 2.017 0.822 3.621 0.03 0.062 0.541 3.789 1.572 0.707 0.264 7.714 1.204 2.838 0.342 2.86 0.507 0.511 1.061 0.575 2.714 0.416

Wn (p-value) 0.96 0.89 0.85 0.91 0.62 0.93 0.32 0.03 0.67 0.44 0.43 0.28 0.29 0.41 0.06 0.77 0.01 0.27 0.47 0.17 0.89 0.08 0.13 0.85 0.5 0.01 0.57 0.23

ADF(p-value) 0.9 0.87 0.96 0.88 0.89 1 0.56 1 0.9 0.43 0.97 0.98 0.85 0.37 0.93 0.98 0.9 0.5 1 0.1 0.92 0.88 0.92 0.88 0.83 0.95 0.57 0.9

Table 3: Tests for LOP in used car markets from 29 di¤erent locations (28 bivariate systems with a benchmark city, New Orleans)

17

SupW Wn either SupW or Wn HW ADF either HW or ADF

Used car market 9 5 13 1 1 2

New car market 14 10 17 17 14 18

# of rejections of no cointegration out of 28 systems at 10% size

Table 4: LOP in used car and new car market contrast to the testing results of the used car markets, that of the new car markets do not seem to depend on whether we use the threshold models or the linear models. This may imply that the nominal transaction costs are not signi…cant enough to a¤ect the power of the conventional tests, and that transaction barriers, such as the uncertainty, can be more important in analysis of the LOP in the used car markets. Arguably, a signi…cant power loss of the conventional linear cointegration tests is likely, and so those tests can be quite misleading.

A

Proof of Theorems

We …rst prove the following lemma, which will be repeatedly used to prove main theorems. It is more general than required and holds for any bounded intervals. Lemma 7 Suppose fat ; bt g is strictly stationary with Ejat j < 1 and Ejb tj < 1; and fwtg is uniformly integrable. Then,

1X L1 j1 fat < yt < bt g wt j ! 0: n t Proof. Fix " > 0: First, due to the uniform integrability of fwt g ; there is c s.t. supt E[jwtj 1fjwt j > cg] < "=2: Write

·

1X E j1 fat < yt < b tg wt j n X 1 c E1 fat < yt < bt g + sup E [jwt j 1fjwt j > cg] n t

Second, we can choose M (c) > 0 s.t., for all t; Pr fat < ¡ M or bt > Mg
N (c) : Since " is arbitrary, the proof is complete.

Remark 1 The uniform integrability is quite general in that it includes any strictly stationary process with a …nite …rst moment, any weakly stationary with a …nite second moment.

A.1

Proof of Theorem 1

For simplicity of exposition, assume ° ¸ 0: First note that, by Lemma 7;as n ! 1; ¯ ¯ ¯ 1 X ¯ ¹° k 1 X ¯ ¯ sup E ¯ 1+k=2 ykt 1 f0 < yt · °g wt¯ · k=2 E j1 f0 < yt · °¹ gwtj ! 0 ¯ n n t 0· ° · °¹ ¯n t

When k > 0; theorem 3.3 of Hansen (1992) applies directly to get the results. When k = 0; the transformation by the indicator function is not continuous and the theorem is not applicable. The proof strategy for the remainings is following : First consider Xn± s.t. Xn± ) X ± as n ! 1; and X ± !a:s: X as ± ! 0 Second show Xn ¡ Xn± !p 0 as n ! 1 and ± ! 0. More speci…cally, …x " > 0: 8´ > 0 there

exist ±0 and N0 s.t.

Then

¯ n¯ o ¯ ±¯ 80 < ± < ±0; 8n > N0; ) P ¯Xn ¡ Xn ¯ > ´ < "

n o P fXn · yg = P Xn ¡ Xn± + Xn± · y n o · P Xn± · y + ´ + "; 8± > ±0; 8n > N0 n o · P X ± · y + ´ + 2"; 8n > N1 · ·

P fX · y + ´g + 3"; 8± > ±1

P fX · yg + 4"; 8´ > ´ 0 by right continuity of cdf 19

Since the choice of " is arbitrary and we can choose ´ and ± to satisfy above conditions, Xn ) X: ¡ ¢ Now consider a continuous function 1± (y) = 1fy · 0g + 1 ¡ 1± y 1 f0 < y · ±g. Then ¯Z ¯ Z µ Z ¶ ¯ ¯ B a:s: ¯ 1 ± (B) ¡ ¯ 1 fB · 0g ¯ = 1¡ 1f0 < B · ±g ! 0 ¯ ± as ± ! 0 by monotone convergence Theorem and ¯ ¾ µ ¶ ¶ ¯¯ ¯1 X µ ½ y y ¯ t t ¯ E¯ 1 p · 0 ¡ 1± p wt¯ ¯n t n n ¯ ¯ ¯ ¾ ¯ ¯1 X ½ y ¯ ¯ · E¯ 1 0 < pt · ± wt¯ ¯n t n ¯ ¾ Z 1 ½ y [nr] · a E1 0 < p · ± dr + sup E jwt j 1fjwt j ¸ ag n t 0 < 3" for every n > N and for every ± < ¹±

The last inequality holds as follows. From uniform integrability there exists a s.t. supt Ejwt j 1fjwt j ¸ ag < ": Fix a. Then for each r;

¾ y[nr] E1 0 < p · ± ! P f0 < B (r) · ±g n and then by Dominated convergence Theorem, there exists N and ¹± s.t. ½ ¾ Z1 Z y [nr] " E1 0 < p · ± dr < P f0 < B (r) · ±g dr + ; for every n > N n a 0 " < 2 ; for every ± < ¹± a R 1 P Therefore it su¢ ces to show that n t 1± (y t) wt ) ¹ 1± (B) : Let vt = wt ¡ ¹ then by Theorem ¯ P ¯ P 3.3 of Hansen (1992), ¯1n t 1 ± (yt ) vt¯ !p 0 which completes the proof since n1 t 1 ± (yt ) ) R 1± (B) by continuous mapping theorem.

A.2

½

Proof of Theorem 2

For simplicity assume ° ¸ 0 and °¹ is the maximum of S. Write 1X 1X yt¡ 11 fyt¡ 1 · °g ut = y 1fyt¡ 1 · ° g"t + Ln + R1n + R2n ; n t n t t¡ 1 1X where Ln = u ³ ¢1 fyt¡ 1 · °g 1fyt · °g n t t t

1 y 1 fy n¡ 1 · °g ³n n n¡ 1 1X = (yt 1 fyt · ° < yt¡ 1g + yt¡ 11 fy t¡ 1 · ° < yt g) ³t n t

R1n = R2n

20

It follows from Theorem 1 of Seo (2002) and Hansen (1992) that Z 1X y t¡ 1 1fyt¡ 1 · ° g"t ) B ¢1 fB · 0g dB n t

uniformly in ° 2 S: Next,since ut³ t is uniformly integrable, by Lemma 7, 1X sup jut ³t ¢1f0 < yt¡ 1 · °g 1 f0 < yt · °gj °2¹° n t 1X · jut ³t ¢1 f0 < yt · °¹ gj !p 0 n t

Again by Lemma 7 we have 1X 1X ut ³t ¢1 fyt¡ 1 · 0g 1 fyt · 0g ¡ ut³ t ¢1 fyt · 0g n t n t 1X = ut ³t ¢1 fyt¡ 1 + ut · 0 < y t¡ 1 g !p 0 n t

Since ut³ t ¡ ¸ is uniformly integrable L1-mixingale, it follows from Theorem 1 that Z Ln ) ¸ 1fB · 0g Then we show that R1n and R2n are op (1) uniformly in °: First, note that ¯ ¯ 1¯ 1 1 ¯ sup sup ¯yt 1fyt · °g ³t+1 ¯ · sup p jyt j sup p ¯³ t+1 ¯ n n ° t· n n = Op (1) op (1) Second, we show R2n = op (1) : By replacing yt by yt¡ 1 + ut ; we have yt 1 fyt · ° < yt¡ 1g + yt¡ 11 fyt¡ 1 · ° < yt g = yt¡ 1 (1f° < y t¡ 1 · ° ¡ utg + 1 f° ¡ ut < yt¡ 1 · °g) + ut 1f° < y t¡ 1 · ° ¡ utg and then jR2nj ·

· ·

1X jyt¡ 1³ t (1 f° < yt¡ 1 · ° ¡ ut g + 1 f° ¡ ut < yt¡ 1 · ° g)j n t 1X + jut³ t1 f° < yt¡ 1 · ° ¡ ut gj n t 1X (jyt¡ 1³t j + jut ³t j) 1 f° ¡ jutj < yt¡ 1 · ° + jut jg n t 1X (j¹° j + 2jut ³t j) 1 fjyt¡ 1j · ¹° + jut jg n t

Thus, by Lemma 7, jR2n j !p 0 uniformly in ° since the choice of ° is arbitrary. 21

(16)

A.3

Proof of Theorem 3

² Notations used only in this section:

x1t x2t xt

µ

Pn ¶ yt¡ 11 fyt¡ 1 · ° 1g t=p+2 Pn = ; t=p+2 yt¡ 11 fyt¡ 1 > ° 1g 0 10 n¡ n¡ 1 X Xp 1 1 = @ut¡ 1 ¡ ut ; ¢¢¢; ut¡ p ¡ ut A ; n t=p+1 n t=2 ¡ ¢0 = x01t ; x02t yt¡ 11 fy t¡ 1 · ° 1 g ¡ yt¡ 11 fy t¡ 1 > ° 2 g ¡

1 n 1 n

utp = ut¡ 1 + ¢¢¢+ ut¡ p = yt¡ 1 ¡ yt¡ p¡ 1 : A.3.1

Limit distribution of ®^ i

We introduce some notations to ease our exposition. Let u ¹t = ut ¡ x1t x2t xt

µ

1 n

Pn

t=p+2 u t;

and

Pn ¶ t=p+2 yt¡ 11 fyt¡ 1 · ° 1g P = ; n t=p+2 yt¡ 11 fyt¡ 1 > ° 2g 0 10 n¡ n¡ X1 Xp 1 1 = @ut¡ 1 ¡ ut ; ¢¢¢; ut¡ p ¡ ut A ; n t=p+1 n t=2 ¡ ¢0 = x01t ; x02t : yt¡ 11 fy t¡ 1 · ° 1 g ¡ yt¡ 11 fy t¡ 1 > ° 2 g ¡

1 n 1 n

Here, the dependence of x1t and xt on ° is suppressed. Then, under the null, µ

n^ ® 1 (°) n^ ® 2 (°)



0

0 1¡ 1 1¡ 1 n n n n X X X X 1 1 1 1 1 =@ 2 x1t x01t ¡ ¢ x1tx02t @ x2t x02tA x2t x01t A n t=p+2 n n t=p+2 n t=p+2 n t=p+2 0 0 1¡ 1 1 n n n n X X X X 1 1 1 1 £ @ x1t u ¹t ¡ x1tx02t @ x2t x02tA x2t u ¹t A n t=p+2 n t=p+2 n t=p+2 n t=p+2

We derive the limit distributions term by term: …rst, by Theorem 2 and CLT, we have µ R 2 ¶ n ¹ 1 X B 0 0 L R x1t x1t ) (17) ¹2 ; 0 B n2 t=p+2 U

and

22

n 1 X x1t u ¹t n t=p+2 0 n 1 P y 1 fyt¡ 1 · ° 1 gut B n t=p+2 t¡ 1 B = @ n P 1 yt¡ 11 fyt¡ 1 > ° 2 gut n t=p+2

1

0

C B C¡ B A @

1 p n n 1 p n n

n P

t=p+2 n P t=p+2

yt¡ 11 fyt¡ 1 · ° 1 g

p1 n

yt¡ 11 fyt¡ 1 > ° 2 g p1n

n P

t=p+2 n P

ut ut

t=p+2

1 C C A

µ R R R ¶ B1 fB · 0g dB + ¸ R 1 fB · 0g ¡ B (1) R B1fB · 0g R ) B1 fB > 0g dB + ¸ 1 fB > 0g ¡ B (1) B1fB > 0g R µ R ¹ ¶ BL dB + ¸ R 1fB · 0g R = ; (18) B¹U dB + ¸ 1fB > 0g R R ¹ U = B1fB > 0g¡ B1 fB > 0g : The last equality where B¹L = B1 fB · 0g¡ B1 fB · 0g ; B R follows from the fact that B (1) = dB: Next, we show that Z Z n 1 X yt¡ 11 fyt¡ 1 · ° 1 gut¡ p ) B1 fB · 0gdB + (¸ + r¹p) 1 fB · 0g; (19) n t=p+2

where r¹p = r (0) + r (1) ¢¢¢+ r (p ¡ 1) : Since

yt¡ 11 fyt¡ 1 · ° 1 gut¡ p = yt¡ p¡ 11 fyt¡ p¡ 1 · ° 1g ut¡ p

+(yt¡ 11 fyt¡ 1 · ° 1 g ¡ yt¡ p¡ 11 fy t¡ p¡ 1 · ° 1g) ut¡ p

and

Z Z n 1 X yt¡ p¡ 1 1fyt¡ p¡ 1 · ° 1g ut¡ p ) B1 fB · 0g dB + ¸ 1 fB · 0g n t=p+2

from Theorem 2, it remains to show

Z n 1 X (yt¡ 11 fyt¡ 1 · ° 1g ¡ yt¡ p¡ 11 fyt¡ p¡ 1 · ° 1g) ut¡ p ) ¹rp 1 fB · 0g n t=p+2

To do so, note that

y t¡ 1 1fyt¡ 1 · ° 1g ¡ yt¡ p¡ 11 fyt¡ p¡ 1 · ° 1 g = utp1 fyt¡ 1 · ° 1g + yt¡ p¡ 1 (1 f° 1 < yt¡ p¡ 1 · ° 1 ¡ utp g ¡ 1f° 1 ¡ utp < yt¡ p¡ 1 · ° 1g) ; where utp = ut¡ 1 + ¢¢¢+ ut¡ p = yt¡ 1 ¡ yt¡ p¡ 1: Next, by Lemma 7, we have

n 1 X jyt¡ p¡ 1 (1f° 1 < yt¡ p¡ 1 · ° 1 ¡ utpg ¡ 1 f° 1 ¡ utp < yt¡ p¡ 1 · ° 1g) ut¡ pj n t=p+2

·

n 1 X (j¹° j + jutpj) jut¡ pj 1fjyt¡ p¡ 1j · j¹° j + jutp jg !p 0; n t=p+2

23

(20)

and by Theorem 1, Z n 1 X utput¡ p 1fyt¡ 1 · ° 1g ) r¹p 1 fB · 0g: n t=p+2

(21)

Then Theorem 2, (20) ; and (21) establish the convergence in (19) ; which in turn yields µ R R R R ¶ n ¹ LdB + (¸ + r¹1) 1 fB · 0g; ¢¢¢ 1 X B B¹L dB + (¸ + ¹rp ) R 1 fB · 0g 0 R R R x1t x2t ) : ¹ U dB + (¸ + r¹1) 1 fB > 0g; ¢¢¢ B B¹U dB + (¸ + r¹p ) 1 fB > 0g n t=p+2 Finally, it follows from LLN that

n n 1 X 1 X x2t x02t ) Gp and x2t u ¹t ) gp n t=p+2 n t=p+2

where

0

B B Gp = B @

1 0 r (p ¡ 1) C r (p ¡ 2) C B C and gp = @ .. A . r (p ¡ 1) r (p ¡ 2) ¢¢¢ r (0) r (0) r (1) .. .

r (1) r (0) .. .

¢¢¢ ¢¢¢ .. .

Let ¶p be a p-dimensional vector of ones and ~rp = (¹r1 ; ¢¢¢; r¹p)0 : à ¡R ! µ ¶ ¢¡ 1 n^ ® 1 (° ) B¹L2 Ap;L ) ¡R 2 ¢¡ 1 n^ ® 2 (° ) B¹U Ap;U where

Ap;U

¡ ¢ 1 ¡ g0pG¡p 1¶

µZ

1 r (1) C .. A: . r (p)

(23)

¶ ¸ Z 1fB · 0g ¡ g0pG¡p 1 ~rp 1 fB · 0g · µZ ¶ ¸ Z Z ¡ ¢ 0 ¡1 0 ¡1 ¹ = 1 ¡ gpGp ¶ BU dB + ¸ 1fB > 0g ¡ gpGp ~rp 1 fB > 0g :

Ap;L =

·

(22)

B¹L dB + ¸

Z

Since the convergence above is uniform in ° and the limit distribution does not depend on the parameter °; the limit distribution of ®^ i is the one in (23) : A.3.2

Limit distribution of Wn

We …rst derive the limit of ^¾ 2 (°) : De…ne · n as a p + 2 dimensional diagnal matrix whose …rst two elements are n¡ 1 and the others are n¡ 1=2. Then, the convergence results (17) ; (18) ; (19) and (22) yield ¾^ 2 (°)

=

0 10 0 1¡ 1 0 1 n n n n X X X X 1 u ¹2 ¡ @n¡ 1=2· n xt u ¹t A @· n xtx0t· nA @n¡ 1=2· n xt u ¹t A n t=p+2 t t=p+2 t=p+2 t=p+2

) ¾ 2 ¡ g0pG¡p 1 gp:

(24)

24

Finally, from (17) ; (23) and (24) ; we conclude that "µ Z # ¶ ¡1 µZ ¶¡1 ¡ 2 ¢¡ 1 0 ¡1 2 2 2 2 ¹U Wn ) ¾ ¡ gpGp gp B¹ L Ap;L + B Ap;U :

A.4

Proof of Theorem 4

This proof follows closely that of Theorem 1 of Paparoditis and Politis (2003) (PP), which consists of two steps. First, they show that ¯ ¯ ¯ ¯ Mr X B Mr X B X ¯1 X ¯ 1 ¤ ¯p u ~im +j ¡ p (uim +j ¡ E uim +j )¯¯ = op (1) ; uniformly in r; ¯ l l m=0 j=1 ¯ m=0 j=1 ¯ where Mr = [([lr] ¡ 2) =b] and B = min fb; [lr] ¡ mb ¡ 1g : Note that P[lr] a rewriting of p1l t=1 u¤t ; and that ¡1

u ^ t = ut ¡ ®^ yt¡ 1; and u ~t = u ^ t ¡ (n ¡ 1)

n X

1 p l

PMr PB m=0

(25)

~ im+j j=1 u

is

u ^ j:

j=2

¡ ¢ where ®^ = Op n¡ 1 : Roughly speaking, resampling from the demeaned u ^t is asymptotically

equivalent to that from ut itself due to the fast convergence rate of ®^ : Second, they establish an invariance principle for the block resampling from ut , that is, Mr X B 1 X p (uim +j ¡ E ¤ uim +j ) ) B in P. l m=0 j=1

This second step should be identical to our case since we have the same assumption on futg as that of PP. In our case, u ^ t = ut ¡ ®^ 1 yt¡ 11 fyt¡ 1 · ° 1g ¡ ®^ 2yt¡ 11 fyt¡ 1 > ° 2 g; n¡ b b 1 X1 X u ~t = u ^t ¡ u ^i+j : n ¡ b i=1 b j =1

(26)

Note that u ^ t is not centered at the sample mean but at that of the mean of each block. However, ¡ ¢ 0 ®^ is are also Op n¡ 1 by Theorem 3, and y t¡ 1 1fyt¡ 1 · ° 1g has the same asymptotic order as that of y t¡ 1 . Therefore, (25) with u ^t de…ned in (26) follows readily . In particular, as in the

25

proof of Theorem 3.1 in PP, it is enough to observe that 2 3 b n¡ b b X 1 XX E¤ 4 yim +j¡ 1 1fyim+j¡ 1 · ° 1g5 = yt+j¡ 11 fyt+j¡ 1 · ° 1 g n ¡ b t=1 j =1 j =1 ¡p ¢ · b sup jyt j = Op b n ;

(27)

t· n

and

2 32 b X E¤ 4 yim+j¡ 1 1fyim +j¡ 1 · ° 1g5 = j =1

·

2 32 n¡ b b 1 X 4X yt+j¡ 11 fyt+j¡ 1 · ° 1 g5 n ¡ b t=1 j =1 · ¸2 ¡ ¢ b sup jyt j = Op b2n ;

(28)

t· n

which in turn establish 0 0 112 Mr X B n¡ b X b X X 1 1 @yim+j¡ 11 fyim+j¡ 1 · ° 1 g ¡ 1 E¤ @p yt+j¡ 11 fyt+j¡ 1 · ° 1gAA n ¡ b t=1 b j=1 l m=0 j=1 = Op (bn) ;

uniformly in ° 1 2 S, and …nally 0 1 Mr X B n¡ b ³ ´ Xb 1 X 1 X 1 @y im +j¡ 11 fyim+j¡ 1 · ° 1g ¡ ®^ 1 p y t+j ¡ 1 1fyt+j¡ 1 · ° 1gA = Op¤ b1=2 n¡ 1=2 ; n¡ b b l m=0 j=1 t=1 j=1 uniformly in ° 1 2 S. Similarly as in the above proof, the following lemma is straightforward from Lemma 8.1 of PP, (27) and (28) : Lemma 8 Under the assumptions of Theorem 4 and as n ! 1; we have P (i) l¡ 1 lj=1 u¤j ³!p 0; ´ P 2 (ii) !¤ = var¤ l ¡ 1=2 lj=2 u¤j !p ! 2; P 2 2 (iii) ¾ ¤ = l ¡ 1 lj =1 u¤j !p ¾ 2:

A.5

Proof of Theorem 5

For the part (i) ; note that, for any ° 2 S; ¯ l ¯ l ¯X 2 © ¯ X ª © ª 2 ¯ ¯ ¤ ¤ l ¡ 2 ¯ y ¤t¡ 1 1 yt¡ y¤t¡ 11 yt¡ 1· ° ¡ 1· 0 ¯ ¯ ¯ t=2 t=2 ¯ l ¯ ¯X 2 © ¯ ¯ ª ¯¯ ¯ · l ¡ 2 ¯ y ¤t¡ 1 1 ¯y¤t¡ 1¯ · °¹ ¯ · l ¡ 1°¹ ! 0: ¯ ¯ t=2

26

Then, it follows from the continuous mapping theorem and Theorem 4 that Z l X © ¤ ª ¡2 ¤2 l y t¡ 1 1 yt¡ 1 · 0 ) B 21 fB · 0g : t=2

¡ ¢ Fort the part (ii)³ ; without loss of generality, assume u¤t = 0 if t > l: Let u¤t = "¤t ¡ ³¤t ¡ ³ ¤t¡ 1 ´ P P1 ¤ ¤ ¤ ¤ ¤ ¤ ¤ ¤ where "¤t = 1 ~ im+s j=0 Et ut+j ¡ Et¡ 1 ut+j and ³ t = j=1Et ut+j : Recalling that u t = u where m = [(t ¡ 2) =b] and s = t ¡ mb ¡ 1; we have ½ ¤ ut+j ; if j · b ¡ s ¤ ¤ Et ut+j = E¤ u¤t+j ; o/w, and thus, "¤t ³¤t

= =

½ Pb¡ 1

¤ j =0 ut+j ;

0;

b¡ s X

u¤t+j

j=1

since

Pb

j =1 E

¤ u¤ t+j

Write

if s = 1; if s > 1,

+

l¡ t X

E ¤ u¤t+j

=

b¡ s X

u¤t+j ;

j =1

j=b¡ s+1

= 0 for any t ¸ 2:

l l © ª © ¤ ª ¤ 1X ¤ 1X ¤ ¤ ¤ yt¡ 11 y¤t¡ 1 · ° u¤t = y t¡ 1 1 yt¡ 1 · ° "t + Ll + Rl ; l l t=2

(29)

t=2

L¤l =

where

R¤l = To show that u¤t ³¤t = u ~im+s

R¤l

l 1X

l

1 l

t=2 l X t=2

© ª u¤t ³ ¤t ¢1 y¤t¡ 1 · ° 1 fy ¤t · °g ;

¡ ¤ © ¤ © ¤ ¤ ª ¤ ¤ ª¢ ¤ yt 1 yt · ° < yt¡ ³t : 1 + yt¡ 11 y t¡ 1 · ° < yt

= op (1) uniformly in ° 2 S, write that

b X

u ~ im+j

j=s+1

80 10 19 b n¡ b n¡ b < = X Xb 1 X Xb 1 X 1 1 @uim+s ¡ = uv+g A @uim+j ¡ uv+gA (30) : n¡ b b n¡ b b ; j=s+1 g=1 v=1 g=1 v=1 8 ³ ´ 9 b < yim +s¡ 1 1fyim+s¡ 1 · ° 1g ¡ 1 Pn¡ b 1 Pb yv+g1 fyv+g · ° 1 g = X g=1 b v=1 n¡ b ³ ´ (31) +^ ® 21 P P n¡ b b 1 1 : £ yi +j¡ 1 1fy i +j¡ 1 · ° g ¡ ; y 1 fy · ° g v+g j=s+1 m m 1 g=1 b v=1 v+g 1 n¡ b 8 ³ ´ 9 b < yim +s¡ 1 1fyim+s¡ 1 > ° 2g ¡ 1 Pn¡ b 1 Pb yv+g1 fyv+g > ° 2 g = X g=1 b v=1 n¡ b ³ ´ (32): +^ ® 22 : £ yim+j¡ 1 1fy im +j¡ 1 > ° 2g ¡ 1 Pn¡ b 1 Pb yv+g1 fyv+g > ° 2g ; j=s+1

n¡ b

27

g=1 b

v=1

Note that (31) and (32) are op (1) uniformly ³ ´ in t and ° as in the Proof of Theorem 4. Also 1 Pn¡ b 1 Pb note that n¡ b g=1 b v=1 uv+g = Op p1n (see Künsch (1989), p. 1227). Then, like (16) ; we have

l ¯ ©¯ ª 1X (j¹ ° j + 2 ju¤t ³¤t j) 1 ¯y¤t¡ 1¯ · °¹ + ju¤t j l t=2 ¯P ¯ 0 ³ ´ 1 ¯ b ¯ l p1 X j¹ ° j + 2 u u + 2(b ¡ s) ju j O ¯ ¯ i +s i +j i +s p 1 j=s+1 m m m n @ ¯ ¯ ³ ´ A · (33) ¯Pb ¯ 1 l t=2 +2 ¯ j=s+1 uim+j ¯Op pn + op (1) ¯ ©¯ ª £ 1 ¯y¤t¡ 1¯ · °¹ + ju¤t j : P Since fut g is independent of fim g, bj=s+1 uim+s uim+j is uniformly integrable by Theorem 3.2 P of Hansen (1992), not to mention bj=s+1 uim+j and uim+s . And, write, for any M1 > 0;

jR¤l j ·

l l l ©¯ ¤ ¯ ª 1X ©¯ ¤ ¯ ª 1X 1X ¤ ¯ ¯ ¯ E1 ¯yt¡ · ° ¹ + ju j · E1 y · ° ¹ + M + E1fju¤t j > M1g : 1 1 t t¡ 1 l t=2 l t=2 l t=2

(34)

Then it follows from Theorem 4 and (15) in the proof of Lemma 7 that the …rst term in the right hand side of (34) is o (1) : Next, since ¯ © ª¯ sup ¯P ¤ fu¤t · xg ¡ P ¤ u¤j · x ¯ · 2b=n x

for any j and t; we observe that l

1X ¤ E 1fju¤t j > M1g · sup E ¤ 1 fju¤t j > M1g · P ¤ fu¤2 > M1g + 2b=n: l t t=2

Furthermore, since futg is strictly stationary and independent of fim g ; and u¤2 = ui1 + op (1) ; for any " > 0; there is M1 satisfying E [P ¤ fu¤2 > M1g] · P fui1 > M1 ¡ "=2g + "=2 < "; which in turn yield that the second term in the right hand side of (34) is also o (1) : Finally, we conclude that jR¤l j = op (1) by the uniform integrability and the observation that (34) is negligible (as in the proof of Lemma 7). Next, we have E¤ u¤t ³ ¤t =

b¡ s X j=1

E¤ ~uim+s u ~im+s+j !p ¸;

by a similar argument in Lemma 8. Then, by the uniform integrability above, Theorem 4, and the same argument as in the proof of Theorem 1 we can conclude that the limit of L¤l is R ¸ 1 fB · 0g : 28

Finally, for the convergence of the …rst term in (29) ; note that l

© ¤ ª ¤ 1X ¤ yt¡ 11 yt¡ 1 · ° "t l t=2

k¡ 1 n po ¤ 1 X ¤ ¤ Ym¡ 11 Ym¡ · °= b Vm 1 k m=1 Z ) B1 fB · 0g dB:

=

P P ¤ 1 ¤ p where Vm¤ = p1b bj=1 u¤mb+1+j = p1b "¤mb+2 and Ym¤ = m s=0 Vs = b y mb+1 : To see this, observe that fVm¤ g is independent and have mean zero under the bootstrap distribution since fVm¤ g is a normalized sum of each block. The convergence follows from Theorem 4 and the same line of argument as the proof of Theorem 1 (b) of Seo (2003).

A.6

Proof of Theorem 6

Since we already have the invariance principle for our RBB and the bootstrap version of Theorem 2 in hand, the proof is in fact straightforward following the same line of argument of the proof of Theorem 3 (More details are to be added soon).

References Balke, N., and T. Fomby (1997): “Threshold cointegration,” International Economic Review, 38, 627–645. Bec, F., A. Guay, and E. Guerre (2002): “Adaptive Consistent Unit Root Tests Based on Autoregressive Threshold Model,” Working Paper. Caner, M., and B. E. Hansen (2001): “Threshold autoregression with a unit root,” Econometrica, 69(6), 1555–1596. Chan, K. S., J. D. Petruccelli, H. Tong, and S. W. Woolford (1985): “A Multiplethreshold AR(1) Model,” Journal of Applied Probability, 22, 267–279. Chang, Y., and J. Y. Park (2003): “A Sieve Bootstrap for the Test of a Unit Root,” Journal of Time Series Analysis, forthcoming. Davidson, J. (1994): Stochastic Limit Theory. Oxford University Press. Davies, R. (1987): “Hypothesis Testing when a Nuisance parameter is Present Only under the Alternative,” Biometrika, 74, 33–43.

29

De Jong, R. (2003): “Nonlinear estimators with integrated regressors but without exogeneity,” manuscript. Hall, P., J. L. Horowitz, and B.-Y. Jing (1995): “On blocking rules for the bootstrap with dependent data,” Biometrika, 82, 561–574. Hansen, B. (1992): “Convergence to Stochastic Integrals for Dependent Heterogeneous Processes,” Econometric Theory, 8, 489–500. (1996): “Inference when a nuisance parameter is not identi…ed under the null hypothesis,” Econometrica, 64, 413–430. Horvath, M., and M. Watson (1994): “Testing For Cointegration When Some of the Cointegrating Vectors are Known,” Discussion Paper 171, National Bureau of Economic Research. Kapetanios, G., and Y. Shin (2003): “Unit Root Tests in Three-Regime SETAR Models,” Queen Mary, University of London. Künsch, H. R. (1989): “The jackknife and the bootstrap for general stationary observations,” The Annals of Statistics, 17, 1217–1241. Lo, M., and E. Zivot (2001): “Threshold Cointegration and Nonlinear Adjustment to the Law of One Price,” Macroeconomic Dynamics, 5(4), 533–76. Paparoditis, E., and D. N. Politis (2003): “Residual-based block bootstrap for unit root testing,” Econometrica, 71(3), 813–855. Park, J., and P. Phillips (2001): “Nonlinear Regressions with Integrated Time Series,” Econometrica, 69, 117–161. Park, J. Y. (2002): “An invariance principle for sieve bootstrap in time series,” Econometric Theory, 18(2), 469–490. Pippenger, M., and G. Goering (1993): “A note on the empirical power of unit root tests under threshold processes,” Oxford Bulletin of Economics and Statistics, 55, 473–481. Said, S. E., and D. A. Dickey (1984): “Testing for unit roots in autoregressive-moving average models of unknown order,” Biometrika, 71, 599–607.

30