Detecting Structural Change Using SAS /ETS Procedures

58 downloads 209 Views 119KB Size Report
The analysis of time series is an important statistical methodology with principal ... This paper uses the SAS/ETS procedure PROC AUTOREG with the CHOW ...
Detecting Structural Change Using SAS®/ETS Procedures Archie J. Calise, Queensborough College of the City University of New York Joseph Earley, Loyola Marymount University, Los Angeles All great natures delight in stability

... Emerson

ABSTRACT This paper illustrates how the SAS System may be used to test for structural change in a time series. PROC AUTOREG with the CHOW option are used to perform a Chow test for structural stability on airline passenger travel time series data before and after the September 11, 2001 terrorist incident. The paper illustrates the ease of use of this procedure in investigating the structural change of a time series. Also illustrated in the paper is the Time Series Forecasting System, which is a modular option under Solutions> Analysis>Time Series Forecasting System. The TSFS may be use to fit numerous pre-selected and other models based on a user-selected criterion such as Rsquare or mean square forecast error.

INTRODUCTION The analysis of time series is an important statistical methodology with principal contributions developed by academic disciplines such as biometrics, economics and sociology. Recent developments in arima modeling, spectral analysis and X12 methodology have added immensely to the usefulness of the methodology. This paper illustrates how a time series may be examined for structural stability using an application of the F-test called the Chow test. Figure 1 is a time series of monthly total passenger traffic volume from the Los Angeles International Airport from January, 1993 to April, 2006. The 105th observation, the monthly passenger traffic for September, 2001, shows a marked numerical drop from 6,624,720 to 3,593,455. This of course may be attributed to the terrorist event of September 11 of that year. This paper uses the SAS/ETS procedure PROC AUTOREG with the CHOW option to test whether or not there was a change in the structural stability of this time series. The following question is addressed: Was the monthly drop at the 105th observation a one-time phenomenon, or did the terrorism event permanently effect the evolution of the time series?

LAX

Figure 1

6500 6000 5500 5000 4500 4000 3500 3000 0

50

100

150

200

time

REGRESSION ANALYSIS Regression analysis is the study of the relationship between a dependent variable, Y, and one or more independent variables, X's. The SAS System contains numerous procedures which may be used to estimate regression equations. A linear regression equation may be expressed as:

Yi = βo + β1 X1 + β2 X2 + β3 X3 + ... + βk Xk + μi where Yi Xi βi μi

is the dependent variable are the independent variables are the regression coefficients is the error term or residual

1

Regression analysis allows the researcher to determine the influence which each respective independent variable has on the dependent variable, ceteris paribus. In addition, a correctly specified regression model allows for the use of numerous statistical tests and summary statistics, such as R-square, which indicate how well the model fits the data. For most of these tests to be statistically correct, there are a number of implicit assumptions imposed on the model which must be satisfied (Gujarati, 2003). These model assumptions should be tested for validity. Pending results from these tests, there are a variety of econometric methods which may be used to deal with assumption violations. For this study, the regressor used is the index of time. A simple linear regression model was estimated for LAX passenger traffic data for several time windows: pre-911, post-911 and entire data set from January, 1993 through April, 2006.

CHOW TEST FOR STRUCTURAL STABILITY1 In order to test for the stability of a relationship between a dependent variable and the explanatory regressor, in our example, time, SAS includes an option of the PROC AUTOREG MODEL statement called CHOW - which allows the researcher to select the potential breakpoint of the relationship which we desire to test. If there is no structural change, we would expect that the estimated residuals from a regression using the entire data would not differ from the combined residuals from two regressions using each subset of the data. A large difference between the sets of residuals would indicate that there has been a break in the data - i.e. a structural change has occurred. From a statistical perspective, the null hypothesis for the CHOW test is that the subset regression slope coefficients, β 1 and β 2 are equal, and thus the subsets can be viewed as one dataset. Alternatively, the intervention has changed the nature of the relationship. Ho: β1 = β2 conditioned on the equality of the sample error variances where the two subset regressions are: y1 = X1 β 1 + u1 y2 = X2 β 2 + u2 Chow statistic

with n1 observations with n2 observations

(u'u - u'1u1 - u'2 u2 ) / k __________________________ (u'1u1 + u'2u2)/ (n1 + n2 -2k)

=

where u is the residual vector for the entire data set regression, u1 and u2 the residuals for the subset regressions. Chow showed that the sampling distribution of the above statistic is distributed as an F distribution with k degrees of freedom in the numerator and (n1 + n2 - 2k) degrees of freedom in the denominator. Figure 2 illustrates the trend lines for the two subset regressions. Casual observation indicates that a downward shift has occurred, but whether or not the slopes have remained is not readily apparent.

Figure 2

6500 6000 LAXmod

5500 5000 4500 4000 3500 3000 0

20

40

60

80

100

120

140

160

180

time

Following are the regression equations and summary statistics for the two subset regressions using JMP® 1

Equations for this section follow the nomenclature of SAS/ETS Chapter 10 THE AUTOREG Procedure, p. 441.

2

visualization software from the SAS Institute.

1st time period: january, 1993 through august, 2001 LAX = 3909.6486 + 18.831451 time Summary of Fit RSquare RSquare Adj Root Mean Square Error Mean of Response Observations (or Sum Wgts)

0.618506 0.614765 448.3274 4898.3 104

Analysis of Variance Source Model Error C. Total

DF 1 102 103

Sum of Squares 33238864 20501738 53740602

Mean Square 33238864 200997.44

F Ratio 165.3696 Prob > F |t| |t| F |t| F |t|

Intercept

1

4470

101.1604

44.18