Chapter 07. Multiple Regression 5: Dummy Variables. 1. Econometrics. 1.
Multiple Regression Analysis y = β0 + β1x1 + β2x2 + . . . + βkxk + u. 5. Dummy ...
Chapter 07
Ch.7 Multiple Regression: Dummy Variables
Multiple Regression Analysis
1. Describing Qualitative Information 2. Single Dummy Independent Variables 3. Using Dummy Variables for Multiple
y = b0 + b1x1 + b2x2 + . . . + bkxk + u
Categories 4. Interactions Involving Dummy Variables 5. A Binary Dependent Variable* 6. More on Policy Analysis & Program evaluation*
5. Dummy Variables
Econometrics
1
7.1 Describing Qualitative Information
The name of dummy variable should indicate the event with the value one.
female (= 1 if are female, 0 otherwise), married (= 1 if are married, 0 otherwise), etc. Econometrics
3
Example of d0 > 0 (also see Fig.7.1)
d=0 slope = b1
d0
}b
Econometrics
4
We can use dummy variables to control for something with multiple categories. Example 7.6
y = (b0 + d0) + b1x
{
If d = 0, then y = b0 + b1x + u. If d = 1, then y = (b0 + d0) + b1x + u. The case of d = 0 is the base group. See (7.10) when dependent variable is ln(y).
7.3 Dummies for Multiple Categories
d=1
y
7.2 A Dummy Independent Variable
Examples (see Table 7.1):
2
Consider a simple model with one continuous variable (x) and one dummy (d) y = b0 + d0d + b 1x + u. This can be interpreted as an intercept shift.
A dummy variable, also called binary variables, is a variable that takes on the value 1 or 0.
Econometrics
y = b 0 + b 1x
Suppose there are 4 groups: married men, married women, single men & single women. To compare “single men” to other groups, include 3 dummy variables. marrmale = 1 if married men only, 0 otherwise;
0
marrfem = 1 if married women, 0 otherwise; and singfem = 1 if single women, 0 otherwise
x Econometrics
Multiple Regression 5: Dummy Variables
5
Econometrics
6
1
Chapter 07
Cont. Dummies
for Multiple Categories
7.4 Interactions Involving Dummies
Any categorical variable can be turned into a set of dummy variables. But if there are n categories, there should be n – 1 dummy variables. Because the base group is represented by the intercept.
Interacting dummy variables is like subdividing the group (e.g. compare (7.11) & (7.14)). Example: female (fem) & married (mar) y = b0 + d1fem + d2mar + d3fem*mar + b1x + u
See “Dummy variable trap.”
If there are a lot of categories, it may make sense to group some together.
Example: top 10 ranking, 11 – 25, etc. (See Example 7.8)
Econometrics
If fem = 0 and mar = 0, y = b0 + b1x + u If fem = 0 and mar = 1, y = b0 + d2 + b1x + u If fem = 1 and mar = 0, y = b0 + d1 + b1x + u If fem = 1 and mar = 1, y = b0 + d1 + d2 + d3 + b1x + u Base group is single male.
7
Econometrics
8
Example of d0 > 0 and d1 < 0
Allowing for Different Slopes
y
We can also consider interacting a dummy variable, d, with a continuous variable, x,
d=0
y = b 0 + b 1x
y = b0 + d1d + b1x + d2d*x + u.
d=1
If d = 0, then y = b0 + b1x + u If d = 1, then y = (b0 + d1) + (b1+ d2) x + u
y = (b0 + d0) + (b1 + d1) x
This is interpreted as a change in the slope. (See fig. 7.2) Econometrics
9
Testing for Differences Across Groups Testing whether a regression function is different for one group versus another can be thought of as simply testing for the joint significance of the dummy and its interactions with all other x variables.
See the example of (7.20) & (7.21).
However, it is inconvenient to do that for the model with many independent variables. Econometrics
Multiple Regression 5: Dummy Variables
11
x
Econometrics
10
The Chow Test The Chow test is really just a simple F test for exclusion restrictions, but we’ve realized that SSRur = SSR1 + SSR2 . If you run the unrestricted model for group one and get SSR1, then for group two and get SSR2. Run the restricted model for all to get SSRp, then F
SSR SSR SSR n 2k 1 p
1
2
SSR1 SSR2
Econometrics
k 1
(7.24)
12
2
Chapter 07
7.5 A Binary Dependent Variable P(y = 1|x) = E(y|x), when y is a binary variable, so we can write our model as P(y = 1|x) = b0 + b1x1 + … + bkxk . (7.27) So, the interpretation of bj is the change in the probability of success when xj changes.
The predicted y is the predicted probability of success.
Potential problem that can be outside [0,1]. Econometrics
A typical use of a dummy variable is when we are looking for a program effect.
For example, we may have individuals that received job training, or welfare, etc. The control group doesn’t participate in the program. The treatment group take part in the program.
We need to remember that usually individuals choose whether to participate in a program, which may lead to a self-selection problem. Econometrics
Multiple Regression 5: Dummy Variables
Even without predictions outside of [0,1], we may estimate effects that imply a change in x changes the probability by more than +1 or –1, so best to use changes near mean. This model will violate assumption of homoskedasticity, so will affect inference. Despite drawbacks, it’s usually a good place to start when y is binary.
13
7.6 Policy Analysis & Program Evaluation
Linear Probability Model (LPM)
15
Econometrics
14
Self-selection Problems If we can control for everything that is correlated with both participation and the outcome of interest, then it’s not a problem. However, there are often unobserved factors that are correlated with participation. In this case, the estimate of the program effect is biased, and it is appropriate to set policy based on it.
See the example of (7.34). Econometrics
16
3