Household diversity and market segmentation ... - Greenfield Advisors

10 downloads 123 Views 145KB Size Report
Department of Agricultural and Applied Economics, Texas Tech University, ... east, a 137-acre closed steel mill to the north, Georgia Tech to the south, and a.
Ann Reg Sci 39:791–810 (2005) DOI 10.1007/s00168-005-0020-z

ORIGIN AL PAPER

Clifford A. Lipscomb . Michael C. Farmer

Household diversity and market segmentation within a single neighborhood

Received: 6 July 2004 / Accepted: 15 March 2005 / Published online: 15 November 2005 © Springer-Verlag 2005

Abstract Housing hedonic studies typically assume that individuals or households are similar enough to aggregate into a single demand equation for analysis, usually relying on ordinary least squares (OLS) or some other single-line equation estimator. Diversity itself is managed by non-spherical disturbance corrections, typically spatial autocorrelation or heteroskedasticity in the single line OLS estimate. This paper tests whether households in the same neighborhood can be theoretically and empirically treated as a single type or if households can be assigned into more than one type; multiple types suggests distinguishable local housing sub-markets. Our technique, a combination of the method of principal components and the seemingly unrelated regression (SUR) model, allows for one or more demand curves to represent housing demand and allows several types to compete over a fixed housing stock in a given residential neighborhood. As such, the SUR is the empirical translation of a theory that different household types can coexist in the same neighborhood. JEL Classification C31 . C51 . D12 . R21

C. A. Lipscomb (*) Department of Marketing and Economics, Langdale College of Business Administration, Valdosta State University, Valdosta, GA 31698-0075, USA E-mail: [email protected] M. C. Farmer Department of Agricultural and Applied Economics, Texas Tech University, Lubbock, TX 79409-2132, USA E-mail: [email protected]

792

C. A. Lipscomb, M. C. Farmer

1 Introduction Several attempts have been made in the housing literature to manage the theoretical and empirical challenges posed by strong household preference heterogeneity for local amenities. The appearance of economically distinct submarkets or “types” in a study area suggests that planners and analysts will have to account for marked differences in preferences in their housing market equilibrium models. This has led to several attempts to model more complex housing market equilibria to account for taste variation among households. Many analysts now depart from the traditional hedonic pricing method where an analyst estimates the impacts of local amenities or structural characteristics on dwelling prices from an hedonic equation using a regression model. Some employ discrete choice methods (Bayer et al. 2002; Quigley 1985; Chattopadhyay 2000), others use locational equilibria (Sieg et al. 2002), and still others explore random bidding models (Ellickson 1981; Lerman and Kern 1983; Chattopadhyay 1998) as alternative devices to capture diversity considered inaccessible from a single hedonic equation. Yet even these alternative works appeal to some degree to a relatively strong form of Tiebout’s (1956) “voting with your feet” model whereby similar households coordinate to supply an optimal bundle of local amenities befitting their type. Even as Tiebout’s (1956) classic model hints at a “one neighborhood, one type” equilibrium, both Tiebout (1956) and Samuelson (1954) already anticipate its violation. The presence of fixed urban space (the premier scarce resource in a real estate market equilibrium) routinely checks the capacity of any single group to install a set of amenities at the desired scale (i.e. minimum average cost) before they begin to compete for space with other groups. In such a market, “voting with your feet” expands to allow several types to coordinate to supply amenities to one area; or one neighborhood with several types. The prima facie constraint on space has enormous implications for modeling housing market equilibria if the resulting preference diversity for local amenities is far greater than entertained in the alternative equilibria models above. Most housing submarkets are identified for analysis by large, spatially delineated areas such as a census tract, a city, or a county wherein households have similar enough preferences to be to be treated as a single group within that area [Palmquist (1984) and Brasington and Hite (2005)]. Household heterogeneity typically is captured in a single hedonic equation where attribute value falls along a continuum of demographic traits distributed over the market (Leggett and Bockstael 2000; McFadden and Train 2000). These exogenous determinations of submarkets are common in the hedonic price literature. A notable exception to an a priori exogenous definition of a market along the “one area, one type” model is Bourassa et al. (2003) who define real estate appraiser submarkets as a “set of statistically generated submarkets consisting of dwellings that are similar but not necessarily contiguous” (p. 12). In this paper we test whether an endogenous assignment process that uses data on households and the parcels they occupy yields more consistent and efficient estimates of the impacts of amenities or housing structural features on dwelling sales prices. Like Bourassa et al. (2003), we do not mandate parcel contiguity for household types. We consider a small 820-dwelling neighborhood in Atlanta, Georgia, well partitioned away from other residential areas by a major interstate highway to the east, a 137-acre closed steel mill to the north, Georgia Tech to the south, and a commercial corridor to the west. A priori, one might argue that a single household

Household diversity and market segmentation within a single neighborhood

793

type, which can be analyzed using single-line regression models, lives in this neighborhood. Using a unique data set that includes publicly available and survey data, we find three distinct hedonic price functions that each meet the statistical conditions for aggregability of demand for local amenities and cannot be efficiently or consistently combined. Equivalently, we discover that several economically distinct household types occupy the same neighborhood and together they access a uniform bundle of amenities across that space. The approach used in this paper is closest in spirit to Ellickson’s (1981) random bidding model (RBM) in that we examine the household type that is most likely to occupy a given residence. One limitation of the Ellickson model, though, is that it defines household types exogenously, using arbitrary partitions of continuous demographic variables to define groups by discrete income or age categories. Subsequent applications distinguish households by income and family size (Lerman and Kern 1983), by resource access and life-cycle differences (Chattopadhyay 1998), by school districts or by common package of local amenities (Sieg et al. 2002), or by assuming that everyone differs by only a single taste parameter so that all households in a market equilibrium order the most preferred to least preferred neighborhood in exactly the same way (Smith et al. 2004). Critically, in each case these a priori methods for market segmentation are adopted without actually testing the assumption. More diversity in a single neighborhood introduces a new market equilibrium burden on the policy analyst. To respond to the added burdens of modeling and properly estimating a market equilibrium that entertain this higher incidence of preference diversity, we use more information: (1) the continuous variables are not discretized into arbitrary categories and (2) we gain efficiency by allowing regression residuals for each household type to correlate with each other in a seemingly unrelated regression (SUR) model. Finally, we directly test our equilibrium assumption. Our results pass every efficiency test that we can find (please see the Appendix) and the SUR regression system estimation suggests that the groups cannot be combined consistently. The SUR model results are particularly informative. By defining types empirically based on aggregation conditions that satisfy an estimated demand, this housing market equilibrium accounts for competition across groups for particular housing units that cohabit a single neighborhood. Separate hedonic equations by group are no longer truly independent but are related by their mutual entry and competition for housing units in the same neighborhood. The SUR estimator allows for the joint estimation of several distinct hedonic equations and also locates the household type most likely to occupy a given dwelling, closing the key equilibrium concerns in modeling household diversity in a single contiguous space. The structure of the rest of the paper is as follows: first, the theory of market segmentation is discussed, followed by an iterative assignment procedure that utilizes principal components analysis PCA and the SUR method. Next, we report our SUR results. Finally, we draw some conclusions. 2 Market segmentation and assignment It is important to distinguish between two theoretical motives for segmenting markets. One motive is to recover a marginal willingness to pay schedule for a local amenity or a housing attribute for a single type of person. If the purchasing behavior

794

C. A. Lipscomb, M. C. Farmer

of a type differs from city to city because local conditions differ, the reaction of that type in valuing an amenity will allow a more robust estimate of the willingness of that individual to pay for the amenity over a large change in the supply of that amenity. This popular method, attributed to Palmquist (1984), is replicated almost unaltered in works to this day (Brasington and Hite 2005; Zabel and Kiel 2000; Boyle et al. 1999; Palmquist and Israngkura 1999; Bartik 1987). A second motive is to distinguish among different, non-comparable hedonic price functions for each market. Many works limit the distinctions that compel different hedonic price equations to income shifts (or other demographic shifts) of sufficient size to warrant a separate market. This eases considerably the equilibrium challenges since in many cases a shift in the hedonic regression intercept can be employed to demarcate markets, using dummy variables or interaction terms to capture the shift. Train (1998) provides a few examples across the recreation demand and non-market valuation literature, but generally finds the approach problematic when true taste differences over individuals appear. The recent works cited above wrestle with this problem at some level, raising the question as to how the analyst chooses to segment markets. We propose an alternative that segments a local housing market via an endogenous assignment process. Theory from Tiebout (1956) on down to Palmquist (1984) and beyond provides relatively clear guidance on how markets segment but little in the way of strong theory to empirically identify market segments. For example, at what levels of income, age, education, or family size do interpersonal variance define a separate market? True taste differences, where preference rankings differ, defy easy approximation by what are now standard GLS corrections (Bell and Bockstael 2000). Even promising developments in random parameters models to date treat unobserved taste differences under a continuous taste parameter distribution (McFadden and Train 2000). Other works that more explicitly treat basic taste differences still strongly bound those differences up front, restricting diversity in preferences to a single taste parameter. This restriction on utility allows a single composite index for all public services priced in a market to be defined so that these bundles will generate consumers who rank amenity bundles in exactly the same way (Smith et al. 2004). Others specify a single social characteristic that captures the desire of a household to live next to households of similar socio-economic and racial composition (Bayer et al. 2002). In this paper, more than one social characteristic is used to capture these household spillovers. These a priori segmentation strategies may be appropriate for a given setting; but we introduce a simple mechanism as a starting point to elicit from individual household data a large set of possible household groupings. Then, we use the theory of market equilibrium and the conditions for market segmentation (or demand aggregation) to isolate empirically distinct classes of households where each type is represented by a separate hedonic price equation. This theory tells us that persons may fall into a single Tiebout-style (1956) division (“one neighborhood, one type”), several discrete divisions, or possibly collapse to almost total disaggregation (“one household, one type”) a la Samuelson (1954). But, these divisions will be defined over many continuous variables such as age and income but also on continuously weighted combinations of personal characteristics such as strong likes and dislikes about the community, social characteristics and habits, even memberships in civic, professional, or religious organizations. What distinguishes one type from another may not be a discrete cut-off over one or two demographic dimensions represented

Household diversity and market segmentation within a single neighborhood

795

by categorical quintiles (Ellickson 1981; Chattopadhyay 1998), but may be defined by intersections that divide groups over an n-dimensional surface from the n critical characteristics that segment households into types. Theory only guides us on what characteristic data to seek that are likely to embed critical distinctions. For this reason, we seek to sort persons into discrete groups on the basis of a continuous surface delineated over as many dimensions as we have strong demographic and attitudinal variables available. The method of principal components completes this job by locating n-space partition surfaces among n variables at hand, which can be used to define the largest number of potential groupings that the data, fully employed, allow. This sorts households into the group to which they best fit and tests if these partitions are distinct enough to define separate hedonic price functions (using a SUR model that accounts for intergroup competitions for housing space). This flexible approach minimizes the ad hoc and a priori divisions that frequently mis-specify the number of household types since we do not sacrifice information available about the characteristics that distinguish households. 3 Iterative assignment process Researchers a priori do not know how many market segments exist in a given space, nor do they know what comprises the segments.1 For these reasons, household groupings in this paper are defined endogenously, which is a real point of departure from all previous hedonic works. This responds to a concern raised by Palmquist (2004), who argues that there are serious efficiency losses when one pre-sorts households into a pre-assigned number of discrete income, age and education categories, especially if the analyst relies on pre-assigned income, age, and education levels. Exogenous pre-assignment sacrifices much of the demographic information available to the analyst that can be used to distinguish groups as it imposes rather ad hoc divisions of the households under study (Ellickson 1981; Lerman and Kern 1983; Chattopadhyay 1998). In our approach, both the number of groups and the number of dimensions over demographic and attitudinal variables used to demark group divisions are endogenously located. Once a set of potential demarcation surfaces is introduced, an iterative assignment procedure using successive SUR system estimates reduces the number of possible groups into, in this case, three separate hedonic price functions. The two steps in the iterative process are the identification of “groups” through principal components analysis (PCA) and “types” through an iterative hedonic regression model. First, 24 candidate variables (demographic and attitudinal variables in Table 1 that account for taste differences between households and allow the threshold that defines one potential group from another to lie anywhere along a single surface in R+24 space) are used in a PCA, producing 24 factors, eight of which have eigenvalues greater than one (and thus are significantly orthogonal to each

1 Our assignment of households endogenously seems to be an obvious extension of Abraham et al. (1994), who use a k-means clustering algorithm to endogenously locate homogeneous groupings of metropolitan housing markets.

Demographic and attitudinal variables (from housing survey): “Are you a renter or owner?”; dichotomous (0=renter, 1=owner)* Years lived at current residence; interval* Total Household Income (IV); multinomial* Race of the respondent (IV); multinomial* Number of adults in dwelling (IV); interval* Number of adults that work at least part-time; interval* Number of children in the household (IV); interval* Education level of the respondent (IV); multinomial* Age of the survey respondent (IV); continuous* Work status of survey respondent [retired (0,1)*, student (0,1)*, full time employed (0,1)*] Sex of the survey respondent (IV); dichotomous (0=female, 1=male)* Is the house owner-occupied? (IV); dichotomous (0=no, 1=yes) Is some part of the house rented? (IV); dichotomous (0=no, 1=yes) “What particular features attracted you to Home Park?”; Close to Georgia Tech*, Investment Potential*, Tree Cover* “What particular features attracted you to your dwelling?”; Near Home Park green space*, Near public transportation*, Near work* Political tendency (conservative, moderate, liberal); dichotomous* Most important issue besides national security (crime*, environment*, education*, social security*) Dwelling characteristics (from housing survey and Multiple Listing Service) used in iterative SUR process: Rental/Selling price of house (DV); continuous in dollars; rental multiplier of 120 used to convert rental prices to sales prices Square footage of the house (IV); continuous in square feet Number of bedrooms (IV); discrete Number of baths (IV); discrete Year of last sale (IV); discrete

Table 1 Variable list

796 C. A. Lipscomb, M. C. Farmer

Age of the dwelling (IV); continuous in years Is the dwelling at or above street level? (IV); dichotomous Spatial features (from GIS) used in iterative SUR process: Road network distance to nearest brown industry (IV); continuous in miles Road network distance to child care facility (IV); continuous in miles Road network distance to local churches (IV); continuous in miles Road network distance to most distant parcel from Home Park (IV); continuous in miles Road network distance to Piedmont Park (IV); continuous in miles Road network distance to local elementary school (IV); continuous in miles Road network distance to local convenience and ethnic grocery stores (IV); continuous in miles Road network distance to local commercial/retail opportunities (IV); continuous in miles Road network distance to nearest public transportation bus stop (IV); continuous in miles Road network distance to Georgia Tech (IV); continuous in miles Road network distance to Feminist Women’s Center (IV); continuous in miles Distance interaction terms (D200=ln distance to Home Park if dwelling is more than 200 m from the park and within South Home Park [south of 14th Street], 0 otherwise; DNorth=ln distance to Home Park if dwelling is located in North Home Park [north of 14th Street], 0 otherwise) Adjacency variables (Yes or no dummy variables created from GIS) used in iterative SUR process: “Do you live adjacent to a renter?” “Do you live adjacent to a homeowner?” “Do you live adjacent to a student?” “Do you live adjacent to a college graduate?” “Do you live adjacent to a household that wishes (to move/not to move) in the next two years?” “Do you live adjacent to a household of a different race than your own?” “Do you live adjacent to a household that has made home improvements in the last two years?” “Do you live adjacent to a household that chose Home Park for its proximity to Georgia Tech?” “Do you live adjacent to a household that is of the same/different type than you?”

Table 1 (continued)

Household diversity and market segmentation within a single neighborhood 797

798

C. A. Lipscomb, M. C. Farmer

other).2 Then, households are matched. to the factors that best describe them based  b  on the maximization of Gi ¼  fin  f i ^ifi , where f^in ði ¼ 1 to 8; n ¼ 1 to 400Þ is the factor score for each observation, and f i and ^i are the mean and standard deviation of each factor i. Now that observations have been aggregated into eight “groups”, each “group” is treated as a separate line in an eight equation Zellner-like (1962) SUR model that regresses the natural log of sales price against the structural characteristics of the dwellings (S), distances to various amenities such as open spaces (D), and adjacency variables A. The source data for this iterative PCA and hedonic regression model (please see Table 1) include publicly available data (dwelling structure characteristics), a housing survey administered by the author (demographic and attitudinal variables, adjacency variables), and geographic information system analysis (distance variables). In practical terms, the only difference between OLS and SUR is that SUR allows for correlation among the error terms across multiple equations, information that is captured in the error variance–covariance matrix. Eq. 1 accounts for changes in the dwelling attributes that might entice a different household to purchase a particular dwelling (similar to the way that demand conditions may influence investment decisions across firms in different industries) as well as an unequal number of observations in each line of the model. 32 2 3 2 S 1 ; D1 ; A1 3 2 13   "i ln SPi1 1 ; 1 ; 1 . . 76 ; D ; A S . . 2 2 2 6 ln SP2 7 6 6 7  ;  ; . . 6 76 2 2 2 7 6 "2i 7 i 7 6 7 .. 76 . . . 6. 7¼6 7þ6 . 7 .. .. 74 . . . 4 .. 4 5 6 5 4 5 . . 5 . . 8 8 ; 8 ; 8 ln SPi .. .. "8i S ;D ;A 8

8

8

(1) Here, β, δ, and ψ are estimated parameter vectors on the independent variables S, D, and A, respectively. Also, the expected value of the error variance–covariance matrix is calculated by E(ɛiɛk′)=σikIT, where ɛi is the error term from the ith equation, σik is the covariance of the error terms of the ith and kth equations, and IT is an identity matrix of order 400×400. Greene (1993) proves that equation-by-equation OLS can be used to estimate the SUR model when the same independent variables are used for each equation. This suggests that both approaches, SUR or equation-by-equation OLS, are equally valid. Yet the equivalence rests on asymptotic convergence between the estimators which has come under scrutiny within the SUR context in particular (Dufour and Khalaf 2000). In small samples, a fully efficient GLS correction to each equation will approximate asymptotically the SUR weights. But, we have little prior theory (as do others that a priori define neighborhoods by geography, renter/owner status, 2 The Appendix provides more details and properties of the endogenous assignment process, which allows the households to sort into groups that are not predetermined with very little statistical guidance. To check for any bias in terms of starting points, we extend the possible starting points for “groups” to the top 10 factors in terms of eigenvalues to test if the number of “groups” always collapses to three “types”. This was proven true starting with 10, 9, 8, and 7 factors. Since factor 7 identifies with a final household “type”, starting with 6 factors reduces the number of household “types” down to 2.

Household diversity and market segmentation within a single neighborhood

799

etc.) on the exact markets that will emerge to guide the specification of nonspherical disturbances for each equation at each iterate. Therefore, we allow the data, with little structure provided by the choice of the SUR estimator in the iterative hedonic process, to determine the number of statistically aggregable “types” in this neighborhood. At their worst, the SUR estimates replicate the OLS estimates (to be discussed later); at their best, they account for the interdependencies between different household types in the estimation of beta coefficients related to threshold positions, where a small change in a housing bundle is likely to induce a different household type with a higher willingness to pay (WTP) for that bundle to offer a higher bid and therefore occupy the dwelling. Once initial beta coefficient vectors are estimated in the eight-equation model, the sales prices of each dwelling for each of the eight beta coefficient vectors are predicted and compared to the actual sales prices. Then, based on the differences between the predicted and actual normalized  sales prices, .  by the standard errors of    SUR ^ the linear predictions Zi ¼  ln Pin  ln Pn ^  , each observation is ali

lowed to re-sort into the group whose beta coefficient vector best predicts the sales price for that observation; equivalently, each observation is re-sorted into the group whose beta coefficient vector minimizes the Zi statistic.3 After two iterations, the model converges, meaning that less than 1 % of households re-sort into a different type. The iteration process sorts all but four households into three final “types”. As a final step, the remaining four households are sorted into one of the three final types. One result of the assignment process is that the groups with the highest PCA eigenvalues (groups 1 and 2), which explain the largest raw variance among the demographic and attitudinal variables, do not become the “types” after the do-loop process. This is instructive. Factor 1 favors renters, a very visible market distinction; but an a priori renter/homeowner market segmentation does a poor job tracking price variability (please see Table 2). The SUR results for a regression that predetermines the market segments (renters and owners) in Table 2 indicate opposing signs on every independent variable for renters and owners, which suggests that renters and owners do not have the same directional marginal WTP for the same features. This conclusion seems strange and suggests that an a priori assignment of the market based on renter/owner status may not adequately segment this housing market. The typical policy analyst who relies on the most visible demographic distinction to segment markets exogenously (renter or owner) could misspecify the segmentation of the market and bias the subsequent welfare estimates. 4 Results Generally, the results show three distinct household types: Type A households are low income student renters completing university degrees (corresponds to Factor 3); Type B households are young adults establishing themselves who rent larger 3 It is possible that relatively high values of σ would cause a large number of households to sort i into one or two particular groups. Descriptive statistics show that the factors with the highest σi do not have a high percentage of households sort into those groups (i.e. do not become household “types”).

Constant Structural variables Natural log of dwelling square feet Natural log of number of beds Natural log of number of baths Natural log of number of acres Condominium (yes=1) Spatial variables Ln dist away from 14th St. Commercial Site Ln dist away from Piedmont Park Ln dist away from Meineke Service Station Ln dist away from Georgia Tech D200 m DNorth Is dwelling above street level? Located on state street? Above street level * state street Adjacency variables Lives adjacent to a renter Lives adjacent to undergrad student Lives adjacent to a college educated person −0.40 (1.48) 0.44 (0.37) 1.85** (0.74) −1.76*** (0.68) 0.50*** (0.10) 0.11 (0.10) – −1.14** (0.45) 1.22 (0.96) −0.40 (1.40) 1.74*** (0.67) 1.25** (0.56) −0.19 (0.63)

4.57 (3.40) −0.40 (0.45) −2.05* (1.13) 1.79 (1.65) −0.61 (2.01) −0.56 (2.03) −0.15 (1.98) 1.10** (0.55) −1.43 (1.18) 0.19 (1.73) −3.56*** (0.82) −2.27*** (0.69) 1.14 (0.77)

3.25*** (0.77) 2.26*** (0.65) −0.98 (0.73)

−1.28** (0.65) 3.60*** (0.68) −1.18* (0.69) −0.19 (0.34) –

(3.20) (0.42) (1.06) (1.56) (1.89) (1.91) (1.86) (0.52) (1.11) (1.63)

−4.27 0.37 1.92* −1.64 0.49 0.43 0.04 −0.99* 1.47 −0.40

−0.12** (0.76) −0.21 (0.79) −0.86 (0.79) 1.25*** (0.39) – 0.37 (0.81) 0.31 (0.84) 0.93 (0.84) −1.35*** (0.41) –

4.78 (9.90)

−24.09 (18.27)

32.59* (17.21)

−2.29*** (0.79) 1.39** (0.67) −0.82 (0.75)

1.60 (1.63) −1.85*** (0.43) −3.97*** (0.86) 4.00*** (0.78) – 0.40*** (0.11) 0.50*** (0.10) −0.57 (0.53) −6.23*** (1.13) 5.42*** (1.66)

0.56 (0.77) −4.93*** (0.81) −1.17 (0.81) −0.64 (0.40) –

5.50 (11.04)

0.24 (0.74) −2.65*** (0.63) 1.17* (0.71)

−1.78 (1.44) 1.37*** (0.41) 2.17*** (0.79) −2.52*** (0.73) – −0.02 (0.09) – 1.84*** (0.50) 5.00*** (1.07) −5.27*** (1.55)

0.92 (0.72) 1.40* (0.76) 2.45*** (0.77) 0.72* (0.38) −0.17* (0.09)

2.19 (9.60)

Type C (146 obs.) Coefficient (Std. error)

Type A (70 obs.) Coefficient (Std. error)

Owner (142 obs.) Coefficient (Std. error)

Renter (258 obs.) Coefficient (Std. error)

Type B (184 obs.) Coefficient (Std. error)

Endogenous market segmentation (Iterative assignment process)

Exogenous market segmentation

Table 2 SUR estimates (dependent variable: natural log of sales price)

800 C. A. Lipscomb, M. C. Farmer

Type B (184 obs.) Coefficient (Std. error)

Type C (146 obs.) Coefficient (Std. error)

Type A (70 obs.) Coefficient (Std. error)

Renter (258 obs.) Coefficient (Std. error)

Owner (142 obs.) Coefficient (Std. error)

Endogenous market segmentation (Iterative assignment process)

Exogenous market segmentation

*=Significant at 0.10 level; **=Significant at 0.05 level; ***=Significant at 0.01 level

Lives adjacent to a person of a different race 1.01* (0.54) −1.07* (0.57) −1.92*** (0.47) −2.69*** (0.55) 4.58*** (0.52) than them Lives adjacent to someone that made home −4.14*** (0.55) 4.44*** (0.59) 0.69 (0.48) −3.98*** (0.57) 3.61*** (0.53) improvements Lives adjacent to someone who lives in Home 1.06 (0.65) −1.12 (0.69) 1.47*** (0.57) 2.61*** (0.67) −4.13*** (0.63) Park because it is near Georgia Tech Other statistics 237.07 (p=0.0000) 239.19 (p=0.0000) 112.87 (p=0.0000) 245.56 (p=0.0000) 312.12 (p=0.0000) χ2 Root MSE 4.33 4.60 3.88 4.59 4.34 R2 0.38 0.38 0.23 0.38 0.43 Breusch–Pagan test of independence χ2 (3)=398.716 prob.=0.0000 χ2 (3)=305.558 prob.=0.0000

Table 2 (continued)

Household diversity and market segmentation within a single neighborhood 801

802

C. A. Lipscomb, M. C. Farmer

units or purchase lower-end starter homes (corresponds to Factor 4); and Type C households are more established homeowners with incomes below the average for the immediate downtown area who seek affordable urban housing in this community (corresponds to Factor 7). Condominium owners seem to be an emerging fourth type (as they have above average Z-scores compared to all other Type C households) with the “mindset” or “lifestyle” of owners but the physical dwelling characteristics of Type B residents. Assignment shows that distinct types exist today but also hints of compromised welfare assessments over time if other types emerge, like a “condominium” household type. Segmenting households in this way allows a more parsimonious model that targets theoretically relevant variables for each type. Variables that deviate between types are useful in the assignment procedure but not for predicting sales price variation if that variable is uniform within a type. For example, we tested the introduction of attitudinal variables into the final first-stage hedonic model. As expected, they proved insignificant as they vary little within type; so their exclusion is warranted. That these variables are often significant in the second-stage hedonic perhaps signals that multiple market demands have been folded into a single equation model.4 Table 2 also summarizes the SUR estimates for the three “types” determined endogenously.5 We see across types that the same independent variables often have a significant effect on price but differ by sign as well as by magnitude. This is a very strong result that supports our theory of heterogeneous households across a single space (if heterogeneity is defined as having different marginal WTPs for the same variables) as evidenced most strongly by the different coefficient signs on Spatial Variables and Adjacency Variables: Type A students seem to want to live close to the university and to other student renters associated with the university; Type C homeowners seem to want to live near the university but also next to someone who has made home improvements, as they do not want to live near the automotive repair center or next to undergraduate students; and Type B residents seem to value park proximity (as a leisure destination) but appear unable to afford living adjacent to it. A map that identifies type location reveals no clear type clusters despite the different location preferences. That these differences in attribute value among types operate in a fluid competitive real estate market is reinforced by a highly significant Breusch and Pagan (1980) χ2 statistic at all stages of the assignment process that tests the diagonality of the error covariance matrix of residuals across equations (i.e. whether all of the offdiagonal matrix elements are non-zero). The test reveals that the same groupings could not be represented more efficiently as a pooled model that uses “types” as discrete variables. Also, the relative efficiency of the assignment process can be 4 New model specifications for each type permit the discrete variable “Condo” to be included in the Type C equation only since all condo owners sorted into this type. Also, since most Type B and Type C households live more than 200 m from the Home Park green space (hereafter called park), only variables that identify distance from the park outside the 200-m perimeter are included. Also among those who reside in North Home Park, tests show that the distance from the park for Type A and Type C residents could be eliminated in these two equations. These specification changes were subjected to a joint Wald test; the variables omitted do not significantly alter price predictability in the system, even though many of the variables removed generated significant t-scores in their respective equations. 5 The standard errors are estimated using Eq. 17-9 in Greene (1993, 489); variable significance remains robust even with a small sample standard error correction.

Household diversity and market segmentation within a single neighborhood

803

seen particularly in the RMSE and Breusch–Pagan χ2 statistics. The RMSEs decrease through the various stages of the assignment process, suggesting that we obtain more efficient groupings of households as we proceed through additional iterations. Compared to other works that impose market segments a priori, the iterative assignment process presented here is relatively more efficient in its determination of endogenous market segments. The SUR results and descriptive statistics by type tell an interesting story. Coefficient estimates on structural features show that lower income student renters completing university degrees prize dwellings with extra bedrooms. Also, more established homeowners prefer an extra bathroom, more living space, or a larger lot more than an additional bedroom. Little within-type structural variation appears among Type B households who generally rent higher quality units or purchase lower-end starter homes. Across the neighborhood, Type B structures are nearly all two bedroom, one bath dwellings with approximately the same living space. Dwellings occupied by Type B households tend to have less living space, smaller lot size, and fewer improvements than Type C homeowners but are somewhat larger than the two bedroom, one bath rental dwellings generally occupied by Type A households. The significant and negative, yet counterintuitive, coefficient on bedrooms for Type B residents, in particular, is such due to the lack of variation in bedrooms for dwellings occupied by Type B households, not because they are willing to pay less for an additional bedroom at the margin. A Wald test shows that the variable cannot be omitted from this equation. Another interpretation of a negative coefficient for Type B households on some features may be that a small increase in, say, bedrooms or bathrooms will induce a switch in occupant type, where Type B households switch respectively to students who positively value bedrooms (Type A) or to homeowners who positively value an extra bath (Type C). Local realtors interviewed prior to the administration of the survey predicted that a certain class of residents (Type B) occupies two kinds of “fixer-upper” dwellings: units with an unfinished bathroom or utility room convertible to a bath; and units at or below average condition with space uninhabitable as a bedroom or convertible to a bedroom. Both kinds of dwellings rent to Type B residents at their lower expenditure level, though the MLS lists the units as having more bedrooms or baths, explaining further the negative price. Once a second bath or third bedroom becomes fully available, it appears that a Type C or Type A resident, respectively, occupies the unit thereafter and is willing to pay the highest price for the new space. The SUR joint estimation helps to uncover these “submarkets” whereby valuations of dwelling features among different types are sharpened.6 The signals of regularity in the performance of this market, type by type, are reassuring. While the supply cost of a shared common amenity, such as the park, may be safe to assert in the prior position, the SUR results above help to infer that the costs of dwelling structure characteristics are relatively comparable within each 6 Proximity to a person of a different race is significant for every type and in opposing directions. Among established homeowners, racial diversity seems to be prized; but the few mapped ethnically diverse enclaves of Type C homeowners report a higher incidence of home improvements in the last two years and a desire for investment potential from the home, two features of neighbors that homeowners value. With few variables to indicate overall parcel condition, ethnicity may function as a soft proxy for overall condition.

Constant Structural variables Natural log of dwelling square feet Natural log of number of beds Natural log of number of baths Natural log of number of acres Condominium (yes=1) Spatial variables Ln dist away from 14th St. Commercial Site Ln dist away from Piedmont Park Ln dist away from Meineke Service Station Ln dist away from Georgia Tech D200 m DNorth Is dwelling above street level? Located on State Street? Above street level * State Street Adjacency variables Lives adjacent to a renter Lives adjacent to undergrad student Lives adjacent to a college educated person Lives adjacent to a person of a different race than you (0.06) (0.05) (0.06) (0.04)

0.19 (0.34) 0.12 (0.13) 0.01 (0.16) −0.02 (0.16)

1.42 (0.97) −0.44 (0.28) −0.79 (0.50) 0.49 (0.60) 0.005 (0.10) −0.02 (0.09) – 0.13 (0.14) 0.25 (0.24) −0.69** (0.32)

0.33 (0.27) −0.02 (0.03) −0.14 (0.09) 0.17 (0.13) −0.13 (0.16) −0.14 (0.16) −0.11 (0.15) 0.11** (0.04) 0.04 (0.09) −0.22 (0.13) −0.31*** −0.02 0.14** −0.04

0.21 (0.18) 0.17 (0.19) −0.17 (0.18) 0.28* (0.16) –

−0.37*** 0.004 0.13* −0.12*

(0.07) (0.07) (0.07) (0.06)

0.03 (0.23) −0.01 (0.03) −0.05 (0.10) 0.03 (0.08) – −0.02 (0.03) −0.005 (0.03) 0.07 (0.05) −0.01 (0.15) −0.28 (0.23)

0.39*** (0.08) 0.11 (0.09) −0.05 (0.10) −0.23*** (0.04) –

8.65*** (1.46)

Coefficient (Std. error)

Coefficient (Std. error) 5.30 (5.97)

Type B (184 obs.)

Type A (70 obs.)

(0.06) (0.06) (0.06) (0.03) (0.09)

0.24*** 0.09 0.08 −0.12*** −0.16*

8.35*** (1.46)

Pooled OLS (400 obs.) Coefficient (Std. error)

Table 3 Pooled and type-specific OLS models (Dependent variable: natural log of sales price)

−0.26** (0.12) −0.14 (0.15) −0.15 (0.17) −0.15* (0.17)

0.32* (0.18) −0.29 (0.18) −0.48** (0.19) 0.49** (0.20) – −0.02* (0.01) – 0.04 (0.07) −0.43** (0.18) 0.30 (0.24)

−0.07 (0.11) −0.03 (0.13) 0.13 (0.11) −0.08 (0.05) −0.37*** (0.13)

12.33*** (1.53)

Coefficient (Std. error)

Type C (146 obs.)

804 C. A. Lipscomb, M. C. Farmer

9.32 (p=0.0000) 0.37 0.30

0.31*** (0.04) −0.05 (0.05)

Pooled OLS (400 obs.) Coefficient (Std. error)

*=Significant at 0.10 level; **=Significant at 0.05 level; ***=Significant at 0.01 level

Lives adjacent to a dwelling that made home improvements Lives adjacent to someone who lives in Home Park because it is near Georgia Tech Other statistics F Root MSE Adjusted R2

Table 3 (continued)

1.21 (p=0.2900) 0.42 0.05

7.12 (p=0.0000) 0.33 0.38

0.28*** (0.07) 0.06 (0.08)

Coefficient (Std. error)

Coefficient (Std. error) 0.38** (0.16) −0.33 (0.24)

Type B (184 obs.)

Type A (70 obs.)

5.33 (p=0.0000) 0.34 0.36

0.36*** (0.07) −0.000 (0.08)

Coefficient (Std. error)

Type C (146 obs.)

Household diversity and market segmentation within a single neighborhood 805

806

C. A. Lipscomb, M. C. Farmer

type. The relative homogeneity of dwelling structure characteristics among Type B residents in particular is supported by empirical evidence to infer that the costs of structural improvements are relatively comparable within each segmented group, a condition often treated by prior assertion. Also, the predictably favored structural features of Type A and C residents suggest, arguably, predictable conditions for switching between household types. The process of assignment and then accounting for competition for space among market segments that inhabit the same neighborhood opens the door to greater confidence in isolating an implicit price function for amenities and structural improvements at the second stage. By way of comparison, Table 3 reports both pooled and a type-specific OLS estimates using the same general specification. Here we see very different results in terms of coefficients, standard errors, and explanatory power. To test if the type specifications make a difference, we include two binary (0,1) variables for Type B and Type C households in the pooled model specification in Table 3. The results show that both variables are significant predictors of sales price (for Type B, t=2.93 [p=0.004]; for Type C, t=2.79 [p=0.006], which indicates that the treatment of all households as a single market masks the submarkets revealed by the data, particularly the significant Breusch and Pagan (1980) tests. 5 Conclusion To conclude, this paper allows for the possibility of more than one “type” of household in a single neighborhood. We gain precision by assigning households into distinct segments through an iterative process that accounts directly for the demographic characteristics and attitudes of the households as well as the simultaneous competition by types for the same space. By segmenting the housing market of a single neighborhood where households share the same set of public amenities, we find variation in implicit prices across these types and regularity in the choices made by those within a relatively similar demographic and attitudinal group. This helps to extend the reach of welfare estimates inferred from hedonic price analysis premised on market segmentation by permitting segmentation to operate beyond geography alone (Palmquist 1984; Bartik 1987). This work extends the Samuelson (1954) observation that space limits might preclude the ability to accommodate an “efficient” scale of amenity provision for each household or household type. We posit that types can either scale back their independent plans for amenity provision or they can elect to coordinate their ambitions with other types. If they coordinate, types can agree on a bundle of amenities that will achieve a larger (economies of) scale of overall amenity provision but likely will package those amenities in a way no type would pursue on its own. While some types may still go-it-alone, others may choose to coordinate; so the appearance of a single set of amenities in a local space does not guarantee that a single occupant type exists in that neighborhood or, by implication, that local dwelling purchase activities can be captured by a single hedonic property value equation. Significantly, these household differences center on differences in taste or preference rankings, not simply differences in resource access among otherwise similar households who may differ by, say, age or by income. In a hedonic property value model, these differences in tastes or preference rankings are signaled by different signs on the amenity value estimates.

Household diversity and market segmentation within a single neighborhood

807

One of the main results is that the use of a single hedonic price function cannot be sustained statistically, even in this relatively small neighborhood. The implications for policies constructed from a large hedonics exercise that divides household types a priori [by spatial delineations, by demographic characteristics, or by housing tenure (renters vs. owners)] without testing that assumption may introduce unintended consequences that can prove quite serious. For instance, the results suggest the existence of more losers and larger scale winners than would be expected from an environmental change: more housing units fall in value than a pooled hedonic estimate predicts. Therefore, our results confirm the statement by Palmquist (2004, Section III), that “The researcher must be confident that the observations come from a single market.... Problems arise when separate markets are treated as one.” The implications of a statistical approach that tests the homogeneity assumption extend to numerous regional science research topics (e.g. research that uses regions, counties, cities, etc. as the unit of analysis). This is fed by a trend across the social sciences to collect much more household level data in order to map precise attitudes and preference orderings to policy prescriptions. This paper suggests for urban planners that instead of using zoning as a blunt policy tool to shape who wins and who loses (in welfare terms), more precise characteristics of individual residents can be used to articulate more precise zoning designations that simultaneously achieve broader policy objectives and complement the existing real estate market. In this way, the zoning process can minimize intrusion or even enlist existing market forces to transition land use toward an urban design favored by the planner. Appendix: Iterative assignment process After running the principal components analysis (PCA), eight orthogonal factors emerge as significant linear combinations of the 24 demographic and attitudinal variables; all eight factors have eigenvalues greater than one. At this point, each household has a value for each of the eight factors. Using the Z statistic concept, we construct a “Z-like” statistic to identify “clusters” of households with similarities in multiple demographic and attitudinal dimensions. We use the statistic Gi ¼     b f  fin  f i =^i i  where each household is sorted into the group that has the highest value of Gi. Rows 1 and 2 in Table A show how the households sort into these eight groups after the first and second iteration based on PCA. Next, using the “Second Initial Sort” distribution of households as the basis for seemingly unrelated regression (SUR) analysis, an eight-line SUR model (Eq. 1) that regresses dwelling sales price against dwelling structure and spatial variables controls for an unequal number of observations  equation. Then, for each  in each   SUR household, the statistic Zi ¼  ln P^ in  ln Pn =^i  is calculated. Then, each household is sorted once again into one of the eight groups based on the minimization of Zi. In other words, households are re-sorted into the group whose corresponding beta vector from Eq. 1 best describes the sales price of that household’s dwelling. Therefore, it is possible that a household previously sorted into one group may be reassigned to a different group if that household’s dwelling

5

4

3

2

1

Row

Initial PCA sort Second Initial sort 8-line SUR sort RMSE R2 Breusch–Pagan=189.87 8-line SUR Re-sort RMSE R2 Breusch–Pagan=111.22 3-line SUR Parsimonious Sort RMSE R2 Breusch–Pagan=305.55

132

37

0 0 0

0 0 0

0 41741 0.05

1

Factors

Table A Household distributions among PCA factors

158

28

0 0 0

0 0 0

23 49594 0.14

2

56

64

70 3.88 0.23

70 36926 0.43

55 55794 0.09

3

21

86

184 4.59 0.38

184 54986 0.51

159 59479 0.11

4

7

47

0 0 0

0 0 0

8 40842 0.11

5

11

52

0 0 0

0 0 0

10 49827 0.07

6

10

42

146 4.34 0.43

146 58983 0.61

137 61392 0.12

7

5

44

0 0 0

0 0 0

8 49076 0.12

8

808 C. A. Lipscomb, M. C. Farmer

Household diversity and market segmentation within a single neighborhood

809

sales price is better predicted by a different vector of beta coefficients that correspond to a different group. Row 3 shows the re-distribution of households among the eight groups, where we begin to see the final three household types emerge (Groups 3, 4, and 7 which correspond to Factors 3, 4, and 7, respectively). Then, households are reassigned again based on additional runs of the SUR model (where the beta vectors for each group change and cause additional reassignments of households to groups); the result is the distribution of households in Row 4. Also, the number of households distributed among the types does not change when a parsimonious set of 19 independent variables is used in the final SUR analysis (Row 5) as a basis for assigning households into types. This suggests that the iterative assignment procedure has some relatively robust efficiency properties with respect to model specification. That RMSEs are in constant decline with each successive step of the iterative assignment process suggests that the final three types make some sense statistically. Plus, we are satisfied that the researcherimposed structure on the iterative process (e.g. setting a convergence criterion that SUR iterations cease when less than 1% of the households are reassigned into a different household type) is minimal and does improve upon a priori market segmentation based on geography, housing tenure (renter or owner), or time. References Abraham JM, Goetzmann WN, Wachter SM (1994) Homogeneous groupings of metropolitan housing markets. J Hous Econ 3:186–206 Bartik TJ (1987) The estimation of demand parameters in hedonic pricing models. J Polit Econ 95:81–88 Bayer PB, McMillan R, Reuben K (2002) “An equilibrium model of sorting in an urban housing market: a study of the causes and consequences of residential segregation” Working Paper Yale University Bell KP, Bockstael NE (2000) Applying the generalized method of moments approach to spatial problems involving micro-level data. Rev Econ Stat 82:72–82 Bourassa SC, Hoesli M, Peng VS (2003) Do housing submarkets really matter? J Hous Econ 12:12–28 Boyle KJ, Poor PJ, Taylor LO (1999) Estimating the demand for protecting freshwaterlakes from eutrophication. Am J Agric Econ 81:1118–1122 Brasington D, Hite D (2005) Demand for environmental quality: a spatial hedonic analysis. Reg Sci Urban Econ 35:57–82 Breusch TS, Pagan AR (1980) The Lagrangian multiplier test and its applications to model specification in econometrics. Rev Econ Stud 47:239–253 Chattopadhyay S (1998) An empirical investigation into the performance of Ellickson’s random bidding model, with an application to air quality valuation. J Urban Econ 43:292–314 Chattopadhyay S (2000) The effectiveness of McFadden’s nested logit model in valuing amenity improvement. Reg Sci Urban Econ 30:23–43 Dufour J-M, Khalaf L (2000) “Exact Tests for Cotemporaneous Correlations of Disturbances in Seemingly Unrelated Regressions” CIRANO Scientific Series: 2000s-16 Montreal Ellickson B (1981) An alternative test of the hedonic theory of housing markets. J Urban Econ 9:56–79 Greene W (1993) Econometric Analysis, 2nd edition Macmillan, New York Leggett C, Bockstael NE (2000) Evidence of the effects of water quality on residential land prices. J Environ Econ Manage 39:121–144 Lerman SR, Kern CR (1983) Hedonic theory, bid rents, and willingness-to-pay: some extensions of Ellickson’s results. J Urban Econ 13:358–363 McFadden D, Train K (2000) Mixed MNL models for discrete response. J Appl Econ 15: 447–470 Palmquist RB (1984) Estimating the demand for the characteristics of housing. Rev Econ Stat 66:394–404

810

C. A. Lipscomb, M. C. Farmer

Palmquist RB (2004) Property value models. In: Mäler KG, Vincent J (eds.) Handbook of Environmental Economics volume 2, North-Holland Palmquist RB, Israngkura A (1999) Valuing air quality with hedonic and discrete choice models. Am. J. Agric. Econ. 81:1128–1133 Quigley JM, (1985) Consumer choice of dwelling, neighborhood and public services. Reg Sci Urban Econ 15:41–63 Samuelson PA (1954) The pure theory of public expenditures. Rev Econ Stat 36:387–389 Sieg H, Smith VK, Banzhaf HS, Walsh R (2002) Interjurisdictional housing prices in locational equilibrium. J Urban Econ 52:131–153 Smith VK, Sieg H, Banzhaf HS, Walsh R (2004) General equilibrium benefits for environmental improvements: projected ozone reductions under EPA’s Prospective Analysis for the Los Angeles air basin. J Environ Econ Manage 47:559–584 Tiebout C (1956) A pure theory of local expenditures. J Polit Econ 64:416–424 Train KE (1998) Recreation demand models with taste differences over people. Land Econ 74:230–239 Zabel JE, Kiel KA (2000) Estimating the demand for air quality in four United States cities. Land Econ 76:174–194 Zellner A (1962) An efficient method of estimating seemingly unrelated regressions and tests for aggregation bias. J Am Stat Assoc 57:348–368