Parametric or nonparametric regression approaches ... - AgEcon Search

1 downloads 0 Views 163KB Size Report
Parametric or nonparametric regression approaches to the estimation of marginal costs in dairy production? A comparison of estimation results 1. Christine ...
Parametric or nonparametric regression approaches to the estimation of marginal costs in dairy production? A comparison of estimation results 1

Christine Wieck2 and Thomas Heckelei3 2

IMPACT Center and School of Economic Sciences, Washington State University 3

Institute for Food and Resource Economics, University of Bonn, Germany

Selected Paper American Agricultural Economics Association Portland, Oregon, July 29 - August 1, 2007

Abstract This paper compares various nonparametric models for the estimation of farm specific marginal costs function in the dairy sector. Specifically, locally weighted regression approaches using theory-consistent cost function frameworks as polynomials in the nonparametric approach are applied. A comparison of average marginal cost levels as well as marginal cost distributions across farms illustrates the different approaches. Keywords: Dairy production, marginal costs, nonparametric regression JEL classification: C33, Q12, Q18.

1

Copyright 2007 by Christine Wieck and Thomas Heckelei. All rights reserved. Readers may take verbatim copies of this document for non-commercial purposes by any means, provided that this copyright notice appears on all such copies. This work was carried out as part of the CAPRI modelling project assessing the impacts of the Common Agricultural Policy on the European agricultural sector. Corresponding author: Christine Wieck, [email protected]

1 Introduction The discussion about the level and distribution of marginal costs in the dairy sector in regions of the EU is of continuing relevance. The potential removal of the EU milk quota system after the year 2013 is likely to increase the demand for analytical efforts and accurate knowledge of the situation in the dairy sector. In most of the empirical work up to this point, the cost minimization framework has been represented by a parametric approach (recently Moro et al. 2005; Cathagne et al. 2006, Wieck and Heckelei 2007). However, widely acknowledged disadvantages of this approach remain and include the potential estimation bias due to the restrictiveness of the chosen functional form, especially in the context of samples spanning a wide range of dairy production systems. In order to circumvent these problems, first attempts to use nonparametric, local linear regression models have been made (Wieck and Heckelei, 2006). The problem of restrictiveness is potentially well addressed in a nonparametric approach since the functional form of the relationship between the variables is left unspecified. Within the selection of nonparametric techniques, the local polynomial regression is a very popular class of estimators with the Nadaraya-Watson (zero-order) and first-order polynomial estimator being widely used approaches (see Fan and Gijbels, 1996; Mittelhammer et al. 2000). They can be viewed as a constant or linear least-squares approximation to the true regression function at the point of evaluation, respectively. The regression is local since it is specific to a point of evaluation. Such estimators can be defined as local weighted least-squares regression estimators where the weighting scheme using the information available in the neighbourhood of the evaluation point is based on a kernel function. Instead of approximating the regression function at the evaluation point by a zeroor first-order polynomial, the use of higher order polynomials has been suggested recently by Gozalo and Linton (2000) and shown to reduce the bias of the nonparametric estimator. Wieck and Heckelei (2006) demonstrated that this general idea can be combined with typical cost minimization frameworks recovering the underlying technology under certain regularity assumptions. This local representation of technology and cost minimization behaviour implies significantly more functional flexibility in the 2

analysis of the underlying behavioural model compared to parametric approaches, but maintains advantages of the latter with respect to the imposition of theoretical constraints. This paper aims at an illustration of this new approach in the area of nonparametric density and regression analysis and to compare estimation results with a parametric approach in the context of dairy production in the EU. As a point of comparison, results from a parametric estimation applying a multi-input multi-output Symmetric Generalized McFadden cost function for the same regional unit is available. In the following section, idea and methodology of locally weighted nonparametric regression will be discussed followed by the data presentation. Section 4 contains the results of the different estimations, and the last section concludes.

2 Methodology Dairy farming is characterized by cost minimizing behaviour for all variable input factors under given outputs y (J x 1), input prices w (K x 1), quasi-fixed factors z (L x 1), and underlying time dependent technology F(.). The cost function is non-decreasing in y and w and concave in w and can be written as (1)

C( y, w, z, t) = min {w ' x : F( y , x, z, t) = 0}

where x (K x 1) being input quantities and t represents time. In empirical work, the cost minimization framework has been traditionally represented by a parametric approach using various at most second order flexible functional forms2 and different levels of coherence to microeconomic theory3. Examples of empirical studies in the dairy production context are Burrell 1989; Burton 1989; Guyomard et al. 1996; Colman et al. 1998; Colman 2000, Moro et al. 2005; Cathagne et al. 2006, Wieck and Heckelei 2007. However, widely acknowledged disadvantages of this approach include the potential estimation bias due to misspecification of the functional form, as the so called ‘flexible’ functional forms are rather restrictive considering the potentially wide range of production systems and thereby range of

2

As for example by translog, generalized translog, normalized quadratic, symmetric normalized quadratic, or symmetric generalized Mc Fadden functional forms. 3 Reaching from ad-hoc specifications to the estimation of globally well-behaved flexible functional forms.

3

variable values implied by cross sectional samples. This problem has been addressed by increasing the flexibility of parametric forms using higher order degrees of flexibility (Gallant and Golub 1984; Barnett et al. 1991) and by employing nonparametric regression techniques (Pagan and Ullah 1999). Both alternatives typically raise problems with respect to the imposition of microeconomic shape restrictions. Here, we use local estimation of regular input demand systems which allows combining functional flexibility with the ability to test and impose behavioural restrictions. We assume that the underlying local cost function is specific for each of the i = 1,…,N observations in the sample, defined as (2)

Ci (y, w, z, t) = min {w ' x : Fi (y , x, z, t) = 0}

and corresponding input demand equations given by (3)

∂ Ci (.) = xi (y, w, z, t) . ∂w

representing the behavioural equations to be estimated. In the non-parametric, local polynomial regression context, we approximate each input demand function k in (3) by a  ik where the vector v (1 x m) is the collection of linear-in-parameter polynomial vβ

transformations of original variables (y, w, z, t) and βi (m x 1) the corresponding vector of parameters. Which transformation of the original variables are part of v and its dimension depend on the polynomial employed to represent the cost function (2). In this paper we apply a Normalized Quadratic (NQ) and Symmetric Generalized McFadden (SGM) functional form, leading to (normalized) original variables in the NQ case and linear, quadratic and cross product transformations in the SGM case as elements of v . We now estimate the parameter vector βik for each observation using a kernel weighted, restricted Seemingly Unrelated Regression (SUR) estimator combined with a Bayesian procedure to impose curvature. This implies the following sequence of steps:

4

Step 1: Calculation of vector of kernel weights for each observation

w i = fˆ ( v i ; V ) =

1 K ( ( v i − V ) H −1 ) N det(H )

∀i

where V is N x (J+K+L+1) matrix of observations on explanatory variables with row i being equal to v i = [ y i

wi

t i ] , K(.) is a multivariate kernel and H a diagonal

zi

bandwidth matrix corresponding to the column dimension of vi. In our application, we either use the multivariate normal (“Gaussian”) or Epachenikov kernel. The original bandwidth matrix H is defined as in Simonoff 1996, p. 105. However, this bandwidth is often too small in empirical applications with rather empty neighbourhoods around the evaluation point leading to singularity problems. Froelich (2007) proposes stepwise enlargement of the bandwidth values until sufficient non-zero weights are calculated. Starting with the Simonoff formula, we increase bandwidth matrix in steps if 10% until 95% of the observations obtain a kernel weight above 0.01. This has proven to be sufficient for all N estimations of cost functions.

Step 2. Calculate the locally weighted restricted SUR estimator for each observation. First calculate the unrestricted SUR estimate (here the same as OLS applied to each equation) as

(

′ ΦV ′ βˆ SUR = V i S i S

)

−1

 ′Φ x V S i S

∀i

where the subscript “S” denotes variable (transformation) matrices organised as a stacked system of input demand equations and Φ i being a (KN x KN) diagonal matrix with K stacked kernel weight vectors wi as the diagonal. Writing homogeneity and symmetry restrictions on parameters as R-rβ = 0,4 the restricted SUR estimator and its covariance matrix conditional on the kernel weights are given as

(

′ ΦV ′ + V βˆ iRSUR = βˆ SUR i S i S

)

−1

(

(

′ ΦV ′ R′ R V S i S

4

)

−1

)(

R′ r − Rβˆ SUR i

)

∀i

Depending on the specific local polynomial, i.e. second order demand system employed, homogeneity restrictions are implicitly taken care of by normalisation and do not need to be included here

5

(

(

ˆ RSUR = Σ ˆ SUR − Σ ˆ SUR R′ R V ′ ΦV ′ Σ i i i S i S

)

−1

)

ˆ SUR R′ RΣ i

∀i

ˆ SUR is the standard estimate of the covariance matrix of βˆ SUR conditional on the where Σ i i kernel weights.

Step 3. Sample from posterior distribution of the parameters for all observations taking curvature restriction into account

In principle, several algorithms such as Metropolis Hastings, importance or Gibbs sampling are possible. Because speed is a crucial issue for large sample sizes given that we obtain a posterior distribution for each point of evaluation (here observation) and numerical instabilities of importance schemes were troublesome, we use a Gibbs sampler with an accept/reject algorithm. This means we sample from the posterior of the restricted SUR parameters and discard all outcomes that do not satisfy the concavity of the cost function in prices. The resulting samples can be used for further inference in the subsequent step. Please refer to Griffiths (2001) for the details of the approach, using the results from step 2 as a starting point. Note again, that this implies posterior distributions conditional on the kernel weights.

Step 4. Derive final point estimate and covariance matrix of parameter vector for each observation

Using the sample of the posterior distributions generated in the previous step, point estimates and covariance matrices of βi are obtained by the sample mean and the sample covariance matrix.

The results of this approach are illustrated in section 4 for various polynomial specifications.

3 Data The study uses an unbalanced panel data set for North of England from the European Farm Accountancy Data Network covering the period 1989 - 2000. This data set contains 6

information on individual farm records regarding the use of inputs, generation of outputs as well as farm structure information. Regarding the representativeness of the farms within the member states, two limitations apply: Only farms above a certain size are included and farms must be managed on a professional basis.5 Associated with a given farm of the sample is the number of holdings belonging to the same type and economic size class of farm. It can be used to aggregate the results from the individual to a higher regional level. The consistency of the data was checked and implausible observations were deleted during data preparation.6 The following sub-set of dairy farms was chosen for the estimation: Farms that produced milk during the complete time period the farm is present in the sample7, that had milk yields in a plausible range8 and a minimum number of cows9 (at least 10% of regional average herd size). Farms that market their products through (unknown) channels yielding a very high price (farm gate milk price 50% above/below regional price) were suppressed assuming that they operate in a production niche independently from general sectoral price and cost developments. As the marginal costs of all dairy farms in the sector will be determined, despite this, no selection according to the degree of specialization of dairy farming was undertaken. An overview of all variable definitions can be found in Table 1. All indices are calculated as Tornquist–Price indices with base year 1995. All farms in one member state of the EU face the same prices.

5

The European Court of Auditors complained that the economic size threshold of the holdings is not defined in a uniform manner in all member states and that they rely on different indicators to assess the representativeness of the farms in the sample. (Court of Auditors, 2004) 6 The following minimum requirement were defined for inclusion of a farm: some type of land use, at least one output with the respective input quantities, positive labor, and non-activity specific variable input quantities. 7 This does not imply that the farm has to be in the FADN sample over the whole data range (1989-2000). This restriction relates to the dairy production activity of the single farm and shall prevent that farms that quit dairy farming during the observed time period enter the estimation. This is done as the modeling of entry or exit decisions in dairy farming requires a different model formulation and cannot be captured by this approach. 8 Plausible means here in terms of genetically possible and with respect to engineering information in a sensible range. 9 The opposite would indicate that dairy farming is rather a leisure activity than an economic branch of the farm.

7

Table 1 Overview of variable definitions Variables Inputs

Crop specific inputs Price index

Seed, fertilizer, plant protection, other crop specific inputs Tornquist price index

Quantity

Expenditure / Price index

Animal specific inputs Price index

Purchased and home-grown feed, other animal specific expenses, dairy cow stock Tornquist price index

Quantity

Expenditure / Price index

Price index

Costs for machinery, buildings, energy, other direct inputs, paid rents Tornquist price index

Quantity

Expenditure / Price index

Other variable inputs

Outputs

Items/Definition

Crop outputs Price index Quantity

Animal specific outputs

All outputs (value) from crop production except fodder from arable or grass land Tornquist price index Expenditure / Price index

Tornquist price index

Quantity

Expenditure / Price index Gross production with standardized fat and protein content

Price Quantity

Quasifixed factors

Arable crop area Fodder area Grass land Animal stock Labor Depreciation and interests

Source

1995 Euro

EAA/FADN FADN

1995 Euro

EAA/FADN FADN

1995 Euro

EAA/FADN FADN

1995 Euro

EAA/FADN FADN

1995 Euro Tons

EAA/CAPRI10 FADN FADN

Euro/t Tons Hectare

EAA FADN FADN

Hectare Hectare Euro

FADN FADN FADN

Hours Euro

FADN FADN

All outputs (value) from animal production except milk

Price index

Milk

Unit/ Base

All arable crop production activities except fodder from arable land Fodder production on arable land Permanent grass land Stock of animals, except dairy cows, in prices of 1995 All paid and unpaid labor on farm Depreciation of all fixed capital assets, calculated at replacement rate, interest payments for working capital, land and buildings

Note: EAA = Economic Accounts of Agriculture, CAPRI = Regionalized data base of CAPRI modeling system FADN calculates all monetary values in € (EUR/ECU). The conversion to EUR/ECU is done for each Member State and FADN accounting year using the average of the monthly exchange rates as provided by EUROSTAT.

Source: Own compilation.

10

Regional production shares stem from the CAPRI data base (see Britz 2005).

8

Due to the introduction of the milk quota system in 1984, at least one of the outputs of the farm is restricted to the (milk) quota level. The number of dairy cattle is part of the input vector x which implies that intensity adjustments in the production process are represented and the time horizon covered by this cost minimization approach is short to medium term.

4 Results 4.1

Variants of implementation, summary statistics, and average marginal costs

Three variants of the methodology described above are applied here to check the relevance of specific methodological choices in the context of the overall procedure: (1) Employment of the SGM functional form as the second order local polynomial combined with Gaussian kernel; (2) Employment of NQ functional form as the second order local polynomial combined with Gaussian kernel; (3) like last variant, but with Epanechnikov kernel. In addition, results are compared to a SGM parametric SUR estimation. Summary statistics on the different nonparametric specifications are presented in Table 2. Equation fit and number of significant parameters is in a reasonable range for both polynomial specifications. The most notable difference between the two specifications can be found in the probability of curvature holding in the Gibbs procedure, where the SGM specification with around 54% provides a much higher percentage of correct curvature than the NQ formulation.

9

Table 2 Summary statistics of the nonparametric locally weighted regression estimators

Specification of local regression approach

SGM with Gaussian NQ with Gaussian NQ with Epanechnikov

Average over observations Probability of curvature # of sign. R2 holding in Parameters Gibbs iterations XC XA XT XC XA XT 54.23 0.82 0.52 0.15 11 11 11 23.51 0.82 0.81 0.52 4 5 6 24.34 0.84 0.83 0.60 4 5 7

Note: XC = variable crop specific input demand equation; XA = variable animal specific input demand equation; XT = other variable input demand.

R2 = measure for goodness-of-fit; significance of parameters measures at the 10% level using a t-test. Source: Own estimations.

A look at the average marginal costs of the different specifications shows that, as expected, a different choice of the Kernel weighting function does not introduce a significant difference in the level of marginal costs, whereas a change of the polynomial result in slightly different marginal costs with the differences in the marginal cost level for the three year average around 1991 showing a larger variation than for the year 1999. Table 3 Summary statistics of the nonparametric locally weighted regression estimators Specification of local regression approach SGM with Gaussian NQ with Gaussian NQ with Epanechnikov

Marginal cost EUR/t 1991 EUR/t 1999 191.5 145.5 219.3 135.3 210.6 128.4

Note: The years 1991 and 1999 are three year averages of the respective estimates, calculated for all farms within the 95% quantile of the respective marginal cost distribution. For the calculation of regional means of all variables and marginal costs, each farm variable is multiplied by its associated FADN weighting in order to get an approximately true representation (weight) of the farm in the regional sector. Values are in €/t and deflated by the national consumer price indices of the respective years (EUROSTAT 2004).

Source: Own calculations.

These nonparametric locally weighted regression results can be compared with a more traditional iterative SUR estimation procedure for the same data set (see Wieck and Heckelei, 2007; and Wieck, 2005). With the SUR approach, average regional marginal costs were estimated to be 142 €/t for the 1991 average and 139 €/t for the 1999 average.

10

The 1999 estimate is very close the results achieved by the nonparametric approach, but the development of marginal cost over the time span differs considerably. 4.2

Distribution of marginal costs across farms

In addition to the average marginal costs, we want to have a look at the distribution of the estimated marginal costs within the data set. Table 4 presents marginal cost averages for different quantiles of the regional marginal cost distribution for the different estimation procedures. Figure 1 contains the distributions of marginal costs for the different specifications. These distributions are presented as nonparametric kernel density estimates for three year averages around the years 1991 or 1999 of the farm marginal cost estimates. The marginal cost estimate of each farm is multiplied by its associated FADN weight in order to get an approximately true representation of the farm in the regional distribution. A higher farm weight leads to a higher density value indicating that the farm represents a higher share of the farm population of the sector. All marginal cost estimates are deflated by the national consumer price indices of the respective years (EUROSTAT, 2004). The density below each curve sums up to one. Peaks in the curve indicate that a higher number of farms reveal very similar marginal costs and that marginal cost in that region and time period are more balanced. Very concentrated distributions with small tails point to a rather homogenous farm population with little variation in production technology whereas flat distributions and large tails highlight differences in economic behavior and productivity in the farm sector. From the outset, apart from the different mean values, we see rather similar distributions for the two NQ specifications. The nonparametric SGM specifications show not such a distinct peak in the 1999 year as the NQ formulations, but the 1991 distribution look very comparable to the NQ approach. This seems to hold even for the distribution obtained by the traditional SUR estimation.

11

Figure 1 Marginal cost distributions within the region SGM with Gaussian

NQ with Gaussian

Kernel Density Estimate for region 411 S22 (3 yrs average) 0.07

Kernel Density Estimate for region 411 N22 (3 yrs average) 0.07

1991 1999

0.06

0.05 D ensity

D ensity

0.05 0.04 0.03

0.04 0.03

0.02

0.02

0.01

0.01

0

1991 1999

0.06

0

50

100 150 200 Marginal costs (Euro/t)

250

0

300

0

50

Iterative SUR estimation (SGM)

250

300

NQ with Epanechnikov Kernel Density Estimate for region 411 N2epa (3 yrs average)

Kernel Density Estimate for region 411 (3 yrs average)

0.07

0.07 1991 1999 0.06

0.06

0.05

0.05 Density

Density

100 150 200 Marginal costs (Euro/t)

0.04

0.03

1991 1999

0.04 0.03 0.02

0.02

0.01

0.01

0

0 0

50

100

150

200

250

300

0

50

Marginal costs (Euro/t)

100 150 200 Marginal costs (Euro/t)

250

300

Note: All marginal cost values are deflated. All farms within 95% quantile.

Source: Own presentation.

These descriptive results are supported by the calculation of the spread of marginal costs outside and within the 95% quantile as well as within the 75% quantile (Table 4). For the outlier estimates (outside the 95% quantile), we see huge differences in the various nonparametric approaches with estimates in particular in the SGM case that are hard to believe. Only the NQ with the Epanechnikov as the Kernel weighting function seems to offer plausible results for the most extreme observations. Moving closer to the mean of the marginal cost distribution, estimates across the different specifications get closer to each other, with the SGM, and the two NQ specifications still showing considerable difference for the year 1991 in the 95% quantile, whereas for both years, marginal cost estimates of farms in the 75% quantile do not differ significantly any longer. Comparing the nonparametric specifications to the SUR approach, we can see that in particular the nonparametric SGM specification is rather close to the traditional estimation approach.

12

Table 4 Marginal cost distributions within the region SGM 1991 1999 Outside 95% quantile Max Min Spread 95% quantile Max Min Spread 75% quantile Max Min Spread

2228863 1449996 116 61 2228747 1449935

NQ (Gaussian) 1991 1999

NQ (Epan.) 1991 1999

368 -205 573

208 -191 399

277 70 207

197 81 116

SUR estimation 1991 1999

n.a.

n.a.

315 124 191

187 91 96

324 176 148

158 93 65

251 173 79

147 91 56

175 79 96

166 74 92

218 164 53

164 116 48

243 196 47

148 120 28

230 189 41

141 108 33

158 109 49

159 110 48

Note: The 95% (75%) quantile is defined as the interval covering marginal cost values excluding the 2.5% (12.5%) of the farms with the highest and lowest marginal cost values. The spread is calculated as the difference between upper and lower bound of the intervals. All marginal costs are deflated and in €/t.

Source: Own calculations.

4.3

Differences in the tails of the distributions

Given that the last section shows that average level of marginal costs and spread within the midrange of the marginal cost distribution is rather similar across nonparametric specifications, we focus in this section on the difference in the tails of the distributions. The following table identifies the number of farms that are located in different quantiles in the different specifications. For example, if we compare the NQ with Gaussian Kernel weighting with the SGM specification, we see that five out of 12 farms are located in the 75% quantile in both estimations in the year 1991. However, at the same time, one farm switched from the very low end of the distribution (outside the 95% quantile) to the very high end of the marginal cost distribution. At the upper end for the year 1991 in the comparison of Gaussian NQ with the SGM for both categories, we see that one farm moves up one quantile. Similar observations hold for the year 1999. Comparing the NQ/SGM comparison with the comparison of the two NQ specifications, it is obvious that the latter result in more similar ordering of the marginal costs for the different farms. Most farms at the tails of the distribution remain in the same quantile and only very few (three in total) change the quantile.

13

Table 5 Identification of farms in the tails of the marginal cost distribution Number of Compare NQ Gaussian farms in with SGM quantiles Same quantile 1991 lower end outside 95% quantile outside 75% quantile upper end outside 75% quantile outside 95% quantile 1999 lower end outside 95% quantile outside 75% quantile upper end outside 75% quantile outside 95% quantile

3 12 12 3

5 4

4 12 12 4

3 1

Compare NQ Gaussian with NQ Epanechnikov

Change Change Change Change Same of of tails tails quantile quantile quantile 1 1 1 2 2 2

2

2 9 3

1 1

2

2 9 6 2

1

Source: Own calculations.

The last question to answer remains why the farms are estimated to display a specific level of marginal costs. We try to approach this question by displaying farm characteristics for the different quantile categories. This gives a first indication on how differences in farm endowment and input use may contribute to the specific estimated marginal cost level. Table 6 displays the farm characteristics in the different quantile categories with the highest value in each row for the 1991 and 1999 column being highlighted. Comparing the three specifications for the year 1991 it is striking that there seems to be a shift of the largest averages from the upper end of the tail to the lower one between the SGM specification and the two NQ specifications. This would support the finding from the table above where we found that one (very large) farm switched tails in the two specifications. In addition, this may support the hypothesis that the SGM specification is not well suited to handle large “outliers” in the data set. Evaluating the two NQ specifications, as expected from Table 5, we observer a very similar pattern of shaded cells indicating that the same farms are located in the same categories in both specifications.

14

Table 6 Farm characteristics at the tails of the distribution 1991 lower end upper end 75% q. outside outside outside outside 95% q. 75% q. 75% q. 95% q.

1999 lower end upper end 75% q. outside outside outside outside 95% q. 75% q. 75% q. 95% q.

SGM 36,356 80,794 Milk tons 535 Crop sp.input Euro 17,456 Animal sp.input Euro 71,620 Other var.input Euro 70,807 Grass land ha 330 Labor hours 6,195 Stock o.animal Euro 101,096 Arable land ha 132 Depreciation Euro 22,326 Marginal costs Euro/ton 118 No.of farms 3 NQ (Gaussian) Crop output Euro 313,130 Animal output Euro 181,524 Milk tons 1,194 Crop sp.input Euro 175,601 Animal sp.input Euro 192,380 Other var.input Euro 158,416 Grass land ha 31 Labor hours 15,433 Stock o.animal Euro 183,853 Arable land ha 223 Depreciation Euro 146,264 Marginal costs Euro/ton 43 No.of farms 3 NQ (Epanechnikov) Crop output Euro 287,476 Animal output Euro 338,257 Milk tons 1,299 Crop sp.input Euro 147,601 Animal sp.input Euro 257,301 Other var.input Euro 281,065 Grass land ha 33 Labor hours 23,763 Stock o.animal Euro 248,979 Arable land ha 209 Depreciation Euro 164,235 Marginal costs Euro/ton 103 No.of farms 3 Crop output

Euro

Animal output

Euro

42,272 49,864 763 21,894 85,707 51,443 88 8,081 70,157 72 35,841 147 12

24,447 114,982 219,963 33,917 341,233 51,800 16,327 36,241 34,416 95,939 270,439 81,879 132,191 52,082 97,280 257,610 334 463 734 1,489 962 444 460 685 10,660 38,200 42,681 26,412 50,745 12,821 13,081 13,814 44,461 89,006 174,618 182,663 150,028 68,819 83,953 139,459 31,047 54,604 189,336 75,112 104,046 41,840 45,153 58,592 56 54 184 75 123 56 100 84 5,262 7,790 18,047 6,879 8,549 4,540 5,392 7,107 51,756 142,023 268,815 90,731 110,321 50,498 110,809 131,733 32 89 120 87 179 41 51 34 19,171 45,371 90,163 79,532 92,658 25,757 32,737 70,105 190 245 682,582 78 101 147 176 332,929 88 12 3 4 12 91 12 4

33,152 61,685 537 18,624 74,456 45,475 67 7,498 66,766 51 34,604 185 12

19,570 47,450 32,158 81,029 324 480 9,375 17,627 42,559 72,918 28,705 51,514 60 64 5,139 6,039 50,027 142,522 31 54 17,847 30,609 218 262 86 12

40,558 68,010 550 21,576 77,784 45,774 75 7,510 73,917 47 40,850 180 12

23,977 35,476 357 10,970 46,859 32,755 62 5,431 55,576 35 19,712 209 88

46,559 143,693 1,022 43,731 149,069 119,060 192 13,351 228,821 86 48,382 352 3

721,365 34,491 47,759 300,678 111,917 50,922 815 881 487 105,486 17,512 13,653 159,305 130,240 73,175 200,774 57,194 42,458 94 89 55 13,738 6,526 4,633 130,577 87,323 56,161 303 36 43 179,419 60,911 27,492 -15 108 136 4 12 90

110,049 39,071 226,201 256 921 6,131 48,823 47,967 189,054 34,491 95,172 154 23 3,737 9,278 59,509 197,340 43 158 19,401 56,419 152 186 12 4

32,095 46,559 156,664 49,796 45,060 38,281 141,298 332,600 126,167 54,725 299 490 1,015 850 511 9,622 22,513 40,998 18,831 14,149 41,768 120,658 190,304 126,590 78,172 29,855 49,472 96,768 68,018 42,589 61 205 57 101 65 4,741 9,592 9,894 6,749 4,841 68,394 249,205 111,962 103,345 61,341 42 76 95 48 44 13,257 57,950 83,181 62,127 29,560 241 260 84 102 129 12 3 4 12 90

32,101 126,716 28,918 104,628 250 476 5,911 26,295 38,786 96,723 32,105 61,308 70 10 2,809 6,457 34,181 92,940 37 104 10,885 30,065 143 162 12 4

Note: In each quantile category, the average among farms is displayed. Source: Own calculations.

The next interesting point that can be read out of the table is that apparently, farms in the middle range of the marginal cost distribution (75% quantile) are of average size with respect to the milk output and consists of average farm endowment, output, and input use. In particular for the year 1999, larger farms with respect to most indicators, seem to

15

display the lower marginal costs given that most highlighted cells are located on the lower range of the marginal cost distribution. This holds even for the SGM specification (contrary to the 1991 year).

5 Conclusions This contribution illustrated the use of different specifications of local nonparametric regression estimators in the framework of dairy cost estimations. The comparison of results of these different specifications shows that very similar average levels of marginal costs are estimated. Both functional forms used in the polynomials support this finding, though the curvature imposition seems to work better with the Symmetric Generalized Mc Fadden specification. However, the comparison of the distribution of marginal costs across farms and of the farm-specific marginal cost displays slight to significant differences across the different specifications, in particular for the farms with more heterogeneous farm characteristics. Here, the Normalized Quadratic, in particular in combination with an Epanechnikov weighting scheme, seems to handle data outliers more appropriate than the SGM specification. However, this may come at the cost of a loss of appropriate curvature in the functional form specification. The comparison with a more traditional iterative SUR framework in the same context and with the same data set does not reveal substantial differences in the average marginal costs or its distribution across farms. However, the ease in the imposition of curvature in the nonparametric approach compared to often less satisfying curvature results in the traditional approach seems to favour a more flexible nonparametric regression framework. More applications of this promising approach across a broader range of data sets is desirable and may also contribute to further methodological developments and insight in the area of Kernel, bandwidth, or polynomial choice.

6 References Barnett, W.A., J. Geweke, and M. Wolfe. 1991. Seminonparametric Bayesian Estimation of the Asymptotically Ideal Production Model. Journal of Econometrics, 49, No. 1/2, pp.5-50. Britz, W., ed.. 2005. CAPRI Modelling System Documentation, University of Bonn. Available at: http://www.agp.uni-bonn.de/agpo/rsrch/capri/capri-documentation.pdf

16

Burrell, A. 1989. The Microeconomics of Quota Transfer. In Burrell, A. (Ed.), Milk Quotas in the European Community. CAB International, Wallingford, pp. 100-118. Burton, M.P. 1989. Changes in Regional Distribution of Milk Production. In Burrell, A. (Ed.), Milk Quotas in the European Community. CAB International, Wallingford, pp. 163-182. Cathagne, A., H. Guyomard, and F. Levert. 2006. Milk Quotas in the European Union: Distribution of Marginal Costs and Quota Rents. Working Paper 01/2006, European Dairy Industry Model. Colman, D. 2000. Inefficiencies in the UK Milk Quota System, Food Pol. 25, 1-16. Colman, D., M. Burton, D. Rigby, and J. Franks. 1998. Economic Evaluation of the UK Milk Quota System, Final Report to the MAFF, Centre for Agricultural, Food and Resource Economics, School of Economic Studies, University of Manchester. EUROSTAT. 2004. Agricultural Statistics, Luxembourg. Fan, J., and I. Gijbels. 1996. Local Polynomial Modelling and its Applications. Chapman&Hall. London. Froelich, M. 2007. Nonparametric regression for binary dependent variables, forthcoming Econometrics J. Gallant, A. R. and G.H. Golub. 1984. Imposing Curvature Restrictions on Flexible Functional Forms. Journal of Econometrics, 26, pp.295-322. Guyomard, H., X. Delache, X. Irz, and L.-P. Mahé. 1996. A Microeconometric Analysis of Milk Quota Transfer: Application to French Producers, J. Agric. Econ. 47, 206223. Gozalo, P., and O. Linton. 2000. Local nonlinear least squares: Using parametric information in nonparametric regression. Journal of Econometrics 99: 63-100. Griffiths, W. E. 2001. Bayesian Inference in the Seemingly Unrelated Regression Model. Working paper. Economics Department, University of Melbourne, Australia. Mittelhammer, R.C., G.G. Judge, and D.J. Miller. 2000. Econometric Foundations. Cambridge University Press, UK. Moro, D., M. Nardella, and P. Sckokai. 2005. Regional Distribution of Short-run, Medium-run, and Long-run Quota Rents across EU-15 Milk Producers. Selected paper at the XIth congress of EAAE, Copenhagen. Pagan, A., and A. Ullah. 1999. Nonparametric Econometrics. Cambridge University Press, UK. Simonoff, J.S. 1996. Smoothing methods in statistics. Springer, New Cork. Wieck, C. 2005. Determinants, distribution, and development of marginal costs in dairy production: An empirical analysis of dairy production regions in the EU. PhD Dissertation University of Bonn, Shaker Press: Aachen.

17

Wieck, C., and T. Heckelei. 2006. The estimation of nonparametric cost functions in dairy production. Unpublished manuscript. Institute for Food and Resource Economics, University of Bonn. Wieck, C., and T. Heckelei. 2007. Determinants, differentiation, and development of short-term marginal costs in dairy production: an empirical analysis for selected regions of the EU. Agricultural Economics 36: 201-218.

18