Solutions to the residual and zero-value problems in ... - CiteSeerX

4 downloads 0 Views 112KB Size Report
pollution intensity and the sector-wise fuel-structure. The main interest lies in the time development of these different factors (or indices) capturing the different ...
Solutions to the residual and zero-value problems in energy decomposition Adrian Muller∗†‡ Abstract: This topical note shows how the problems of values equal zero and the residual in energy decomposition can be resolved. The ultimate cause of these problems lies in approximation of integrals and derivatives and illdefined division by zero. Reference to that identifies promising approaches to improve decomposition methods. Keywords: Decomposition, Index numbers, Divisia Index, Energy consumption JEL: C63, Q41, Q5



Environmental Economics Unit (EEU), Department of Economics, G¨oteborg University, PO Box 640, 40530 G¨ oteborg, Sweden; phone: 0046-31-773 47 59; fax 0046-31-773 41 54; e-mail: [email protected] † Center for Corporate Responsibility and Sustainability CCRS, University of Z¨ urich, K¨ unstlergasse 15a, 8001 Z¨ urich, Switzerland; phone: 0041-44-634 40 62; e-mail: [email protected] ‡ Many thanks to ˚ Asa L¨ ofgren for helpful remarks. The usual disclaimer applies.

1

1

Introduction

Decomposition of energy use, energy intensity and pollution to identify different drivers for their evolution is an important input for policy makers. An example are industrial SO2 emissions P (t) in a country for a year t. These can trivially be written as P =

X Pij Eij Ei Qi Q, Eij Ei Qi Q ij

(1)

where, each for the year t, Ei (t) is the energy use in sector i, Qi (t) sector i output, Q(t) total output, Pij stands for emissions from the use of fuel j in sector i, and Eij denotes the share of fuel j in sector i energy use. Total emissions are thus decomposed into factors referring to sector-wise technological progress (energy use per output), structural change (sector output share in total output) and size effects (total output), as well as sector- and fuel-wise pollution intensity and the sector-wise fuel-structure. The main interest lies in the time development of these different factors (or indices) capturing the different driving forces. The literature on further theoretical development of decomposition and its application to energy use and pollution still grows (Cole et al., 2005; Boyd and Roop, 2005; Ang and Choi, 2003; for a survey: Ang, 1995). Results often depend on the method applied and the performance of a method varies with different data sets. Furtheron, the presence of a residual that remains unexplained in most decomposition and the problem with zero-values or negative values in the data call for improvements. New proposals for decomposition methods address these issues and the performance of a method is often measured by the value of its residual, ideally bringing this down to zero, and by the method’s ability to handle zero and negative values (Ang 2004, Ang and Liu in press b). Here, I focus on the fact that the problem of residuals is an aspect of the underlying mathematics of decomposition, which is integral and derivative approximation (Trivedi 1981). Non-zero residuals are thus unavoidable and forcing such to be zero necessarily leads to erroneous results. The problem with zero-values, on the other hand, stems from careless division by zero in the course of the derivation of the decomposition and can be avoided by taking this into account. Negative values, finally, cause a problem for methods based on taking logarithms of the variables of interest. Again, the best solution is to avoid such ill-defined operations right from the beginning. The next section presents the general mathematics of decomposition and relates it to approximation of integrals and derivatives, section 3 links this 2

to decomposition in the energy and pollutant related literature and section 4 discusses the zero value problem and the residual in particular. Section five concludes.

2

The mathematics of decomposition

The general decomposition of an aggregate like industrial SO2 emissions into factors referring to two sets of sub-levels (such as “sectors” and “fuel type”) reads as follows (larger number of levels are treated similarly): X XY Y P (t) = Pij (t) = Akij (t) Bil (t)C(t), (2) ij

where Pij =

Q

k

Akij

Q

l

ij

∂t

k0

+

l

Bil C. The derivative with respect to t is: X  ∂Akij0 Y

Xn dP = dt ij

k

XY l0

Akij

k6=k0

∂t

Pij Pij

Multiplying each summand by

Bil C



(3)

l

l0 k ∂Bi Aij

k

Y

Y

Bil C



+

Y

l6=l0

Akij

k

Y

∂C Bil ∂t

l

o .

gives

0

0

X n X ∂Akij 1 X ∂B l 1 dP ∂C 1 o i = Pij . + + 0 0 dt ∂t Akij ∂t Bil ∂t C ij k0 l0

(4) 0

The parts in the first sum capture the effect of each variable Akij on the 0 total, correspondingly the second and third summands for Bil and C. The interest lies in the time development of this decomposition. Having only discrete data (on an annual, quarterly or monthly basis, usually), differences have to be investigated instead of the derivative:

∆Pt+1,t := Pt+1 − Pt

0

 ∂Akij 1  Pij = 0 dτ (5) ∂τ Akij t t Z t+1 X Z t+1  ∂B l0 1   ∂C 1  o i + Pij Pij dτ . 0 dτ + ∂τ Bil ∂τ C t t l0 Z

t+1

XnX dP dτ = dτ ij k0

Z

t+1

These integrals cannot be calculated as data is only available for the discrete points t ∈ {1, ..., n} and the integrand is unknown for other values t. The challenge of decomposition is therefore to find best approximations for 3

Z

t+1

P (τ )

J := t

∂F (τ ) 1 dτ, ∂τ F (τ )

(6)

where P and F are known at the discrete points t = 1, ...n only. To relate to above, ‘P ’ stands for ‘Pij ’, ‘F ’ for any of the functions ‘Akij ’, ‘Bil ’ or ‘C’. Besides integrals, the unknown derivatives have to be approximated. Data availability requires all approximations to be based on the values at t = 1, ..., n. The most simple integral approximation is based on Riemannian sums, i.e. on replacing the integral by the value of the integrand at the right/left-hand-side boundary multiplied with the difference in t between two subsequent data-points (which equals 1 and is thus dropped in the following): ∂F (τ ) 1 J ≈ P (t + 1) F (t + 1) ∂τ τ =t+1

and

1 ∂F (τ ) J ≈ P (t) F (t) ∂τ τ =t

(7)

A combination of the right/left-hand based approach is also common (the so-called trapezoid method): 1 1 ∂F (τ ) o 1n ∂F (τ ) P (t + 1) + P (t) J≈ 2 F (t + 1) ∂τ τ =t+1 F (t) ∂τ τ =t

(8)

In classical approximation with Riemannian sums, the limit ∆t → 0 would then be taken and all these approaches are equivalent. Here, however, where data is available for discrete points only, this procedure is impossible. The results from the different approximations thus usually differ and other approximation methods from numerical analysis could also be employed leading to potentially different results. Based on first-order Taylor approximation, the most common approach to approximate derivatives replaces them with the boundary values divided by the difference in t (equalling 1, cf. above). Approximating from the left/right, respectively, gives ∂F |τ =t ≈ [F (t) − F (t − 1)] resp. ≈ [F (t + 1) − F (t)]. (9) ∂τ With the trapezoid method, i.e. approximating by a straight line connecting the data, the choice of a right-hand derivative approximation at the lower boundary and a left-hand at the upper may be motivated: 1 n P (t + 1) P (t) o J≈ + [F (t + 1) − F (t)] 2 F (t + 1) F (t) 4

(10)

For both integral and derivative approximation, restricted data availability makes it impossible to achieve increased accuracy by using higher-order Taylor expansion terms and subsequently decreasing the time-differences to reduce the higher-order errors. 1 ∂x(t) Employing x(t) = ∂ ln∂tx(t) in eq. (4) leads to equivalent and frequently ∂t used alternative formulae. A preference for this alternative logarithmic or the original form regarding derivatives cannot be motivated without reference to the properties of the underlying functions. Before approximating, the two formulations are equivalent, while after approximation, they likely differ. I refer to the original form in the following, but the same discussion can be led for the logarithmic form.

3

Decomposition in the energy and pollution related literature

This literature usually does not discuss decomposition in the light of approximations. This has led to some problems and confusion. First, inspired by equations (7) and (8), the problem has become to be seen as a problem of choosing the right weights for the two known endpoints in approximating the different contributions. The following expression has thus been suggested for the integral, replicating an approximation to the general Divisia index (Boyd et al. 1988): n P (t)

 P (t + 1)

P (t) o +α − [F (t + 1) − F (t)], F (t) F (t + 1) F (t)

with α ∈ [0, 1]. (11)

Choosing α = 0/1 gives the left/right-hand boundary approximation with the mixed approximation for the derivative (replicating the Laspeyres and Paasche Index), α = 12 gives the trapezoid approximation (the MarshallEdgeworth Index). Any value of α can be seen as a certain choice of weights given to the left/right boundary in the integral approximation, lacking sound basis from integral and derivative approximation theory, though, especially as mixed with unweighted approximations for the derivatives, although those are part of the integrand. These formulae replicate the additive techniques common in the literature, as presented in Ang (1995). Similar reasoning applies to the multiplicative approach presented there, being the exponentiated form of the additive approach. Usually, decomposition is also discussed separately for different combinations of the variables of interest although the formal basis is the same (Ang 1995). 5

Against this background of perceiving the search for an optimal decomposition rather as a problem of weights than of approximations, the development of more flexible, “optimal” weighting schemes based on some basic principles has been a natural path for improvements. The adaptive weighting (Liu et al., 1992) for example builds on the claim that the original and the logarithmic approach to decomposition are mathematically equivalent. This motivates setting equal the original (eq. (11)) and corresponding logarithmic approximations for different choices of the variables P and F , which usually uniquely determines α. The mathematical equivalence claimed here however does not hold for these equations, where the derivatives have been replaced by approximations. This method thus lacks sound theoretical underpinning. Other flexible weighting approaches are the refined Divisia method from Ang and Choi (1997), or the logarithmic mean Divisia Index (Ang 2004) based on a specific choice of weights given by the so-called “log-mean”. Again, there is no motivation from integral and derivative approximation to use these weights, though. A big advantage, however, as the authors claim, is that these weights solve the problem of zero-values and residuals (see the next section), and this is the main motivation to propose this particular weighting scheme. A zero residual is also the main advantage claimed of Boyd and Roop (2005) for their Fisher-Index based approach, which also forces the residual to be zero by definition rather than by reference to exact approximation, though.

4

Zero-values and the residual

Zero-values and the residual are seen as two main problems of decomposition in the energy and pollution related literature and also in the context of general price and quantity indices. P The problem of zeroes arises multiplying the terms in eq. (3) by Pijij , Q Q where Pij = k Akij l Bil C. Thus, if any of the factors Akij , Bil or C equals zero, Pij equals zero and this operation leads to division by zero (or, for the logarithmic form, taking logarithms of zero). The total result nevertheless remains well-defined, due to the corresponding factor Pij from the numerator. Zeroes occur for example for different fuel types (cf. eq. (1)) when some sectors phase-out or have not yet started to use certain fuels. To mathematically correctly deal with this situation, it is necessary to drop from Pij the factors F that are zero and thus to not divide by those either. This does not result in any information loss, as the factor F is still present in the derivative. This then reads, cf. (6),

6

Z

t+1

P6F (τ ) t

∂F (τ ) dτ, ∂τ

(12)

where P6F stands for P with any factor F = 0 left out, i.e. for the product Q kˆ Q ˆl ˆ ˆ Aij ˆ k l Bi C, where the “ˆ” indicates to drop any factor that is zero. The logarithmic form reads (the term in square brackets is always well defined):   Z t+1 ∂ ln F (τ ) dτ. (13) P6F (τ ) F (τ ) ∂τ t Another problem in decomposition are residuals - large residuals are seen as a sign of a failed decomposition. The residual refers to the difference of the exact left-hand side in eq. (5) to the approximated integrals on the right-hand side after employing formulae such as (7) to (10). As each of these summands is an approximation, each thus contributes to the overall residual. These partial residuals can have positive or negative signs and it is unlikely that they sum up to a zero overall residual. Forcing the overall residual to be zero may even impose larger than necessary residuals onto the single terms. To finally cancel out, the single term residuals have partly to be of opposite sign. This thus may even increase over- or underestimation of the relative importance between some of the single effects captured in the different terms. Clearly, results with large overall residuals lack explanatory power. Trying to achieve zero residuals is no viable path, though, as the approximations that lie at the basis of decomposition necessarily lead to some non-zero residual. Some terms may even have small residuals while one term exhibits a larger one, thus leading to a relative large overall residual but nevertheless estimating most of the partial effects quite exactly. To solve this problem, means to assess the quality of the approximations in the single parts of the decomposition are needed rather than means to tackle and minimize the overall residual. Negative values, as they occur in some pollution emission data sets, cause a problem for certain decomposition methods only, where the weights involve operations not defined for negative values. This is the case for the logarithmic mean Divisia Index, for example (Ang and Liu, in press b). As with zero values, the only correct approach is to avoid the problem right from the beginning, referring to the original integral approximation and refraining from application of ill-defined methods.

7

5

Conclusion

Recalling the basis of decomposition methods in approximation, the problem of zeroes has been solved and the role of the residual has been clarified. This also indicates directions for improvement. Optimal approximation should use all available information on the underlying functions. This might be more than given by the data points alone. Information on seasonal patterns or macroeconomic shocks may suggest or rule out certain shapes for the functions and their derivatives to be approximated in each interval. A general observation, for example, is, that the quantities in the integrands are one-period back average or sum values. Referring to energy use, the data point E(2005) is total energy used in the whole year 2005. Similarly, an intensity I(2005) is the average intensity for the same period. The unknown data E(τ ) and I(τ ) thus refer to the sum or average for [τ − 1, τ ]. This special structure of the integrand makes it somewhat smoother than the underlying function and this might suggest a certain method especially adequate for the approximation in the case at hand. Any improvement of index numbers and decompositions should be based on properties of approximations of the relevant integrals and derivatives only. There is no motivation to address the problem as a weighting problem. It is neither motivated to use a zero overall residual as a criterion to discern the optimal decomposition, as the approximation is undertaken in the single terms. The problem with zero-values, finally, should not be addressed by infinitesimal quantities and assessment of the convergence properties of the different methods under taking limits (Ang and Liu, in press a). It should be avoided from the beginning by avoiding any division by or taking logarithms of zero. The same applies to negative values.

References Ang, B.W. (1995). “Decomposition methodology in industrial energy demand analysis.” Energy 20(11): 1081-1095. Ang, B.W. (2004). “Decomposition analysis for policymaking in energy: which is the preferred method?” Energy Policy 32: 1131-1139. Ang, B.W., K.-H. Choi. (1997). “Decomposition of aggregate energy and gas emission intensities for industry: a refined Divisia approach.” The Energy Journal 18(3): 59-73.

8

Ang, B.W., N. Liu. (in press a). “Handling zero values in the logarithmic mean Divisia index decomposition approach.” Energy Policy: in press. Ang, B.W., N. Liu. (in press b). “Negative value problems of the logarithmic mean Divisia index decomposition approach.” Energy Policy: in press. Boyd, G.A., D.A. Hanson, T. Sterner. (1988). “Decomposition of Changes in Energy Intensity - A Comparison of the Divisia Index and Other Methods.” Energy Economics 10(4): 309-312. Boyd, G.A., J.M. Roop. (2005). “A note on fisher ideal index decomposition for structural change in energy intensity.” The Energy Journal 25(1): 87-101. Liu, X.Q., B.W. Ang, H.L. Ong. (1992). “The application of the Divisia index to the decomposition of changes in industrial energy consumption.” The Energy Journal 13(4): 161-177. Trivedi, P.K. (1981). “Some discrete approximations to Divisia integral indices.” International Economic Review 22(1): 71-77.

9