Bounds on the Estimation Error in the Chain Ladder Method

1 downloads 0 Views 183KB Size Report
May 14, 2007 - In this paper we consider the chain ladder (CL) method for claims ... There are several stochastic models that support the CL method (for an ...
Bounds on the Estimation Error in the Chain Ladder Method Mario V. W¨ uthrich∗, Michael Merz§, Hans B¨ uhlmann∗ May 14, 2007

Abstract Buchwalder et al. [2] have illustrated that there are different approaches for the derivation of an estimate for the estimation error in the chain ladder reserving method. In this paper we demonstrate that these approaches give estimates that are for typical parameters close to each other. This is done by proving upper and lower bounds.

1

Introduction

In this paper we consider the chain ladder (CL) method for claims reserving in non-life insurance. We assume that the CL model assumptions (Model Assumptions 1.1 below) are satisfied. Then we analyze the uncertainties in the CL predictions. There are two sources of uncertainties: i) the process error, which derives from the fact that we consider stochastic time series; ii) estimation error, which comes from the fact that we have to estimate the true model parameters. Recent discussions (ASTIN Bulletin 36/2, 2006) have shown that it is very important to clearly define the understanding of the estimation error. Basically, it derives from the fact that the true parameters of the underlying model are not known and need to be estimated from the data in the upper left triangle (observations). ∗ §

ETH Zurich, Department of Mathematics, CH-8092 Zurich, Switzerland University T¨ ubingen, Faculty of Economics, D-72074 T¨ ubingen, Germany

1

The main goal of the reserving actuary then is to quantify the appropriateness of these estimates. There are several stochastic models that support the CL method (for an overview of the models we refer to England-Verrall [3] and Taylor [9]). In the present work we concentrate on the distribution-free CL model (see Mack [6]) and its corresponding time series version (see Murphy [8], Barnett-Zehnwirth [1] and Buchwalder et al. [2]). Even having fixed the CL model, there are still different methodologies to derive an estimate for the estimation error. These different approaches have led to an extensive discussion about the understanding of the estimation error (see Buchwalder et al. [2], Mack et al. [7], Gisler [5] and Venter [10]), and there is no a final answer which methodology should be used. The aim of the present paper is to show that in the average all the different methodologies are close to each other (for typical non-life insurance data sets). This means that for practical applications one does not really need to bother about the methodology used. From this point of view, the present paper does not give new techniques that can immediately be applied in practice, it rather gives the background information that the different approaches lead to similar estimates for the estimation error. The main statements contain lower and upper bounds on estimation errors (see Theorems 3.2 and 3.3). Moreover, we show that the different methodologies lie in the average within these bounds, and we also prove a conjecture by Mack-Quarg-Braun [7] (see Theorem 3.3) that has been proved for the special case of two factors by the same authors (see [7], Section 6).

1.1

Time series version of the CL method

We define the time series version of the CL model, which uses stronger assumptions than the classical distribution-free CL model considered in Mack [6]. The time series version of the CL model has already been studied in various papers, see e.g. Murhpy [8], BarnettZehnwirth [1] or Buchwalder et al. [2]. We assume that Ci,j denote cumulative claims for accident year i ∈ {0, . . . , I} in development period j ∈ {0, . . . , J}.

2

Usually, we have observations D = {Ci,j ; i + j ≤ I} ,

(1.1)

and we want to predict Ci,j for i + j > I. We define for j ≤ J Bj = {Ci,l ; l ≤ j} ,

(1.2)

these are the cumulative claims up to development period j. Model Assumptions 1.1 The cumulative claims Ci,j satisfy the following assumptions: • Cumulative claims Ci,j of different accident years i ∈ {0, . . . , I} are independent. • There exist constants fj > 0, σj > 0 and random variables εi,j such that for all i ∈ {0, . . . , I} and j ∈ {1, . . . , J} we have that Ci,j = fj−1 · Ci,j−1 + σj−1 ·

p Ci,j−1 · εi,j ,

(1.3)

  where conditionally, given B0 , εi,j are independent with E [εi,j | B0 ] = 0, E ε2i,j B0 = 1 and P [Ci,j > 0| B0 ] = 1 for all i ∈ {0, . . . , I} and j ∈ {1, . . . , J}. Remarks 1.2 • fj are called CL factors, link ratios or age-to-age factors. This is the object of central interest in the CL reserving method. • Observe that εi,j is defined conditionally, given B0 , in order to ensure that our cumulative claims Ci,j stay positive, P [·| B0 ]-a.s. This may, at the first sight, look slightly artificial. However, one needs such an assumption in order to obtain a mathematically correct model (see criticism in Mack et al. [7]). • In Lemma 2.1 below, it is shown that Model Assumptions 1.1 imply the assumptions of the distribution-free CL model by Mack [6].

Organisation of this paper. In Section 2 we provide the CL estimate for the ultimate claim. In Section 3 we define the mean square error of prediction (MSEP) and we give upper and lower bounds for the MSEP in the conditional approach. In Section 4 we compare the bounds to the existing results, provide a numerical example and give the conclusions. Finally, in Appendix A we prove our results. 3

2

Model properties and claims reserving

Model 1.1 satisfies the assumptions of the distribution-free CL model (Mack [6]): Lemma 2.1 Under Model Assumptions 1.1 we have: (Ci,j )j≥0 are Markov chains with E [Ci,j | Bj−1 ] = E [Ci,j | Ci,j−1 ] = fj−1 · Ci,j−1 ,

(2.1)

2 Var (Ci,j | Bj−1 ) = Var (Ci,j | Ci,j−1 ) = σj−1 · Ci,j−1 .

(2.2)

This easily gives the following conditionally expected ultimate claim Ci,J , given the information D (see Theorem 1 in Mack [6]): Lemma 2.2 Under Model Assumptions 1.1 we have E [Ci,J | D] = E [Ci,J | BI−i ] = E [Ci,J | CI−i ] = Ci,I−i ·

J−1 Y

fj .

(2.3)

j=I−i

Henceforth, for known CL factors fj we can easily calculate the conditionally expected ultimate claim Ci,J , given D. However, in almost all practical examples the CL factors fj are not known and need to be estimated from the information D. We therefore consider the individual development factors (random variables) Ci,j+1 . Ci,j

Fi,j+1 =

(2.4)

Observe that Fi,j+1 is conditionally, given Bj , an unbiased estimator for fj with E [Fi,j+1 | Bj ] = E [Fi,j+1 | Ci,j ] = fj

and

Var (Fi,j+1 | Bj ) = σj2 /Ci,j .

(2.5)

This implies that the classical estimator for the CL factor I−j−1

fbj =

Ci,j

X

PI−j−1 i=0

i=0

PI−j−1

Ci,j+1 Fi,j+1 = Pi=0 I−j−1 Ci,j Ci,j i=0

(2.6)

is conditionally, given Bk (k ≤ j), an unbiased estimator for fj . Moreover it has minimal conditional variance among all unbiased linear combinations of Fi,j+1 , i ≤ I − j − 1 (see e.g. Proposition 12.1 in Taylor [9]). It is then straightforward to estimate E [Ci,J | D], given the information D, by bi,J = Ci,I−i · C

J−1 Y j=I−i

4

fbj .

(2.7)

Next, we would like to state that the estimator in (2.7) is an unbiased estimator for E [Ci,J | D], given the information BI−i . Therefore we need that the estimators fbj are conditionally uncorrelated (see Theorem 2 in Mack [6]): Lemma 2.3 Choose k ≤ j < l ≤ J. Then under Model Assumptions 1.1 we have that i h i h i b b b E fj · fl Bk = fj · fl = E fj Bk · E fbl Bk . h

(2.8)

Of course Lemma 2.3 implies also that fbj are unconditionally uncorrelated. We have the following corollary: bi,J Corollary 2.4 Under Model Assumptions 1.1 we have: Conditionally, given BI−i , C is an unbiased estimator for E [Ci,J | D], i.e. J−1 i Y b fj = E [Ci,J | D] . E Ci,J BI−i = Ci,I−i ·

h

(2.9)

j=I−i

Proofs Section 2. The proof of Lemma 2.1 is straightforward. The proofs of Lemmas 2.2, 2.3 and Corollary 2.4 can be found in Mack [6]. 2

3

Mean square error of prediction

Corollary 2.4 tells that we should estimate the conditional expectation of Ci,J , given D, bi,J . This estimator C bi,J is also used to predict the random variable Ci,J at time I. by C bi,J is used as a predictor for Ci,J , given the information This means that the estimator C D. Our goal is to study the error terms of this prediction. We define the (conditional) mean square error of prediction (MSEP) as follows    2  b b MSEPCi,J |D Ci,J = E Ci,J − Ci,J D . bi,J is D-measurable, this conditional MSEP decomposes as follows Since C    2   2 b b MSEPCi,J |D Ci,J = E Ci,J − E [Ci,J | D] D + Ci,J − E [Ci,J | D] . {z } | {z } | process variance

5

estimation error

(3.1)

(3.2)

The conditional process variance Var (Ci,J | D) = E



Ci,J

2  − E [Ci,J | D] D originates

from the stochastic movement of the process Ci,J . It can explicitly be calculated (see Mack [6] on page 218, and Buchwalder et al. [2] formula (4.6)). Hence we concentrate on the estimation error (second term in (3.2)). The estimation error comes from the fact that we have estimated the true CL factors fj by fbj , given the information D, henceforth 

bi,J − E [Ci,J | D] C

2

2 = Ci,I−i ·

J−1 Y

fbj −

j=I−i

J−1 Y

!2 fj

.

(3.3)

j=I−i

In order to determine the estimation error we would like to calculate the right-hand side of (3.3). Since the true CL factors fj are not known this term can not be explicitly calculated. Therefore one tries to estimate the fluctuations of the estimators fbj around fj . There are different methodologies to determine these fluctuations. In general, one resamples the time series and then evaluates the volatility of these resampled values. Since we consider dependent time series there are different ways to resample observations, conditional and unconditional ones (see e.g. Approaches 1-3 in Buchwalder et al. [2]). In the present paper we want to study the fluctuations in the unconditional version (which corresponds to Approach 1 in Buchwalder et al. [2]): Hence we would like to calculate the average estimation error given by (which then is used as an estimate for (3.3))     !2 !2 J−1 J−1 J−1 J−1 Y Y Y Y 2 2 ·E fbj − fj BI−i  . · E  Ci,I−i fbj − fj BI−i  = Ci,I−i j=I−i j=I−i j=I−i j=I−i (3.4) Using the conditional uncorrelatedness (see Lemma 2.3) and unbiasedness (see (2.6)) of fbj , we find that 

 !2 E fbj − fj BI−i  = Var j=I−i j=I−i " J−1 Y

J−1 Y

= E

! fbj BI−i j=I−i # J−1 J−1 Y Y fbj2 BI−i − fj2 . J−1 Y

j=I−i

(3.5)

j=I−i

Hence the main difficulty now is to calculate the first term on the right-hand side of (3.5), in order to give an estimate for (3.3). Unfortunately, this can not be done explicitly, hence we provide lower and upper bounds in the next subsections. 6

3.1

Bounds for the estimation error (single accident years)

Our goal in this subsection is to approximate the first term on the right-hand side of (3.5), i.e.

" E

J−1 Y j=I−i

# fbj2 BI−i .

(3.6)

In Lemma 2.3 we have proved that the estimates of the CL factors are conditionally uncorrelated, but however, these estimates are not independent as the following corollary shows. Mack et al. [7], Section 5, provide the proof to the following statement. Corollary 3.1 (Negative correlations) Under Model Assumptions 1.1 we have that   2 Cov fbj−1 , fbj2 Bj−1 < 0.

(3.7)

Proof. Applying Jensen’s inequality yields the claim, for more details we refer to [7]. 2 Hence successive squared estimators fbj2 are negatively correlated, which implies that fbj , j ≥ 0, are not independent. Therefore the product in (3.6) can not be explicitly calculated. Our goal is to give upper and lower bounds for (3.6) and (3.5), respectively. We define for j ≤ J and k ≤ I [k] Sj

=

k X

Ci,j .

(3.8)

i=0

Henceforth, we can rewrite fbj as follows [I−j−1]

fbj =

Sj+1

[I−j−1]

.

(3.9)

Sj

Theorem 3.2 (Lower bound) Under Model Assumptions 1.1 we have that   !2 J−1 J−1 J−1 J−1 Y Y Y X σj2 /fj2 2 2 2 b   h i . (3.10) Ci,I−i ·E fj − fj BI−i ≥ Ci,I−i · fj · [I−1−j] BI−i j=I−i j=I−i j=I−i j=I−i E Sj Theorem 3.2 gives a lower bound on (3.3) in a closed form. For the upper bound we obtain the result which was conjectured by Mack et al. [7] and proved by the same authors in the special case (J − 1) − (I − i) = 1, i.e. in the case of two development factors (see Section 6 of [7]). 7

Theorem 3.3 (Upper bound) Under Model Assumptions 1.1 we have that   !2 J−1 J−1 Y Y 2 Ci,I−i ·E fbj − fj BI−i  j=I−i j=I−i ! # ! " J−1 J−1 Y Y σj2 2 2 2 + fj BI−i − fj . ≤ Ci,I−i · E [I−1−j] Sj j=I−i j=I−i

(3.11)

Observe that, in general, this upper bound can not be calculated explicitly. For a numerical value one needs to apply simulation or bootstrap techniques. The proofs of Theorems 3.2 and 3.3 are provided in Appendix A.

3.2

Aggregation of accident years

If we aggregate different accident years we need to study the conditional MSEP of the estimator

I X

bi,J , C

(3.12)

i=0

bi,J = Ci,J for i + J ≤ I. With (4.30)-(4.34) in Buchwalder et al. [2] we find that with C ! I I   X X b b P MSEP Ii=0 Ci,J |D MSEPCi,J |D Ci,J (3.13) Ci,J = i=0

i=I−J+1

+ 2·



X

bi,J − E [Ci,J | D] C



 bk,J − E [Ck,J | D] . C

I−J+1≤k 0,

(A.19)

and for k ≥ 2 we have that ∂ ∂εm,I−i+1

Cm,I−i+k =

k Y

0 Cm,I−i+l (Cm,I−i+l−1 ) · σI−i ·

p Cm,I−i .

(A.20)



(A.21)

l=2

Moreover we have for l ≥ 2 that 0 Cm,I−i+l (x) =

∂ ∂x

fI−i+l−1 · x + σI−i+l−1 ·



x · εm,I−i+l

1 √ · εm,I−i+l 2· x  fI−i+l−1 √ 1 · fI−i+l−1 · x + σI−i+l−1 · x · εm,I−i+l + . = 2x 2 = fI−i+l−1 + σI−i+l−1 ·

Henceforth, this implies for l ≥ 2 that 0 Cm,I−i+l (Cm,I−i+l−1 ) (A.22)   p fI−i+l−1 1 · fI−i+l−1 · Cm,I−i+l−1 + σI−i+l−1 · Cm,I−i+l−1 · εm,I−i+l + = 2Cm,I−i+l−1 2 Cm,I−i+l fI−i+l−1 = + . 2Cm,I−i+l−1 2

Under Model Assumptions 1.1 we know that Cm,I−i+l−1 and Cm,I−i+l are positive, P [·|B0 ]a.s., and that fI−i+l−1 > 0. This immediately implies that 0 Cm,I−i+l (Cm,I−i+l−1 ) > 0,

P [·|B0 ]-a.s.

(A.23)

This proves that Cm,I−i+k is a strictly increasing function in εm,I−i+1 for k ≥ 1. Consider now for j ∈ {I − i + 1, . . . , J − 1} I−i−j [I−i−j] Sj

X

=

Cn,j .

(A.24)

n=0 [I−i−j]

Hence for m ≤ I − i − j we have that Sj [I−i−j]

moreover Sj that

is a strictly increasing function of εm,I−i+1 ,

is independent of εm0 ,I−i+1 if m0 > I − i − j. But this immediately shows J−1 Y

σj2

j=I−i+1

[I−1−j] Sj

18

! + fj2

(A.25)

is a decreasing function of εm,I−i+1 for m ∈ {0, . . . , I}. Hence the function −g(ε) defined in (A.16) is a coordinatewise increasing function. This implies that we can apply the FKG inequality to the right-hand side of (A.14) which gives ! " " J−1 # #  2 Y σj2 1 [i−1] + fj2 BI−i+1 BI−i SI−i+1 · E (A.26)  2 · E [I−1−j] [i−1] Sj j=I−i+1 SI−i " J−1 ! #   2 Y σj2 1 [i−1] + fj2 BI−i ≤ 2 · E SI−i+1 BI−i · E [I−1−j] [i−1] S j j=I−i+1 SI−i ! " J−1 ! # 2 Y σj2 σI−i 2 2 + fI−i · E + fj BI−i . = [i−1] [I−1−j] S S I−i

j

j=I−i+1

This finishes the proof of Theorem 3.3. 2

A.3

Proof of the covariance statement

Proof of Theorem 3.4. Choose k < i. Then we have that # I−k−1 ! "I−k−1 J−1 J−1 J−1 J−1 Y Y Y Y Y Y 2 b b b b fj2 fj · fj BI−i − fj BI−i = E fj · Cov fj , j=I−i j=I−i j=I−i j=I−k j=I−k j=I−k # # I−k−1 "I−k−1 " J−1 J−1 Y Y Y Y =E fbj2 BI−k BI−i − fj · fj2 . (A.27) fbj · E j=I−i

j=I−i

j=I−k

j=I−k

Now we can apply Theorems 3.2 and 3.3 to find the following lower bound and upper bound, respectively,   I−k−1 J−1 J−1 Y Y X E fbj · fj2 ·  j=I−i

j=I−k

  I−k−1 J−1 Y Y h i + 1 BI−i  − f · fj2 , j [I−1−j] BI−k j=I−i j=I−k E Sj j=I−k σj2 /fj2

(A.28) E

"I−k−1 Y j=I−i

" fbj · E

J−1 Y j=I−k

σj2 [I−1−j] Sj

! # # I−k−1 J−1 Y Y 2 + fj BI−k BI−i − fj · fj2 . j=I−i

j=I−k

The proof of the lower bound now follows from Jensen’s inequality similar as in the proof of Theorem 3.2. The proof of the upper bound follows from the FKG inequality similar as in the proof of Theorem 3.3. This finishes the proof of Theorem 3.4. 2 19

References [1] Barnett, G., Zehnwirth, B. (2000). Best estimates for reserves. Proc. CAS, Vol. LXXXVII, 245-321. [2] Buchwalder, M., B¨ uhlmann H., Merz, M., W¨ uthrich, M.V. (2006). The mean square error of prediction in the chain ladder reserving method (Mack and Murphy revisited). ASTIN Bulletin 36/2, 521-542. [3] England, P.D., Verrall, R.J. (2002). Stochastic claims reserving in general insurance. British Act. J. 8/3, 443-518. [4] Fortuin, C.M., Kasteleyn, P.W., Ginibre, J. (1971). Correlation inequalities on some partially ordered sets. Comm. Math. Phys. 22, 89-103. [5] Gisler, A. (2006). The estimation error in the chain-ladder reserving method: a Bayesian approach. ASTIN Bulletin 36/2, 554-565. [6] Mack, T. (1993). Distribution-free calculation of the standard error of chain ladder reserve estimates. ASTIN Bulletin 23/2, 213-225. [7] Mack, T., Quarg, G., Braun, C. (2006). The mean square error of prediction in the chain ladder reserving method - a comment. ASTIN Bulletin 36/2, 543-552. [8] Murphy, D.M. (1994). Unbiased loss development factors. Proc. CAS, Vol. LXXXI, 154-222. [9] Taylor, G. (2000). Loss Reserving: An Actuarial Perspective. Kluwer Academic Publishers. [10] Venter, G. (2006). Discussion of mean square error of prediction in the chain-ladder reserving method. ASTIN Bulletin 36/2, 566-572.

20