Copulas and goodness of fit tests

Copulas and goodness of fit tests Pál Rakonczai1 and András Zempléni2 1

2

E¨ otv¨ os Lor´ and University Department of Probability Theory and Statistics Budapest, Hungary (e-mail: [email protected]) E¨ otv¨ os Lor´ and University Department of Probability Theory and Statistics Budapest, Hungary (e-mail: [email protected])

Abstract. There are more and more recent copula models aimed at describing the behavior of multivariate data sets. However, no effective methods are known for checking the validity of these models, especially for the case of higher dimensions. Our approach is based on the multivariate probability integral transformation of the joint distribution, which reduces the multivariate problem to one dimension. We compare the above goodness of fit tests to those, which are based on the copula density function. We present the background of the methods as well as simulations for their power. Keywords: copulas, goodness of fit test, probability integral transformation.

1

Introduction

In the last decades the question of multivariate modeling became also tractable, by the vast number of recorded data and the powerful computing equipment readily available. However, the methodology has not always been kept pace with the available resources: one can easily fit 1 multivariate models by one software or another, but there is not always a suitable method at hand for checking the goodness of the fit. Copulas, simple yet powerful tools for modeling, ensuring the separation of marginal modeling and dependence questions, have been re-invented in the 1990s and their use has been expanded rapidly since then. One natural area of their applications is in the financial mathematics, where they are often used to model the dependence structure between assets or losses, stock indices and so on. The paper is organized as follows. In Section 2 we recall the definition of the Archimedian copula and outline its most relevant properties. In section 3 we demonstrate two completely different approach for investigating the goodness of fit. In Section 4 we apply the presented methods for real financial data sets and compare their power. We show possible extensions of the wellknown bivariate methods to higher dimensions. 1

In this aspect the open-source R package played a leading role, see http://www.rproject.org/ for its description and the available packages

2

Rakonczai and Zempléni

2

d-dimensional Archimedian copulas

In the following we present the basic elements of copula theory for the class of the d-variate Archimedian copulas, which is the extension of the very popular notion of bivariate Archimedian copulas, so it possesses similarly favorable characteristics. Let us consider a copula generator function: φθ (u) : [0, 1] → [0, ∞], which is continuous and strictly decreasing with φ(1) = 0. Then a d-variate Archimedian copula function is Cφθ (u1 , ..., ud ) = φ−1 θ

µX d

¶ φθ (ui ) .

(1)

i=1

The d-copula inherits the beneficial properties of its bivariate ancestor, however it has a limitation that for a fixed family φθ , there are only a few parameters to capture the full dependence structure. Since all the d − 1 dimensional margins of an Archimedian copula are identical: Cφθ (1, u2 , ..., ud ) = Pd−1 ... = Cφθ (u1 , ..., ud−1 , 1) = φ−1 i=1 φθ (ui )), it assumes a certain symmetry θ ( among the coordinates. Anyway, since the main aim of this paper is to introduce some appropriate methods for checking a given model’s validity and not to develop involved copula models, the Archimedian families are pretty eligible. Indeed in the course of the next sections we deal mostly with the Clayton copula family, but we emphasize that the presented methods can be adapted for any Archimedian models in the same way. The generator function of the Clayton copula is given by φθ (u) = u−θ − 1, − θ1 . The Clayton d-copula function, also known as Cook hence φ−1 θ (t) = (t+1) and Johnson’s family, is given by: CClayton (u1 , u2 , ..., ud ) =

µX d

u−θ i

¶− θ1 −d+1

(2)

i=1

with θ > 0. Simulations can be performed by general methods, such as the conditional sampling, which can be computed quite easily with the help of the derivatives of the function φ−1 θ (t), for details see [1]. Beyond the simulation, another relevant question is the parameter estimation. An easy method for the bivariate case is based on Kendall’s τ [cite:] 2τ by the form of θˆ = (1−τ ) . In the general, d-dimensional case one may use the so-called maximum pseudo-likelihood method (see [5]), based on the copula density function: cClayton (u1 , u2 , ..., ud ) =

µX d i=1

¶−d− θ1 Y d (u−θ−1 [(i−1)θ +1]). (3) u−θ −d+1 i i i=1

Copulas and goodness of fit tests

3

3

Goodness of fit tests

In this section we discuss the goodness of fit statistics in two subsections, first the tests related to the cumulative distribution function and then those based on the probability density function. 3.1

GOF statistics based on PIT

Let a random vector X = (X1 , ..., Xd ) possess a continuous d-variate copula model C = (Cθ ) with unknown margins F1 , ..., Fd and (X11 , ..., Xd1 ), ..., (X1n , ..., Xdn ), n ≥ 2 a random sample from X. Let the distribution function of the probability integral transformation V = H(X) be denoted by K(θ, t) = P (H(X) ≤ t) = P (Cθ (F1 (X1 ), ..., F1 (Xd )) ≤ t).

(4)

In the case of the Archimedian copula family (4) can be computed as follows K(θ, t) = t + where fi (θ, t) =

d−1 X ¤ (−1)i £ φθ (t)i fi (θ, t), i! i=1

di −1 dxi φθ (x)|x=φθ (t) .

(5)

For Clayton copulas this can be given as

KClayton (θ, t) = t +

¶i d−1µ X 1 − tθ q(θ, i, 1) i=1

θ

i!

,

(6)

Qi−1 where q(θ, i, m) = j=0 (m + jθ). Define the empirical version of K as n

Kn (t) =

1X 1(Ein ≤ t), t ∈ [0, 1] n i=1

(7)

Pn where Ein = n1 k=1 1(X1k ≤ X1i , ..., Xdk ≤ Xdi ). The test statistics we propose for checking the goodness of fit is based on the comparison of the parametric estimate K(θn , t) of K(θ, t) with its empirical counterpart Kn (t) (for further details see [4]). Known tests for the bivariate case use some continuous functional of the Kendall’s process κn (t) = R1 √ n(K(θn , t) − Kn (t)) such as Sn = 0 (κn (t))2 dt and Tn = sup0≤t≤1 |κn (t)|. With this end in view we will apply the following test statistics,

S1 = S2 =

P P

Deviations

ti ∈[0+ε,1−ε]

|K(θn , ti ) − Kn (ti )| S3 =

2 ti ∈[0+ε,1−ε] (K(θn , ti ) − Kn (ti )) S4 =

Weighted deviations P (K(θn ,ti )−Kn (ti ))2 ti ∈[0+ε,1−ε]

K(θn ,ti ) (K(θn ,ti )−Kn (ti ))2 ti ∈[0+ε,1−ε] K(θn ,ti )2

P

where (ti )ni=1 is an appropriately fine division of the interval (0, 1).

4

3.2


Kernel-based GOF statistics

An other prevailing approach is based on the smoothed empirical copula density. For each index i, define the d-dimensional vectors Yi = (F1 (Xi,1 ), ..., Fd (Xi,d )) and Yn,i = (Fn,1 (Xi,1 ), ..., Fn,d (Xi,d )), denoting by Fn,k the empirical k-th marginal cdf of X. Obviously, the copula C is the cdf of Y and its empirical version is n

d

1 XY Cn (u) = 1(Fn,k (Xi,k ) ≤ uk ). n i=1

(8)

k=1

We assume that Y has a density c, so its kernel estimator at point u is 1 cn (u) = d h

Z

µ ¶ ¶ n u−v 1 X u − Yn,i τ τ Cn (dv) = , h nhd i=1 h µ

(9)

where τ is a d-dimensional kernel and h = h(n) is a usual bandwidth sequence. m nhd X (cn (uk ) − c(uk , θn ))2 T =R 2 c(uk , θn )2 τ

(10)

k=1

Under the appropriate conditions (see [2]): If the chosen copula is the true one then T tends to the χ2 distribution with m degrees of freedom. Theoretically it is very comfortable, because it provides a distribution free testing method. But we have to mention that the conditions even for fitting a proper non-parametric model are far from minimal (adequate kernel structure and bandwidth). Beyond that choosing a grid for evaluating the statistic can be cruel as well. So in practice we propose the following test statistics T =

m X

c(uk θn )(cn (uk ) − c(uk , θn ))2

(11)

k=1

In the modified statistic (11), the coefficients of the sum were omitted and the squared deviations were weighted by the copula density. This is a logical choice, as thus we concentrate on points which are more frequent under our model. Critical values can be calculated by Monte Carlo simulations similarly to the previous subsection.

4

Application for financial data sets

This section illustrates the implementation of the described goodness of fit procedures. We investigate three stock indices namely BUX (Budapest), WIG (Warsaw) and PX (Prague). The data are the daily closing prices

Copulas and goodness of fit tests

0.0

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

BUX

0.0

0.2

0.4

0.6

0.8

1.0

WIG

PX

0.0 0.2 0.4 0.6 0.8 1.0

0.0 0.2 0.4 0.6 0.8 1.0

Fitted Clayton 3−copula Theta = 6.131

WIG

PX

0.0 0.2 0.4 0.6 0.8 1.0

Empirical 3−copula Kendall’s tau = 0.754

5

BUX

BUX−WIG−PX

Deviations Clayton 3−cop (6.13) 95% conf. bounds Observations

0.0

0.2

Clayton 3−cop (6.13) 95% conf. bounds Observations

0.0

0.2

0.4

0.6

0.8

1.0

−0.05 0.00

0.4

0.05

0.6

0.8

0.10

1.0

0.15

Simulations from Clayton 3−copula

0.0

0.2

0.4

0.6

0.8

1.0

Fig. 1. Dependence structure for the whole data set

recorded from 02/01/1997 to 02/01/2007, i.e. around 2500 observations (as there are observations on working days only). Since our aim is to capture the dependence structure, we transformed the data set into the unit cube with the help of the empirical univariate margins, and attempted to fit a 3-dimensional Clayton model. Since the empirical copula has upper tail dependence and the Clayton copula has lower tail dependence, in the first step it is practical to ”turn the empirical copula upside down”. This can be seen in the upper left panel of Figure 1. Next to the empirical data we can see a simulation from the fitted model, so it is clear that the Clayton model can not mimic the given structure properly. The lower graphs report the deviation between the given and the estimated data set. In the first panel the estimated K(θn , t) function is given, together with its 95 percent confidence bounds from a Monte Carlo simulation with 1000 repetitions, compared to the empirical Kn (t) function. The second graph emphasizes the deviations between the graphs. We see that the the observed data are fairly far from our model. This phenomena is proven by the test procedures, since none of our statistics accepts our model. Indeed the observed test statistics are 5-20 times higher then the maximum from 1000 simulations (see Table 1). However, if we omit the extremes from our data set by considering instead of the unit cube just the observations falling into [0.125; 0.86]3 , then no tests among the proposed ones rejects the fit (Table 2 gives the details).

6


Min. 1st Qu Median Mean 3rd Qu. Max. Obs.

Simulation summary S1 S2 S3 0.0672 0.00019 0.00058 0.1228 0.00064 0.00157 0.1455 0.00092 0.00211 0.1499 0.00104 0.00226 0.1707 0.00129 0.00275 0.3255 0.00531 0.00829 1.3787 0.06991 0.16377

S4 0.0019 0.0068 0.0092 0.0103 0.0127 0.0415 0.7183

Table 1. Simulated and observed statistics for the whole data set

0.2

0.4

0.6

0.8

1.0

0.0 0.2 0.4 0.6 0.8 1.0

BUX

0.0

0.2

0.4

0.6

0.8

1.0

WIG

PX

0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.0 0.2 0.4 0.6 0.8 1.0

Fitted Clayton 3−copula Theta = 1.072

WIG

PX

0.0 0.2 0.4 0.6 0.8 1.0

Empirical 3−copula Kendall’s tau = 0.349

BUX

BUX−WIG−PX

Deviations


0.0

−0.04

0.2


0.00

0.4

0.6

0.04

0.8

1.0

0.08

Simulations from Clayton 3−copula

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

Fig. 2. Dependence structure for that part of the data set, where the extremes were removed

In the case of the kernel-based approach there are more difficulties. The value of the proposed T statistic depend strongly on the kernel estimation of the copula. In the current case we set h = 0.05 as bandwidth and evaluated the estimated models on a 100 × 100 grid. This is reported in Figure 3 for the BUX-WIG indices. In the left panel there is the estimated kernel density for the observations, next to that in the middle panel the estimated kernel density for a simulated data set from the model, with the given sample size and in the last one the ”true” values of the model’s density.

Copulas and goodness of fit tests Simulation summary S1 S2 S3 Min. 0.1082 0.00055 0.0011 1st Qu 0.2147 0.00221 0.0054 Median 0.2736 0.00361 0.0085 Mean 0.2907 0.00457 0.0104 3rd Qu 0.3482 0.00571 0.0136 Max. 0.8184 0.02902 0.0532 Obs. 0.308 0.00443 0.0134 Sign. 0.638 0.613 0.742

7

S4 0.003 0.0192 0.0341 0.0418 0.0539 0.2804 0.062 0.8

Table 2. Simulated and observed statistics for that part of the data set, where the extremes were removed

0.0

0.2

0.4

0.6

0.8

1.0

0.2

0.4

0.6

0.8

1.0

Theoretical density

0.0

0.2

0.4

0.6

0.8

1.0

Kernel density of Simulations

0.0

0.0

0.2

0.4

0.6

0.8

1.0

Kernel density of Observations

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

Fig. 3. Contour plots of the densities for the whole data

As in the previous 3-dimensional analysis, next we investigated the subset data ( observations falling into [0.125; 0.86]2 ) as well, in which case we noticed a better fit. This can be seen in Figure 4, where more similarity between the observations and the estimated model is detected. We performed a simulation study for both of the two cases and found that simulated T statistics can not detect the deviations so effectively as in S statistics (the observed sample corresponds to the 0.858 quantile in the case of the whole data set, and to the 0.324-quantile for the subset). It is also clear, however that the model fits better to the subsample.

5

Conclusions

We have shown that the statistics, based on the probability integral transform, gave reasonable results even for our moderate sample size. We are


0.0

0.2

0.4

0.6

0.8

1.0

0.4

0.6

0.8

1.0

Theoretical density

0.2

0.2

0.4

0.6

0.8

1.0

Kernel density of Simulations

0.0

0.0

0.2

0.4

0.6

0.8

1.0

Kernel density of Observations

0.0

8

0.0

0.2

0.4

0.6

0.8

1.0

0.0

0.2

0.4

0.6

0.8

1.0

Fig. 4. Contour plots of the densities for that part of the data set, where the extremes were removed

about to undertake further investigations about the sensitivity of the statistics and the proposed weight functions (see [3],[6] and [7]). The presented analysis about the kernel-based methods was more preliminary. A lot of work has still to be done until one gets a clear picture about the properties of these statistics for real, relatively small data sets. A crucial question is the choice of the points uk as well as the exact form of the most appropriate statistics for a given type of problems.

References [1]Cherubini, U. Luciano, E. and Vecchiato, W. (2004) ”Copula methods in Finance”, WileyFinance, West Sussex, England. [2]Fermanian, J.D. (2005) ”Goodness of fit tests for copulas”, J. Multivariate Anal, 95, 119-152. [3]Fouweather, T. Rakonczai, P. and Zempléni, A. (2007) ”Anderson-Darling type goodness-of-fit tests with extreme value applications”, (in preparation) [4]Genest, C. Quessy, J.-F. and Rémillard, B. (2006) ”Goodnes-of-fit Procedures for Copula Models Based on the Integral Probability Transformation”, Scandinavian J. of Statistics, 33, 337-366. [5]Genest, C. and Favre, A.-C. (2006) ”Everything you always wanted to know about copula modeling but were afraid to ask”, Journal of Hydrologic Engineering. [6]Rakonczai, P. Bozs´ o, D. and Zempléni, A. (2005) ”Goodness of fit in extreme value analysis and for copulas”, Morgan Stanley Conference on Quantitative and Mathematical Finance, Budapest, Hungary. [7]Zempléni, A., Bozs´ o, D. and Rakonczai, P. (2006) ”High dimensional copulas for simulating and testing Extreme Value Models”, XXVI European Meeting of Statisticians, Torun, Poland.