portable document (.pdf) format - InterStat

38 downloads 248 Views 19KB Size Report
Nesan Chelliah , School of Health Science, Griffith University,. Goldcoast, Southport, Queensland, Australia. Abstract. In this paper a random coefficient ...
RANDOM COEFFICIENT REGRESSION MODEL FOR PEDIGREE ANALYSIS.

By

Nesan Chelliah , School of Health Science, Griffith University, Goldcoast, Southport, Queensland, Australia.

Abstract

In this paper a random coefficient regression model has been used to model quantitative traits in pedigree analysis. We review a new estimator for estimating the variance components and derived the distribution of the pedigree statistic used in testing for outlying observations.

Key Words: Generalized least squares estimator, Variance components,Quantitative traits.

1

1.

Model and the Estimation

We consider N pedigrees, in each there are n individuals on whom quantitative measurements

Y

= ( y ............. y ) ' ( i=1,2…….N) are

i

1

n

available. Let the linear regression model is given by

(1.1)

Y = X β +ε i

i

i

i

where X i is the nxp design matrix of fixed effects. For example the trait for any individual may depend on age, sex,height etc.

β

i

is the

vector of unknown parameters for the fixed effects assumed to be varies across all pedigrees and it is modelled by

(1.2)

β

i

= β +ν i

here E(ν i ) = 0 , Cov (ν i ) = ∆ (non diagonal and positive definite) and E(ν iν j ' ) = 0, i ≠ j.

ε

i

is the nx1 vector of environmental effects such that

E( ε i ) = 0, E (ε i ε i ' ) = σ ii I , E (ε i ε j ' ) = 0, i ≠ j, E (ν i ε i ' ) = 0 , where I is the nxn identity matrix. Combining the models (1.10 and (1.2) we have

2

Y =X

(1.3)

here

i

U

i

=

β +U i

i

i= 1……..N.

X ν + ε , Cov(U ) = Ω = X ∆ X '+σ i

i

i

i

i

i

i

ii

I.

The model (1.1) is known as the random coefficient regression model and has been studied by several researchers see for example Gumpertz & Pantula(1989), Carter & Yang(1986). In quantitative genetics maximum likelihood estimation methodology has been used to estimate the variance components under the mulivariate normality assumption as in Hopper & Mathew (1994), Lange,Westlake & Spence (1976). A robust approach to genetic linkage analysis has been studied by several researchers, see for example, Amos et al .(1996), Commenges (1994) and Huggins(1993) . Recently in Chelliah(1998) a new estimation methodology has been proposed to estimate the variance components. This method dose not assume any distributon and always produce positive variance components. The estimator is given by

(1.4)

ˆ = Uˆ Uˆ ' +α Ω i

i

i

3

i

I

Where

Uˆ = Y − X β , β i

i

i

ls

ls

= (1 / N )∑ ( X '

i

X) i

−1

X 'Y i

i

. The α i is a

positive value which can be estimated using the data according to some optimal criteria. The asymptotic properties of (1.4) has also been studied in Anh&Chelliah(1998).

2.

An Extention and estimation

The model (1.3) can be compactly written as

(1.5)

Y = Xβ +U

where U = Xν + ε , and ν , ε , Y and X are all nNx1 vectors.

We can extend the model (1.3) discussed in section 1 , by introducing correlation among different pedigrees i.e. For non-diagonal ΓandΣ

the covariance matrix of Y is

Cov(Y ) = Ω = XΓX ′ + Σ

Our new estimator is given by (1.6)

ˆ = UˆUˆ ′ + αˆI Ω

4

here

Uˆ = Y − X

estimator is

βˆ

β

ls

. From (1.5) the generalised least squares (GLS) = ( X ′Ω −1 X ) −1 X ′Ω −1Y . Substituting the estimator (1.6)

GLS

we have the estimated GLS

(1.7)

βˆ

ˆ −1 X ) −1 X ′Ω ˆ −1Y = ( X ′Ω EGLS

The finite sample properties of (1.7) has been studied in Chelliah (1998) under the assumption of elliptically symmetric distribution of the data.

3.

The distribution of the pedigree statistic

The outlying observations can be detected using several tests based on the approach of Hopper & Mathews 91982). They suggested computing the following statistic for each pedigree.

(1.8)

ˆ −1 (Y − µˆ ) Q = (Y − µˆ )′ Ω i

i

i

i

5

i

i

where

ˆ µˆ , Ω

i

i

are maximum likelihood estimators and E( Y i ) = µ i .

In this case the distribution in (1.8) is chi-squared with n degrees of freedom. Huggins (1993) proposed repacing the parameters by robust estimator. We choose

µ =X i

i

β as in (1.3) and use our new estimator (1.4), and

estimate β by the ordinary least squares estimator. Since these estimators are consistent the distribution in (1.8) dose not change.

References

Amos, C.I,Zhu,D and Boerwinkle (1996a). Assessing genetic linkage and association with robust components of variance approaches. Annals of Human Genetics, 60, 143-160.

Anh,V.V,Chelliah,T.N.(1988). Estimated generalized least squares estimator in random coefficient model.

Scandinavian J.of Statist. (to

appear).

Carter,R.L.,and Yang,M.C.K.(1986). Large sample inference in random coefficient regression model.

Commun.Statist.-

Theory&Methods,15, 2507-2525.

6

Chelliah,T.N.(1998). A new covariance estimator in random coefficient regression model.

Sankhya B (to appear).

Commenges,D.(1994). Robust genetic linkage analysis based on a score test of homogeneity: the weighted pairwise correlation statistic. Genetic Epidemiol. 11, 189-200 .

Gumpertz,M. and Pantula,S.G.(1989). A simple approach to inference in a random coefficient models. The amer.Statist. , 43, 203-210.

Hopper,J.L. and Mathews,J.D.(1982). Extensions to multivariate normal models for pedigree analysis.

Ann.Hum.Genet., 46, 373-383.

Hopper,J.L.and Mathews,J.D.(1994). A multivariate normal model for pedigree and longitudinal data and the software ' Fisher '.

Austral.

J. of Statist. 36(2), 153-176.

Huggins,R.M.(1993). On the robust analysis of variance components models for pedigree data. Austral. J. of Statist. 35(1), 43-57.

Lange,K.,Westlake,J. and Spence,M.A.(1976). Extention to pedigree analysis. 111. Variance components by the scoring method. Ann.Hum.Genet. 39, 484-491.

7

8