Orthogonal rotation in PCAMIX arXiv:1112.0301v1 [stat.CO] 1 Dec 2011

7 downloads 159 Views 1MB Size Report
Dec 1, 2011 - Abstract. Kiers (1991) considered the orthogonal rotation in PCAMIX, a principal compo- nent method for a mixture of qualitative and ...
Orthogonal rotation in PCAMIX∗ Marie Chavent1,2†, Vanessa Kuentz3 and J´erˆome Saracco2,4

1

Universit´e de Bordeaux, IMB, CNRS, UMR 5251, France 2

INRIA Bordeaux Sud-Ouest, CQFD team, France 3

arXiv:1112.0301v1 [stat.CO] 1 Dec 2011

4

CEMAGREF, UR ADBX, France

Institut Polytechnique de Bordeaux, France Abstract

Kiers (1991) considered the orthogonal rotation in PCAMIX, a principal component method for a mixture of qualitative and quantitative variables. PCAMIX includes the ordinary principal component analysis (PCA) and multiple correspondence analysis (MCA) as special cases. In this paper, we give a new presentation of PCAMIX where the principal components and the squared loadings are obtained from a Singular Value Decomposition. The loadings of the quantitative variables and the principal coordinates of the categories of the qualitative variables are also obtained directly. In this context, we propose a computationaly efficient procedure for varimax rotation in PCAMIX and a direct solution for the optimal angle of rotation. A simulation study shows the good computational behavior of the proposed algorithm. An application on a real data set illustrates the interest of using rotation in MCA. All source codes are available in the R package “PCAmixdata”. Keywords: mixture of qualitative and quantitative data, principal component analysis, multiple correspondence analysis, rotation.

1

Introduction

Kaiser (1958) introduced the varimax criterion for the attainment of simple structures by orthogonal rotation in Principal Component Analysis (PCA) . This criterion aims at maximizing the sum over the columns of the squared elements of the loading matrix. The loading matrix plays a significant part in the interpretation of the results since it contains the correlations between the variables and the principal components. The idea is to get components so that the interpretation is easier, that is to rotate the loading matrix and the standardized principal components so that the groups of variables appear: having high loadings on the same component, moderate ones on a few components and negligible ones on the remaining components. Because the Singular Value Decomposition (SVD) approach in PCA gives one the freedom for orthogonal rotation, the percentage of variance explained is redistributed ∗ †

Submitted paper, August 2011 [email protected]

1

along the newly rotated axes, while still conserving the variance explained by the solution as a whole. Kiers (1991) extended the varimax criterion for the attainment of simple structures in PCAMIX, a principal component method for the mixture of qualitative and quantitative variables. For qualitative variables, the coefficient used to express the link between a variable and a component is the correlation ratio; this correlation ratio plays the role of a squared loading. The varimax criterion is then expressed with squared loadings defined as correlation ratios for qualitative variables and squared correlations for quantitative variables. Algorithms devised for the determination of an optimal orthogonal rotation in the context of PCA, as proposed for example by Kaiser’s (1958), Neudecker (1981) or Jennrich (2001) did not apply to this extended varimax criterion. So Kiers (1991) proposes a matrix reformulation of this new varimax criterion in order to replace the optimization problem with a problem of simultaneous diagonalization of a set of symmetric matrices (ten Berge, 1984), and suggests the use of the algorithm of de Leeuw and Pruzansky (1978) to solve the latter. To the best of our knowledge, the resulting algorithm has never been presented in a single paper, so we have recalled for comparison purpose the main steps of the matrix reformulation and the simultaneous diagonalization. We shall refer to this algorithm as Kiers’ (1991) original approach to PCAMIX. In this paper we will first present a new formulation of PCAMIX. It is similar to that of Escofier (1979) and Pag`es (2004) in the way quantitative and qualitative variables are transformed, but it is presented via a SVD. This presents a direct way to determine both the component scores and the squared loadings and also the principal coordinates of the categories of the qualitative variables as well as the loadings of the qualitative variables. Then we will search for an optimal rotation for the PCAMIX varimax criterion using the iterative procedure suggested by Kaiser (1958) for PCA: we will rotate pairs of dimensions according to an optimal angle θ, iteratively until the process converges. A new direct, specific to PCAMIX determination of this angle is proposed. We shall refer to the resulting algorithm as the SVD approach to PCAMIX. This algorithm leads to the same final rotation as Kiers’ (1991) original approach, however a simulation study shows that it is computationally more efficient. When all the variables are quantitative, the new algorithm reduces to the classical Kaiser’s (1958) procedure for orthogonal rotation in PCA with a new direct expression of the optimal planar angle θ. Notice that Kaiser’s varimax rotation procedure does not always produce an optimal rotation in PCA. ten Berge (1995) made suggestions for addressing this point for PCA. This is an open problem for PCAMIX.

2

This paper is organized as follows. Section 2 recalls Kiers’ original PCAMIX method and proposes an alternative formulation using SVD. Section 3 deals with varimax rotation in PCAMIX. The optimization problem is given section 3.1. The determination of the optimal angle of rotation with Kiers’ matrix reformulation approach is described section 3.2.1 for purpose of comparison with the direct solution proposed section 3.2.2. The complete procedure for orthogonal rotation in more than two dimensions is given section 3.3. A simulation study compares section 4.1 the computational time of the proposed rotation procedure with the rotation procedure based on Kiers (1991). In section 4.2 a real data application illustrates the interest of rotation in MCA and shows some of the outputs and graphical representations available in the R package “PCAmixdata” we have developed.

2

The PCAMIX method

Let us first introduce some notations used in the presentation of the PCAMIX method. • Let n denote the number of observation units, p1 the number of quantitative variables, p2 the number of qualitative variables and p = p1 + p2 the total number of variables. • Let zj be the column vector which contains the standardized scores of the n objects on variable j if the j-th variable is quantitative. • Let Gj be the indicator matrix for the variable j if the j-th variable is qualitative and let Dj be the diagonal matrix of frequencies of categories of this variable. • Let us denote by m the number of categories of the p2 qualitative variables. • Let G = (G1 | · · · |Gj | · · · |Gp2 ) be the n × m matrix of the indicator variables of the m categories of the p2 qualitative variables and let D = diag(D1 , . . . , Dj , . . . , Dp2 ) be the m × m diagonal matrix of frequencies of the m categories. • Let J = In − 110 /n be the centering operator where In denotes the n × n identity matrix and 1 the vector of order n with unit entries. In the two following subsections, we give two formulations of the PCAMIX method and highlight their main differences.

2.1

The original PCAMIX procedure

Suppose k is the number of components required in PCAMIX. In Kiers (1991), the procedure computes the n × k matrix X of the standardized component scores, the variance of each

3

component and the p × k matrix C of the squared loadings. The squared loadings are defined as squared correlation for quantitative variables and as correlation ratio for qualitative variables. This procedure is carried out according to the following steps: 1. For j = 1, . . . , p: calculate the so-called n × n quantification matrix Sj with:  0 Sj = n1 zj zj if variable j is quantitative, −1 0 Sj = JGj Dj Gj J if variable j is qualitative. 2. Calculate the n × n matrix S =

Pp

j=1

Sj .

3. Perform an EigenValue Decomposition of S. The matrix X of the standardized component scores is given by the first k eigenvectors of S normalized to n (such that X0 X = nIk ). 4. For l = 1, . . . , k: calculate the variance of the l-th component given by x0l Sxl where xl denotes the l-th column of X. 5. Calculate the matrix C of the squared loadings of the p variables on the k components with cjl =

1 0 xS x. n l j l

For quantitative (resp. qualitative) variables, cjl is the squared

correlation (resp. correlation ratio) between the variable j and the component l. When all the variables are quantitative (resp. qualitative), this procedure is equivalent to PCA (resp. MCA). But the loadings (the correlations between the variables and the components) and the principal coordinates of the categories (the barycenters of the component scores) are not directly provided and must be calculated afterwards if desired. From a practical point of view this procedure requires the construction and the storage of p matrices of dimension n × n which can leads to memory size problems when n and p increase.

2.2

The SVD based PCAMIX procedure

This procedure is carried out according to the following steps: 1. Determine the n × (p1 + m) matrix of interest Z =

√1 (Z1 |Z2 ) n

where :

• Z1 = (z1 | · · · |zj | · · · |zp1 ) is the n × p1 matrix of the standardized scores of the n observation units (objects) on the p1 quantitative variables. • Z2 is the n × m matrix obtained by recoding G in the following way: Z2 = JGD−1/2 .

4

2. Perform the SVD of Z : Z = UΛV0 ,

(1)

where U0 U = V0 V = Ir , Λ is the diagonal matrix of singular values (in weakly descending order) and r is the rank of Z. 3. Calculate the n × k matrix of the standardized component scores: X=

√ nUk

(2)

where Uk denotes the matrix of the first k columns of U. 4. For ` = 1, . . . , k, the standard deviation of the `-th component is given by the `-th singular value in Λ. 5. Calculate the matrix: A = V k Λk ,

(3)

where Vk denote the matrix of the first k columns of V and Λk the diagonal matrix of the k largest singular values.   A1 the concatenation of a p1 × k matrix A1 and a m × k matrix A2 . 6. Write A = A 2 • The matrix A1 contains the loadings of the quantitative variables (the correlations between the quantitative variables and the components). • The matrix DA2 contains the principal coordinates of the categories of the qualitative variables. • Calculate the matrix C of the squared loadings of the p variables on the k components. This matrix is obtained from the matrix A as follows:  if variable j is quantitative, cjl = P a2jl 2 cjl = s∈Ij asl if variable j is qualitative, where Ij is the set of row indices of A associated with the categories of the P qualitative variable j. To simplify the notations, we note hereafter cjl = s∈Ij a2sl for both quantitative and qualitative variables with Ij = {j} in the quantitative case. Note that the matrix X of the standardized component scores is obtained from the SVD of the recoded data matrix Z whereas it was obtained from the Eigenvalue Decomposition of the matrix S (the sum of the quantification matrices Sj ) in Kiers’ original approach. Also, the matrix C of the squared loadings (squared correlations or correlation ratios between 5

the variables and the components) is calculated here from the only matrix A obtained with the SVD of Z whereas it was calculated from the two matrices X and Sj in Kiers’ original approach. Contrary to the original PCAMIX approach, this procedure simultaneously provides the loadings of the quantitative variables and the principal coordinates of the categories of the qualitative variables. Moreover, when the data are mixed (quantitative and qualitative), the well known barycentric property in MCA remains true: the coordinates of the categories are the averages of the standardized component scores of the objects in those categories. The matrices X, A1 and DA2 are then used to plot the observation units, the quantitative variables and the categories with the same interpretation rules as in PCA and MCA. Matrix C is used to plot the quantitative and qualitative variables on a same graphic.

3

Varimax rotation in PCAMIX

3.1

The optimization problem

Why using rotation ? As shown by Eckart and Young (1936), from the SVD in (1) and definitions of matrices X and A given in (2) and (3), the matrix XA0 is a rank k least squares approximation of Z. Let us introduce T an orthonormal rotation matrix: TT0 = T0 T = Ik . e = XT and A e = AT. As XA0 = X eA e 0 , this approximation is not unique over Let X orthogonal rotations. This non-uniqueness can be exploited to improve the interpretability of the original solutions. To simplify the interpretations, the matrices X and A are then rotated in such a way that when considering one variable, few squared loadings are large (close to 1) and as many as possible are close to zero. e contains the loadings of the variables after The varimax problem. In PCA, since A rotation, the varimax rotation problem is formulated as max f (T), T

s.t. where f (T) =

(4)

TT0 = T0 T = Ik ,

p k X X

(˜ a2jl )2

l=1 j=1

p k 1X X 2 − a ˜ p l=1 j=1 jl

!2 (5)

is the varimax function measuring the simplicity of the components after rotation. In the SVD approach of PCAMIX, the varimax function f is defined by replacing in (5)

6

the terms a ˜2jl by c˜jl , where the c˜jl = f (T) =

P

s∈Ij

a ˜2sl are the squared loadings after rotation:

p k 1X X 2 c˜jl (˜ cjl ) − p l=1 j=1 j=1

p k X X l=1

!2 .

(6)

Note that the squared loadings after rotation c˜jl are squared correlations (resp. correlation ratios) between the quantitative (resp. qualitative) variables and the rotated components. For comparison purpose, we recall Kiers’ original expression of the varimax function in ˜ 0l Sj x ˜ l , where x ˜ l denotes the PCAMIX: the squared loadings after rotation c˜jl are given by n1 x e Hence the varimax function (6) becomes: l-th column of X. f (T) =

p  k X X 1 l=1 j=1

n

˜ 0l Sj x ˜l x

2

p k 1X X1 0 ˜ Sj x ˜l − x p l=1 j=1 n l

!2 .

(7)

The iterative optimization procedure. Because a direct solution for the optimal T is not available, an iterative optimization procedure suggested by Kaiser (1958) for PCA can be used for PCAMIX. The idea is to consider at each iteration a planar rotation for which the rotation matrix T only depends of an angle θ (see below for details). This procedure rotates pairs of dimensions in the following way: the single-plane rotations are applied to dimensions 1 and 2, 1 and 3, . . ., 1 and k, 2 and 3,. . ., (k − 1) and k, iteratively until the process converges, i.e. until k(k − 1)/2 successive rotations providing an angle of rotation equal to zero are obtained. The key point of this rotation procedure is the definition of the single-plane rotation step. We give next details on the calculation of the optimal angle for planar rotation. Then we give the complete iterative procedure for rotation in more than two dimensions.

3.2

Planar rotation

Single planar rotations are obtained with a rotation matrix T defined by   cos θ −sin θ T= sin θ cos θ

(8)

where θ is the angle of rotation. The varimax rotation problem (4) is then rewritten as: max f (θ). θ∈R

For purpose of comparison we recall first the solution based on Kiers’ matrix reformulation before we give our direct solution.

7

3.2.1

Planar rotation using the Kiers’ matrix reformulation

Kiers (1991) proposes to use a procedure of simultaneous diagonalization of a set of symmetric matrices (ten Berge, 1984; de Leeuw and Pruzansky,1978) to solve the global varimax optimization problem (4). For that purpose he gives the following matrix reformulation of the formula (7) giving f : f (T) = p

−2

p X

Trace (T0 Ej T(Diag T0 Ej T))

(9)

j=1

where Ej = p X0 Sj X − nΓ

(10)

and Γ is the diagonal matrix with the k first eigenvalues of S on its diagonal. Careful reading of ten Berge (1984) and de Leeuw and Pruzansky (1978) shows that the procedure for simultaneous diagonalization of the matrices Ej is equivalent to Kaiser’s iterative optimization procedure with the optimal angle θ of single plane rotations defined by the equation: a tan(4θ) = , b where a=4

p X

ej12 (ej11



ej22 )

and b =

j=1

p X

(11)

(ej11



j=1

ej22 )2

−4

p X

(ej12 )2

(12)

j=1

 ej11 ej12 and Ej = is defined in (10). ej21 ej22 As mentionned by several authors (see for instance Nevels, 1986; ten Berge, 1984; de 

Leeuw and Pruzansky, 1978 and Kaiser, 1958) equation (11) is only a necessary condition obtained upon setting the first order derivative of the objective function to zero. Both Kaiser (1958) and de Leeuw and Pruzansky (1978) developed a procedure for determining the optimal θ from the sign of the second order derivative of the objective function. These two procedures, expressed in tabular form, give the appropriate solution for every possible combination of signs of a and b. 3.2.2

Planar rotation using the SVD approach of PCAMIX

The varimax function f (T) defined with the SVD approach in (6) is written:  2  2  2  2 p p p p X X X X X X X X 1 1   a ˜2s1  −  a ˜2s2  f (θ) = a ˜2s1  + a ˜2s2  −  p p j=1 s∈I j=1 s∈I j=1 j=1 s∈I s∈I j

j

j

(13)

j

with a ˜s1 = as1 cos(θ) + as2 sin(θ) and a ˜s2 = −as1 sin(θ) + as2 cos(θ). 8

(14)

This function is equal to (see Appendix): f (θ) = f (0) +

 ρ cos(4θ − ψ) − cos ψ 4p

(15)

where ρ and ψ are defined by : ρ = (a2 + b2 )1/2

,

cos ψ = b/ρ ,

sin ψ = a/ρ

(16)

with a and b given by : a = 2p

p X j=1

uj vj − 2

p X j=1

uj

p X

vj , b = p

j=1

p X

(uj 2 − vj 2 ) −

j=1

p X

!2 uj

j=1

+

p X

!2 vj

,

(17)

j=1

where uj and vj are defined by : X X as1 as2 . uj = (a2s1 − a2s2 ) and vj = 2

(18)

s∈Ij

s∈Ij

The function f obtained in (15) is maximum for cos(4θ − Ψ ) = 1 ⇔ 4θ − Ψ = 2kπ, thus the optimal angles are : Ψ π + k , k ∈ Z. (19) 4 2 Note that the above expressions of uj and vj contain as special cases (take Ij = {j}) θ=

those defined by Kaiser (1958) for the PCA varimax solution. Note also that the classical necessary condition (11) immediately follows by setting the expression (23) of pf 0 (θ) given in the Appendix to zero (the coefficients b and a given by (12) on one side, and (17)(18) on the other side are proportional).

3.3

The iterative rotation procedure.

We consider now the case where the number k of dimensions in the rotation is greater e of the rotated standardized than two. The iterative rotation procedure gives the matrix X e which is used to obtain the rotated squared loadings, component scores and the matrix A the rotated loadings (correlations) of the quantitative variables and the rotated principal coordinates of the categories. This procedure is carried out according to the following steps: e = X and A e = A where the n × k matrix X and the (p1 + m) × k 1. Initialization : X matrix A are given by the SVD based PCAMIX procedure given section 2.2 . 2. For l = 1, . . . , k − 1 and t = (l + 1), . . . , k, calculate for the pair of dimensions (l, t): - the angle of rotation θ = Ψ/4 with Ψ defined in (16) . We choose:  b  ) if a ≥ 0,  arcos( √ 2 a + b2 Ψ= b   −arcos( √ ) if a ≤ 0. 2 a + b2 where a and b are defined in (17). 9

(20)

 - the matrix of rotation T =

 cos θ −sin θ , sin θ cos θ

e and A e updated by rotation of their l-th and t-th column. - the matrices X 3. Repeat the previous step until the k(k − 1)/2 angles θ are equal to zero. 4. Calculate: e with c˜jl = P - the matrix C ˜2sl . s∈Ij a e 1 of the p1 first rows of A e which contains the rotated loadings of the - the matrix A quantitative variables. e 2 of the m last rows of A e and the matrix DA e 2 which contains the - the matrix A rotated principal coordinates of the categories of the qualitative variables. The main differences between this procedure and that constructed with Kiers’ matrix reformulation are the following: • The expressions of a and b in step (2): in this procedure they are expressed according to the matrix A of dimension (p1 + m) × n where p1 is the number of quantitative variables and m is the total number of categories. With Kiers’ matrix reformulation, a and b are expressed according to the p matrices Sj of dimension n × n. Then the calculation and the storage of these matrices may be time and space consuming. • The direct determination of the optimal angle in step (2). Having an explicit expression for the solution is of theoretical interest and is more straightforward from a computational point of view. • The outputs: this procedure provides directly the rotated loadings of the quantitative variables and the rotated principal coordinates of the categories which are used for graphical representations after rotation.

4

Numerical studies

The procedure proposed in this paper for varimax orthogonal rotation in PCAMIX has been implemented in R. A package called “PCAmixdata” is already available on the CRAN website. In this section, this algorithm is compared on simulated data with Kiers’ rotation procedure. Then an application on a real data example illustrates the possible benefits of using rotation in MCA as particular case of PCAMIX.

10

4.1

A simulation study: comparison of computational times

An iterative rotation procedure based on Kiers’ matrix reformulation has also been implemented in R. This procedure is that proposed section 3.3 with the following modifications: • Kiers’ original PCAMIX procedure is used in the initialization step in place of the SVD based PCAMIX procedure. • All the calculations and outputs based on the matrix A are removed because this matrix is not part of the original PCAMIX procedure. • The coefficients a and b in step 2 are calculated according to their expressions (12) associated to Kiers’ matrix reformulation. Note that the ratio

a b

is the same with the

two approaches (SVD and matrix reformulation) so the optimal angle θ is the same. • In step 4 the squared loadings are calculated with their expression in the original PCAMIX approach. The computation time of the two rotation procedures (the one based on Kier’s matrix reformulation and the one based on the SVD approach of PCAMIX) is compared from simulated datasets with varying parameters: the number p of variables (p/2 quantitative and p/2 qualitative) and the number n of observations. For each set of parameters (n, p), 20 simulations are drawn. More precisely the datasets are built using the following procedure: • A dataset with n observations and p variables is drawn from a multivariate normal distribution with a covariance matrix Σ = Q0 Q where Q is a p × p matrix drawn from a uniform distribution on the interval [0.2; 0.4]. • The p/2 last variables are distributed in three equal-count categories. Each dataset is then constituted of p1 = p/2 quantitative variable, p2 = p/2 qualitative variable and the total number of categories is m = 3 ∗ p/2. Because the two rotation procedures iterate planar rotations until convergence, we compare their computation time for k = 2. The median computation times (over the 20 replications) are given in Table 1 and the ratio between the computation time of the two approaches are given in Table 2. Table 1 shows that the SVD approach is faster than the matrix reformulation approach for all configurations. For configurations where p = 10, Table 2 shows that the SVD approach is from 3 times faster for n = 50 to 214 times faster for n = 800. For configurations with greater values of p, this ratio is less important but still increases with n. For the configuration where 11

n=50 n=50 n=100 n=100 n=200 n=200 n=400 n=400 n=800 n=800

Matrix reformulation SVD Matrix reformulation SVD Matrix reformulation SVD Matrix reformulation SVD Matrix reformulation SVD

p=10 0.05 0.02 0.14 0.02 0.55 0.02 2.15 0.03 10.06 0.05

p=50 0.12 0.06 0.33 0.09 1.12 0.11 4.32 0.16 19.27 0.25

p=100 0.22 0.12 0.56 0.17 1.86 0.26 7.1 0.37 30.54 0.58

p=200 0.44 0.27 1.04 0.34 3.38 0.53 12.65 0.89 error 1.79

Table 1: Median computation time (in seconds) of two PCAMIX rotation procedures: the one based on Kiers’ matrix reformulation and the one based on the SVD appoach. n=50 n=100 n=200 n=400 n=800

p=10 2.9 8.7 23.2 69.4 214.1

p=50 2.0 3.8 10.3 27.7 77.4

p=100 1.8 3.3 7.0 19.0 52.9

p=200 1.6 3.0 6.4 14.2 error

Table 2: Ratio between the median computation time of the two rotation procedures (Matrix reformulation/SVD). n and p are great (n = 800 and p = 200) an error occurs with the rotation procedure based on Kiers’matrix refromulation. The maximum capacity of memory size of the computer was reached in that case. This error occurs during the calculation of the p matrices Sj of size n × n. This confirms the computational efficiency of the proposed SVD approach.

4.2

A real data application

This real data application illustrates the interest of rotation in MCA. A food habits survey1 was carried out in 1999 on students living in the region “Aquitaine” in south of west France. We focus on the answers of 2885 students to 12 binary questions concerning their consumption at breakfeast (coffe, cereals, eggs...). The PCAMIX method (equivalent here to MCA) has been applied to this dataset and the first 4 components have been rotated. In Figure 1 the association of the variables with the first two components is obviously easier after rotation. This rotation of the first four components leads in Table 3 to clear associations between the binary variables: coffe is associated with milk, eggs with cheese and deli, bread with jam and cereals with pure milk. The effect of the rotation on the objects’ scores and on the categories’ coordinates can also be visualized in Figures 2 and 3. 1

This survey was realized by the Bordeaux School of Public Health (Institut de Sant´e Publique, d’Epid´emiologie et de D´eveloppement - ISPED)

12

The interpretation rule associated with the barycentric property remains true after rotation.

coffe tea milk milk chocolate pure milk cheese deli eggs jam honey bread cereals

1 0.23 0.05 0.15 0.43 0.02 0.18 0.20 0.20 0.06 0.00 0.11 0.01

Before rotation 2 3 4 0.22 0.06 0.05 0.01 0.06 0.18 0.16 0.08 0.00 0.18 0.01 0.06 0.00 0.05 0.40 0.23 0.01 0.00 0.27 0.00 0.05 0.37 0.00 0.01 0.02 0.49 0.02 0.05 0.14 0.20 0.01 0.45 0.00 0.01 0.12 0.22

After rotation 2 3 0.00 0.00 0.02 0.06 0.00 0.01 0.00 0.01 0.00 0.02 0.42 0.00 0.51 0.00 0.58 0.00 0.00 0.59 0.02 0.20 0.01 0.53 0.01 0.04

1 0.49 0.05 0.37 0.62 0.01 0.00 0.00 0.00 0.00 0.00 0.01 0.05

4 0.07 0.17 0.01 0.05 0.44 0.00 0.01 0.00 0.00 0.16 0.03 0.27

Table 3: Correlation ratio (squared loadings) between the variables and the first 4 components before and after rotation

0.8 0.6 deli cheese milkchoc

honey cereals jambread tea puremilk

0.0

deli

0.4

cheese

0.0

0.0

cafe

milk

0.2

0.4

0.6

0.8

eggs

0.2

Dimension 2 after rotation

0.8 0.4

0.6

Correlation ratios after rotation

eggs

0.2

Dimension 2

Correlation ratios before rotation

honey tea cereals bread puremilk jam

0.0

Dimension 1

milk

0.2

milkchoc

cafe

0.4

0.6

0.8

Dimension 1 after rotation

Figure 1: Plots of the correlation ratios between the variables and the two first components before rotation and after rotation. Note that for binary variables MCA and PCA lead to equivalent object scores and squared loadings (correlations are equal to correlation ratio). Then considering the data as quantitative in PCAMIX (equivalent to PCA in that case) gives the same results except for the plots of the categories which are not defined in that case.

5

Conclusion

We have given in this paper a SVD based formulation of the PCAMIX method. This new formulation leads to an efficient procedure for varimax rotation in PCAMIX where a direct solution for the optimal angle of rotation θ has been obtained. The numerical results have shown on simulations that this procedure is computationally more efficient than the 13

Scores after rotation 90 1908 1116 1499 1498 2467 ● 1611 2598 578 1456 2472 1551 1474 ● ●

8

8

Scores before rotation

1908

●●



●●





0

2

4





6 4



2

367 1495 513 6312636 1397 1865 2746 1437 1654 1650 462 1038 2718 391 ●2404 323 886 1684 2374 980 2878 ● 298 300 611 467 171 2859 ● 211 593 1337 2648 986 1025 87 2735 2295 1897 1667 ●5● ● 1382 1251 2037 2342 2664 490 2431 2858 ●●● ● 1590 2585 591 673 2392 623 2639 672 ● 1663 2750 1262 2788 1767 920 63 ●● ●2589 1200 291 279 1735 310 293 2291 ● ● ●●● 1915 2656 246 526 190 2471 2195 588 2028 17 ●●● ●● 168 585 1569 1977 835 ● ●● ● 1329 1872 ●834 ● 2340 2615 863 ● ● 1138 213 31 341 2006 ● 1358 ● ● ● ● 2857 100 ●678 ● 1006 1608 1622 ● 1078 1328 2365 2503 ●● 564 2044 2453 ● ● 316 152 395 737 873 ●●630 1691 2275 2323 181 ● ● 1105 1607 2343 ● ● 302 2389 1950 170 ● ● 263 1178 1353 ● 2767 ● ● 1089 1215 2447 2674 1909 2208 2378 330 538 941 2105 2672 ● ●● 2570 1166 1165 135 883 1308 567 992 2693 443 2732 1225 ● 2182 1416 ●● 1394 2855 1388 ● ●● ●● ● ●●● ● ●

● ● ●

●● ●●●●





1265 1491 2386 1870 603 394 1202 841 987 1013 1336 2423 2634 319 329 360 753 498 2697 1280 2259 1536 1937 1917 274 1330 2611 2710 2398 1071 1378 2302 2380 2630 212 353 943 1261 2251 403 402 1317 1327 1439 1349 1519 1560 1559 1727 2124 2439 2822 2838 268 472 674 752 1545 1786 827 454 559 1248 2565 1588 2753 571 1183 1373 1466 1783 2129 2455 2521 219 817 1041 2147 2756 1306 2169 757 1104 1157 1232 1665 1764 1763 1815 1829 1884 2025 2243 2580 2667 2685 2757 250 345 797 954 ● 1748 1068 1118 1197 1461 1678 2186 2288 2399 2421 2862 165 203 224 267 408 480 484 551 573 579 612 627 675 687 913 1245 1980 1093 2257 610 441 1026 1061 1325 1347 1400 1555 1762 1847 2177 2242 2292 2382 2473 2509 2712 2773 132 202 280 287 550 684 21 99 1903 162 1156 1460 2689 1035 1450 1449 1695 2430 2631 232 516 1293 1843 2144 2171 2370 2489 831 2543 ●37 1276 1281 1666 1668 1790 1826 1957 2254 2279 2498 2629 2769 126 686 748 940 73 1102 321 514 313 1208 2654 258 1886 2104 2671 1705 1993 2355 984 93 877 885 ●1266 1292 205 368 1322 1979 1224 1515 1579 1836 2052 2062 2135 2173 2376 2553 2714 2748 129 989 30 72 1252 1386 2231 2652 2799 850 98 ● ● 2225 1395 1418 2805 1037 1234 1488 1487 1655 1778 2211 2210 2644 2772 198 278 843 945 971 ● 2293 689 781 1699 ● 1176 1189 1492 1567 1634 2122 2194 2213 2262 2583 2666 2690 2738 206 265 281 460 562 561 620 661 991 2048 574 717 887 983 1465 1346 371 422 682 1145 1369 1524 1033 1284 1391 1409 1408 1485 1570 1739 1978 2159 2406 2492 2668 2877 364 553 659 29 28 58 1230 1193 1902 1240 1195 1392 1561 1717 2317 380 814 97 1003 1019 1032 1048 1051 1060 1076 1083 1085 1087 1172 1187 1192 1205 1233 1307 1324 1354 1357 1359 1412 1423 1443 1442 1448 1451 1493 1511 1535 1592 1631 1679 1688 1701 1708 1723 1800 1799 1813 1855 1854 1880 1889 1906 1924 1973 1975 2020 2040 2047 2070 2090 2150 2224 2233 2241 2260 2270 2274 2289 2303 2357 2456 2458 2474 2504 2515 2525 2533 2578 2649 2692 2705 2713 2725 2724 2739 2771 2777 2806 2808 2843 2860 2869 2872 105 157 184 191 200 216 290 289 299 301 305 331 373 473 482 486 497 552 577 638 706 713 715 734 740 739 745 744 756 759 773 802 805 823 846 936 960 54 60 80 2202 178 2138 33 1113 328 949 948 1082 1081 1109 1124 1135 1155 1177 1180 1255 1303 1331 1457 1505 1530 1568 1593 1646 1670 1729 1794 1806 1856 1948 1967 1983 1998 2004 2029 2165 2188 2200 2265 2276 2282 2294 2305 2424 2427 2487 2532 2591 2594 2608 2641 2795 2802 2815 2824 2842 2875 163 177 277 512 521 527 614 679 710 791 808 894 914 952 963 969 990 993 25 96 1002 1312 322 1142 223 222 2351 2852 446 1859 1858 541 1123 1010 1396 1692 2249 2367 2698 2866 664 663 958 69 2681 2876 2736 ●2330 1130 2118 196 237 934 933 2844 2074 2595 2868 660 879 1148 ●667 1054 1092 1095 1137 1147 1152 1185 1210 1239 1247 1258 1267 1316 1417 1541 1557 1556 1619 1659 1743 1781 1845 1926 1928 2024 2027 2134 2141 2146 2250 2290 2329 2369 2397 2413 2416 2436 2465 2486 2528 2546 2548 2558 2569 2571 2593 2715 2727 182 254 269 309 336 344 424 423 433 432 481 483 505 504 507 558 580 600 616 681 712 738 804 833 938 70 95 ● 2115 2684 1940 2034 1704 2067 1270 ● 1313 1603 1801 2086 314 377 379 509 891 2508 ●719 1001 1016 1040 1039 1050 1063 1067 1084 1098 1121 1139 1158 1162 1194 1206 1228 1242 1241 1263 1268 1273 1277 1296 1320 1342 1367 1366 1402 1405 1421 1420 1426 1470 1469 1483 1482 1508 1507 1575 1596 1601 1625 1629 1653 1657 1694 1720 1770 1774 1787 1819 1838 1851 1853 1887 1912 1911 1910 1935 1944 1949 1951 1956 1959 1965 1982 1984 1992 1996 1995 2001 2003 2041 2058 2057 2091 2101 2111 2123 2125 2151 2160 2179 2185 2192 2198 2209 2228 2230 2236 2261 2266 2297 2309 2316 2319 2331 2377 2387 2408 2407 2443 2459 2464 2476 2484 2530 2535 2540 2551 2555 2568 2604 2617 2633 2646 2659 2688 2733 2752 2755 2763 2775 2797 2801 2831 2835 2847 2854 2864 2880 2883 102 124 134 145 159 183 189 192 195 220 231 236 235 238 251 283 332 334 350 365 389 393 425 427 436 466 500 510 537 544 543 542 549 566 568 598 605 617 624 629 636 650 649 648 656 655 666 693 732 755 761 771 770 774 782 826 857 871 870 889 897 896 900 931 935 937 961 968 976 975 996 12 22 24 32 49 55 57 65 76 79 84 88 1044 1070 1175 1207 1288 1345 1355 1377 1376 1385 1453 1458 1462 1490 1512 1518 1658 1713 1721 1810 1809 1896 1913 1960 2005 2021 2076 2080 2114 2149 2205 2246 2245 2315 2426 2446 2482 2485 2505 2538 2575 2660 2696 2700 2702 2742 2754 2759 2779 2784 2792 2811 2829 2851 2882 113 115 139 160 199 284 288 437 584 764 812 822 932 40 61 74 1304 1724 2268 193 1472 ● 1429 1644 261 ● ●930 1027 1031 1049 1062 1069 1112 1132 1154 1153 1222 1229 1235 1244 1294 1299 1323 1415 1424 1430 1455 1464 1471 1484 1523 1546 1550 1563 1578 1581 1599 1604 1618 1636 1645 1648 1712 1719 1726 1741 1746 1807 1832 1831 1841 1861 1860 1898 1941 1958 1981 1989 1988 19992000 2002 2007 2023 2030 2050 2054 2061 20632064 20832084 20992100 2110 2130 2181 2203 2227 2255 2264 2298 2338 2337 2353 2400 24482449 2454 2466 2469 2480 2494 25002501 2544 2559 2613 2612 2622 2643 2719 2723 2747 2761 2789 2794 2856 118 117 122 121 131 148 151 229 242 241 249 304 307 320 337 339 348 357 416 428 444 449 458 463 471 470 476 489 493 502 535 534 533 594 607 606 634 642 654 680 688 691 701 700 716 722 721 727 762 776 792 801 829 828 858 862 869 874 926 925 965 973 982 981 27 50 6 75 86 92 1 8 1107 2709 2871 452 1286 1772 1974 2320 2428 2499 363 378 1072 1161 1186 1250 1362 1497 1520 1580 1672 1676 1779 1816 1894 1918 1931 1947 1985 2137 2156 2155 2226 2493 2507 2563 2642 2740 2743 2758 2812 2834 2846 2863 2870 2884 146 175 214 546 704 733 736 916 947 972 977 39 1489 2306 1004 1042 1111 1125 1182 1264 1291 1298 1436 1435 1494 1574 1606 1664 1683 1702 1785 1795 1849 1852 1862 1876 1875 1899 1922 1946 2017 2049 2078 2077 2116 2229 2247 2278 2381 2410 2418 2462 2468 2518 2517 2537 2560 2638 2760 245 308 359 417 430 468 570 576 587 586 619 813 815 859 915 950 1319 1842 2097 942 1015 1022 1024 1030 1064 1103 1108 1129 1184 1254 1253 1372 1383 1390 1399 1406 1410 1425 1428 1445 1475 1516 1591 1597 1614 1640 1642 1649 1661 1680 1736 1758 1765 1797 1796 1808 1814 1821 1825 1840 1839 1867 1878 1888 1905 1904 1923 1927 1932 1952 1961 1964 1968 1987 1997 2032 2043 2079 2103 2106 2119 2121 2126 2132 2152 2176 2175 2193 2199 2271 2324 2326 2346 2349 2361 2368 2375 2388 2405 2451 2477 2510 2514 2536 2574 2590 2606 2619 2624 2627 2653 2678 2694 2728 2764 2770 2774 2807 2836 2839 2841 2849 2853 104 103 114 120 167 217 230 244 264 306 362 369 383 385 390 401 413 421 435 439 531 547 555 565 599 601 621 628 639 662 665 703 705 707 720 730 729 743 760 766 778 788 787 786 790 803 811 821 853 884 911 921 929 944 985 36 62 66 89 1223 1538 1537 1422 ●● ● 1017 1066 1065 1075 1094 1099 1159 1164 1191 1190 1256 1310 1309 1335 1374 1381 1380 1384 1411 1419 1438 1528 1527 1558 1612 1633 1760 1769 1788 1791 1823 1822 1837 1850 1883 1925 1933 1970 2066 2162 2166 2207 2215 2214 2220 2223 2239 2263 2272 2280 2286 2307 2345 2356 2359 2502 2511 2516 2523 2542 2602 2609 2621 2658 2669 2683 2703 2706 2717 2722 2782 2800 2848 2850 2861 130 136 155 169 315 399 448 485 499 581 683 750 795 810 840 839 866 882 881 892 901 939 988 23 42 71 1553 1630 1750 1848 1890 2183 2237 2321 2344 23472348 2679 187 1101 1136 1690 1728 1986 2019 2133 2277 2287 2312 2350 2745 2833 854 1275 1274 1005 1023 1115 1114 1221 1352 1682 1711 1921 1939 1943 2013 2015 2085 2107 2172 2180 2240 2301 2333 2390 2457 2478 2488 2522 2647 174 173 209 622 789 819 876 875 955 26 1009 1008 1755 1805 503 878 953 1059 1379 1459 1500 1698 1780 1812 1920 2081 2087 2109 2768 144 445 488 511 640 670 67 1361 1370 1506 1577 1742 2092 2256 2322 2354 2364 2470 2539 479 779780 8 67 613 2393 1134 1285 1771 1934 2212 2402 2708 176 349 384 492 962 2845 ● ● ● 1090 1432 1444 1714 1901 1971 2036 2117 2174 2296 2341 2726 108 188 374 409 491 597 635 785 3 7 6 ● ●● 1021 1020 1097 1096 1212 1211 1371 1638 1686 1696 2014 2033 2148 2273 2332 2403 2828 2840 333 388 671 754 777 946 978 ●1504 340 ●9 1173 1623 769 1554 1562 1652 709 15 381 1012 1045 1056 1055 1088 1100 1119 1133 1140 1144 1143 1146 1171 1179 1181 1188 1198 1209 1214 1213 1219 1227 1243 1249 1290 1289 1311 1315 1314 1333 1343 1356 1365 1401 1427 1431 1434 1452 1468 1473 1477 1476 1503 1502 1514 1513 1522 1529 1532 1531 1534 1539 1571 1576 1586 1585 1602 1620 1626 1637 1671 1674 1709 1715 1730 1737 1753 1752 1776 1775 1789 1803 1818 1817 1824 1835 1844 1857 1869 1871 1892 1930 1936 1938 1942 1963 2035 2071 2096 2095 2094 2108 2112 2128 2127 2140 2154 2163 2178 2190 2197 2196 2232 2238 2244 2248 2253 2284 2304 2336 2335 2334 2360 2394 2396 2414 2429 2452 2461 2460 2483 2491 2490 2495 2506 2520 2529 2573 2581 2616 2626 2635 2665 2699 2701 2716 2778 2781 2787 2804 2803 2809 2827 2826 2867 2885 112 111 127 143 147 149 156 166 180 179 186 201 207 215 218 234 233 262 296 327 343 347 352 366 370 415 420 451 469 478 506 517 523 530 529 556 572 602 645 668 685 690 692 695 694 711 725 746 749 768 767 772 798 825 832 844 880 895 899 898 919 994 999 14 41 43 78 ● ● 1632 2089 2432 2576 726 ● 1126 1287 1375 1480 1479 1509 1621 1732 1749 1759 1773 1916 2018 2031 2143 2222 2252 2313 2372 2433 2527 2534 2562 2720 2780 2783 106 125 226 311 438 647 669 765 837 923 1057 1269 1440 1744 1820 1833 2056 2073 2308 2463 2550 2579 2821 248 375 816 34 8 ● 1768 1891 2016 2285 2730 2762 276 275 997 1034 1036 1074 1086 1110 1131 1163 1168 1167 1196 1226 1231 1334 1368 1404 1413 1446 1454 1496 1542 1548 1547 1552 1566 1572 1598 1610 1628 1635 1639 1647 1685 1689 1700 1706 1710 1738 1745 1777 1811 1846 1866 1881 1953 1972 1976 1991 1994 2072 2075 2082 2142 2161 2170 2234 2300 2339 2383 2391 2411 2440 2442 2475 2496 2512 2531 2556 2586 2597 2601 2600 2614 2655 2657 2662 2686 2707 2766 2796 2798 2818 2820 2865 116 138 137 172 204 208 210 225 257 285 303 312 351 356 376 382 398 411 414 419 431 440 457 459 464 474 495 532 536 560 575 626 644 677 696 718 724 747 758 784 806 809 830 848 906 912 917 928 959 970 19 38 83 82 81 2 ● 1170 1318 1360 1363 1693 1798 2046 2098 2136 2437 2577 2628 2677 26752676 2721 2731 2785 2832 142 154 255 273 272 286 410 450 456 455 515 596 595 8 888 918 922 38 1340 2450 525 907 1122 2145 2366 2444 2479 860 995 1029 1174 1297 1344 1613 2358 2409 2554 2588 2837 243 247 429 447 852 908 ● 2042 2164 1259 1351 1350 1677 1687 1734 1751 1834 2434 2566 2651 282 405 569 855 864 10 91 1047 1046 1257 1272 1271 1467 1756 1828 2235 2325 2327 197 335 418 461 590 604 633 861 998 48 1000 1018 1052 1160 1326 1478 1486 1583 1589 1675 1707 1761 1864 1882 2026 2068 2088 2153 2206 2217 2310 2385 2412 2425 2596 2817 324 540 582 609 646 807 851 1533 2051 2168 2184 ● 849 496 522 842 94 1609 1651 1863 2011 2663 2711 133 406 35 59 1079 1305 1332 1962 2352 2417 1073 1091 1246 1463 1627 1731 1874 1885 2395 2513 2599 2623 2637 2729 2751 2810 101 109 119 252 260 292 397 501 520 592 657 676 699 723 820 903 905 909 956 979 46 1757 2038 2258 910 2419 1011 1077 1120 1201 1339 1338 1389 1433 1510 1517 1521 1703 1827 1914 2039 2093 2269 2650 358 442 902 1043 1204 1893 2008 2362 2592 256 475 554 632 1295 1321 1393 1447 1540 1565 1564 1733 1792 2055 2481 2519 2561 2645 2786 2793 107 271 412 434 519 964 18 47 1007 1150 1260 1282 1403 1407 1544 1549 1573 1595 1605 1673 1697 1747 1873 1929 1990 2012 2022 2045 2053 2060 2065 2102 2167 2216 2379 2420 2545 2547 2549 2587 2618 2673 2682 2687 2776 2814 2816 2873 2879 150 240 239 270 294 354 386 477 494 518 548 589 608 637 643 698 697 702 714 794 793 796 856 872 924 957 53 4 1128 1716 64 ●346 1740 618 651 751 1919 2191 865 ● ●● 1501 1782 1879 2010 2069 2157 2363 2526 2819 2113 2572 ●●● ● ●56 ●2328 2765 392 2221 228 227 325 404 1106 1218 1279 1300 2267 2401 2737 524 44 16 1237 1236 ●11 ● ● ● ●● ●●● ● 1616 1725 2120 2557 2610 2704 295 326 387 836 845 ● ●● ● 1203 1217 1341 1624 1793 1804 1877 2158 2584 2661 2691 2749 2823 123 185 253 453 508 966 ● ● 1053 1080 1283 1302 1301 1414 1617 1754 1868 1969 2567 2680 128 140 153 487 563 653 652 763 927 ● ● ● 1199 1615 1641 1660 2131 2281 2318 2415 2441 2497 2541 2605 2640 259 338 342 396 407 799 824 967 974 13 85 2219 583 951 2218 2790 2825 ● 1169 1387 1543 1584 1643 1830 2564 2582 2670 2695 297 318 317 775 ● 1220 1398 1441 1582 1594 1662 1681 2187 2283 2311 2384 2813 110 221 266 728 735 800 818 20 1149 1525 1587 1656 1966 2874 372 741 783 ● 1669 545 1028 1117 1141 1895 1907 1945 1955 2009 2201 2204 2314 2371 2373 2620 2830 141 164 361 426 528 45 ● ● 1278 2059 557 1900 1526 1802 2438 465 890 77 ● ●●● ● 708 ● 2744 742 ● ●●● 161 ●●● ●● ●● ● ● ●● ● 1348 1718 ●● ● ●● ●● ●● ●●● ●● ● ●● ● ●●● ● ● ● ●● ● ●●● ●● ●

●● ● ●● ●

● ●● ●

−2



●● ●● ● ● ●

−2

1151 1216 2881 355 2603 ● 2552 2632 1481 1784 2435 904 ● 2524 400 1722 2422 ●2299 ● 731 ● ● 1058 ● ●●● 868 ● 2607 ●847● 1364 641 615 1238 1600 2625 2445 194 893 1127 658 52 51 158 2734 ● 1766 2139 2741 539 625 2189 ●●● ● ● 1954 ● ● ● ● 1014 ●● ● 2791 ● ● ● ●

0

Dimension 2 after rotation

4 2 −2

0

Dimension 2

6

90 1474 578 ● ● ● 1611 2598 1456 1116 1499 1498 2467 ● 1151 2472 ● ● ● ● 1216 893 2607 2881 194 158 1551 ● ● 2632 ●● ● 1238 ● ● 355 615 2741 400 1954 ● 2524 1600 ● 539 625 1722 2734 731 ● 1481 1784 2435 2552 904 ● ● ● 2299 ● ● ● ● 847 ●● 1058 2791 1014 ●● 641 1766 2139 868 ●● 658 52 51 2295 ● ● 513 ● ● 1397 2189 2603 2422 ● ● ● 886 2445 ● ● ● 1127 ● 678 ● 1025 87 2195 2471 588 2625 1364 631 ● 611 980 1663 2750 1337 2648 986 1915 2656 246 2735 ●● 2718 2431 5 1977 ●●2585 920 63 323 ● 1767 1590 591 673 ● ● ● ●1650 2746 1735 1251 462 ● 2037 ● ● 2340 ●17 2389 2878 1166 1165 135 883 1865 ● 2342 ● 293 ● 623 ●● ●● 171 585 1215 2447 2674 1329 526 1382 ●2589 2639 672 190 ● ● 2732 ●1038 1950 170 ● 298 ●300 152 395 737 ● 873 ● 367 1200 291 1684 2374 2636 2028 1608 1622 ●2392 2859 1437 1654 ● ● 2664 490 ● ● ●1308 ● 1138 213 31 2857 100 ● 1667 310 567 992 ● 2404 ● ● ● ● ● ● 2855 1569 ●2343 ●● ● 863 ● ●2615 1394 1909 2208 2378 330 538 941 1105 ● ● 279 ● 564 1225 ● 263 630 274 ● 1495 ●1897 ● 467 1691 2275 181 ●2323 ● 593 ● 341 835 1078 1328 2365 ●834 168 ● ● 2503 ● ● 1358 1176 1189 1492 1567 1634 2122 2194 2213 2262 2583 2666 2690 2738 206 265 281 460 562 561 620 661 991 1607 ● 1262 ● 2291 394 ● 603 ● 1068 1118 1197 1461 1678 2186 2288 2399 2421 2862 165 203 224 267 408 480 484 551 573 579 612 627 675 687 913 ●● ● 2788 1472 1224 1515 1579 1836 2052 2062 2135 2173 2376 2553 2714 2748 129 989 30 72 391 ●● 1786 ● ●● ● ● 1872 1142 1035 1450 1449 1695 2430 2631 232 516 1699 1183 1373 1466 1783 2129 2455 2521 219 817 ●1465 ● 1330 2611 2710 ●1353 ● 2006 ● 211 2182 1006 1178 2693 1071 1378 2302 2380 2630 212 353 943 ●2634 ●1980 1886 2104 1937 ● ●2652 1245 1013 1336 2423 319 329 360 753 ● 1252 1386 2231 850 1149 1525 1587 1656 1966 2874 372 741 783 1319 1842 2097 942 1416 316 1034 1036 1074 1086 1110 1131 1163 1168 1167 1196 1226 1231 1334 1368 1404 1413 1446 1454 1496 1542 1548 1547 1552 1566 1572 1598 1610 1628 1635 1639 1647 1685 1689 1700 1706 1710 1738 1745 1777 1811 1846 1866 1881 1953 1972 1976 1991 1994 2072 2075 2082 2142 2161 2170 2234 2300 2339 2383 2391 2411 2440 2442 2475 2496 2512 2531 2556 2586 2597 2601 2600 2614 2655 2657 2662 2686 2707 2766 2796 27982799 2818 2820 2865 116 138 137 172 204 208 210 225 257 285 303 312 351 356 376 382 398 411 414 419 431 440 457 459 464 474 495 532 536 560 575 626 644 677 696 718 724 747 758 784 806 809 830 848 906 912 917 928 959 970 19 38 83 82 81 2 1346 2330 371 422 682 2767 302 ●98 ● 2105 2672 ● 1292 ● 827 1870 ● 2351 2852 446 1293 1843 2144 2171 2370 2489 831 2697 2113 2572 1940 1059 1379 1459 1500 1698 1780 1812 1920 2081 2087 2109 2768 144 445 488 511 640 670 67 1041 2147 2756 841 987 ●27 1533 2051 2168 2184 346 ●1202 97 1027 1031 1049 1062 1069 1112 1132 1154 1153 1222 1229 1235 1244 1294 1299 1323 1415 1424 1430 1455 1464 1471 1484 1523 1546 1550 1563 1578 1581 1599 1604 1618 1636 1645 1648 1712 1719 1726 1741 1746 1807 1832 1831 1841 1861 1860 1898 1941 1958 1981 1989 1988 1999 2002 2007 2023 2030 2050 2054 2061 2063 2083 2099 2110 2130 2181 2203 2227 2255 2264 2298 2338 2337 2353 2400 2448 2454 2466 2469 2480 2494 2500 2544 2559 2613 2612 2622 2643 2719 2723 2747 2761 2789 2794 2856 118 117 122 121 131 148 151 229 242 241 249 304 307 320 337 339 348 357 416 428 444 449 458 463 471 470 476 489 493 502 535 534 533 594 607 606 634 642 654 680 688 691 701 700 716 722 721 727 762 776 792 801 829 828 858 862 869 874 926 925 965 973 982 981 27 50 68 75 86 92 1 9 610 1089 1053 1080 1283 1302 1301 1414 1617 1754 1868 1969 2567 2680 128 140 153 487 563 653 652 7 9 63 1429 1644 261 2044 2453 1012 1045 1056 1055 1088 1100 1119 1133 1140 1144 1143 1146 1171 1179 1181 1188 1198 1209 1214 1213 1219 1227 1243 1249 1290 1289 1311 1315 1314 1333 1343 1356 1365 1401 1427 1431 1434 1452 1468 1473 1477 1476 1503 1502 1514 1513 1522 1529 1532 1531 1534 1539 1571 1576 1586 1585 1602 1620 1626 1637 1671 1674 1709 1715 1730 1737 1753 1752 1776 1775 1789 1803 1818 1817 1824 1835 1844 1857 1869 1871 1892 1930 1936 1938 1942 1963 2035 2071 2096 2095 2094 2108 2112 2128 2127 2140 2154 2163 2178 2190 2197 2196 2232 2238 2244 2248 2253 2284 2304 2336 2335 2334 2360 2394 2396 2414 2429 2452 2461 2460 2483 2491 2490 2495 2506 2520 2529 2573 2581 2616 2626 2635 2665 2699 2701 2716 2778 2781 2787 28042805 2803 2809 2827 2826 2867 2885 112 111 127 143 147 149 156 166 180 179 186 201 207 215 218 234 233 262 296 327 343 347 352 366 370 415 420 451 469 478 506 517 523 530 529 556 572 602 645 668 685 690 692 695 694 711 725 746 749 768 767 772 798 825 832 844 880 895 899 898 919 994 999 14 41 43 78 1395 1418 37 ● ● ● 1536 ● ●●●● ● ● 2858 1261 2251 403 402 1265 1491 ●2386 205 368 1010 1396 1692 2249 2367 2698 2866 664 663 958 69 1043 1204 1893 2008 2362 2592 256 475 554 632 ● 1859 1858 541 ● 1015 1022 1024 1030 1064 1103 1108 1129 1184 1254 1253 1372 1383 1390 1399 1406 1410 1425 1428 1445 1475 1516 1591 1597 1614 1640 1642 1649 1661 1680 1736 1758 1765 1797 1796 1808 1814 1821 1825 1840 1839 1867 1878 1888 1905 1904 1923 1927 1932 1952 1961 1964 1968 1987 1997 2032 2043 2079 2103 2106 2119 2121 2126 2132 2152 2176 2175 2193 2199 2271 2324 2326 2346 2349 2361 2368 2375 2388 2405 2451 2477 2510 2514 2536 2574 2590 2606 2619 2624 2627 2653 2678 2694 2728 2764 2770 2774 2807 2836 2839 2841 2849 2853 104 103 114 120 167 217 230 244 264 306 362 369 383 385 390 401 413 421 435 439 531 547 555 565 599 601 621 628 639 662 665 703 705 707 720 730 729 743 760 766 778 788 787 786 790 803 811 821 853 884 911 921 929 944 985 36 62 66 89 ●● 1768 1891 2016 2285 2730 2762 276 275 997 2048 574 717 887 983 ● ● 258 1001 1016 1040 1039 1050 1063 1067 1084 1098 1121 1139 1158 1162 1194 1206 1228 1242 1241 1263 1268 1273 1277 1296 1320 1342 1367 1366 1402 1405 1421 1420 1426 1470 1469 1483 1482 1508 1507 1575 1596 1601 1625 1629 1653 1657 1694 1720 1770 1774 1787 1819 1838 1851 1853 1887 1912 1911 1910 1935 1944 1949 1951 1956 1959 1965 1982 1984 1992 1996 1995 2001 2003 2041 2058 2057 2091 2101 2111 2123 2125 2151 2160 2179 2185 2192 2198 2209 2228 2230 2236 2261 2266 2297 2309 2316 2319 2331 2377 2387 2408 2407 2443 2459 2464 2476 2484 2530 2535 2540 2551 2555 2568 2604 2617 2633 2646 2659 2688 2733 2752 2755 2763 2775 2797 2801 2831 2835 2847 2854 2864 2880 102 124 134 145 159 183 189 192 195 220 231 236 235 238 251 283 332 334 350 365 389 393 425 427 436 466 500 510 537 544 543 542 549 566 568 598 605 617 624 629 636 650 649 648 656 655 666 693 732 755 761 771 770 774 782 826 857 871 870 889 897 896 900 931 935 937 961 968 976 975 996 12 22 24 32 49 55 57 65 76 79 84 88 1306 2169 ●2883 1422 1123 1033 1284 1391 1409 1408 1485 1570 1739 1978 2159 2406 2492 2668 2877 364 553 659 29 28 ●1439 2508 ●58 ● ● ● ● 1009 1008 1755 1805 503 878 953 ● ● 1003 1019 1032 1048 1051 1060 1076 1083 1085 1087 1172 1187 1192 1205 1233 1307 1324 1354 1357 1359 1412 1423 1443 1442 1448 1451 1493 1511 1535 1592 1631 1679 1688 1701 1708 1723 1800 1799 1813 1855 1854 1880 1889 1906 1924 1973 1975 2020 2040 2047 2070 2090 2150 2224 2233 2241 2260 2270 2274 2289 2303 2357 2456 2458 2474 2504 2515 2525 2533 2578 2649 2692 2705 2713 2725 2724 2739 2771 2777 2806 2808 2843 2860 2869 2872 105 157 184 191 200 216 290 289 299 301 305 331 373 473 482 486 497 552 577 638 706 713 715 734 740 739 745 744 756 759 773 802 805 823 846 936 960 54 60 80 1317 1327 ● ● ●● ● ● ● ● 984 ● ● 1545 ● 1113 328 667 ●613 1026 1061 1325 1347 1400 1555 1762 1847 2177 2242 2292 2382 2473 2509 2712 2773 132 202 280 287 550 684 21 99 1037 1234 1488 1487 1655 1778 2211 2210 2644 2772 198 278 843 945 971 1028 1117 1141 1895 1907 1945 1955 2009 2201 2204 2314 2371 2373 2620 2830 141 164 361 426 528 45 1093 2257 1029 1174 1297 1344 1613 2358 2409 2554 2588 2837 243 247 429 447 852 908 ● ● 441 ● ● ●73 454 559 ●● 2736 ● ● ●● 1276 1281 1666 1668 1790 1826 1957 2254 2279 2 2629 2769 126 686 748 940 498 228 227 325 1145 1388 ● 2845 ●404 1104 1157 1232 1665 1764 1763 1815 1829 1884 2025 2243 2580 2667 2685 250 345 797 954 ●2757 1609 1651 1863 2011 2663 2711 133 406 35 59 1979 1072 1161 1186 1250 1362 1497 1520 1580 1672 1676 1779 1816 1894 1918 1931 1947 1985 2137 2156 2155 2226 2493 2507 2563 2642 2740 2743 2758 2812 2834 2846 2863 2870 2884 146 175 214 546 704 733 736 916 947 972 977 39 1199 1615 1641 1660 2131 2281 2318 2415 2441 2497 2541 2605 2640 259 338 342 396 407 799 824 967 974 13 85 2259 1126 1287 1375 1480 1479 1509 1621 1732 1749 1759 1773 1916 2018 2031 2143 2222 2252 2313 2372 2433 2527 2534 2562 2720 2780 2783 106 125 226 311 438 647 669 765 837 923 ●2221 2398 2393 757 ● 849 1079 1305 1332 1962 2352 2417 ● ● 2219 583 951 1107 2709 2871 452 2681 2876 719 1349 1519 1560 1559 1727 2124 2439 2822 2838 268 472 674 752 340 2570 443 ● 2074 2595 2868 660 879 1007 1150 1260 1282 1403 1407 1544 1549 1573 1595 1605 1673 1697 1747 1873 1929 1990 2012 2022 2045 2053 2060 2065 2102 2167 2216 2379 2420 2545 2547 2549 2587 2618 2673 2682 2687 2776 2814 2816 2873 2879 150 240 239 270 294 354 386 477 494 518 548 589 608 637 643 698 697 702 714 794 793 796 856 872 924 957 53 4 ● 1017 1066 1065 1075 1094 1099 1159 1164 1191 1190 1256 1310 1309 1335 1374 1381 1380 1384 1411 1419 1438 1528 1527 1558 1612 1633 1760 1769 1788 1791 1823 1822 1837 1850 1883 1925 1933 1970 2000 2064 2066 2084 2100 2162 2166 2207 2215 2214 2220 2223 2239 2263 2272 2280 2286 2307 2345 2356 2359 2449 2502 2501 2511 2516 2523 2542 2602 2609 2621 2658 2669 2683 2703 2706 2717 2722 2782 2800 2848 2850 2861 130 136 155 169 315 399 448 485 499 581 683 750 795 810 840 839 866 882 881 892 901 939 988 23 42 71 2034 2671 ● 1170 1318 1360 1363 1693 1798 2046 2098 2136 2437 2577 2628 2675 2677 2721 2731 2785 2832 142 154 255 273 272 286 410 450 456 455 515 596 595 838 888 918 922 ● ● 1705 1993 2355 1044 1070 1175 1207 1288 1345 1355 1377 1376 1385 1453 1458 1462 1490 1512 1518 1658 1713 1721 1810 1809 1896 1913 1960 2005 2021 2076 2080 2114 2149 2205 2246 2245 2315 2426 2446 2482 2485 2505 2538 2575 2660 2696 2700 2702 2742 2754 2759 2779 2784 2792 2811 2829 2851 2882 113 115 139 160 199 284 288 437 584 764 812 822 932 40 61 74 ● 2138 33 ●1230 1173 1623 769 1266 498 64 1128 1716 ●1917 ● 1553 1630 1750 1848 1890 2183 2237 2321 2344 2347 2679 187 1102 1748 321 514 1669 545 1340 2450 525 907 1369 1524 1122 2145 2366 2444 2479 860 995 1616 1725 2120 2557 2610 2704 295 326 387 836 845 ●● ● 1280 ● 1361 1370 1506 1577 1742 2092 2256 2322 2354 2364 2470 2539 479 779 867 ● ● 1082 1081 1109 1124 1135 1155 1177 1180 1255 1303 1331 1457 1505 1530 1568 1593 1646 1670 1729 1794 1806 1856 1948 1967 1983 1998 2004 2029 2165 2188 2200 2265 2276 2282 2294 2305 2424 2427 2487 2532 2591 2594 2608 2641 2795 2802 2815 2824 2842 2875 163 177 277 512 521 527 614 679 710 791 808 894 914 952 963 969 990 993 25 96 ● ● ● ● ● 2543 1704 2067 ● 2765 392 ● ● 1134 1285 1771 1934 2212 2402 2708 176 349 384 492 962 ● 93 496 522 842 94 178 ●2202 ● 1286 1772 1974 2320 2428 2499 363 378 1903 162 ● ● 1632 2089 2432 2576 726 ● ● ● ● 1489 2306 ● 949 ● ● ● ● ●1322 ●2654 ● ● ●948 ●● 1130 2118 196 237 934 933 1248 2565 1223 1538 1537 ● ●● ●● ● ● ● ● 1148 ● ● 1208 1902 1195 1392 1561 1717 2317 380 2744 742 1304 1724 2268 193 ● 1348 1011 1077 1120 1201 1339 1338 1389 1433 1510 1517 1521 1703 1827 1914 2039 2093 2269 2650 358 442 902 ● 1169 1387 1543 1584 1643 1830 2564 2582 2670 2695 297 318 317 775 ● ● ● ●15 ● ● 571 ● ● 1270 877 885 ●2651 ● ● 1002 1312 322 1504 709 1156 1460 2689 1919 2191 865 618 651 751 2293 689 781 ● 1005 1023 1115 1114 1221 1352 1682 1711 1921 1939 1943 2013 2015 2085 2107 2172 2180 2240 2301 2333 2390 2457 2478 2488 2522 2647 174 173 209 622 789 819 876 875 955 26 ● 1278 2059 557 1259 1351 1350 1677 1687 1734 1751 1834 2434 2566 282 405 569 855 864 10 91 1047 1046 1257 1272 1271 1467 1756 1828 2235 2325 2327 197 335 418 461 590 604 633 861 998 48 1203 1217 1341 1624 1793 1804 1877 2158 2584 2661 2691 2749 2823 123 185 253 453 508 966 ● ● ●2753 1900 1275 1274 ● 2844 1588 313 ● 1313 1603 1801 2086 314 377 379 509 891 1106 1218 1279 1300 2267 2401 2737 524 44 ● 708 ● 1090 1432 1444 1714 1901 1971 2036 2117 2174 2296 2341 2348 2726 108 188 374 409 491 597 635 780 785 3 7 6 ● 1757 2038 2258 910 1073 1091 1246 1463 1627 1731 1874 1885 2395 2513 2599 2623 2637 2676 2729 2751 2810 101 109 119 252 260 292 397 501 520 592 657 676 699 723 820 903 905 909 956 979 46 ● ● 11 1004 1042 1111 1125 1182 1264 1291 1298 1436 1435 1494 1574 1606 1664 1683 1702 1785 1795 1849 1852 1862 1876 1875 1899 1922 1946 2017 2049 2078 2077 2116 2229 2247 2278 2381 2410 2418 2462 2468 2518 2517 2537 2560 2638 2760 245 308 359 417 430 468 570 576 587 586 619 813 815 859 915 950 ● ● 930 1057 1269 1440 1744 1820 1833 2056 2073 2308 2463 2550 2579 2821 248 375 816 34 8 ● 16 1193 2419 223 222 2218 2790 2825 ●● ● ● 1554 1562 1652 1054 1092 1095 1137 1147 1152 1185 1210 1239 1247 1258 1267 1316 1417 1541 1557 1556 1619 1659 1743 1781 1845 1926 1928 2024 2027 2134 2141 2146 2250 2290 2329 2369 2397 2413 2416 2436 2465 2486 2528 2546 2548 2558 2569 2571 2593 2715 2727 182 254 269 309 336 344 424 423 433 432 481 483 505 504 507 558 580 600 616 681 712 738 804 833 938 70 95 ● 1101 1136 1690 1728 1986 2019 2133 2277 2287 2312 2350 2745 2833 854 ●● ● ● 1740 ● ● ● ● 2042 1240 2115 ●2164 ● ● ● ●● ● ●2225 ● ●●● ● ●●● ● ●● 1802 2438 465 890 77 ● ● ● ●2793 ● 814 2684 2328 1295 1321 1393 1447 1540 1565 1564 1733 1792 2055 2481 2519 2561 2645 2786 107 271 412 434 519 964 18 47 1718 1220 1398 1441 1582 1594 1662 1681 2187 2283 2311 2384 2813 110 221 266 728 735 800 818 20 56 381 161 1501 1782 1879 2010 2069 2157 2363 2526 2819 1526 1000 1018 1052 1160 1326 1478 1486 1583 1589 1675 1707 1761 1864 1882 2026 2068 2088 2153 2206 2217 2310 2385 2412 2425 2596 324 540 582 609 646 807 851 ●2817 1237 1236 1021 1020 1097 1096 1212 1211 1371 1638 1686 1696 2014 2033 2148 2273 2332 2403 2828 333 388 671 754 777 946 978 ● ● ● ●●2840 ●

6

8

−2

0

2

Dimension 1

4

6

8

Dimension 1 after rotation

Figure 2: Plots of the (standardized) scores of the 2885 students on the first two components before and after rotation.

4 milkchoc=no

0

1

3 bread=no cereals=no milkchoc=no milk=no jam=no puremilk=no cafe=no cafe=yes milkchoc=yes honey=no jam=yes tea=no bread=yes cereals=yes puremilk=yes deli=no cheese=no milk=yes eggs=no

−1

cafe=yes milk=yes

−1

tea=yes honey=yes 0

honey=yes milkchoc=yes tea=yes cafe=no jam=yes milk=no cereals=yes bread=yes puremilk=yes puremilk=no tea=no honey=no jam=no deli=no cheese=no cereals=no eggs=no bread=no

−1

0

1

2

cheese=yes

cheese=yes

2

3

deli=yes eggs=yes Dimension 2

deli=yes eggs=yes

1

Dimension 2 after rotation

4

5

Categories after rotation

5

Categories before rotation

2

3

4

5

−1

Dimension 1

0

1

2

3

4

5

Dimension 1 after rotation

Figure 3: Plots of the category coordinates on the first two components before and after rotation. procedure based on Kiers’ matrix reformulation. The numerical results have also shown on a real data application the interest of this algorithm in the context of MCA with graphical representations of both variables and categories after rotation. The PCAMIX procedure as well as the rotation procedure have been implemented in the R package “PCAmixdata”.

14

Appendix Define the complex numbers: def

as = as,1 + ias,2 , def P 2 tj = s∈Ij as = uj + ivj ,

def

a ˜s = e−iθ as = a ˜s,1 + i˜ as,2 , def P t˜j = ˜2s = e−2iθ tj = u˜j + i˜ vj , s∈Ij a

where a ˜s,1 , a ˜s,2 have been defined in (14), uj , vj in (18), and where u˜j , v˜j are given by the same formula as uj , vj , but with a tilde over as,1 , as,2 . We introduce now a complex-valued varimax function F (θ) of the rotation angle θ by: def

F (θ) = p

p X

p X t˜j )2 = e−4iθ F (0) , t˜j2 − ( j=1

j=1

where F (0) is simply obtained by suppressing the tilde in F (θ). Development of F (θ) gives : F (θ) = p

p X

(˜ uj2



v˜j2 )

p p p p p X X X X  X 2 2 v˜j ) + 2i p u˜j ) + ( −( u˜j v˜j − u˜j v˜j } j=1

j=1

|

j=1

j=1

{z

}

g(θ)

|

j=1

(21)

j=1

{z

i h(θ)

}

Comparison with the formula (16), (17), (18) defining b, a, ρ, ψ shows that : F (0) = g(0) + ih(0) = b + ia = ρ eiψ . Hence :  F (θ) = ρ ei(ψ−4θ) = ρ cos(4θ − ψ) − i sin(4θ − ψ) . But derivation of the varimax function f (θ) defined in (13) gives, using the fact that a0s,1 (θ) = as,2 (θ) and a0s,2 (θ) = −as,1 (θ) : p p p X X  X pf (θ) = 2 p u˜j v˜j − u˜j v˜j 0

j=1

j=1

j=1

= h(θ) = −ρ sin(4θ − ψ) ,

(22)

= a cos 4θ − b sin 4θ ,

(23)

and (22) proves (15) by integration. References de Leeuw, J., and Pruzansky, S., (1978), A new computational method to fit the weighted Euclidean distance model, Psychometrika, 43, 479-490. Escofier, B., (1979), Traitement simultan´e de variables qualitatives et quantitatives en analyse factorielle [Simultaneous treatment of qualitative and quantitative variables in factor analysis], Cahiers de l’Analyse des Donn´ees, 4, 137-146. 15

Jennrich, R.I., (2001), A simple general procedure for orthogonal rotation, Psychometrika, 66(2), 289-306. Kaiser, H.F., (1958), The varimax criterion for analytic rotation in factor analysis, Psychometrika, 23(3), 187-200. Kiers, H.A.L., (1991), Simple structure in Component Analysis Techniques for mixtures of qualitative and quantitative variables, Psychometrika, 56, 197-212. Neudecker, H., (1981), On the matrix formulation of Kaiser’s varimax criterion, Psychometrika, 46, 343-345. Pag`es, J., (2004), Analyse Factorielle de donn´ees mixtes [Factor Analysis for Mixed Data], Revue de Statistique Appliqu´ee, 52(4), 93-11. ten Berge, J.M.F., (1984), A joint treatment of varimax rotation and the problem of diagonalizing symmetric matrices simultaneously in the least-squares sense, Psychometrika, 49, 347-358. ten Berge, J.M.F., (1995), Suppressing permutations or rigid planar rotations: a remedy against nonoptimal varimax rotations, Psychometrika, 46 60, 437-446.

16