Multivariate Statistical Methods for Engineering and Management

3 downloads 8112 Views 300KB Size Report
Jun 29, 2010 ... Department of Mathematics. Section of Probability and Statistics. Multivariate Statistical Methods for Engineering and Management. Master in ...
Department of Mathematics Section of Probability and Statistics

Multivariate Statistical Methods for Engineering and Management Master in Industrial Engineering and Management 1st. Exam Duration: 3h

2nd. Semester – 2009/10 29/06/2010 – 3 PM

Group I 6.5 points A company specialized in portraits of children operates in 21 cities of medium size. The company is considering an expansion into other cities of medium sizes and wishes to investigate whether sales (Y , in thousands of dollars) in a community can be predicted from the number of persons aged 16 or younger in the community (X1 , in thousands of persons) and the per capita disposable personal income in the community (X2 , in thousands of dollars). Data on these variables for the 21 cities in which the company operates are summarized by: P21 xi1 = 1302.4, Pi=1 21 2 i=1 xi1 = 87707.9, 

(X t X)−1

P21 xi2 = 360.0, Pi=1 21 2 i=1 xi2 = 6190.3,

29.57402 =  0.07176 −1.98197

0.07176 0.00037 −0.00552

P21 xi1 xi2 = 22609.2, Pi=1 21 2 i=1 yi = 721072.40.

   3820 −1.98197 −0.00552  , X t y =  249643  . 66073 0.13559

1. Fit a multiple regression model to these data. State the estimated regression function.

(1.0)

ˆ t X t y = 718891.4. 2. Estimate σ 2 , admitting β

(0.5)

3. Test whether there is a linear association between sales and the two explanatory variables under study, using α = 0.05. State the hypotheses, test statistic, decision rule, and conclusion. What are the assumptions you have to admit in order to solve this question?

(2.0)

4. What is the p-value of the test in part (3)? Comment.

(1.0)

5. As part of a possible expansion program, the company would like to predict the sales for a new city, characterized by x0 = (1, 65.4, 17.6)t . Note that it is known that the city has characteristics that fall well within the pattern of the 21 cities on which the regression analysis is based. Obtain a 90% prediction interval for this new city. Hint: xt0 (X t X)−1 x0 = 0.06311.

(2.0)

Group II

6.5 points

A consumer organization studied the effect of age of automobile owner on size of cash offer for a used car. 12 persons in each of three age groups (young, middle, elderly) accepted to participate in this study as the owners of a medium price, six-year-old car. The “owners” solicited cash offers for this car from 36 dealers selected at random from the dealers in the region. The assignment of a “owner” to a dealer is also made randomly. The offers (in hundreds of dollars) can be summarized by: y 1 = 21.5, y = 848,

y 2 = 27.75, P3 P12 i=1

2 j=1 yij

y 3 = 21.42, = 20374,

where 1 represents the young, 2 the middle, and 3 the elderly age group. Assume that ANOVA model with one factor is applicable. 1. Obtain the analysis of variance table.

(1.5)

Page 1 in 2

2. Use the F -test, at a 1% significance level, to decide whether there are differences among the group ages. State the hypotheses, test statistic, decision rule, and conclusions.

(2.0)

3. Find a 95% interval estimate on the mean value associated with young group.

(2.0)

4. What appears to be the nature of the relationship between age of owner and mean cash offer?

(1.0)

Group III

4.0 points

Protein consumption measured in twenty-five European countries for 5 food groups were obtained to determine whether there are groups of countries and whether meat consumption is related to that of other foods. The 5 food groups under study are: Red Meat, White Meat, Eggs, Milk, and Fish. The eigenvalues associated with the sample correlation matrix (R) are: λ1 = 2.394, λ2 = 1.214 λ3 = 0.721, λ4 = 0.479, and λ5 = 0.191. The first two eigenvectors (γi , i = 1, 2) of R are:

Red Meat White Meat Eggs Milk Fish

γ1 -0.476 -0.412 -0.592 -0.502 -0.029

γ2 -0.236 0.532 0.050 -0.225 -0.780

1. Write the first two sample principal components.

(1.0)

2. Find the percentage of the total variability explained by each principal component and interpret the first two principal components.

(1.5)

3. Find the correlation between the first principal component (Y1 ) and each of the standardized original variables (Xi , i = 1, . . . , 5). Interpret the first principal component and compare your findings with part (2). q p d Yˆj , Xi ) = γij λj / V \ Hint: Recall that Cor( ar(Xi ).

(1.5)

Group IV

3.0 points

From the previous study, five European countries were selected and the Euclidian distance among them are as follows:

Greece Sweden UK Italy F rance

Greece 0 25.16 20.32 7.98 18.25

Sweden

UK

Italy

F rance

0 10.90 0 21.88 17.26 0 14.03 6.99 15.16

0

1. Given the previous matrix, obtain the dendrogram using the complete linkage method.

(2.0)

2. Suggest the appropriate number of clusters and comment the results obtained.

(1.0)

Page 2 in 2