1. Female faculty at Delaware have raised concerns over gender discrimination by the administration. At their request, the Dean pulled salary data for one department and found that females in the department were making on average \$5,000 per year less than their male counterparts. Note that this scenario is fictitious, as are all data to be presented.

a. Formally state the hypotheses that would be reasonable to test for this concern of gender discrimination. Explain your reasoning.

b. If we use a t test to evaluate these hypotheses, should it be an independent samples test, or a paired test? Why?

c. Suppose we conducted the appropriate t test, which produces a p value of 0.017. State in plain English what we might conclude from this test.

d. Suppose the test in question b produced a p value of 0.392. State in plain English what we might conclude from this test.

e. If we fit a regression model to this data using additional variables, and it turned out that there was a significant interaction between gender and years of experience, how should this be interpreted? Please answer in plain English, as if speaking to someone with no statistical background.

f. Suppose that the p value of the test in question b is statistically significant. Explain why this information, by itself, would not prove that the administration is discriminating on the basis of gender.

1. This question relates to central composite designs (CCD).

a. When in the sequential experiments process would we typically run a central composite design? Why?

b. What additional information do we obtain from central composite designs, relative to factorials or high-resolution fractional factorials?

c. From which types of points in the design do we obtain the additional information noted in question b (center points, factorial points, etc.)?

d. For what reasons do we typically include center points in central composite designs? Hint: there is more than one reason.

e. I plan to run a central composite design in 5 variables, and want to save experimental effort. I am considering running a 25-1 for the factorial part of the design, instead of a full factorial. What is your advice for me about this – does it make sense to you or not? Assume that I plan to fit a full quadratic model with all main effects, all two-factor interactions, and all quadratic terms. Justify your answer.

f. Suppose I run the 25-1 mentioned in question e for the factorial part of the design. Assuming I run the rest of the central composite design using the standard approach, including 4 center points, how many points would be in my final design? Explain your answer.

g. Explain in a few sentences the steps I would take to manually create the factorial part of the design – the 25-1 – utilizing Table 8.14 of the Montgomery text.

1. I have run a full factorial design involving two discrete variables, A at 4 levels, and B at 5 levels. I have replicated the design once, i.e., I have two runs for each combination of these two variables. Further, I blocked the design by replicate; that is I ran the first replicate in Block 1, and the second replicate in Block 2. I randomized the order of runs within each block.

a. Write down the ANOVA table – just sources of variation and the associated degrees of freedom. Explain your answer.

b. Assume we did have the data, and could calculate the sums of squares, mean squares, and F ratios. Conceptually, what pattern would I expect to see in the set of mean squares if the null hypotheses concerning A, B, & their interaction were all true? That is, how would you expect these mean squares to compare to one another? Why?

c. What pattern would I expect to see in the set of mean squares if the null hypotheses concerning A, B, & their interaction were all false? Why?

d. What does the mean square error represent? Please be specific in your answer; for example, do not say “variation” – this is too general.

e. Suppose the null hypotheses about A, B, AB, etc. are false. How might this affect the mean square error? Please explain your answer.

1. Compare these two types of screening designs: the 2k-p fractional factorial designs (n= 4, 8, 16, 32, etc) and the Plackett-Burman designs that are not a power of two (eg, n= 12, 20, 24, 28, etc).

a. What is similar about these two types of designs? (Hint: It involves their balance and orthogonality properties.)

b. What is the main difference between these two types of designs? (Hint: It involves how they are affected by two factor interactions.)

Sample Solution