Statistics

  1. A refrigerator company is interested in understanding the number of years for which a household keeps the same refrigerator. They obtained a random sample of 64 households to address this issue. In the sample, the average household used the same refrigerator for a duration of 12 years. Assume you do not know the population mean, but you know that the population standard deviation is equal to 6 years. Using a significance level of 0.05, answer the questions below:

 

  • (a) Is there enough evidence to suggest that the average number of years that a household keeps a refrigerator exceeds 10 years? Conduct a hypothesis using the test statistic method and interpret your result.

 

  • (b) Re-do part (a) using the p-value method.

 

  • (c) What is the definition of a p-value?

 

  • (d) What is the definition of a significance level?

 

 

 

  1. Download the data file labeled “Question 2” on Moodle. Suppose you are a marketing manager for Planet Fitness, which is coming to Cape Girardeau in November. The dataset consists of membership sales (SALES) from 25 different metropolitan areas. In each area, Planet Fitness spent money on three different forms of advertising: direct mailing (Direct), newspaper ads (Newspaper), and tv ads (Television). In the data file, all dollar amounts are measured in thousands of dollars.

Using a statistical software program of your choosing, estimate a regression model to predict membership sales (SALES) as a function of Direct, Newspaper, and Television advertisement expenditures. Then answer the following questions:

 

  1. Write down the estimated regression equation. Interpret the meaning of the y-intercept and slope coefficients on each of the three independent variables.

 

  1. Predict membership sales if Direct=2 (or $2,000), Newspaper=1, and Television=3.

 

  1. Using the test statistic method, test whether there is a linear relationship between expenditures on direct mailings and membership sales. Show all work for your hypothesis test and use a 5% significance level.

 

  1. Using the p-value method, test whether there is a linear relationship between newspaper advertisements and membership sales. Show all work for your hypothesis test and use a 5% significance level.

 

  1. Write the value of the R-squared term and interpret its meaning in words.

 

  1. Write down the estimated value of the standard deviation of the error term. Does this number appear to be high or low? Explain.

3.

Suppose you decide to examine the link between absenteeism and union-management relations using multiple regression analysis. The analysis uses firm-level data.

 

Y = Average # of days absent within a firm in a year

 

Bad_UM_Rel = a dummy variable indicating BAD union-management relations (omitted category = NEUTRAL relations)

 

Good_UM_Rel = a dummy variable indicating GOOD union-management relations (omitted category = NEUTRAL relations)

 

Pct_PT = the percentage of a firm’s employees who are part time.

 

Pct_U = the percentage of a firm’s employees who are members of a labor union.

 

 

 

Questions: Answer parts (a), (b), (c), (d), (e), and (f).

 

  1. In words, interpret the meaning of the y-intercept.
  2. In words, interpret the R-square value.
  3. Using the p-value method, do a hypothesis test for whether there is a statistically significant difference between companies with BAD relations and companies with NEUTRAL relations. Show all steps to your hypothesis test and use a significance level of 5%.
  4. Using the p-value method, do a hypothesis test for whether there is a statistically significant difference between companies with GOOD relations and companies with NEUTRAL relations. Show all steps to your hypothesis test and use a significance level of 5%.
  5. In words, interpret the meaning of the unstandardized slope coefficient on Pct_PT.
  6. In words, interpret the meaning of the standardized slope coefficient on PCT_U.

 

 

  1. Short answer.

 

  • (a) What is a linear probability model? In addition, provide an example of how it can be used.

 

  • (b) Answer the following:
  1. What is multicollinearity?
  2. If you think multicollinearity exists, how can you identify it?
  • Why is multicollinearity a problem?

 

(c) When evaluating the results from a regression analysis, explain how the interpretation of a standardized slope coefficient differs from the interpretation of an unstandardized slope coefficient.

 

 

 

  1. Application and Extension Proposal

 

Do the following:

(1) The Homework 1 folder on Moodle includes 4 pdf files with articles from the Harvard Business Review. Read all 4 articles.

 

(2) Pick 2 articles that interest you the most and then prepare the following:

 

  1. Summarize this article in one paragraph.
  2. Explain how regression analysis can be applied to this topic. List a dependent variable and five independent variables that could be applied to this study.
  3. Propose an extension of this research topic and how regression analysis can be applied in your proposed extension. By “extension,” I want you to think of a modification of this topic.
ACED ESSAYS