1. Module Overview
We encounter statistics in our daily lives more often than we probably realize and from many different sources, like the news. You are probably asking yourself the question, “What is statistics? When and where will I use statistics?” If you read any newspaper, watch television, or use the Internet, you will see statistical information. There are statistics about crime, sports, education, politics, and real estate. Typically, when you read a newspaper article or watch a television news program, you are given sample information. With this information, you may make a decision about the correctness of a statement, claim, or “fact.” Statistical methods can help the business managers make the “best educated guess.” In general, statistics is a field of study concerned with summarizing data, interpreting data, and making decisions based on data.
Once you have collected data, what will you do with it? Data can be described and presented in many different formats. For example, suppose you are interested in buying a house in a particular area. You may have no clue about the house prices, so you might ask your real estate agent to give you a sample data set of prices. Looking at all the prices in the sample often is overwhelming. A better way might be to look at the median price and the variation of prices. The median and variation are just two ways that you will learn to describe data. Your agent might also provide you with a graph of the data.
In Module 1, we learned about using descriptive statistics and visual displays in data analysis and decision making. In this module, we will focus on creating confidence intervals and conducting hypothesis testing.
A statistical graph is a tool that helps you learn about the shape or distribution of a sample or a population. A graph can be a more effective way of presenting data than a mass of numbers because we can see where data clusters and where there are only a few data values. Newspapers and the Internet use graphs to show trends and to enable readers to compare facts and figures quickly. Statisticians often graph data first to get a picture of the data. Then, more formal tools may be applied. Some of the types of graphs that are used to summarize and organize data are the dot plot, the bar graph, the histogram, the stem-and-leaf plot, the frequency polygon (a type of broken line graph), the pie chart, and the box plot. In this module, our emphasis will be on histograms.
A histogram is a graphic version of a frequency distribution. The graph consists of bars of equal width drawn adjacent to each other. The horizontal scale represents classes of quantitative data values and the vertical scale represents frequencies. The heights of the bars correspond to frequency values. Histograms are typically used for large, continuous, quantitative data sets. A frequency polygon can also be used when graphing large data sets with data points that repeat. The data usually goes on y-axis with the frequency being graphed on the x-axis.
In this module, you will be using Microsoft Excel to calculate statistics and produce the graphical displays mentioned above. Microsoft Excel is used widely in the workplace today, so the tools you learn in this module will be very useful.

Statistics are all around you, sometimes used well, sometimes not. We must learn how to distinguish the two cases. Just as important as detecting the deceptive use of statistics is the appreciation of the proper use of statistics. You must also learn to recognize statistical evidence that supports a stated conclusion. When a research team is testing a new treatment for a disease, statistics allows them to conclude based on a relatively small trial that there is good evidence their drug is effective. Therefore, it is important to understand statistics. In this course, you would reform your statistical habits from now on. No longer will you blindly accept numbers or findings. Instead, you will begin to think about the numbers, their sources, and most importantly, the procedures used to generate them. In this way, you can become a more rational decision maker by analyzing the past performance to make business planning.
The primary resource for this module is Introductory Business Statistics, by Alexander, Illowsky, and Dean.
For Module 4, you should read through the following material in this textbook:
Chapter 13: Linear Regression and Correlation
Sections 13.4, 13.5, and 13.6 only
These sections introduce multivariate or multiple linear regression analysis. These sections also explain some of the problems that can occur in regression analysis.
You are now familiar with several tools in the Analysis Toolpak. Regression analysis is just another one of those tools. Please review the following tutorial for help in generating regression estimates in Excel:
https://www.excel-easy.com/examples/regression.html
Optional Sources
Use the IBISWorld database or other databases such as Business Source Complete (EBSCO) and Business Source Complete – Business Searching Interface in our online library.
Check the professional market research reports from the IBISWorld database to conduct the industry analysis. IBISWorld can be accessed in the Trident Online Library.
IBISWorld Overview (n.d.). IBISWorld, Inc., New York, NY.
IBISWorld Forecast (n.d.). IBISWorld, Inc., New York, NY.
IBISWorld Data and Sources (n.d.). IBISWorld, Inc., New York, NY.
IBISWorld Navigation Tips (n.d.). IBISWorld, Inc., New York, NY.
IBISWorld is a proprietary database providing industry research. It is accessible via the Trident Online Library, Additional Library Resources.
Trident Online Website: https://mytlc.trident.edu
Locate: Library Access and click Additional Library Resources

1. Case Assignment
Assume once again that you are a consultant who works for the Diligent Consulting Group. You are continuing to work on the analysis of the customer database from Modules 1 through 3 – I can provide previous assignments as needed or background information

Complete the following tasks in the Module 4 SLP assignment template: – attached

1. Compare the coefficients of determination (r-squared values) from the two linear regressions: simple linear regression from Module 3 Case and the multivariate regression from Module 4 Case. Which model had the “best fit”?
2. Calculate the residual for the first observation from the simple linear regression model. Recall, the Residual = Observed value – Predicted value or e = y – ŷ.
3. What happens to the overall distance between the best fit line and the coordinates in the scatterplot when the residuals shrink?
4. What happens to the coefficient of determination when the residuals shrink?
5. Consider the r-squared from the linear regression model and the r-squared from the multivariate regression model. Why did the coefficient of determination change when more variables were added to the model?

Sample Solution