QUESTION 1: (15 MARKS)
Electric Co. Ltd buys electrical components in batches of 2000. From time to time a batch is randomly
selected and, for quality control purposes, all the components are inspected. The data gives the
number of defective components found in 40 batches recently bought.
12 16 81 49 60 17 19 48
34 20 25 50 32 72 57 44
76 62 93 43 47 93 86 71
54 66 48 51 27 22 16 53
48 61 33 19 78 49 98 19
(a) Construct a stem and leaf display for the number of defective components. [2 marks]
(b) Using the stem and leaf, evaluate the median, lower quartile, upper quartile and the interquartile
range of the given distribution. [2 marks]
(c) Using graph paper, draw a box and whisker diagram to represent these data and comment on
the shape of the distribution. [2 marks]
3
(d) Form a grouped frequency distribution for the number of defective components with classes of
equal width of 10 starting with the smallest value. [2 marks]
(e) Using the grouped frequency distribution in (d), evaluate an estimate of the mean, variance,
standard deviation and the coefficient of variation. [5 marks]
(f) Using graph paper, draw a histogram to represent the frequency distribution for the grouped
frequency data obtained in (d). [2 marks]

QUESTION 2: (5 MARKS)
A random sample of 50 professional football players had their heights, x, and their weights, y,
measured. The results are summarized below.
𝑛 = 50 ∑𝑥 = 3,681.2 ∑𝑦 = 9,973.3 ∑𝑥
2 = 271,137 ∑𝑦
2 = 2,011,873 ∑𝑥𝑦 = 734,756
a) Calculate the correlation coefficient for the data. Hence comment on the result. [1 mark]
b) Calculate the coefficient of determination for the data. Hence comment on the result.
[1 mark]
c) Calculate the equation of the linear regression line of y on x. [3 marks]
QUESTION 3: (10 MARKS)
The table below is based on data collected in the 2022 census in Botswana. The data indicated are for
the residents of Gaborone who were in work.
Respondents were randomly chosen in the survey. They were asked to state their home postcode and
the postcode of their place of work. This information was used to calculate the distance they travelled
to work. Respondents were able to specify that they worked mainly at or from home, or that their
work pattern did not include regular travel to a fixed place of work.
Distances travelled to work All persons Males Females
All categories 137,978 71,329 66,649
Less than 10 km 80,753 34,806 45,947
10 km to less than 30 km 26,946 15,554 11,392
30 km and over 6,524 4,637 1,887
Work mainly at or from home 13,641 8,438 5,203
Other 10,114 7,894 2,220
Calculate, for males and females separately, the percentages travelling less than 10 km, travelling
between 10 km and 30 km, and travelling 30 km and over, working mainly at or from home and other.

 

 

 

Sample Answer

Sample Answer

 

Analysis of Defective Components in Batches
Introduction
In this analysis, we will examine the data provided by Electric Co. Ltd, which includes the number of defective components found in 40 randomly selected batches. The purpose of this analysis is to understand the distribution of defective components and to evaluate various statistical measures related to this data.

Stem and Leaf Display
To construct a stem and leaf display for the number of defective components, we separate each number into a stem and a leaf. The stem represents the tens digit, and the leaf represents the units digit. Here is the stem and leaf display for the given data:

1: 2 6 7 9
2: 0 2 2 5 6 7 8 9
3: 2 3 4
4: 3 4 7 8 9
5: 1 3 4 4
6: 0 1 2 6
7: 1 6
8: 1
Statistical Measures
Using the stem and leaf display, we can calculate various statistical measures for the given distribution:

Median: The median is the middle value of the data. In this case, the median is the average of the two middle values, which are 47 and 48. Therefore, the median is (47 + 48) / 2 = 47.5.
Lower Quartile: The lower quartile is the median of the lower half of the data. In this case, the lower quartile is the average of the two middle values in the lower half, which are 24 and 25. Therefore, the lower quartile is (24 + 25) / 2 = 24.5.
Upper Quartile: The upper quartile is the median of the upper half of the data. In this case, the upper quartile is the average of the two middle values in the upper half, which are 61 and 62. Therefore, the upper quartile is (61 + 62) / 2 = 61.5.
Interquartile Range: The interquartile range is the difference between the upper quartile and lower quartile. In this case, the interquartile range is 61.5 – 24.5 = 37.
Box and Whisker Diagram
To represent the data visually, we can draw a box and whisker diagram. This diagram provides a visual representation of the statistical measures calculated above.

_______
| |
| |
|—+—|
| | |
|——-|

In this box and whisker diagram, the line inside the box represents the median (47.5). The box represents the interquartile range (24.5 to 61.5). The lines extending from the box (whiskers) represent the minimum and maximum values in the data set (12 and 98). The shape of the distribution appears to be slightly skewed to the right.

Grouped Frequency Distribution
To create a grouped frequency distribution, we group the data into classes of equal width of 10 starting with the smallest value.

Class: [10 – 19]

Frequency: 2

Class: [20 – 29]

Frequency: 5

Class: [30 – 39]

Frequency: 4

Class: [40 – 49]

Frequency: 9

Class: [50 – 59]

Frequency: 8

Class: [60 – 69]

Frequency: 6

Class: [70 – 79]

Frequency: 4

Class: [80 – 89]

Frequency: 3

Class: [90 – 99]

Frequency: 0

Estimate of Statistical Measures
Using the grouped frequency distribution, we can estimate various statistical measures for the given distribution:

Mean: To estimate the mean, we calculate the midpoint of each class and multiply it by its corresponding frequency. We then sum up these products and divide by the total number of observations. The estimated mean for this data set is approximately 48.25.
Variance: To estimate the variance, we calculate the squared difference between each midpoint and the estimated mean, multiply it by its corresponding frequency, sum up these products, and divide by the total number of observations. The estimated variance for this data set is approximately 299.94.
Standard Deviation: The estimated standard deviation is simply the square root of the estimated variance. For this data set, it is approximately sqrt(299.94) = approximately 17.32.
Coefficient of Variation: The coefficient of variation is calculated by dividing the estimated standard deviation by the estimated mean and multiplying by 100. For this data set, it is approximately (17.32 / 48.25) *100 = approximately 35.87.
Histogram
To represent the frequency distribution visually, we can draw a histogram. This graph provides a visual representation of how frequently each class occurs in the data set.

Frequency
| x
| x
| x
| x x
| x x
| x x
| x x
| x x
|__x__x__x__x__
[10 -19]

In this histogram, each class is represented by a bar whose height corresponds to its frequency. The width of each bar represents its class width.

Overall, this analysis provides insights into the distribution of defective components in batches for Electric Co. Ltd. The stem and leaf display, statistical measures, box and whisker diagram, grouped frequency distribution, estimated statistical measures, and histogram help us understand various aspects of this data set and draw meaningful conclusions.

This question has been answered.

Get Answer