Statistics - 1 Solution
Statistics - 1 Solution
Statistics - 1 Solution
1 Statistical tests provide a clear and objective means of deciding the differences between a model’s
predictions and experimental data. These tests will show if and how the model can be refined even
further.
2 Predictions based on the model are compared with the experimental data. By analysing these, the
model is adjusted and refined. The process is repeated.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
Chapter review
1 Cheaper to use
Easier to use
They enable predictions to be made
Help improve the understanding of our world
Help to see how certain changes in variables will affect the outcomes
Help simplify complex situations
Advantages Disadvantages
They are relatively quick Simplification of a real-
and easy to produce world situation may cause
errors as the model is too
simplistic
They help enable The model may work only
predictions to be made in certain conditions
3 The answer could be, but is not limited to: 'Climate data can sometimes be too large to investigate thoroughly
as it can be too time consuming, too expensive and logistically difficult to investigate. As a result,
mathematical modelling can be used to simplify the model, but still give meaningful results.'
4 1. Some assumptions need to be made to ensure the model is manageable. These include that birth and death
rates.
2. Plan a mathematical model which will include mathematical models and diagrams
3. Use this model to predict the population over a period of years
4. Include and collect new data that match the conditions of the predicted values.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
Exercise 2A
1 a Quantitative as it is numerical.
c Quantitative as it is numerical.
d Quantitative as it is numerical.
2 a Not true
b True
c True
d Not true
9.0 + 9.9
b = 9.45
2
1.3 + 1.4
b = 1.35
2
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
Exercise 2B
1 a 700 g, as this is the most often occurring.
d It will increase the mean, as 650 > 600 (the old mean).
It will decrease the median. There will now be an even number of values, so we take the middle
pair: 650 and 700. The new median will be half-way between these: 675.
256.2
2 a = 42.7
6
b It will increase the mean, as the new piece of data (52) is greater than the old mean (42.7).
3 a For May
v=
∑v
n
724 000
=
31
= 23 354.838...
= 23 355 m (to the nearest whole number)
For June
v=
∑v
n
632 000
=
30
= 21 066.666...
= 21 067 m (to the nearest whole number)
vtotal =
∑ ( vMay + vJune )
nMay + nJune
=
∑ ( 724 000 + 632 000 )
31 + 30
= 22 229.508...
= 22 230 m (to the nearest whole number)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
4 a 8 minutes. Everything else occurs only once, but there are two 8’s.
102
b = 10.2 minutes
10
c 5 6 7 8 8 9 10 11 12 26
The median is 8.5 minutes.
d The median would be reasonable. The mean is affected by the extreme value of 26.
In this case the mode is close to the median, so would be acceptable; but this would not always be
the case.
5 a 2 breakdowns
c (8 × 0) + (11 × 1) + (12 × 2) + (3 × 3) + (1 × 4) + (1 × 5) = 53
53
The mean = = 1.47 breakdowns
36
8 + 57 + 29 + 3 + 1 = 98 celandines
618
The mean = = 6.31 petals (to 2 d.p.)
98
1× 7 + 2 × p + 3 × 2
7 The mean =
7+ p+2
7+ 2p +6
1.5 =
p+9
2 p + 13
=
p+9
1.5p + 13.5 = 2p + 13
0.5 = 0.5p
p=1
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
Exercise 2C
1 a 351 − 400
c There are 65 observations so the median is the 33rd. The 33rd observation will lie in the class
351–400.
b The answer is an estimate because you don't know the exact data values.
3 a 16 < t < 18
(20.5 5) (30.5 16) (40.5 14) (50.5 22) (60.5 26) (70.5 14)
4 Store A
97
4828.5
= = 50 years
97
(20.5 4) (30.5 12) (40.5 10) (50.5 28) (60.5 25) (70.5 13)
Store B
92
4696
= = 51 years
92
Store B employs older workers but not by a great margin.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
Exercise 2D
16 1
1 a Q2 = th value = 8.5th value
2
1009, 1013, 1014, 1017, 1017, 1017, 1018, 1019, 1021, 1022, 1024, 1024, 1025, 1027, 1029, 1031
1019 1021
Q2 = = 1020 hPa
2
b Q1 = 4.5th value
Q1 = 1017 hPa
Q3 = 12.5th value
Q3 = 1024.5 hPa
95 1
2 Q2 = th value = 48th value
2
Q2 = 37
Q1 = 24th value
Q1 = 37
Q3 = 72nd value
Q3 = 38
m 0 13 0
so m ≈ 1.08 (3 sf)
1.5 0 18 0
31
4 a Median: = 15.5th value
2
Using interpolation:
Q 2 399.5 15.5 9
449.5 399.5 19 9
Q2 = 432 kg
31
b Q1 : = 7.75th value, so Q1 is in class 350 – 399
4
Q1 349.5 7.75 3
399.5 349.5 93
Q1 349.5 4.75
50 6
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
31
4 c Q3 : 3 × = 23.25th value, so Q3 is in class 450 – 499
4
Q3 449.5 23.25 19
499.5 449.5 26 19
Q3 449.5 4.25
50 7
65
b 65th : × 49 = 31.85
100
P65 40 31.85 16
50 40 34 16
P65 = 48.8
P90 50 44.1 34
c
60 50 47 34
P90 = 57.8
90th percentile = 57.8 minutes, so more than 10% of customers have to wait longer than 57.8
minutes – not 56 minutes as stated by the firm.
P80 2.5 80 61
6 a
3.0 2.5 89 61
80th percentile = 2.84 m, so 80% of condors have a wingspan of less than 2.84 m.
b The 90th percentile is in the 3.0 ≤ w class. There is no upper boundary for this class, so it is not
possible to estimate the 90th percentile.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
Exercise 2E
1 a CF = 4 8 10 17 37 61 71
71 slow worms were measured.
71
b Q1 = = 17.75th value, so Q1 is in class 185 – 199
4
Q1 − 184.5 17.75 − 17
=
199.5 − 184.5 37 − 17
Q1 – 184.5 = 0.5625
Q1 = 185.0625
71
Q3 : 3 × = 53.25th value
4
So Q3 is in class 200 – 214
Q3 − 199.5 53.25 − 37
=
214.5 − 199.5 61 − 37
243.75
Q3 – 199.5 =
24
Q3 = 209.656
(132 × 4) + (147 × 4) + (162 × 2) + (177 × 7) + (192 × 20) + (207 × 24) + (222 ×10)
c x=
71
13 707
=
71
= 193.1 mm (to 1 d.p.)
Using interpolation:
217.7 − 214.5 y − 61
=
229.5 − 214.5 71 − 61
y = 63.13 …
71 – y = 7.87
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
34
2 a 34th: × 70 = 23.8
100
P34 = 1086.7
66
66th: × 70 = 46.2
100
P66 = 1168.57
5
3 a 5th: × 60 = 3
100
P5 − 14.5 3 − 0
=
16.5 − 14.5 5 − 0
P5 = 15.7
95
95th: × 60 = 57
100
P95 − 20.5 57 − 50
=
22.5 − 20.5 60 − 50
P95 = 21.9
b 57 − 3 = 54
So 54 data values
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
4 a Placing the temperatures in order gives:
9.4 10.3 10.3 10.6 10.9 12.1 12.4 12.7 13.2 14.3
n +1
Median position =
2
10 + 1
=
2
= 5.5
The median lies at the midpoint of 10.9 and 12.1.
10.9 + 12.1
= 11.5
2
Therefore the median is 11.5°C.
The lower quartile position is found using:
n
Q1 =
4
10
=
4
= 2.5
Round 2.5 up to 3, therefore Q1 lies at 10.3.
The upper quartile position is found using:
3n
Q3 =
4
3 (10 )
=
4
= 7.5
Round up to 8, therefore Q3 lies at 12.7.
IQR = Q3 – Q1
= 12.7 – 10.3
= 2.4°C
b On average, the temperature was higher in June than in May (higher median). The temperature
was more variable in May than June (higher IQR).
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
Exercise 2F
24
1 a Mean = =3
8
78
b Variance = − 32 = 0.75
8
2
5905 241
2 Standard deviation = − = 3.11 kg
10 10
1425
Mean = = 178.125 ≈ 178
8
Variance ≈ 60
4 ∑x = 50 + 86 = 136
136
Mean = = 5.44
25
2
878 136
Standard deviation = − = 2.35
25 25
869
5 a Mean = = 10.22 Omani Riyals
85
2
9039 869
Standard deviation = − = 1.35 Omani Riyals
85 85
s = 66.4
85 − 66.4 = 18.6
So 19 students
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
2
203 81
6 Standard deviation = − = 1.23
54 54
805
7 Mean = = 16.1 hours
50
2
14 062.5 805
Standard deviation = − = 4.69 hours
50 50
One standard deviation below mean = 16.1 − 4.69 = 11.41 hours.
11.41 − 10 p − 5
=
15 − 10 19 − 5
p = 8.948
50 − p = 41.052
41 parts tested (82%) lasted longer than one standard deviation below the mean.
According to the manufacturers, this should be 45 parts (90%), so the claim is false.
243
8 a Mean = = 8.1 kn
30
2
2317 243
Standard deviation = − = 3.41 kn
30 30
b 8.1 + 3.41 = 11.51 kn
11.51 − 4 d − 0
=
17 − 4 30 − 0
d = 17.33
30 − d = 12.67
So 12 days
c The windspeeds are equally distributed throughout the range.
Challenge
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
Exercise 2G
x
1 a Code using the formula y = to give coded data: 11, 9, 5, 8, 3, 7, 6
10
b 11 + 9 + 5 + 8 + 3 + 7 + 6 = 49
49
Mean = =7
7
x
c 7= so x = 70
10
x −3
2 a Code using the formula y = to give coded data: 7, 10, 4, 10, 5, 11, 2, 3
7
b 7 + 10 + 4 + 10 + 5 + 11 + 2 + 3 = 52
52
Mean = = 6.5
8
x −3
c 6.5 = so x = 48.5
7
5 a
Battery life Frequency (f) Midpoint (x) x − 14
y=
(b hours) 2
11–21 11 16 1
21–27 24 24 5
27–31 27 29 7.5
31–37 26 34 10
37–43 12 40 13
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
(1× 3) + (1.1×12) + (1.2 × 40) + (1.3 ×10) + (1.4 × 5)
6 a Mean =
70
84.2
=
70
= 1.2 hours
84.2 x − 1
b = so x = 25.1 hours
70 20
2
101.82 84.2
c Standard deviation of coded data = −
70 70
= 0.0877845…
2
176.84 131
7 Standard deviation of coded data = − = 0.229
100 100
Standard deviation = 0.229 × 100 = 22.9
2
147.03 16.1
8 Standard deviation of coded data = − = 4.16
6 6
4.16
Standard deviation = = 416
0.01
Scc 296.4
Coded standard deviation = = = 3.1432…
n 30
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
Chapter review 2
1 (8 × 65) + (12 × 72) = 1384
1384
= 69.2 marks
20
45
b Coded mean = = 7.5
6
3 a Group A:
2852.5
Mean = = 63.39 marks
45
Group B:
2648
Mean = = 60.18 marks
44
b The method used to teach group A is best as the mean mark is higher.
m − 20.5 40 − 30
=
25.5 − 20.5 75 − 30
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
5 CF = 5 15 30 42 50
50
Q1 : = 12.5th value, so Q1 is in class 21–40
4
Q1 − 20.5 12.5 − 5
=
40.5 − 20.5 15 − 5
Q1 − 20.5 7.5
=
20 10
Q1 = 35.5
3 × 50
Q3 : = 37.5th value, so Q3 is in class 61–80
4
Q3 − 60.5 37.5 − 30
=
80.5 − 60.5 42 − 30
Q3 = 73
30
6 a 30th: × 100 = 30
100
P30 = 20.5
70
b 70th: × 100 = 70
100
P70 − 30.5 70 − 60
=
40.5 − 30.5 84 − 60
P66 = 34.7
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
80
7 a Q1: = 20th value, so Q1 is in class 40–49
4
Q1 − 39.5 20 − 15
=
49.5 − 39.5 51 − 15
Q1 − 39.5 5
=
10 36
Q1 = 40.9
3 × 80
Q3: = 60th value, so Q3 is in class 50–59
4
Q3 − 49.5 60 − 51
=
59.5 − 49.5 71 − 51
Q3 = 54
2
183 040 3740
b Variance = − = 102.4375 = 102
80 80
8 CF = 5 15 41 49 50
50
Q1: = 12.5th value, so Q1 is in class 95–100
4
Q1 − 95 12.5 − 5
=
100 − 95 15 − 5
Q1 = 98.75
50
b Q3: 3 × = 37.5th value, so Q3 is in class 100–105
4
Q3 − 100 37.5 − 15
=
105 − 100 41 − 15
Q3 = 104.33
2
516112.5 5075
c Standard deviation = − = 4.47
50 50
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
9 a x=
∑ fx
∑f
12 × 26 + 14 × 28 + 4 × 30
=
30
= 27.4667...
= 27.5°C
2
σ
= 2 ∑ fx
−
∑ fx
2
∑f ∑ f
12 × 26 + 14 × 282 + 4 × 302
2
− ( 27.4667...)
2
30
= 1.8489...
σ = 1.3597...
= 1.36°C
+ σ 27.464... + 1.412
c x=
= 28.876...
= 28.9
28.9 lies in the interval 27 ≤ t < 29
28.9 − 27 y − 12
=
29 − 27 26 − 12
y = 25.3
30 – 25.3 = 4.7
Therefore 5 days (to the nearest day)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
106
10 a Coded mean = = 3.419…
31
80.55
Coded standard deviation = = 1.6119…
31
316
11 a Mean = = 15.8
20
2
5078 316
Standard deviation = − = 2.06
20 20
104
c Coded mean = = 5.2
20
Mean = 5.2 × 10 + 5 = 57 cm
1.8
Coded standard deviation = = 0.3
20
Standard deviation = 0.3 × 10 = 3
Challenge
Total = 3.1 × 20 = 62
New total = 62 − 2.3 + 3.2 = 62.9
62.9
New mean = = 3.145
20
σ = 1.4, σ2 = 1.96
1.96 = ∑x 2
62
−
2
20 20
2
∑x = 231.4
New ∑x2 = 231.4 − 2.32 + 3.22 = 236.35
2
236.35 62.9
New standard deviation = − = 1.39
20 20
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
Exercise 3A
1 a, b Class widths are all 5.
90
1 square unit = = 1.5 students
60
So there were 100 × 1.5 × 150 students who took between 40 and 60 seconds.
So there were 246 × 1.5 = 369 students who took 80 seconds or less.
80
1 square unit = = 2 people
40
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
5 a i Use extra columns to help, using the frequency densities given in the histogram:
0 ⩽ t < 20 4 20 0.2
20 ⩽ t < 30 10 × 1 = 10 10 1
30 ⩽ t < 35 15 5 3
35 ⩽ t < 40 25 5 5
ii
5 3
b ×10 + 15 + × 25 = 35 passengers.
10 5
6 a 12.5 and 14.5 are the class boundaries, as we are dealing with continuous data.
b i The class boundaries for the 15–17 class are 14.5 and 17.5.
This width is 1.5 times the width of the 13–14 class, since 17.5 – 14.5 = 3 = 1.5 × 2.
So the width of the class is 1.5 × 4 = 6 cm.
24
ii The frequency density for the 13–14 class is = 12.
2
The frequency density of this class is 6, which is 0.5 times the frequency density above: 12.
So the height of the class is 0.5 × 6 = 3 cm.
7 a The 10 ⩽ w < 11 interval is half the width of the 8 ⩽ w < 10 interval therefore it should be
0.5 cm wide.
The 8 ⩽ w < 10 interval has a frequency of 8 and an area of 16, so the 10 ⩽ w < 11 interval should
be 12 cm high.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
7 b x=
∑ fx
∑f
4 × 6 + 8 × 9 + 6 ×10.5 + 7 ×11.5 + 5 ×13.5 + 1×15.5
=
31
= 10.403...
= €10.40
2
σ
= 2 ∑ fx − ∑ fx
2
∑ f ∑ f
∑ 4 × 6 + 8 × 9 + 6 ×10.5
2 2 2
+ 7 ×11.52 + 5 ×13.52 + 1×15.52
− (10.403...)
2
31
= 5.668...
σ = 2.380...
= €2.38
31
c Q=
1 = 7.75 therefore Q1 = 8
4
Q1 − 8 8 − 4
=
10 − 8 12 − 4
Q1 = €9
Challenge
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
Exercise 3B
1 IQR = 68−46 = 22
46 − 1.5 × 22 = 13
68 + 1.5 × 22 = 101
2 a Outliers are < 400 – 180 = 220 or > 580 + 180 = 760. So there are no outliers.
b Outliers are < 260 – 80 = 180 or > 340 + 80 = 420. So 170 g and 440 g are both outliers.
c 760 g
3 a Mean = 6.1 kg
Standard deviation = 4.2
Mean − 2 × standard deviation = 6.1 − 2 × 4.2 = 2.00 (to 3 s.f.)
Mean + 2 × standard deviation = 6.1 + 2 × 4.2 = 10.2 (to 3 s.f.)
So 11.5 kg is an outlier.
x 92
4 a Mean = = 10.2 (to 3 s.f.)
n 9
2
x 2 x
Standard deviation =
n n
2
1428 92
=
9 9
= 7.36 (to 3 s.f.)
d ∑x − 30 = 92 − 30 = 62
62
Mean = = 7.75
8
∑x2 − 302 = 1428 − 900 = 528
2
528 62
Standard deviation = = 2.44 (to 3 s.f.)
8 8
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
Exercise 3C
1
b 38 marks
c IQR = 47 − 32 = 15 marks
d Range 76 − 12 = 64 marks
b It is more likely to be a female. Very few male turtles weigh as little as this, but roughly three-
quarters of the female turtles weigh even less.
4 a Q1 = 22
Q2 = 26
Q3 = 30
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
Exercise 3D
1 Key: 1|2 means 12 movies
0 6 9 3
1 2 7 2 8 5 5 5 9 2
2 5 9 5 9 2 0 6 7 7 6 5
3 4 4 2 2 5
4 5 2
1 2 2 2 5 5 5 7 8 9
2 0 2 5 5 5 6 6 7 7 9 9
3 2 2 4 4 5
4 2 5
n + 1 30 + 1
a The median is the = = 15.5th piece of data
2 2
Therefore the median lies at the midpoint of the 15th and 16th pieces of data.
25 + 25
= 25
2
2 a 49
b 8
c 3
d 37
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
n + 1 24 + 1
2 e The median is the = = 12.5th piece of data.
2 2
Therefore the median lies at the midpoint of the 12th and 13th pieces of data.
34 + 34
= 34
2
9 8 2 6 4 8
2 4 2 3 2 4 4 9 3
5 4 8 7 5 4 7 5 6
7 6 6 4 4 5 4 2
0 6
Key: 2|6 means 26
Boys Girls
9 8 2 4 6 8
4 2 2 3 2 3 4 4 9
8 7 5 5 4 4 5 6 7
7 6 6 4 4 5 2 4
0 6
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
n + 1 51 + 1
4 a The median is the = = 26th piece of data.
2 2
So the median is 19.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
Exercise 3E
1 Q2 = $36.50, Q3 = $45.75, IQR = $30.50
Q1 = Q3 – IQR
= 45.75 – 30.50
= $15.25
Q2 – Q1 = 36.50 – 15.25 = $21.25
Q3 – Q2 = 45.75 – 36.50 = $9.25
Q2 – Q1 > Q3 – Q2, so negatively skewed.
3 a 64 mm
n + 1 67 + 1
b The median is the = = 34th piece of data.
2 2
Therefore the median is 65 mm.
To find the lower quartile:
n 67
= = 16.75
4 4
Since this is not a whole number round up, so the lower quartile is the 17th piece of data, therefore
Q1 = 56 mm
To find the upper quartile:
3n 201
= = 50.25
4 4
Since this is not a whole number round up, so the lower quartile is the 51st piece of data, therefore
Q3 = 81 mm
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
3 e x=
∑x
∑f
4604
=
67
= 68.716...
= 68.7 mm (3 s.f.)
2
=σ 2 ∑x
−
∑x
2
∑f ∑ f
328 996
− ( 68.716...)
2
=
67
= 188.499...
σ = 13.729...
= 13.7 mm (3 s.f.)
f Q2 – Q1 = 65 – 56 = 9 mm
Q3 – Q2 = 81 – 65 = 16 mm
Q2 – Q1 < Q3 – Q2
Therefore positive skew.
Challenge
There are (1 × 10) + (3.5 × 10) + (5.5 × 10) + (2 × 10) = 120 small squares.
Therefore 1 small square represents 1 orange.
x=
∑ fx where x is the midpoint of each group.
∑f
65 ×10 + 75 × 35 + 85 × 55 + 95 × 20
x=
120
= 82.083...
= 82.1 mm (3 s.f.)
652 ×10 + 752 × 35 + 852 × 55 + 952 × 20
− ( 82.083...)
2
σ 2
120
= 70.714...
σ = 8.409...
= 8.41 mm (3 s.f.)
It is an estimate because the data is grouped. There are values above and below 2 standard deviations
and therefore there are probably outliers. The distribution is negatively skewed.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
Exercise 3F
1 Median for motorway A = 76 km/h
Median for motorway B = 72 km/h
The median speed is greater on motorway A than on motorway B. The spread of speeds for motorway
B is greater than the spread of speeds for motorway A.
Σx 650
2 Mean for class 2B = = = 32.5 minutes
n 20
Σx 598
Mean for class 2F = = = 27.2 minutes (to 3 s.f.)
n 22
2
Σx 2 Σx
Standard deviation for class 2B = −
n n
2
22000 650
= −
20 20
= 6.61 (to 3 s.f.)
2
Σx 2 Σx
Standard deviation for class 2F = −
n n
2
19100 598
= −
22 22
= 11.4 (to 3 s.f.)
The mean time for Class 2B is higher than the mean time for Class 2F, showing that Class 2F are
generally faster at completing the puzzle. The standard deviation for Class 2F is bigger than for Class
2B, showing that the times are more spread out.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
n + 1 22 + 1
3 a The median is the = = 11.5th piece of data.
2 2
Therefore the median lies halfway between the 11th and 12th pieces of data.
26 + 27
= 26.5 years
2
To find the lower quartile
n 22
= = 5.5
4 4
Since this is not a whole number round up, so the lower quartile is the 6th piece of data, therefore
Q1 = 22 years
To find the upper quartile
3n 66
= = 16.5
4 4
Since this is not a whole number round up, so the upper quartile is the 17th piece of data, therefore
Q3 = 39 years
IQR = Q3 – Q1
= 39 – 22
=17 years
4 • Median marks for students taking their exam for the first time are lower than for students retaking
their exam.
• There is more variability for students retaking their exam. This is shown by the interquartile range
being smaller for students taking the exam for the first time compared to students retaking.
• The range of marks for students taking the exam for the first time is lower than that for students
retaking the exam.
• Both groups marks are positively skewed.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
Chapter review 3
31
1 a Q1: = 7.75 so we pick the 8th value: 178
4
31 + 1
Q2: = 16 so we pick the 16th value: 185
2
3
Q3: × 31 = 23.25 so we pick the 24th value: 196
4
So 226 km is an outlier.
2 a 45 minutes
b 60 minutes
d The Runners Club had the highest median, so overall they had the slowest runners.
The IQR ranges were about the same, with the Runners Club slightly more spread out.
e With the exception of the outlier, the Marathon Club runners were faster in every respect. Their
minimum, Q1, Q2, Q3 and maximum times were all lower than the corresponding times for the
runners from the Runners Club.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
2 f Advantages, any one from:
n + 1 35 + 1
3 a The median is the = = 18th piece of data.
2 2
Therefore the median is 39 years.
To find the lower quartile
n 35
= = 8.75
4 4
Since this is not a whole number round up, so the lower quartile is the 9th piece of data, therefore
Q1 = 31 years
To find the upper quartile
3n 105
= = 26.25
4 4
Since this is not a whole number round up, so the upper quartile is the 27th piece of data, therefore
Q3 = 55 years
c For Zoo 1
Q2 – Q1 = 64 – 44 = 20
Q3 – Q2 = 76 – 64 = 12
Q2 – Q1 > Q3 – Q2 therefore the distribution of Zoo 1 is negatively skewed
For Zoo 2
Q2 – Q1 = 39 – 31 = 8
Q3 – Q2 = 55 – 39 = 16
Q2 – Q1 < Q3 – Q2 therefore the distribution of Zoo 2 is positively skewed
29k = 58 so k = 2
Number of girls who took longer than 56 seconds = 2((4.5 × 2) + (1 × 4)) = 26 girls
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
4 b Number of girls who took between 52 and 55 seconds = 2((1.5 × 2) + ( 12 × 2 × 5.5)) = 17 girls
6 a
Σfy 988.85
b Mean = = = 19.777 kg
n 50
Σfy 2 19 602.84
Standard deviation = − u2 = − 19.777 2 ==
0.927 0.963 (to 3 s.f.)
n 50
13.5
c Median = 20.0 + × 0.2 = 20.123 (to 3 d.p.)
22
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
312
7 a = 22.286 (to 3 d.p.)
14
3 0 3 1
4 0 2
3 0 1 3
4 0 2
n + 1 14 + 1
The median is the = = 7.5th piece of data.
2 2
Therefore the median lies halfway between the 7th and 8th pieces of data.
20 + 20
The median is = 20
2
To find the lower quartile
n 14
= = 3.5
4 4
Since this is not a whole number round up, so the lower quartile is the 4th piece of data, therefore
Q1 = 13 bags
To find the upper quartile
3n 42
= = 10.5
4 4
Since this is not a whole number round up, so the upper quartile is the 11th piece of data, therefore
Q3 = 31 bags
13 – 27 = −14
31 + 27 = 58
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
7 d
e Q2 – Q1 = 20 – 13 = 7
Q3 – Q2 = 31 – 20 = 11
Q2 – Q1 < Q3 – Q2 therefore the distribution is positively skewed
8 a 22
b For Suha
To find the lower quartile
n 21
= = 5.25
4 4
Since this is not a whole number round up, so the lower quartile is the 6th piece of data, therefore
Q1 = 11 bicycles
To find the upper quartile
3n 63
= = 15.75
4 4
Since this is not a whole number round up, so the upper quartile is the 16th piece of data, therefore
Q3 = 22 bicycles
For Jameela
n + 1 21 + 1
The median is the = = 11th piece of data.
2 2
Therefore the median is 27 bicycles.
So X = 11, Y = 27 and Z = 22
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
9 a For 1987
x=
∑x
∑n
356.1
=
30
= 11.87
= 11.9°C (3 s.f.)
For 2015
x=
∑x
∑n
364.1
=
30
= 12.136...
= 12.1°C (3 s.f.)
b For 1987
2
σ
= 2 ∑x −
∑x
2
∑n ∑ n
4408.9
− (11.87 )
2
=
30
= 6.066...
σ = 2.463...
= 2.46°C (3 s.f.)
The mean temperature was slightly higher in 2015 than in 1987. The standard deviation of
temperatures was higher in 1987 (2.46°C) than in 2015 showing that the temperatures were more
spread out in 1987.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 6
Challenge
70–89 4 4k 20 x
100–109 20 20k 10
110–139 9 9k 30
140–179 2 2k 40
k
Area of 70–89 bar = 20x = 4k, so x =
5
Using substitution
k
10 × = 0.4
5
Area 18
Height = = = 0.6 cm
Class width 30
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 7
Exercise 4A
1
2 1
P(same) =
4 2
2 a
2 1
b i P(X = 24) =
36 18
8 2
ii P(X < 5) =
36 9
27 3
iii P(X is even) =
36 4
33 21 2 56 2
3 a P(m ⩾ 54) =
140 140 5
25 42 33 100 5
b P(48 ⩽ m < 57) =
140 140 7
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
3 c Let B = the number of Bullmastiffs with mass less than 53 kg.
Using interpolation:
53 51 B 17 25
54 51 42
So B = 70
70
P(m < 53) = = 0.5, so half of the Bullmastiffs are estimated to have a mass less than 53 kg.
140
This probability is lower than the probability of 0.54 for Rottweilers, and so it is less likely.
The assumption made is that the frequency is uniformly distributed throughout the class.
14 15 32 27 26 114 19
4 a P(female) =
240 240 40
4 14 20 15 24 32 109
b P(l < 80) =
240 240
24 47 71
c P(male and 75 ⩽ l < 85) =
240 240
The assumption is that the distribution of lengths of koalas between 70 and 75 cm is uniform.
5 a P(m > 5) =
1 24 2 4 32 16
70 70 35
b Start with the probability that the cat has a mass greater than 6.5.
3
4 2 4 6 3
P(m > 6.5) =
70 70 35
3 32
So P(m < 6.5) = 1 −
35 35
The fact that we have ignored the case 6.5 is not a problem in this estimate. We are assuming that
the class is continuous when we interpolate, and that the probability of being exactly equal to any
individual value is negligible.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
Challenge
But P(Y ⩾ 20) = 1 is impossible, as the product of 2 and 4 is only 8, so x cannot be even.
But P(Y is even) = P(Y ⩾ 20), so there must also be four values where Y ⩾ 20.
Two of them are in the top row: 28 and 20, leaving two in the bottom row.
Given that exactly two of these three values are greater than or equal to 20:
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
Exercise 4B
1 a
14
b i P(E) =
25
6
ii P(E A) =
25
8
iii P(English but not Arabic) =
25
1
iv P(Neither English nor Arabic) =
25
2 a
15 3
b i P(All three) =
125 25
10 2
ii P(Pasta but not cheesecake and not garlic bread) =
125 25
10 2
iii P(Garlic bread and pasta but not cheesecake) =
125 25
54
iv P(None) =
125
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
3 a
89
b i P(Plays piano) =
275
64 9 29 1 103
ii P(At least 2) =
275 275
20 15 35 14
iii P(Plays exactly one) =
275 55
102
iv P(Plays none) =
275
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
5
7 P(M) = 0.32 + p
P(P) = p + q + 0.07
p = 0.13, q = 0.25
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
Challenge
P(B) = p + q + 0.05
P(A) = 0.15 + p
As P(not C) = 0.83
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
Exercise 4B
1 a
b P(A B) = 0.7
c P(A’ B’ ) = 0.3
3
2 P(Sum of 4) = 36 121
6
P(Same number) = 36 16
1
P(Sum of 4) + P(Same number) = 12 61 14
8
P(Sum of 4 or same number) = 36 92
Alternatively: A roll of 2 followed by another roll of 2 fits both conditions, so the intersection is not
empty, and the events are not mutually exclusive.
5 a The closed curves representing bricks and trains do not overlap and so they are mutually exclusive.
1 1
b P(B and F) =
3 1 4 6 2 5 21
3 1 1 4 6 4 11 44
P(B) × P(F) =
21 21 21 21 441
As P(B and F) ≠ P(B) × P(F), 'plays with bricks' and 'plays with action figures' are not independent
events.
x = 0.25
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
6 b P(A and B) = x = 0.25
As P(A and B) ≠ P(A) × P(B), the two events ‘like pasta’ and ‘like pizza’ are not independent.
ii P(neither S nor T) = 1 – P(S or T) = 1 – (P(S but not T) + P(T)) = 1 – (0.18 + 0.4) = 0.42
P(X) = 1 – (P(W and not X) + P(neither W nor X)) = 1 – (0.25 + 0.3) = 0.45
As P(W and X) ≠ P(W) × P(X), the two events W and X are not independent.
As P(R and F) ≠ P(R) × P(F), the two events R and F are not independent.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
10 P(A and B) = p
= (p + 0.42)(p + 0.11)
(p + 0.42)(p + 0.11) = p
p2 + 0.53p + 0.0462 = p
p2 − 0.47p + 0.0462 = 0, a quadratic in p, which we can solve with the quadratic formula
p = 0.33 or 0.14
Challenge
P(A and not B) = P(A) − P(A and B) = p − pq, and notice P(not B) = 1 − P(B) = 1 − q
As P(A and not B) = P(A) × P(not B), the events A and 'not B' are independent.
As P(not A and not B) = P(not A) × P(not B), the events 'not A' and 'not B' are independent.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
Exercise 4D
1 a This is the set of anything not in set B but in set A. So the shaded region consists of the part of A
which does not intersect with B, i.e. A B.
b The shaded region includes all of B and the region outside of A and B, i.e. B A.
c There are two regions to describe. The first is the intersection of A and B, i.e. A B and the
second is everything that is not in either A or B, i.e. A B. Therefore the shaded region is
( A B) ( A B ) .
f The shaded region is anything that is either in A or B but is not in C. So the shaded region consists
of the part of A B which does not intersect with C, i.e. ( A B) C.
2 a Shade set A. The set B consists of the region outside of A and B and the region inside A that does
not intersect B. Therefore A B is the region consisting of both these regions.
b Since this is an intersection, the region must satisfy both conditions. The first is to be in A . This
consists of two regions: one inside B and not in A B ; and one outside of A and B. The second
condition is to be in B . Again, this consists of two regions: one inside A and not in A B ; and
one outside of A and B. Therefore A B is the region outside of A and B (since this region was
in both A and B ). One way to help picture this is to shade the regions A and B differently
(either with different colours or using a different pattern for each). The intersection is then the
region that includes both colours or patterns.
c In order to describe ( A B ) it is sensible to first describe A B . This is the single region which
is included in both A and B. The complement is then everything except this region.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
3 a The set ( A B) C is the union of the sets A B and C. On the blank diagram, the set A B
consists of the two regions that are both contained within A and B. The remaining regions within
set C can then be shaded in.
c First describe A B C . Brackets have not been included since for any sets X, Y and Z
( X Y ) Z = X (Y Z) . The intersection of A B and C is the region within A B that
does not intersect C. Therefore ( A B C ) is everything except this region.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
4 e C is the event ‘the card chosen is a not a club’.
3
P(C ) =
4
5 Use the information in the question to draw a Venn diagram that will help in answering each part.
d A B is the region inside set A and the region outside set B, i.e. everything but the region inside
set B that is not also in set A. P( A B ) = 0.4 + 0.1+ 0.4 = 0.9
6 Use the information in the question to draw a Venn diagram that will help in answering each part.
c P(C) = 0.65
d C D is the region outside set C and the region outside set D, i.e. everything but the region that
is in both sets C and D. P(C D ) = 0.85
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
7 a
iii H C is the region inside set H and the region outside set C, i.e. everything but the region
inside set C that is not also in set H. P(H C ) = 0.25 + 0.25 + 0.35 = 0.85
8 a Only the possible outcomes of the two events need to considered, and so the Venn diagram should
consist of two circles, one labelled ‘R’ for red and one labelled ‘E’ for even. They should intersect.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
9 a Since A and B are mutually exclusive, P( A B) = 0 and they need no intersection on the Venn
diagram. From the question, P( A C) = 0.2 and so this can immediately be added to the diagram.
Since B and C are independent, P(B C) = P(B) ´ P(C) = 0.35 ´ 0.4 = 0.14 and this can also be
added to the diagram. The remaining region in B must be P(B) - P(B C) = 0.35 - 0.14 = 0.21, the
remaining region for A must be P( A) - P( A C) = 0.55 - 0.2 = 0.35 and the remaining region for C
must be P(C) - P( A C) - P(B C) = 0.4 - 0.2 - 0.14 = 0.06 . This means that the region outside
of A, B and C must be 1 – 0.35 – 0.2 – 0.21 – 0.14 – 0.06 = 0.04.
b i The set A B must be outside of A and outside of B. These two regions are labelled 0.06 and
0.04. Therefore P( A' B') = 0.06 + 0.04 = 0.1
ii The region B C is the region inside set B but outside set C, it is labelled 0.21 on the Venn
diagram and is disjoint from A. Therefore P( A (B C )) = P( A) + 0.21 = 0.55 + 0.21 = 0.76
iii Since A C consists of a single region, ( A C ) consists of everything in the diagram except
for that region. But B includes the region A C and so ( A C ) B includes everything in
the diagram, and so P(( A C ) B ) = 1
10 a Start with a Venn diagram with all possible intersections. Then find the region A B C , which
is at the centre of the diagram, and label it 0.1.
Now, since A and B are independent, P( A B) = P( A) ´ P(B) = 0.25 ´ 0.4 = 0.1 , and as B and C
are independent P( B C) = P( B) ´ P(C) = 0.4 ´ 0.45 = 0.18 . Use these results to find values for the
other intersections. P( A B C ) = P( A B) - P( A B C) = 0.1- 0.1 = 0 ;
P( B C A ) = P( B C) - P( A B C) = 0.18 - 0.1 = 0.08 ; and P( A C B ) = 0 is given in
the question.
Now find values for the remaining parts of the diagram. For example,
P( A B C ) = P( A) - P( A B C ) - P( A C B ) - P( A B C) = 0.25 - 0 - 0 - 0.1 = 0.15
Similarly, P( B A C ) = 0.4 - 0.1- 0.08 = 0.22 and P(C A B ) = 0.45 - 0.1- 0.08 = 0.27
Finally calculate the region outside sets A, B and C,
P( A B C ) = 1- 0.15 - 0.1- 0.22 - 0.08 - 0.27 = 0.18
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
10 b i There are several ways to work out the regions that comprise the set A ( B C) . One way is
to determine, for each region, whether it lies in A and B C . Alternatively, find the regions
within A (there are four) and then note that only one of these does not lie in B C . Summing
the three remaining probabilities yields P( A ( B C)) = 0.27 + 0.08 + 0.18 = 0.53
ii The required region must be contained within C. Three of the four regions in C also lie in
A B , summing the probabilities yields P(( A B) C) = 0 + 0.1+ 0.08 = 0.18
c P( A ) = 1- P( A) = 0.75, P(C)= 0.45 and, from the Venn diagram, P( A C) = 0.08 + 0.27 = 0.35 .
Since P( A ) ´ P(C) = 0.75 ´ 0.45 = 0.3375 ¹ 0.35 , the events A and C are not independent.
b i P( M G) = P( M G E) - P( E M G ) = 1- 0.4 = 0.6
c P(G ) = 0.6 , P( M ) = 0.5 and so P(G ) ´ P( M ) = 0.6 ´ 0.5 = 0.3 . Since P(G M ) = 0.2 , the
events are not independent.
b P( A B) = P( A) + P(B) - P( A B) = x + y - xy
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 6
Challenge
a Use that the events are independent.
P( A B C) = P(( A B) C)
= P( A B) ´ P(C)
= P( A) ´ P(B) ´ P(C)
= xyz
( A B) C consists of the intersections of C with just A, with just B and with both A and B
So ( A B) C = (C A B ) + (C B A ) + ( A B C)
Now substitute the result for P( A B) C from equation (2) into equation (1). This gives
P( A B C) = x + y + z - xy - xz - yz + xyz
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 7
Challenge
c First understand the region on a Venn diagram. The set A B corresponds to the shaded regions:
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 8
Exercise 4E
1 a There are 29 Year 10 students out of a total of 60 students.
29
P(Year 10)
60
b Restrict the sample space to the 29 Year 10 students; 18 of these prefer curry.
18
P(Curry | Year 10) =
29
c Restrict the sample space to the 35 students that prefer curry; 18 of these are in Year 10.
18
P(Year 10 | Curry) =
35
d Restrict the sample space to the 31 Year 11 students; 14 of these prefer pizza.
14
P(Pizza | Year 11) =
31
2 a By simple subtraction, there are 43 teenage members of the club (75 – 32 = 43). Of these 21 play
badminton (43 – 22 = 21).
Teenagers 21 22 43
Adults 15 17 32
Total 36 39 75
b i Restrict the sample space to the 39 members that play squash; 22 of these are teenagers.
22
P(Teenager | Squash) =
39
ii Restrict the sample space to the 36 members that play badminton; 15 of these are adults.
15 5
P(Adult | Badminton) =
36 12
iii Restrict the sample space to the 32 members that are adults; 17 of these play squash.
17
P(Squash | Adult) =
32
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
3 a There are 35 boys (80 – 45 = 35), of which 10 like chocolate (35 – 2 – 23 = 10).
Of the girls, 20 like strawberry (45 – 13 – 12 = 20).
Vanilla 13 2 15
Chocolate 12 10 22
Strawberry 20 23 43
Total 45 35 80
b i Restrict the sample space to the 43 children that like strawberry; 23 of these are boys.
23
P(Boy | Strawberry) =
43
ii Restrict the sample space to the 15 children that like vanilla; 13 of these are girls.
13
P(Girl | Vanilla) =
15
iii Restrict the sample space to the 35 boys; 10 of these like chocolate.
10 2
P(Chocolate | Boy) =
35 7
4 a
Blue spinner
1 2 3 4
Red spinner
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7
4 5 6 7 8
ii There are 4 equally likely outcomes where the red spinner is 2; and for one of these X =3.
1
P(X 3 | Red spinner is 2) =
4
iii There are 4 equally likely outcomes where X =5, and for one of these the blue spinner is 3.
1
P(Blue spinner is 3 | X 5) =
4
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
5 a
Dice 1
1 2 3 4 5 6
1 1 2 3 4 5 6
2 2 4 6 8 10 12
Dice 2
3 3 6 9 12 15 18
4 4 8 12 16 20 24
5 5 10 15 20 25 30
6 6 12 18 24 30 36
b There are 6 outcomes where Dice 1 shows 5, and for one of these the product is 20.
1
P(Product is 20 | Dice 1 shows a 5) =
6
c There are 4 outcomes where the product is 2, and for one of these Dice 2 shows a 6.
1
P(Dice 2 shows a 6 | Product is 12) =
4
1
P(Ace of diamonds) 52 1
6 P(Ace | Diamond) 13
P(Diamond) 52 13
H T
Coin 2
H HH TH
T HT TT
a Note there are three outcomes where at least one coin lands on a head.
P(Head and Head) 14 1
P(HH | H) = 3
P(Head) 4 3
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
P(Head and Tail)
7 b P(Head and Tail | Head)
P(Head)
2
2
43
4 3
Use the fact that of those who watch drama, 18 also watch sport to complete the table.
For example, this means that 38 students who watch sport do not watch drama (56 – 18 = 38), and
59 students who watch drama do not watch sport (77 – 18 = 59).
Given that 43 students do not watch drama, but 38 students who do not watch drama watch sport,
this means 5 students do not watch drama or sport (43 – 38 = 5).
Watches
18 38 56
sport (S)
Total 77 43 120
ii The probability that the student does not watch sport or drama.
5 1
P(S ¢ Ç D¢ ) =
120 24
iii The probability that the student also watches sport if they watch drama.
18
P(S | D ) =
77
iv The probability that the student does not watch drama if they watch sport.
38 19
P(D¢ | S ) =
56 28
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
9 a
Stick 26 18 44
No stick 37 29 66
Total 63 47 110
44 2
b i P(Uses a stick) =
110 5
iii Restrict the sample space to those who use a stick; 18 of these are men.
18 9
P(Male | Uses a stick) =
44 22
10 Build up a table to show the options as follows. First note that as there are 450 female owners, so
there are 300 male owners (750 – 450 = 300). Consider those who own cats. 320 owners in total own
a cat. Since no one owns more than one type of pet, this means that 430 owners do not own a cat
(750 – 320 = 430).
175 female owners have a cat. Since there are 450 female owners, this means that 275 female owners
do not own a cat (450 – 175 = 275). 145 male owners own a cat (320 – 175 = 145) and so 155 male
owners do not own a cat (300 – 145 = 155). This gives this table:
Of the 430 owners who do not own a cat, 250 of them own a budgie. Therefore 180 of the owners
own another type of pet (430 – 250 = 180). Since 25 males own another type of pet, this means that
155 women own another type of pet (180 – 25 = 155).
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
10 Finally, of the 450 women, 175 own a cat and 155 own something other than a cat or a budgie.
Therefore 120 women own a budgie (450 – 175 –155 = 120) and 130 men own a budgie
(300 – 145 – 25 = 130). This information is summarised in this table:
a The probability that the owner does not own a budgie or a cat.
180 6
P(B¢ Ç C ¢) =
750 25
b The probability that a male owner (i.e. not female) owns a budgie.
130 13
P(B | F ¢) =
300 30
d The probability that a female owner does not own a budgie or a cat.
155 31
P(( B ¢ Ç C ¢) | F ) =
450 90
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 6
Exercise 4F
1 a The probability AÈ B includes all cases where either event A or event B occurs. So sum the
probabilities for these three regions A Ç B¢, AÇ B and B Ç A¢ .
This gives P( A È B) = 0.3+ 0.12 + 0.28 = 0.7
b The probability that A occurs given that B occurs means that we are only selecting from those
situations where B occurs. So the sample space is restricted to just circle B. The denominator of
the fraction is 0.12 + 0.28 = 0.4. The numerator is when A also occurs i.e. when both A and B
occur, which is the region AÇ B .
0.12
Therefore P( A | B) = = 0.3
0.4
c The sample space is restricted to those instances where A has not occurred i.e. the regions B Ç A¢
or B¢ Ç A¢ . This means the denominator will be 0.28 + 0.3 = 0.58. The numerator will consist of
the cases where B has occurred i.e. B Ç A¢ .
0.28
Therefore P( B | A¢) = = 0.483 (3 s.f.)
0.58
d The sample space is restricted to those instances where A or B has occurred i.e. the region AÈ B .
From part a this has probability 0.7. The numerator will consist of the cases where B has occurred
i.e. either B Ç A¢ or B Ç A .
0.28 + 0.12 0.4
Therefore P( B | A È B) = = = 0.571 (3 s.f.)
0.7 0.7
2 a Fill in P(C Ç D) = 0.25 on the Venn diagram, and then calculate P(C Ç D ¢ ) = 0.8 - 0.25 = 0.55 ,
P(D Ç C ¢ ) = 0.4 - 0.25 = 0.15 and P(C È D )¢ = 1- 0.25 - 0.55 - 0.15 = 0.05
P(C Ç D) 0.25
ii P(C | D) = = = 0.625
P( D) 0.4
P( D Ç C ) 0.25
iii P( D | C ) = = = 0.3125
P(C ) 0.8
P( D¢ Ç C ¢) 0.05
iv P( D¢ | C ¢) = = = 0.25
P(C ¢) 0.15 + 0.05
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
3 a Since S and T are independent, P(S ÇT ) = P(S) ´ P(T ) = 0.5 ´ 0.7 = 0.35 , and use this result to fill
in the Venn diagram.
P( S Ç T ) 0.35
ii P( S | T ) = = = 0.5
P(T ) 0.7
P(T Ç S ¢) 0.35
iii P(T | S ¢) = = = 0.7
P(S ¢) 0.5
4 a First produce a Venn diagram with the numbers of people in each region.
The Venn diagram can now be used to find the required probabilities. From the diagram,
45 3
P( A Ç B¢ ) = = = 0.375
120 8
P( A Ç B) 120
20
20 2
b P( A | B) = = 50 = = = 0.4
P( B) 120 50 5
P( B Ç A¢) 30 6
c P( B | A¢) = = = = 0.545 (3 s.f.)
P( A¢) 55 11
P( A Ç ( A È B)) P( A) 65 13
d P( A | ( A È B)) = = = = = 0.684 (3 s.f.)
P( A È B) P( A È B) 95 19
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
5 a Note that 12 cats like neither brand of food. So 80 – 12 = 68 cats like Feskers or Whilix or both.
Use this and the other information in the question to calculate P(F ÇW ) as follows:
P( F ÈW ) = P(F ) + P(W ) - P( F ÇW )
Þ P( F ÇW ) = P( F) + P(W ) - P( F ÈW )
45 32 68 9
So P( F ÇW ) = + - =
80 80 80 80
This is a Venn diagram showing the result:
P( F Ç W ) 9
9
b P( F | W ) = = 80
32
= = 0.281 (3 s.f.)
P(W ) 80 32
P( F Ç W ) 9
9 1
c P(W | F ) = = 80
45
= = = 0.2
P( F ) 80 45 5
P( F ¢ Ç W ¢) 12
12
d P(W ¢ | F ¢) = = 80
= = 0.343 (3 s.f.)
P( F ¢) 23+12
80 35
P( A Ç B Ç C ¢) 0.2 0.2
c P(( A Ç B) | C¢) = = = = 0.299 (3 s.f.)
P(C ¢) 0.2 + 0.2 + 0.12 + 0.15 0.67
7 a The fact that the student must watch at least one of the TV programmes means that the student is
selected from a region contained in A È B È C . Therefore this question should be interpreted as:
P(C Ç ( A È B È C )) P(C ) 299 9
P(C | ( A È B È C )) = = 23 = 23 = = 0.391 (3 s.f.)
P( A È B È C ) 29 29 23
The other way to do this is to note that only 23 students watch at least one of the TV programmes,
and of these 9 watch programme C.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
7 b The standard method is as follows:
P( A Ç B) + P( A Ç C) + P( B Ç C) - 3P( A Ç B Ç C )
P(exactly two programmes | A È B È C) =
P( A È B È C )
2 0 1 0
+ + -3 3
=
29 29 29 29
= = 0.130 (3 s.f.)
23 23
29
An alternative method is to note that 2 + 1 = 3 students watch exactly two of the programmes (they
watch A and B, and B and C, respectively) and so 3 out of the 23 students that watch at least one of
the TV programmes watch exactly two of the programmes.
2 + 7 +1 10
c P( B) = = = 0.345 (3 s.f.)
29 29
P( B Ç C ) 291 1
P( B | C ) = = 9 = = 0.111 (3 s.f.)
P(C ) 29 9
So P(B) ¹ P(B | C) and the events are not independent.
Now it is straightforward to work out remaining regions for the Venn diagram
P( A Ç C ¢ ) = 0.2 - P( AÇ C) = 0.2 - 0.1 = 0.1
P(C Ç A¢ Ç B¢ ) = 0.5 - P( A Ç C) - P( B Ç C) = 0.5 - 0.1- 0.3 = 0.1
P( A Ç C ) 0.1
b i P( A | C ) = = = 0.2
P(C ) 0.5
P( B Ç C ¢) 0.3 0.3
ii P( B | C ¢) = = = = 0.6
P(C ¢) 0.1 + 0.3 + 0.1 0.5
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
9 a All of the people who have the illness test positive, which means that there are no people in A who
are not in AÇ B . There are also 10 people who test positive but do not have the illness. These
people lie in B but do not lie in A, i.e. they lie in B Ç A¢ . There are 100 – 10 – 5 = 85 people who
do not have the illness and do not test positive, so they lie in A¢ Ç B¢ . Therefore the Venn diagram
should show:
P( A Ç B) 0.05 1
b P( A | B) = = = = 0.333 (3 s.f.)
P( B) 0.15 3
c The test would allow the doctor to find all of the people who have the illness, but only one third of
those who tested positive would actually have the illness. This means that two thirds of the people
who were told they had the illness would actually not have it.
P( B Ç A) 0.42
b P( B | A) = = = 0.7
P( A) 0.6
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
11 P( A | B) = P( B¢)
P( AÇ B)
Þ = P( B¢ ) = 0.2 + 0.1 = 0.3
P( B)
Þ P( AÇ B) = 0.3P( B)
Þ x = 0.3(x + y)
The probabilities must sum to 1, so 0.2 + x + y + 0.1 = 1Þ x + y = 0.7
Substituting for x + y gives
x = 0.3(0.7) = 0.21
y = 0.7 - x = 0.7 - 0.21 = 0.49
12 P( A | B) = P( A¢)
P( A Ç B)
Þ = P( A¢)
P( B )
c
Þ = d + 0.2 (1)
c+d
The probabilities must sum to 1, so
0.3 + c + d + 0.2 = 1 Þ c + d = 0.5 ( 2)
Substituting for c + d in the equation (1) gives
c
= d + 0.2 Þ c = 0.5d + 0.1 ( 3)
c+d
Substituting this equation for c in equation ( 2) gives
4
0.5d + 0.1 + d = 0.5 Þ 1.5d = 0.4 Þ d =
15
Finally, using equation ( 3) gives
4 4 1 4 3 7
c = 0.5 ´ + 0.1 = + = + =
15 30 10 30 30 30
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 6
Exercise 4G
1 a Rewrite the addition formula to obtain
P( A Ç B) = P( A) + P(B) - P( A È B) = 0.4 + 0.5 - 0.6 = 0.3
Use this result to complete a Venn diagram to help answer the remaining parts of the question.
i The required region is the part ‘outside’ of C and D, which can be found since all of the
probabilities must sum to 1.
P(C ¢ Ç D¢ ) = 1- P(C È D) = 1- 0.8 = 0.2
P(C Ç D) 0.4
ii P(C | D ) = = = 0.615 (3 s.f.)
P( D) 0.65
c From part b ii, it is known that P(C | D) P(C ) so the two events are not independent.
Alternatively, show that P(C) ´ P( D) P(C Ç D) .
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
3 a P( E È F ) = P(E) + P( F ) - P(E Ç F ) = 0.7 + 0.8 - 0.6 = 0.9
i The required region is within E as well as everything outside F. It includes three of the four
regions in the Venn diagram.
P(E È F ¢ ) = 0.1+ 0.6 + 0.1 = 0.8
P( E Ç F ¢) 0.1 1
iii P( E | F ¢) = = = = 0.5
P( F ¢) 0.1 + 0.1 2
5 Let F be the event that a household has a freezer and D be the event that the household has a
dishwasher. The question requires finding P(F Ç D). Use the addition formula
P(F Ç D) = P(F) + P(D) - P(F È D) = 0.7 + 0.2 - 0.8 = 0.1
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
6 a Use the multiplication formula for conditional probability to find P( AÇ B)
P( A Ç B) = P( A | B) ´ P( B) = 0.4 ´ 0.5 = 0.2
Now use the multiplication formula again to find P( B | A)
P( B Ç A) 0.2 1
P( B | A) = = = = 0.5
P( A) 0.4 2
b P( A¢ Ç B) = P( B) - P( A Ç B) = 12 - 203 = 7
20 = 0.35
c P( A¢ Ç B¢) = 1 - P( A È B) = 1 - 35 = 52 = 0.4
3 9
c P(C ) = P(C Ç D¢) + P(C Ç D ) = 20 + 121 = 60 + 605 = 14
60 =
7
30 = 0.233 (3 s.f.)
1
P( D Ç C ) 12 30 5
d P( D | C ) = = 7
= = = 0.357 (3 s.f.)
P(C ) 30 84 14
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
9 c Since the events A and C are independent, P( A Ç C) = P( A) ´ P(C) = 0.42 ´ 0.3 = 0.126
d Since B and C are mutually exclusive, there is no need to have an intersection between B and C on
the diagram. Work out the probabilities associated with each region as follows:
P(C Ç A¢ ) = P(C) - P( A Ç C) = 0.3- 0.126 = 0.174
P( B Ç A¢ ) = P( B) - P( A Ç B) = 0.37 - 0.12 = 0.25
P( A Ç B¢ Ç C ¢ ) = P( A) - P( A Ç B) - P( A Ç C) = 0.42 - 0.12 - 0.126 = 0.174
P( A È B È C) = 0.174 + 0.126 + 0.174 + 0.12 + 0.25 = 0.844
P( A¢ Ç B¢ Ç C ¢ ) = 1- P( A È B È C) = 1- 0.844 = 0.156
e P(( A¢ È C )¢) = 1- P( A¢ È C)
Use the Venn diagram to find P( A¢ È C) = 0.174 + 0.126 + 0.25 + 0.156 = 0.706
So P(( A¢ È C )¢) = 1- 0.706 = 0.294
P( B Ç C ) 0.28 7
b Using part a, P( B | C ) = = = = 0.7
P(C ) 0.4 10
P(( B Ç C ) Ç A¢) P( B Ç C ) - P ( A Ç B Ç C )
d P(( B Ç C ) | A¢) = =
P( A¢) 1 - P( A)
As A and C are mutually exclusive, P( A Ç B Ç C) = 0
P( B Ç C ) 0.28 0.28
So P(( B Ç C ) | A¢) = = = = 0.467 (3 s.f.)
1 - P( A) 1 - 0.4 0.6
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
11 c Test whether the events are independent
P( A) ´ P( B) = 0.3´ 0.7 = 0.21, P( A Ç B) = 0.1
So the events are not independent. If Fatima is late, Gayana is less likely to be late and vice versa.
12 a The probability that both José and Cristiana win their matches is P( J Ç C )
P( J Ç C ) = P( J ) + P(C ) - P( J È C ) = 0.6 + 0.7 - 0.8 = 0.5
P( J Ç C ) 0.5
c P(C | J ) = = = 0.833 (3 s.f.)
P( J ) 0.6
d P(C | J ) = 0.833 (3 s.f.), P(C ) = 0.7, so P(C | J ) P (C ) . So J and C are not independent.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
Exercise 4H
1
2 a
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
3 a
b P(two coins total 20c or less) = P(10c ∩ 10c) + P(10c ∩ 5c) + P(5c ∩ 10c) + P(5c ∩ 5c)
3 5 3 4 1 6 1 3
= × + × + × + ×
10 19 10 19 5 19 5 19
9
=
38
4 a When the first token removed is red, there are 8 tokens remaining in the bag, 4 red and 4 blue.
When the first token removed is blue, there are 8 tokens remaining in the bag, 5 red and 3 blue.
b The answer can be read off from the tree diagram, following the lower branch (first blue) and then
the red branch.
So P(second red | first blue) = 85
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
P(first blue and second red)
4 d P(first blue | tokens different colours) =
P(first blue and second red) + P(first red and second blue)
4
×5 20
20 1
= 4 5 9 8 5 1= 72 = =
( 9 × 8 ) + ( 9 × 2 ) 72 40 2
40
( 9 × 2 ) + ( 9 × 8 ) 72 72 40 2
40 40
5 a
P( B | A) = 0.45 ⇒ P( B′ | A) = 1 − 0.45 =0.55
P( B | A′) = 0.35 ⇒ P( B′ | A′) =1 − 0.35 =0.65
Therefore the completed tree diagram should be:
b i P( A ∩ B) P( A) × P( B =
= | A) = 0.7 × 0.45 0.315
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
6 a There are 10 dark chocolates in a box of 24, meaning the probability of choosing a dark chocolate
is 10
24 = 12
5
. Similarly there are 14 milk chocolates out of the 24, and so the probability of choosing a
dark chocolate is 14 24 = 12
7
.
Once Mariana has eaten one chocolate, there are 23 chocolates left in the box. If the first chocolate
she ate was a dark one, the probability of choosing another dark chocolate is 239 , and the
probability of choosing a milk chocolate is 1423
. If the first chocolate she ate was a milk one, the
probability of a dark chocolate is 10
23
, and the probability of choosing another milk chocolate is 13 23
.
c
= 5
12 × 14
23 + 12 × 23
7 10
× 239
5 45
45 9
= 12
= =
276
= = 0.243 (3 s.f.)
12 × 23 +
5 9
× 14
5
1223 + 12 × 23
7 10 185
276 185 37
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
7 Use the information in the question to produce a tree diagram covering Chimamanda’s possible travel
arrangements on Tuesday and Wednesday as follows:
8 Represent the information as a tree diagram. The coins are chosen at random, so there is a probability
of 12 of choosing each coin.
a P(head) = 12 × 12 = 14 = 0.25
9 a Since the first ball selected is not replaced, there are 10 balls in the bag when the second ball is
selected.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
9 b
10 a The probability of the sheet coming from A, B or C is given in the question. In each case, the
probability that a sheet is flawed immediately provides the probability that it is not flawed since
the two probabilities must sum to 1. Therefore the completed tree diagram is:
b i
11 a The reliability of the test depends on whether the person has the condition (C) or not .
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 6
P(C ∩ tests negative)
11 c P(has condition | tests negative) =
P(tests negative)
d From the data in the question, the test fails to find 10% of the people with the condition (since it
has a 0.1 chance of producing a negative result when a person has the condition).
Consider also false positives, the case of a person who does not have the condition returning a
positive result.
P(C ′ ∩ tests positive) 0.96 × 0.02
P(does not have the condition | tests=positive) = = 0.348 (3 s.f.)
P(tests positive) 1 − 0.9448
So over one third of the positive tests are false positives.
This means that if the test was used on the entire population, 10% of the people with the condition
would not be identified and over one third of the people with a positive result would actually not
have the condition.
12 a Since the probabilities of being late are given, the probabilities for being on time (i.e. not late) for
each type of transport are known, sine the probabilities must sum to 1. Therefore the completed
tree diagram should be as follows:
ii To find P(Hussein is late), sum P(Hussein travels by car and is late), P(Hussein travels by bus
and is late) and P(Hussein travels by train and is late).
P(Hussein is late) = 0.1× 0.55 + 0.6 × 0.3 + 0.3 × 0.05 = 0.25
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 7
13 a The two counters being drawn from box A can be modelled using a tree diagram. In each case, the
number of counters of each colour in box B is then known, and so the third set of branches can be
labelled to represent the drawing of the counter from box B. Therefore the completed tree diagram
should be:
( 74 × 12 ) + ( 73 × 13 ) =72 + 71 =
b P(C ) = P(GG ) + P( BB) = 3
7
c
= ( 74 × 12 × 73 ) + ( 74 × 12 × 74 ) + ( 73 × 32 × 74 ) + ( 73 × 13 × 75 )
= 6
49 + 498 + 498 + 495 = 27
49
d The calculation will be similar to that for P(D), but with the first and second counters being the
same colour.
P(C ∩ D) = P(GGB) + P( BBB) = ( 74 × 12 × 73 ) + ( 73 × 13 × 75 ) = 496 + 495 = 11
49
14 She has not taken into account the fact that after the first jelly bean is selected, there are only 9 jelly
beans left in the box. So if the first jelly bean selected is sweet, the probability that the second bean is
sweet is 96 not 107 .
This is the correct solution.
P(both jelly beans are sweet) = 107 × 96 = 157
P(at least one jelly bean is sweet) =1 − P(neither jelly bean is sweet) =1 − ( 103 × 92 ) =14
15
7
7
P(both are sweet given at least one is sweet)
= =
15
= 0.5
14
15 14
The correct answer is therefore 0.5.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 8
Chapter review 4
1 a
= P ( B ∩ B)
b P(both blue)
3 2
= ×
7 6
1
=
7
4 3 3 4
=× + ×
7 6 7 6
12 4
= =
21 7
7 7 3 7 7 5
2 a P(RRB or RRG) = × × + × ×
15 15 15 15 15 15
392
=
3375
7 3 5 7 5 3 3 5 7 3 7 5 5 3 7 5 7 3
= × × + × × + × × + × × + × × + × ×
15 15 15 15 15 15 15 15 15 15 15 15 15 15 15 15 15 15
7 × 3 × 5 630 14
=6× =
3 =
15 3375 75
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
8 + 13 + 19 + 30 + 26 + 32 128 64
4 a P(Year 11) = = =
250 250 125
7 + 8 + 15 + 13 + 18 + 19 80 8
b P(s < 35) = = =
250 250 25
15 + 18 33
c P(Year 10 with score between 25 and 34) = =
250 250
d Using interpolation:
40 − 37
Number of students passing = × (25 + 30) + 30 + 26 + 27 + 32
40 − 35
3
= × 55 + 30 + 26 + 27 + 32 = 148
5
148 74
P(pass) = =
250 125
The assumption is that the marks between 35 and 40 are uniformly distributed.
0.5 × 50 + 0.5 × 30 + 2 × 2 44 22
5 a P(mass > 3) = = =
1× 6 + 0.5 × 50 + 0.5 × 30 + 2 × 2 50 25
6 a
30 1
b i P(None) = =
150 5
30 + 40 + 18 + 35 123 41
ii P(No more than one) = = =
150 150 50
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
7 a P(A and B) = P(A) + P(B) – P(A or B or both) = 13 + 14 − 12 = 1
12
b P(A and B) = 1
12
P(A) × P(B) = 13 × 14 = 1
12
13
b P(C and F) =
38
21 22 462 231
P(C) × P(F) = × = =
38 38 1444 722
As P(C and F) ≠ P(C) × P(F), the events 'likes cricket' and 'likes football' are not independent.
9 a
As P(J and K) ≠ P(J) × P(K), the events J and K are not independent.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
10 a P(Phone and Tablet) = 0.85 + 0.6 – (1 − 0.05) = 0.5 = 50%
c P(only P) = 0.35
As P(P and T) ≠ P(P) × P(T), the events P and T are not independent.
As P(A and B) ≠ P(A) × P(B), the events A and B are not independent.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
12 a
b i P(D1D2D3) = 54 × 32 × 12 =154
P (exactly one diamond) = P(D, Dʹ, Dʹ) + P(Dʹ, D, Dʹ) + P(Dʹ, Dʹ, D)
= ( 54 × 13 × 12 ) + ( 15 × 23 × 12 ) + ( 15 × 13 × 12 ) =307
c P(at least two diamonds) = 1 – P(at most one diamond) = 1 – (P(none) + P(exactly one diamond))
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
13 a
14 a
P( B ∩ A) 0.2
c P( B|A)
= = = 0.5
P( A) 0.4
P( A′ ∩ B) P( B) − P( A ∩ B) 0.15
d P( A=
′|B) = = = 0.429 (3 s.f.)
P( B) P( B) 0.35
15 a Work out each region of the Venn diagram from the information provided in the question.
Find the outer region by subtracting the sum of all the other regions from 1
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 6
15 b i
ii
P( J ∩ K ) 0.1
iii P( J |K ) = = = 0.222 (3 s.f.)
P( K ) 0.45
P( K ∩ ( J ′ ∩ L′)) 0.2825
iv P( K |J ′ ∩ L′) = = 0.471 (3 s.f.)
=
P( J ′ ∩ L′) 0.6
16 a
d There are 6 students that study just French and wear glasses ( ) and 9 students that
study just Spanish and wear glasses ( ), so the required probability is
e There are 26 students studying one language (from part a). Of these, 15 wear glasses (from part d).
17 a
b i
ii There are two different ways to obtain balls that are different colours:
6 9 9 6 2 × 9 18
P( RG ) + P(GR) =
× + × = = =0.514 (3 s.f.)
15 14 15 14 5 × 7 35
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 7
17 c There are 4 different outcomes:
d The only way for this to occur is to draw a green ball each time. The corresponding probability is:
18 a Either Ty or Chimene must win both sets. Therefore the required probability is:
c The three ways that Ty can win the match are: wins first set, wins second set; wins first set, loses
second set, wins tiebreaker; loses first set, wins second set, wins tiebreaker.
P(Ty wins match) = (0.7 × 0.8) + (0.7 × 0.2 × 0.55) + (0.3 × 0.4 × 0.55)
=0.56 + 0.077 + 0.066 =0.703
19 a There are 20 kittens with neither black nor white paws (75 – 26 – 14 – 15 = 20).
c This is selection without replacement (since the first kitten chosen is not put back).
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 8
20 c As A and C are mutually exclusive
Find the outer region by subtracting the sum of all the other regions from 1
d i
ii The required region must be contained within A, and not include B (the condition on C is
irrelevant since A and C are mutually exclusive). Therefore,
21 a It may be that neither team scores in the match, and it is a 0–0 draw.
b
So
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 9
Challenge
Given that she is retired, the probability that her husband is also retired is 0.8.
Hence the probability that both are retired is 0.4 × 0.8 = 0.32.
From this data you can deduce the following Venn diagram of the probabilities:
Let H = husband retired, Hʹ = husband not retired, W = wife retired, Wʹ = wife not retired.
The permutations where only one husband and only one wife is retired are:
P(only one husband and only one wife is retired) = (0.38 × 0.08 + 0.32 × 0.22) × 2 = 0.2016
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 10
Challenge
2 a Let
As P( A ∩ B ) P ( B ) ⇒ k 0.2
A and B could be mutually exclusive, meaning , so 0 k 2
) kx, =
3 a P ( X= x= x 1, 2,3, 4,5
x 1 2 3 4 5
P(X = x) k 2k 3k 4k 5k
1
The sum of the probabilities is 1, therefore, 15k = 1 so k =
15
5k
b P ( X= 5 | X > 2 )=
12k
5
=
12
P ( odd ∩ prime )
c P ( X is odd | X is prime ) =
P ( prime )
8
= 15
10
15
8 4
= =
10 5
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 11
Exercise 5A
1 a Positive correlation.
2 a There is no correlation.
b The scatter graph does not support the statement that hotter cities have less rainfall.
3 a
b It shows positive correlation. Each student's tendency to guess lower or higher than the mean was
the same in both tests.
4 a
c For example, there may be a third variable that influences both house value and internet
connection, such as the distance from built up areas.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
5 a Q1 = 0.1, Q2 =1.8, Q3 = 4.9
IQR = Q3 – Q1
= 4.8
Q3 + 1.5 × IQR = 4.9 + 1.5 × 4.8
=12.1
Since 22.3 > 12.1 it is an outlier.
d There is no correlation.
e There is a causal relationship between the amount of rainfall and the hours of sunshine.
This is because the amount of rainfall is caused by how many hours of sunshine there are.
If there were sunshine all day, there wouldn’t be any rain. If there were no sunshine, this implies
there are clouds all day, so more chance of rain.
For this small sample there is no correlation, but you would expect for a larger sample
more sunshine = less rain, and less sunshine = more rain.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
Exercise 5B
1 a and b
c If the number of items produced is zero, the production costs will be approximately €21 000. If the
number of items produced increases by 1000, the production costs increase by approximately
€980.
d The prediction for 74 000 is within the range of the data (interpolation) so is more likely to be
reliable. The prediction for 95 000 is outside the range of the data (extrapolation) so is less
likely to be reliable.
2 a
First, 10 coats of paint is very far outside our range of given data, and we cannot assume that this
linear relationship continues as we extrapolate, so using the regression line is not necessarily valid.
Second, even if we accept the extrapolation as valid, a gradient of 1.45 means that, for every extra
coat of paint, the protection will increase by 1.45 years. Therefore, if 10 coats of paint are applied,
the protection will be 14.5 years longer than if no paint were applied. Joti has, however, forgotten
to include the constant 2.93 years, which is the weather resistance if no paint were applied. After
10 coats of paint the protection will last approximately 2.93 + 14.5 = 17.43 years.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
3 a
b The scatter diagram shows weak negative correlation, therefore the gradient in the regression
equation, given as 0.063, should be negative.
4 This is not a reasonable statement as there are unlikely to be any houses with no bedrooms, so she
is extrapolating outside of the range of data, where the linear relationship is unlikely to continue.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
Exercise 5C
S xy 90
1 =
b = = 6
S xx 15
S xy 165
2=
b = = 5.5
S xx 30
Equation is:
S xy 80 80
3 =
b = = 2= b = 2
S xx 40 40
Equation is:
4 a
(∑ x)
2
10 ×10
S xx = ∑ x 2
− = 30 − = 30 − 25 = 5
n 4
S xy = ∑ xy −
∑ x∑ y = 140 −
10 × 48
= 140 − 120 = 20
n 4
Equation is:
∑ xy −
S xy =
∑ x∑ y =
348 −
29 × 48 1740 − 1392 348
= = = 69.6
n 5 5 5
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
5 b
S xy 69.6
b
= = = 1.70588...
= 1.71 (3 s.f.)
S xx 40.8
Equation is: y =
−0.294 + 1.71x
b The value of 57 is the gradient of the regression line. For every unit increase in someone’s
dexterity score, that person’s productivity rises by 57.
c i It may be a little unreliable to use the equation to work out the productivity of someone with
dexterity of 2, since 2 lies just outside the values in the table. It would involve extrapolation.
ii It may be very unreliable to use the equation to work out the productivity of someone with
dexterity of 14, since 14 lies well outside the range of the values in the table. It would involve
extrapolation.
(∑ h)
2
22.09 × 22.09
7 S hh =∑h 2
−
n
=45.04 −
12
= 45.04 − 40.6640 = 4.37599 = 4.376 (4 s.f.)
∑ hg − ∑ n∑ =
h g 22.09 × 49.7
S hg = 97.778 − = 97.778 − 91.48941 = 6.28858 = 6.286 (4 s.f.)
12
=h
∑=
h 22.09
= 1.841 (4 s.f.)
n 12
=g
∑ g 49.7
= = 4.141 (4 s.f.)
n 12
S hg 6.286
b
= = = 1.43647= 1.44 (3 s.f.)
S hh 4.376
a =g − bh =4.141 − (1.436 × 1.841) =1.4973 = 1.50 (3 s.f.)
So the equation is:
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
∑ wp − ∑
w∑ p 186 × 397
8 a S wp = 6797 −
= −587.2
=
n 10
S wp 587.2
b= =
− −1.37711 =
= −1.38 (3 s.f.)
S ww 426.4
=p
∑=
p 397
= 39.7
n 10
w
=
∑= w 186
= 18.6
n 10
587.2
a =p − bw =39.7 + ×18.6 =65.31425 =65.3 (3 s.f.)
426.4
Hence equation of regression line of p on w is: p = 65.3 – 1.38w
65.3142 1
b=p (65.3142) − (1.3771) w =
⇒w − p
1.3771 1.3771
This gives (to 3 s.f.) the equation: w = 47.4 – 0.726p
c The w on p regression line is calculated using different summary statistics rather than just the
reciprocal of the summary statistics used for the p on w regression line.
S xy 3230
b= =
− = −0.294 (3 s.f.) (3 s.f.)
−0.29363 =
S xx 11000
3230
a =y − bx =34 + ×150 =78.0454 =78.0 (3 s.f.)
11000
Hence equation of regression line of y on x is: y = 78.0 – 0.294x
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
9 b
c The model is not valid since the data does not follow a linear pattern.
S np 6344
b
= = = 0.97810=
0.978 (3 s.f.)
S nn 6486
p − bn =65 − (0.9781× 45) =20.98519 =
a= 21.0 (3 s.f.)
So the equation is:
d This estimate is reliable since 40 000 items lies within the range of the data.
∑ np −
S np =
∑ n∑ p =
6850 −
112 × 480
1474
=
10 10
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
11 c The increase in cost, in dollars, for every 100 leaflets printed.
b The value b (1.45) is the additional protection given (in number of years) for each additional coat
of paint.
c The model would be unsuitable because 7 years lies outside the range of the data. The equation
would also give a non-integer solution, but it is only possible to apply a whole number of coats.
e i Calculator gives the new equation of the regression line as y = 0.478 + 1.25x (3 s.f.).
iii The original prediction in part d was extrapolated. This result uses interpolation. More data
generally gives a more accurate regression model.
Challenge
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
Exercise 5D
1 Substituting in the regression equation: (x + 2) + ( y - 3) = 5
Þ x + y -1= 5
Þ y = 6- x
y x
3 Substituting in the regression equation: - 2 = 6 - 4
4 3
Þ 3y - 24 = 72 - 16x (multiplying through by 12)
Þ 3y = 96 - 16x
16
So y = 32 - x
3
S xy 120
5 a b= = = 0.5
S xx 240
a = y - bx = 6 - 0.5 ´ 5 = 3.5
So y = 3.5 + 0.5x
d c
b Substituting in the regression equation from part a: = 3.5 + 0.5 ´
10 2
Þ d = 35 + 2.5c
x=
å x = 49 = 9.8 y=
å y = 81 = 16.2
n 5 n 5
162.2
a = y - bx = 16.2 - ´ 9.8 = 7.86897 K = 7.87 (3 s.f.)
190.8
Hence equation of regression line of y on x is: y = 7.87 + 0.850x.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
c a -8
6 b Substituting in the regression equation from part a: = 7.869 + 0.8501´
5 2
Þ 2c = 78.69 + 4.2505(a - 8) (multiplying through by 10)
Þ 2c = 78.69 + 4.2505a - 34.004
(3 s.f.)
Þ 2c = 44.686 + 4.2505a
Þ c = 22.3 + 2.13a (giving parameters to 3 s.f.)
Note that substituting into the equation from part a with the parameters rounded to 3 s.f., i.e.
y = 7.87 + 0.850x, gives a slightly different result due to rounding:
c a - 8
= 7.87 + 0.85 ´
5 2
Þ 2c = 78.7 + 4.25a - 34
Þ 2c = 44.7 + 4.25a - 34
Þ c = 22.35 + 2.125a
Þ c = 22.4 + 2.13a (giving parameters to 3 s.f.)
c Method 1
32 - 8
If a = 32, then x = = 12
2
y = 7.8689K + 0.85010 ´12 = 18.07017K
c = 5 y = 90.3508K = $90.40 (3 s.f)
Method 2
Using part b,
c = 22.343K + 2.12525K´ 32 = 90.351K = $90.40 (3 s.f.)
S pv 15.26
7 a b= = = 1.49461K = 1.49 (3 s.f.)
Svv 10.21
15.26
a = p - bv = 9.88 - ´ 4.58 = 3.03467K = 3.03 (3 s.f.)
10.21
Hence equation of regression line of p on v is p = 3.03 + 1.49v
42 - 4
b When x = 42, v = = 4.75
8
Hence p = 3.03467K + 1.49461´ 4.75 = 10.134K = 10.1 tonnes (3 s.f.)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
Exercise 5E
S xy 100 100
1 r= = = = 0.985 (3 s.f.)
S xx S yy 92 ´ 112 101.50862
x
2
367 ´ 367
2 S xx = x 2 = 33845 = 33845 22 448.166K = 11396.833K
n 6
y
2
270 ´ 270
S yy = y 2
= 12976 = 12976 12150 = 826
n 6
S xy = xy
x y 367 ´ 270
= 17135 = 17135 16515 = 620
n 6
S xy 620 620
r= = = = 0.202 (3 s.f.)
S xx S yy 11396.833 ´ 826 3068.189
a
2
115 ´ 115
3 a S aa = a 2 = 1899 = 9.7142K = 9.71 (3 s.f.)
n 7
Sah 72.1
b r= = = 0.96774K = 0.968 (3 s.f.)
Saa Shh 9.7142... ´ 571.4
c There is positive correlation. The greater the age of the person, the taller the person.
L = 26.8 L 2
= 150.02 T = 47.4 T 2
= 399.58 LT = 237.07
26.8 ´ 26.8
S LL = 150.02 = 150.02 119.7066K = 30.3133K = 30.3 (3 s.f.)
6
47.4 ´ 47.4
STT = 399.58 = 399.58 374.46 = 25.12
6
26.8 ´ 47.4
S LT = 237.06 = 237.07 211.72 = 25.35
6
S LT 25.35 25.35
b r= = = = 0.91865K = 0.919 (3 s.f.)
S LL STT 30.3133K ´ 25.12 27.5947K
c The data in the scatter graph appear to be linear, and the correlation coefficient found in part b is
close to 1. Therefore, a linear regression model is suitable to model the data.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
x
2
973 ´ 973
5 a S xx = x 2 = 120 123 = 120 123 118 341.125 = 1781.875
n 8
y
2
490 ´ 490
S yy = y 2 = 33 000
= 33 000 30 012.5 = 2987.5
n 8
S xy = xy
x y = 61 595 973 ´ 490 = 61 595 59 596.25 = 1998.75
n 8
S xy 1998.75 1998.75
r= = = = 0.86629K = 0.866 (3 s.f.)
S xx S yy 1781.875 ´ 2987.5 2307.2389
b The correlation is positive. The higher the IQ, the higher the mark gained in the general knowledge
test. (Alternatively, the higher the mark gained in the intelligence test, the higher the IQ.)
6 The coding is linear, so the product moment correlation coefficient will be unaffected by the coding.
So the product moment correlation coefficient between x and y is 0.973.
p 0 5 3 2 1
q 0 17 12 10 6
p = 11 p = 39 q = 45 q
2 2
= 569 pq = 147
p = 39 11´11 = 14.8
2
S pp = p 2
n 5
q
2
45 ´ 45
S qq = q 2 = 569
= 164
n 5
S pq = pq
p q = 147 11´ 45 = 48
n 5
S pq 48
r= = = 0.97429K = 0.974 (3 s.f.)
S pp S qq 14.8 ´ 164
c The coding is linear. The product moment correlation coefficient is independent of the linear
coding, hence it is 0.974 (3 s.f.).
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
8 a This is the coded data set:
p 10 8 11 9 12
t 4 3 5 4 6
p = 50 p = 510 t = 22 t
2 2
= 102 pt = 227
p = 510 50 ´ 50 = 10
2
S pp = p 2
n 5
t
2
22 ´ 22
Stt = t 2 = 102
= 5.2
n 5
S pt = pt
p t = 227 50 ´ 22 = 7
n 5
S pt 7 7
b r= = = = 0.97072K = 0.971 (3 s.f.)
S pp Stt 10 ´ 5.2 7.2111K
c The coding is linear. The product moment correlation coefficient is independent of the linear
coding, hence it is 0.971 (3 s.f.).
x 15 37 5 0 45 27 20
y 30 13 34 43 20 14 0
x = 149 x 2
= 4773 y = 154 y 2
= 4670 xy = 2379
x
2
149 ´149
S xx = x 2
= 4773 = 1601.4285K = 1601 (4 s.f.)
n 7
y
2
154 ´ 154
S yy = y 2
= 4670 = 1282
n 7
S xy = xy
x y = 2379
149 ´ 154
= 899
n 7
S xy 899 899
b r= = = = 0.62742K = 0.627 (3 s.f.)
S xx S yy 1601.4285 ´1282 1432.84K
c The shopkeeper is not correct. There is negative correlation, so as the newspaper sales go up the
sweet sales go down.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
f 10 x x
2 2 2
10 a S ff = f 2
== (10 x) 2
= 10 x
2 2
n 8 8
= 100S xx = 100 ´ 111.48 = 11148
g 5( y + 10) 5 y + 50 ´ n
2 2 2
b S gg = g 2 = 74 458.75 = 74 458.75
n n n
(5 ´ 70.9 + 50 ´ 8) 2
= 74 458.75 = 3299.97
8
S fg 5667.5
r= = = 0.934 (3 s.f.)
S ff S gg 11148 ´ 3299.97
c The product moment correlation coefficient shows strong linear correlation. However, the scatter
diagram suggests a non-linear fit.
x
2
122
11 a S xx = x 2
= 22.02 = 1.44857 K
n 7
y
2
97.7 2
S yy = y 2 = 1491.69
= 128.077K
n 7
S xy = xy
x y = 180.37 12 ´ 97.7 = 12.8842K
n 7
S xy 12.884K
r= = = 0.946 (3 s.f.)
S xx S yy 1.4485K ´128.077K
b This table sets out the residuals for each data point:
x y y = –1.2905 + 8.8945x
1.1 6.2 8.49345 –2.29345
1.3 10.5 10.27235 0.22765
1.4 12 11.1618 0.8382
1.7 15 13.83015 1.16985
1.9 17 15.60905 1.39095
2.1 18 17.38795 0.61205
2.5 19 20.94575 –1.94575
c The linear model might not be a good model for this data, as the residuals do not appear to be
randomly scattered about zero.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
Chapter review 5
1 The data shows that the number of serious road accidents in a week strongly correlates with the
number of fast food restaurants. However, it does not show whether the relationship is causal. Both
variables could correlate with a third variable, e.g. the number of roads coming into a town.
2 a
c As mean CO2 concentration in the atmosphere increased, mean temperatures also increased.
b If the number of items increases by 1, the time taken increases by approximately 2.64 minutes.
4 a 15.2 + 2 × 11.4 = 38
b The outlier should be omitted, as it is very unlikely that the average temperature was 50 °C in a
climate where people need to buy gloves, and so this data point is likely an anomaly.
This means that for every increase in temperature of 1 °C, the shop sells 5.2 fewer pairs of pairs of
gloves.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
5 a and b
c Brand D is overpriced, since its price is much more than you would expect (the data point is far
above the regression line).
d The regression equation should be used to predict a value for y given x, i.e. the price given the
percentage of cocoa solids. So the student’s method is a valid one.
6 a
S st 5885.25
b
= = = 0.95030=
0.950 (3 s.f.)
S ss 6193
a =t − bs =45.75 − (0.95030 × 46.0833) =1.95672 =1.96 (3 s.f.)
Hence equation of regression line of t on s is:
n
− =43622.85 −
8
16350.048 =
= 16350 (5 s.f.)
467.1× 7805
666 045 −
S xy = 210330.56 =
= 210331 (6 s.f.)
8
b=x
∑
=
x 467.1
= 58.3875 =y
∑
=
y 7805
= 975.625
n 8 n 8
S xy 210330.56
b
= = = 12.8642=
12.86 (4 s.f.)
S xx 16350.048
a =y − bx =975.625 − (12.8642 × 58.3875) =224.5155 =224.5 (4 s.f.)
Equation is:
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
d 3500 224.515 + 12.864 x
7=
3500 − 224.515
⇒ Energy consumption ( x) = 255 (3 s.f.)
=
12.8642
e This answer is likely to be unreliable as it involves extrapolation. The value of 3500 is well outside
the limits of the data set used.
8 a S xy =∑ xy −
∑ x∑ y =84.25 − 25.5 ×13.5 =84.25 − 57.375 =26.875
n 6
=x
∑
=
x25.5
= 4.25 = y
∑= y 13.5
= 2.25
n 6 n 6
S xy 26.875
b =
= = 0.44881=
0.449 (3 s.f.)
S xx 59.88
a =y − bx =2.25 − (0.44881 × 4.25) =0.3425 =0.343 (3 s.f.)
Equation is:
m
b=
t − 2 0.3425 + 0.4488
2
=⇒ t 2.3425 + 0.2244 m
⇒= t 2.34 + 0.224m (rounding the parameters to 3 s.f.)
= 2.3425 + (0.2244 ×=
c Tail length 10) 4.5865
= 4.6 cm (2 s.f.)
x 0 3 12 5 14 6 9
y 7 9 15 9 13 11 13
=∑ x 49=∑ x 2 491
= ∑ y 77=∑ xy 617
S xy = ∑ xy −
∑ x∑ y = 617 − 49 × 77 = 617 − 539 = 78
n 7
(∑ x)
2
492
S xx = ∑ x 2 − = 491 − = 491 − 343 = 148
n 7
b x
=
∑=
x 49
= 7 y
=
∑=
y 77
= 11
7 7 7 7
S xy 78
b
= = = 0.52702=
0.5270 (4 s.f.)
S xx 148
a = y − bx =11 − (0.52702 × 7) = 7.3108 = 7.311...
Equation is:=y 7.31 + 0.527 x (parameters to 3 s.f.)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
w
9 c 7.3108 + 0.52702 (n − 10) (multiply by 400)
=
400
⇒ w 816.24 + 210.808n
=
⇒=w 816.2 + 210.8n (parameters to 4 s.f.)
10 a The figure of 0.79 is the average amount of food consumed (in kg) in 1 week by 1 hen.
c Food needed
39.66
Cost of feed = ×12 €47.592
= = €47.59
10
11 a This is a scatter diagram of the data. (The diagram also shows the regression line, found in part e.)
b There appears to be a linear relationship between body length and body mass.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
11 c Calculating the summary statistics for l and w gives:
l 14 15 17 18 18 20 22 22
w 15 18 19 22 24 29 30 31
=∑ l 2 2726
= ∑ l 146
= ∑ w 188
= ∑ lw 3553
l
=
∑=l 146
= 18.25 w
=
∑=
w 188
= 23.5
n 8 n 8
(∑ l )
2
146 ×146
Sll =∑l2 − n 8
=2726 −
= 2726 − 2664.5 = 61.5
Slw = ∑ lw −
∑ l ∑ w = 3553 − 146 ×188 = 3553 − 3431 = 122
n 8
Slw 122
b =
= = 1.9837= 1.98 (3 s.f.)
Sll 61.5
a =− 23.5 − (1.9837 × 18.25) =
w bl = 23.5 − 36.2032 =
−12.7032 =
−12.7 (3 s.f.)
Equation is:
y x
d −12.7 + 1.98 ×
= (multiply through by 10)
10 10
g Voles B and C are both underweight so were probably removed from the river. Vole A is slightly
overweight so was probably left in the river.
(∑ t )
2
17.7 2
12 a Stt =∑ t 2
− =42.33 −
=3.16875
n 8
∑ ts − ∑ n∑ =
t s 17.7 ×17.5
Sts = 42.16 − =3.44125
8
Sts 3.44125
b =
= = 1.0859= 1.09 (3 s.f.)
Stt 3.16875
=t
∑
=
t 17.7
= 2.2125 = s
∑= n 17.5
= 2.1875
n 8 n 8
3.44125
a= s + bt = 2.1875 − × 2.2125 = −0.21526 = −0.215 (3 s.f.)
3.16875
Hence the equation of the regression line of s on t is:
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
13
∑ xt − ∑ n∑
x t
S xt
=r =
S xx Stt ( ) ( )
2 2
∑ x2 − ∑ x
∑t2 − ∑ t
n n
(90.7)(303)
1433.8 −
= 20 0.375 (3 s.f.)
90.7
2
3032
493.77 − 4897 −
20 20
14 a
S xy −1.5 −1.5
r == = = −0.1466 =
−0.147 (3 s.f.)
S xx S yy 16.1× 6.5 10.2298
b The coding is linear, so the product moment correlation coefficient will be unaffected by the
coding. So the product moment correlation coefficient between s and a is –0.147.
c This is a weak negative correlation that is close to 0. There is little evidence to suggest that
students in the group who are good at science will also be good at art.
(∑ j )
2
979 × 979
15 a S jj =∑j 2
−
n
=52335 −
20
=4412.95
(∑ p)
2
735 × 735
S pp =∑ p2 −
n
=32 156 −
20
=5144.75
∑ jp −
S jp =
∑ j∑ p
39 950 −
=
979 × 735
=3971.75
n 20
S jp 3971.75 3971.75
b r
= = = = 0.8335
= 0.834 (3 s.f.)
S jj S pp 4412.95 × 5144.75 4764.8215
c There is a strong positive correlation between the amount of juice and the cost, as the product
moment correlation coefficient is close to 1. So Nimer is correct.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 6
16 a
4912
= 77.0375 − = 1.69968 = 1.70 (3 s.f.)
400 × 8
S pq −11.625
r= = = −0.964 (3.s.f).
S pp S qq 85.5 ×1.69968
c The coding is linear, so the product moment correlation coefficient will be unaffected by the
coding. So the product moment correlation coefficient between x and y is –0.964.
d The correlation coefficient suggests a strong negative linear correlation, but the scatter diagram
shows a non-linear fit.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 7
Challenge
a ∑ x 104.5,
= = ∑ y 113.6,= ∑ x 2 1954.1,= ∑ y 2 2100.6
The regression line of x on y is of the form x= a + by where
(∑ y)
2
b=
S xy
,=
S xy ∑ xy −
∑ x∑ y
,
= S ∑ y − 2
and n = 10
yy
S yy n n
The gradient of the regression line of x on y is 0.8, therefore,
S xy
= 0.8
S yy
( y)
2
∑
∑ xy − ∑ ∑ =
x y
0.8 ∑ y −
2
n n
( ∑ y ) + ∑ x∑ y
2
∑ xy = 0.8
∑ y 2
−
n n
113.62 104.5 ×113.6
= 0.8 2100.6 − +
10 10
= 1835.203...
= 1835 (to the nearest whole number)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 8
b=y 3.50 + 0.725 x
The regression line of y on x is of the form y= a + bx where
S xy
a= y − bx and b =
S xx
S xy
= 0.725
S xx
S xy = 0.725S xx
(∑ x)
2
S xx
= ∑x n
2
−
104.52
= 1954.1 −
10
= 862.075
(∑ y)
2
S yy
= ∑y n
2
−
113.62
= 2100.6 −
10
= 810.104
S xy
r=
S xx S yy
0.725S xx
=
S xx S yy
0.725 × 862.075
=
862.075 × 810.104
= 0.74789...
= 0.748 (3 s.f.)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 9
Exercise 6A
1 a This is not a discrete random variable, since height is a continuous quantity.
b This is a discrete random variable, since it is takes whole number values at random.
c This is not a discrete random variable, since the number of days in a given week is always 7; the
result is predetermined and so not random.
2 {0, 1, 2, 3, 4}
b i
x 4 5 6
P(X = x) 1 1 1
4 2 4
ii
14 , if x 4, 6
P( X x) 12 , if x 5
0, otherwise
4 1
3 13 k 14 1
k=1− 13 31 41 1 1211 121
5
x 1 2 3 4
P(X = x) k 2k 3k 4k
k + 2k + 3k + 4k = 1
10k = 1
k = 101
6 a
x 1 2 3 4
P(X = x) k k 3k 3k
k + k + 3k + 3k = 1
8k = 1
k = 18
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
6 b The probability distribution is:
x 1 2 3 4
7 a
x −2 −1 0 1 2
x −2 −1 0 1 2
8 1
4 a a 12 a 1
3
4 a 1
a 14
9 a P(X = 1) = 501
since each of the 50 individual outcomes is equally likely.
b P(X ⩾ 28) = 1 − 27
50 50
23
c P(X > 3) = 0
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
11 a
s 1 2 3 4
P(S = s) 2
3
1
3 32 92 1
3 13 23 272 1
3 13 13 23 13 13 13 13 271
b P(S > 2) = 2
27 271 19
12 a
x 0 1 2 3 4 5
P(S = s) 0.65 = 0.4 × 0.64 × 0.42 × 0.63 × 0.43 × 0.62 × 0.44 × 0.6 × 0.45 =
0.07776 5 = 0.2592 10 = 0.3456 10 = 0.2304 5 = 0.0768 0.01024
y 0 1 2 3 4 5
P(Y = y) 0.85 = 0.2 × 0.84 × 0.22 × 0.83 × 0.23 × 0.82 × 0.24 × 0.8 × 0.25 =
0.32768 5 = 0.4096 10 = 0.2048 10 = 0.0512 5 = 0.0064 0.00032
z 1 2 3 4 5
P(Z = z) 0.4 0.4 × 0.6 = 0.4 × 0.62 = 0.4 × 0.63 = 0.4 × 0.64 + 0.65
0.24 0.144 0.0864 = 0.1296
13 a
x 2 3 4
P(X = x) 1
2
2
9
1
8
1
2 92 81 72
61
x 2 3 4
P(X = x) k k k
4 9 16
k k k
1
4 9 16
61k
1
144
144
k
61
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
Challenge
x 1 2 3 4 5 6 7 8
P(X = x) 1 1 1 1 1 1 1 1
8 8 8 8 8 8 8 8
y 2 3 6
P(Y = y) 1
2
1
3
1
6
P(X > Y) = P(X > 2 and Y = 2) + P(X > 3 and Y = 3) + P(X > 6 and Y = 6)
68 12 85 31 82 61 85
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
Exercise 6B
1 a
x 1 2 3 4 5 6
F(x) 0.1 0.2 0.35 0.6 0.9 1.0
b F(5) = 0.9
2 a
x 0 1 2 3 4 5 6
F(x) 0 0.1 0.2 0.45 0.5 0.9 1.0
x 0 1 2 3 4 5 6
P(X = x) 0 0.1 0.1 0.25 0.05 0.4 0.1
kx x = 1,3,5
3 a P ( X= x= ) k x − 1 x =
( ) 2, 4, 6
x 1 2 3 4 5 6
P(X = x) k k 3k 3k 5k 5k
Since the sum of the probabilities is 1,
18k = 1
1
k=
18
b
x 1 2 3 4 5 6
1 1 3 3 5 5
P(X = x)
18 18 18 18 18 18
1 3 3
c P(2 ⩽ X < 5) = + +
18 18 18
7
=
18
1 1 3 3
d F(4) = + + +
18 18 18 18
4
=
9
1
e F(1.6) = F(1) =
18
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
4 a Since the sum of the probabilities is 1,
2 ( 0.1) + 2α + 0.3 =
1
α = 0.25
b
x −2 −1 0 1 2
P(X = x) 0.1 0.1 0.25 0.25 0.3
c F(0.3) = F(0)
= 0.1 + 0.1 + 0.25
= 0.45
1+ x
5 a F(X ) =
6
1+ 4
=
6
5
=
6
c
x 1 2 3 4 5
2 1 1 1 1
P(X = x)
6 6 6 6 6
(x + k)
2
6 a F( x) =
16
F ( 3) = 1
Therefore
(3 + k ) = 1
2
16
(3 + k )
2
16
=
3 + k =±4
k = 1 or k = −7
When k = −7 and x = 1
(=
x − 7)
2
36
P (1)
=
16 16
As a probability cannot be greater than 1, k ≠ −7
Therefore k = 1.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
( x + 1)
2
6 b F( x) =
16
x 1 2 3
4 9
F( x) 1
16 16
x 1 2 3
4 5 7
P(X = x)
16 16 16
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
Exercise 6C
1 a The probability distribution for X 2 is:
x 2 4 6 8
P(X = x) 0.3 0.3 0.2 0.2
x2 4 16 36 64
P(X2 = x2) 0.3 0.3 0.2 0.2
2
Note that for this variable P( X= x=
) P( X = x 2 ) as X only takes positive values.
E( X )
= ∑=
x P( X x)
=2 × 0.3 + 4 × 0.3 + 6 × 0.2 + 8 × 0.2 =4.6
E( X 2 )
= ∑= 2
x P( X x)
=4 × 0.3 + 16 × 0.3 + 36 × 0.2 + 64 × 0.2 =26
x –2 –1 1 2
P(X = x) 0.1 0.4 0.1 0.4
x2 4 1 1 4
x2 1 4
P(X2 = x2) 0.5 0.5
E( X )
= ∑=
x P( X x)
=−2 × 0.1 + (−1) × 0.4 + 1× 0.1 + 2 × 0.8
= 0.3
E( X 2 )
= ∑= 2
x P( X x)
= 4 × 0.1 + 1× 0.4 + 1× 0.1 + 4 × 0.4
= 2.5
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
2 E( X ) = ∑ x P( X = x)
= (1× 0.1) + (2 × 0.1) + (3 × 0.1) + (4 × 0.2) + (5 × 0.4) + (6 × 0.1)
= 0.1 + 0.2 + 0.3 + 0.8 + 2.0 + 0.6
=4
E( X 2 ) =∑ x 2 P( X = x)
=(1× 0.1) + (4 × 0.1) + (9 × 0.1) + (16 × 0.2) + (25 × 0.4) + (36 × 0.1)
= 0.1 + 0.4 + 0.9 + 3.2 + 10 + 3.6
= 18.2
x 2 3 6
P(X = x) 1
2
1
3
1
6
x2 4 9 36
P(X2 = x2) 1
2
1
3
1
6
b E( X ) = ∑ x P( X = x)
= 2 × 2 + 3 × 3 + 6 × 16
1 1
= 1+1+1
=3
E( X 2 )
= ∑
= 2
x P( X x)
1 1 1
= 4 × + 9 × + 36 ×
2 3 6
= 11
c (E( X ))=
2
3=2
9 and E( X 2 ) = 11 from part b
So (E( X )) 2 does not equal E(X 2 )
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
4 a The probability distribution for X is:
x 1 2 3 4 5
P(X = x) 1
2
1
4
1
8
1
16
1
16
b E( X ) = ∑ x P( X = x)
= 1× 12 + 2 × 14 + 3 × 81 + 4 × 161 + 5 × 161
= 12 + 12 + 166 + 169 = 31
16
= 1.9375
E( X 2 )
= ∑= 2
x P( X x)
= 12 × 12 + 22 × 14 + 32 × 18 + 42 × 161 + 52 × 161
=1× 12 + 4 × 14 + 9 × 81 + 16 × 161 + 25 × 161
= 16
83
= 5.1875
c =
(E( X )) 2 (1.9375)
= 2
3.7539 (4 d.p.)
So (E( X )) does not equal E(X 2 )
2
Multiply (1) by 2
2a + 2b =
1.2 ( 3)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
6 The probability distribution for X is:
x –2 –1 0 5
P(X = x) 3a 2a a b
) ∑ xP( X= x=
E( X= ) 1.2 , so
1.2 =−2 × 3a − 1× 2a + 0 × a + 5 × b
1.2 =−6a − 2a + 5b
1.2 =−8a + 5b (1)
∑ P ( X= x=) 1, so
1 = 3a + 2a + a + b
1 6a + b
= (2)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
7 Suppose the probability distribution for X is:
x 1 2 3 4 5 6
P(X = x) 1
8
1
8
1
8
1
8 a b
E( X=
) ∑ xP( X= ) 4.1 , so
x=
4.1 =1× 18 + 2 × 18 + 3 × 18 + 4 × 18 + 5 × a + 6 × b
4.1 =108 + 5a + 6b
2.85
= 5a + 6b (1)
∑ P ( X= x=) 1, so
1 = 18 + 18 + 18 + 18 + a + b
0.5= a + b ( 2)
x 1 2 3 4 5 6
P(X = x) 1
8
1
8
1
8
1
8
3
20
7
20
8 P(faulty) = 0.02
Profit on working phone cover is $3.
Loss on faulty phone cover is $8.
49 × 3 − 1× 8
Profit=per phone = $2.78
50
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
Challenge
There is only one way of the number 1 being the highest score on the three dice and that is 1, 1, 1.
To achieve the highest score of 2, each dice must be either 1 or 2. So there are 2 × 2 × 2 =8 ways for the
highest score on three dice to be no more than 2. But one of those is 1, 1, 1, which gives a highest score
of 1 so this needs to be subtracted to leave 7 possible ways for a highest score of 2.
To achieve the highest score of 3, each dice must be either 1 or 2 or 3. So there are 3 × 3 × 3 =27 ways
for the highest score on three dice to be no more than 3. But one of those is 1, 1, 1, which gives a highest
score of 1 and there are 7 possible ways for a highest score of 2 so these both need to be subtracted to
give 19 ways of getting a highest score of 3.
Using this approach, this is the number of ways of getting each highest score:
x 1 2 3 4 5 6
P(X = x) 1
216
7
216
19
216
37
216
61
216
91
216
E( X ) =
∑ xP( X =
x)
1 216
=× 1
+ 2 × 216
7
+ 3 × 216
19
+ 4 × 216
37
+ 5 × 216
61
+ 6 × 216
91
= =
1071
216 = 4.9583 (4 d.p.)
119
24
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 6
Exercise 6D
1 a By symmetry E( X ) = 1
Alternatively, use E( X ) = å x P( X = x)
E( X ) = 15 (1 0 1 2 3) = 15 5 = 1
b E( X 2 ) = å x 2 P( X = x)
E( X 2 ) = 15 (1 0 1 4 9) = 15 15 = 3
Var ( X ) = E( X 2 ) (E( X ))2
= 3 12 = 2
2 a E( X ) = å x P( X = x)
= 1 13 2 12 3 16
= 13 1 21 = 116 = 1.833 (3 d.p.)
E( X 2 ) = å x 2 P( X = x )
= 1 13 4 12 9 16
= 13 2 23 = 23
6
Var ( X ) = E( X 2 ) ( E( X )) 2
2
= 23
6 116 = 138 121 17
36 36 = 36 = 0.472 (3 d.p.)
b E( X ) = å x P( X = x)
= 1 14 0 12 1 41 = 0 (or derive answer by symmetry)
E( X 2 ) = å x 2 P( X = x)
= 1 14 0 12 1 14 = 12 = 0.5
Var( X ) = E( X 2 ) (E( X ))2 = 0.5 02 = 0.5
c E( X ) = å x P( X = x)
= (2) 13 (1) 13 1 61 2 61
= 1 12 = 12 = 0.5
E( X 2 ) = å x 2 P( X = x)
= 4 13 1 13 1 16 4 16
= 53 56 = 156 = 2.5
Var( X ) = E( X 2 ) (E( X ))2 = 2.5 (0.5)2 = 2.5 0.25 = 2.25
y 1 2 3 4 5 6 7 8
1 1 1 1 1 1 1 1
P(Y = y) 8 8 8 8 8 8 8 8
E(Y ) = 18 (1 2 3 4 5 6 7 8) = 18 36 = 4.5
E(Y 2 ) = 18 (1 4 9 16 25 36 49 64) = 18 204 = 25.5
Var(Y ) = E(Y 2 ) (E(Y ))2 = 25.5 (4.5)2 = 25.5 20.25 = 5.25
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
4 a This sample space diagram shows the 36 possible outcomes:
+ 1 2 3 4 5 6
1 2 3 4 5 6 7
2 3 4 5 6 7 8
3 4 5 6 7 8 9
4 5 6 7 8 9 10
5 6 7 8 9 10 11
6 7 8 9 10 11 12
s 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 5 4 3 2 1
P(S = s) 36 36 36 36 36 36 36 36 36 36 36
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
5 a This sample space diagram shows the 16 possible outcomes:
Difference
between 1 2 3 4
scores
1 0 1 2 3
2 1 0 1 2
3 2 1 0 1
4 3 2 1 0
d 0 1 2 3
4 6 4 2
P(D = d) 16 16 16 16
d 0 1 2 3
1 3 1 1
P(D = d) 4 8 4 8
b E( D) = 0 14 1 38 2 14 3 81 = 108 = 54 = 1.25
c E( D 2 ) = 0 14 1 83 4 41 9 81 = 20
8
= 25 = 2.5
Var ( D) = E(D 2 ) (E( D))2
= 2.5 (1.25)2 = 2.5 1.5625 = 0.9375
Alternatively, in fractional form
2
Var ( D ) = 52 54 = 52 16
25 40
= 16 25
16 15
= 16
b E(T ) = 1 12 2 14 3 14 = 74 = 1.75
2
Var(T ) = 154 74 = 16
60 49
16 11
= 16 = 0.6875
7 a E( X ) = å xP( X = x) = a 2b 3a = 4a 2b
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
7 b å p( x ) = 1, so 2a b = 1 (1)
As E( X ) = 4a 2b = 2(2a b)
E( X ) = 2
E( X 2 ) = a 4b 9a = 10a 4b
Var( X ) = E( X 2 ) ( E( X ))2
= 10a 4b 2 2 = 10a 4b 4
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
Exercise 6E
1 a The probability distribution for Y where Y = 2X – 3 is:
y –1 1 3 5
P(Y = y) 0.1 0.3 0.2 0.4
b E(Y ) = å yP(Y = y)
= -1´ 0.1+ 1´ 0.3 + 3´ 0.2 + 5 ´ 0.4
= - 0.1+ 0.3+ 0.6 + 2
= 2.8
c E( X ) = å xP( X = x)
= 1´ 0.1+ 2 ´ 0.3+ 3´ 0.2 + 4 ´ 0.4
= 0.1+ 0.6 + 0.6 +1.6
= 2.9
E(2 X - 3) = E(Y ) = 2.8
2E( X ) - 3 = 2 ´ 2.9 - 3 = 5.8 - 3 = 2.8
So E(2 X - 3) = 2E( X ) - 3
y –8 –1 0 1 8
P(Y = y) 0.1 0.1 0.2 0.4 0.2
b E(Y ) = å yP(Y = y )
= -8 ´ 0.1+ (-1) ´ 0.1+ 0 ´ 0.2 + 1´ 0.4 + 8 ´ 0.2
= - 0.8 - 0.1+ 0 + 0.4 + 1.6
= 1.1
3 a E(8X) = 8E(X) = 8
b E(X + 3) = E(X) + 3 = 1 + 3 = 4
c Var(X + 3) = Var(X) = 2
d Var(3X) = 32Var(X) = 32 ´ 2 = 9 ´ 2 = 18
4 a E(2X) = 2E(X) = 2 ´ 3 = 6
b E(3- 4 X ) = 3- 4E( X ) = 3- 4 ´ 3 = -9
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
4 c E( X 2 - 4 X ) = E( X 2 ) - E(4 X )
= E( X 2 ) - 4E( X )
= 10 - 4 ´ 3 = -2
5 a E(4 X ) = 4 E( X ) = 4 m
b E(2 X + 2) = 2E( X ) + 2 = 2 m + 2
c E(2 X - 2) = 2E( X ) - 2 = 2 m - 2
d The standard deviation of a random variable is the square root of its variance.
So if the standard deviation is s , the variance is s 2 .
Var(2 X + 2) = 2 2 Var( X ) = 4s 2
e Var(2 X - 2) = 2 2 Var( X ) = 4s 2
6 a E( X ) = 16 (1 + 2 + 3 + 4 + 5 + 6) = 21
6 = 3.5
b Y = 200 + 100X
7 Assume the pizzas are cylindrical and that the amount of pizza dough is given by the volume of the
cylinder. The volume of a cylinder is pr 2 h . The volumes of the different sizes of pizza are then:
E(V ) = å vP(V = v)
= 100p´ 103 + 225p´ 209 + 400p´ 205
= (30 + 101.25 + 100)p = p
= 726.5 cm3 (1 d.p.)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
8 a This sample space diagram shows the 16 possible outcomes:
Difference
1 2 3 4
between scores
1 0 1 2 3
2 1 0 1 2
3 2 1 0 1
4 3 2 1 0
x 0 1 2 3
4 6 4 2
P(X = x) 16 16 16 16
E( X ) = å xP( X = x )
= 0 ´ 164 + 1´ 166 + 2 ´ 164 + 3 ´ 162 = 16
20
= 45 = 1.25
E( X 2 ) = å x 2 P( X = x )
= 02 ´ 164 + 12 ´ 166 + 2 2 ´ 164 + 32 ´ 162 = 40
16 = 25 = 2.5
Var( X ) = E( X 2 ) - (E( X ))2
= 2.5 - (1.25)2 = 2.5 -1.5625 = 0.9375
x 0 1 2 3
y 1 2 4 8
1 3 1 1
P(Y = y) 4 8 4 8
E(Y ) = å yP(Y = y )
= 1´ 14 + 2 ´ 83 + 4 ´ 14 + 8 ´ 81 = 24
8 =3
4X +1
The probability distribution for Z where Z = is:
2
x 0 1 2 3
z 0.5 2.5 4.5 6.5
1 3 1 1
P(Z = z) 4 8 4 8
E(Z ) = å zP( Z = z )
= 0.5 ´ 14 + 2.5 ´ 83 + 4.5 ´ 41 + 6.5 ´ 81
= 0.125 + 0.9375 + 1.125 + 0.8125 = 3
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
8 c E(Z 2 ) = å z 2 P(Z = z)
= (0.5) 2 ´ 14 + (2.5)2 ´ 83 + (4.5)2 ´ 14 + (6.5)2 ´ 18
= 0.0625 + 2.34375 + 5.0625 + 5.28125 = 12.75
Alternatively:
4X +1
Z= Þ Z = 2 X + 0.5
2
Var(Z ) = 2 2 Var( X ) = 4Var( X ) = 4 ´ 0.9375 = 3.75
Challenge
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
Exercise 6F
1 a Rearrange the formula Y = 4X – 6 to get an expression for X in terms of Y:
Y = 4 X - 6 gives
Y +6
X=
4
Y 3
X= +
4 2
æ Y 3ö 1 3
E( X ) = E ç + ÷ = E(Y ) +
è 4 2ø 4 2
1 3 1 3
= ´2+ = + = 2
4 2 2 2
2
æY 3ö æ1ö
b Var( X ) = Var ç + ÷ = ç ÷ Var(Y )
è 4 2ø è4ø
1
= ´ 32 = 2
16
2
æ 4 2Y ö æ 2ö
b Var( X ) = Var ç - ÷ = ç - ÷ Var(Y )
è3 3 ø è 3ø
4
= ´9= 4
9
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
3 Rearranging the formula for Y to get an expression for X gives:
Y 3
X= -
2 2
æ Y 3ö 1 3
E( X ) = E ç - ÷ = E(Y ) -
è 2 2ø 2 2
1 3 3 5
= ´ 8 - = 4 - = = 2.5
2 2 2 4
E( X ) = xP( X = x ) = 2.5
0.3 + 2a + 3b + 4 ´ 0.2 = 2.5
2a + 3b + 1.1 = 2.5
2a + 3b = 1.4 (1)
P( X = x ) = 1
So 0.3 + a + b + 0.2 = 1
a + b = 0.5 ( 2)
2 ´ ( 2) 2a + 2b = 1 ( 3)
(1) - ( 3) b = 0.4
From ( 2) a + 0.4 = 0.5 a = 0.1
Solution:
a = 0.1, b = 0.4
y 1 0 –1
P(Y = y) a b 0.3
As probabilities sum to 1:
a + b + 0.3 = 1 b = 1 - 0.3 - 0.5 = 0.2
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
Exercise 6G
1 For a discrete uniform distribution:
n +1
E( X ) =
2
5 +1
=
2
=3
Var ( X ) =
( n + 1)( n − 1)
12
=
( 5 + 1)( 5 − 1)
12
=2
b Var ( X ) =
( n + 1)( n − 1)
12
=
( 7 + 1)( 7 − 1)
12
=4
=
( 6 + 1)( 6 − 1)
12
35
=
12
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
35
3 b Var ( X ) =
12
Therefore
35
σ=
12
= 1.707...
x −σ = 1.792... and x + σ = 5.207...
P(1.792… < x < 5.207…) = P(2 ⩽ x ⩽ 5)
= P(2) + P(3) + P(4) + P(5)
1 1 1 1
= + + +
6 6 6 6
2
=
3
E( X )
= ∑=
xP ( X x )
=2 × 0.1 + 4 × 0.1 + 6 × 0.1 + 8 × 0.1 + 10 × 0.1 + 12 × 0.1 + 14 × 0.1 + 16 × 0.1 + 18 × 0.1 + 20 × 0.1
= 11
( X ) E ( X 2 ) − ( E ( X ))
2
Var
=
E( X 2 )
= ∑=
x P ( X x)
2
= 154 − 121
= 33
Alternatively. Note that if you consider the set of numbers from 1 to 10 inclusive, this
sequence is double that.
Taking X U (10)
10 + 1
E(2 X )= 2 ×
2
= 11
Var(2 X= ) 22 × Var( X )
(10 + 1)(10 − 1)
= 22 ×
12
= 33
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
5 A discrete uniform distribution is not likely to be a good a model for this distribution. The
game depends on the skills of the player. The points are likely to cluster around the
middle.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
Chapter review
1 a
x 1 2 3 4 5
1 2 3 4 5
P(X = x)
15 15 15 15 15
4 5
b P(3 < x ≤ 5) = +
15 15
3
=
5
2 a ∑ P ( X= x=) 1
Therefore:
0.2 + q + 0.3 + 0.1 + 0.2 + 0.1 = 1
q = 0.1
3 a
x 1 2 3 4
P(X = 2 5 8 11
x) 26 26 26 26
19
b P(2 < X 4) = P(X = 3) + P(X = 4) =
26
4 a For a discrete uniform distribution, the probability of choosing each counter must be equal.
1
b i P(X = 5) =
16
6 3
P(X is prime) = =
16 8
8 1
iii P(3 X < 11) = =
16 2
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
5 a
x 1 2 3 4 5
1 2 3 4 5
P(Y = y)
k k k k k
1 2 3 4 5
+ + + + =1
k k k k k
15
= 1, k = 15
k
x 1 2 3 4 5
1 2 3 1 4 5 1
P(Y = y) = =
15 15 15 5 15 15 3
4 5 9 3
c P(Y > 3) = P(Y = 4) + P(Y = 5) = + = =
15 15 15 5
6 a
t 0 1 2 3 4
0.754 0.25 × 0.753 0.25 × 0.752
2 3
0.25 × 0.75 × 4 0.254
P(T = t)
= 0.316 × 4 = 0.422 × 6 = 0.211 = 0.0469 = 0.00391
S 1 2 3 4 5
0.25 0.25 × 0.75 0.25 × 0.752 0.25 × 0.753 0.25 × 0.754 + 0.755
P(S = s)
= 0.188 = 0.141 = 0.105 = 0.316
d P(S > 2) = P(S = 3) + P(S = 4) + P(S = 5) = 0.563 (to 3 s.f. using exact figures)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
7 a The probability distribution for X is:
x 1 2 3 4 5 6
1 2 3 4 5 6
P(X = x)
21 21 21 21 21 21
3 4 5 12
b P(2 < X 5) =P(X =3) + P(X =4) + P(X =5) = + + = = 7
21 21 21 21
1 2 3 4 5 6
c E( X ) =×
1 + 2 × + 3× + 4 × + 5× + 6 ×
21 21 21 21 21 21
1 91 13
= (1 + 4 + 9 + 16 + 25 + 36) = =
21 21 3
1 2 3 4 5 6
d E( X 2 ) =×
1 + 4 × + 9 × + 16 × + 25 × + 36 ×
21 21 21 21 21 21
1 441
= (1 + 8 + 27 + 64 + 125 + 216)= = 21
21 21
Var(
= X ) E( X 2 ) − (E( X )) 2
2
13 169
=21 − = 21 −
21 9
189 169 20
= − = =2.22 (2 d.p.)
9 9 9
e Var (3 − 2 X ) = Var( − 2 X + 3)
= (−2) 2 Var(X )
20 80
= 4× = = 8.89 (2 d.p.)
9 9
f E( X 3 ) =∑ x3 P( X = x)
1 2 3 4 5 6
=13 × + 23 × + 33 × + 43 × + 53 × + 63 ×
21 21 21 21 21 21
1
= (1 + 16 + 81 + 256 + 625 + 1296)
21
2275 325
= = = 108.33 (2 d.p.)
21 3
b P(−1 X < 2) =
P( X =
−1) + P( X =
0) + P( X =
1) =
0.2 + 0.3 + 0.2 =
0.7
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
8 c E( X ) =−2 × 0.1 + (−1) × 0.1 + 0 × 0.3 + 1× 0.2 + 2 × 0.1 + 3 × 0.1
=−0.2 − 0.2 + 0.2 + 0.2 + 0.3 = 0.3
E(2 X + 3) = 2 E( X ) + 3 = (2 × 0.3) + 3 = 3.6
1 3 5 13
b E( X ) = 0 × + 1× + 2 × = = 1.3
5 10 10 10
1 3 5 23
c E( X 2 ) = 0 × + 1× + 4 × = = 2.3
5 10 10 10
Var
= ( X ) E ( X 2 ) − ( E ( X )) 2
=2.3 − 1.32 =2.3 − 1.69 =0.61
1 3
d P(X 1.5) =P( X =0) + P( X =1) = + =0.5
5 10
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
10 b The probability distribution for X is:
x 0 1 2 3
1 1 1
P(X = x) 0
4 4 2
1 1 1 1 3
E(X ) = 0 × + 1× 0 + 2 × + 3 × = + = 2
4 4 2 2 2
1 1 1 11
E( X 2 ) = 0 × + 1× 0 + 4 × + 9 × = = 5.5
4 4 2 2
1
11 a P(1 < X 2) ==
P(X 2) =
8
1 1 1 1 1 1 3 9
b E(X ) = 0 × + 1× + 2 × + 3 × = + + =
4 2 8 8 2 4 8 8
27 19
c E(3 X − 1)= 3E(X ) − 1= − 1=
8 8
1 1 1 1 1 1 9 17
d E(X 2 ) = 0 × + 1× + 4 × + 9 × = + + =
4 2 8 8 2 2 8 8
2
17 9 136 81 55
Var( X ) = E ( X 2 ) − ( E ( X )) 2 = − = − =
8 8 64 64 64
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
12 c E( X 2 ) =
1× 0.4 + 4 × 0.2 + 9 × 0.1 + 16 × 0.3 =6.9
Var(
= X ) E ( X 2 ) − ( E ( X )) 2
=6.9 − (2.3) 2 =6.9 − 5.29 =1.61
3− X 3 1 3 2.3 0.7
d E = − E( X ) = − = = 0.35
2 2 2 2 2 2
e=
E X ( ) ∑=
X P( X x)
= 1× 0.4 + 2 × 0.2 + 3 × 0.1 + 4 × 0.3
= 0.4 + 0.2828 + 0.1732 + 0.6
= 1.4560 (4 d.p.)
E ( 2− X )
f= ∑=
2 P( X
−X
x)
= 2−1 × 0.4 + 2−2 × 0.2 + 2−3 × 0.1 + 2−4 × 0.3
= 0.5 × 0.4 + 0.25 × 0.2 + 0.125 × 0.1 + 0.0625 × 0.3
=0.2 + 0.05 + 0.0125 + 0.01875
= 0.28125
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 6
14 a The probability distribution for X is:
x 1 2 3 4 5
P(X = x) k 2k k 2k 3k
b E(X ) = 1k + 2 × 2k + 3k + 4 × 2k + 5 × 3k
31
= 31 =k
9
c E(X 2 ) =1k + 4 × 2k + 9k + 16 × 2k + 25 × 3k
125
= 125 = k
9
Var(
= X ) E ( X ) − ( E ( X )) 2
2
2
125 31 1125 961 164
= − = − =
9 9 81 81 81
= 2.02 (3 s.f.)
(−2) 2 Var ( X ) =
d Var (3 − 2 X ) = 4 × 2.02 =
8.1 (1 d.p.)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 7
15 a Probabilities sum to 1, so:
0.1 + 0.3 + a + b =1
a+b = 0.6 (1)
Rearrange the equation for Y to get X in terms of Y:
1 1
3X = Y +1 ⇒ X = Y +
3 3
1 1 1 1 1 11 1 21
E ( X ) = E Y + = E (Y ) + = × + = = 0.7
3 3 3 3 3 10 3 30
E( X ) = ∑ x P( X = x)
=−0.1 + a + 2b =0.7
So a + 2b = 0.8 ( 2)
Subtract equation (1) from equation (2), which gives:
b = 0.2
So by substituting the value for b in equation (1)
a + 0.2 = 0.6 ⇒ a = 0.4
b Any distribution where all the probabilities are the same. An example is throwing a fair die.
c There are 5 possible values. So as the variable has discrete uniform distribution, each value has a
1
(= 0.2) probability. E(X) can be found by symmetry, as the probability distribution is uniform,
5
or by:
E( X ) = 0.2(0 + 1 + 2 + 3 + 4) = 0.2 ×10 = 2
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 8
Challenge
1 1 1 1 1
E( X ) = 1× + 2× + 3× + 4 × + ....... + n ×
n n n n n
1
= (1 + 2 + 3 + 4 + ....... + n )
n
1 n
= ∑i
n i =1
1 n(n + 1)
= × (using first sum in hint)
n 2
n +1
=
2
1
E( X=2
)
n
(1 + 4 + 9 +16 + ....... + n 2 )
1 n
= ∑ i2
n i =1
1 n(n + 1)(2n + 1)
= × (using second sum in hint)
n 6
(n + 1)(2n + 1)
=
6
Var(
= X ) E ( X 2 ) − ( E ( X )) 2
(n + 1)(2n + 1) (n + 1) 2
= −
6 4
2(n + 1)(2n + 1) 3(n + 1) 2
= −
12 12
2 2
4n + 6n + 2 − 3n − 6n − 3
= (multiplying out)
12
2
n −1
= (simplifying terms)
12
(n + 1)(n − 1)
= (factoring)
12
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 9
Exercise 7A
1 a Continuous – lengths can take any value.
2 Since the mean is 35 mm, the distribution should be symmetrical about this value. Since the standard
deviation is 0.4 mm, 68% of the data should lie in the range 34.6 mm to 35.4 mm and 95% of the data
should lie in the range 34.2 mm to 35.8 mm.
3 One of the key features of normal distributions is that they are symmetrical about the mean (which is
equal to the mode and the median). This curve shows that the bank employees’ incomes are not
equally distributed either side of its peak (the modal income), so the normal distribution is not a
suitable model.
b The given interval, 112 cm to 128 cm, includes all of the pupils whose armspans are up to two
standard deviations either side of the mean. Therefore 95% of the pupils can be expected to have
an armspan within this range.
5 If 68% of the snakes have a length between 93 cm (100 cm − 7 cm) and 107 cm (100 cm + 7 cm), then
the standard deviation is 7 cm. Therefore the variance, 2 , is 72 = 49.
6 Since 95% of the data should lie within two standard deviations of the mean, 2.5% of the data should
be two standard deviations or more below the mean and 2.5% of the data should be two standard
deviations or more above the mean. Since 2.5% of the dormice weigh more than 70 grams, this means
that 70 grams is two standard deviations above the mean. The standard deviation is 5 grams (the
square root of the variance) so the mean is 60 grams.
7 In a normal distribution, 68% of the data lies within one standard deviation, , of the mean, , and
32% lies outside of this range. Therefore 16% of the data lies below , and 16% lies above
. Also, 95% of the data lies between 2 and 2 . Therefore 2.5% lies below 2
and 2.5% lies above 2 .
The question states that 84% of the sheep weigh more than 52 kg, so 16% weigh less than 52 kg.
Hence 52 . Also, 97.5% of the sheep weigh more than 47.5 kg, so 2.5% weigh less than
47.5 kg and 47.5 2 . From these two equations, 52 47.5 4.5 kg and so
2 (4.5)2 20.25 . Hence 52 52 4.5 56.5 kg.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
8 a Since the normal distribution is symmetrical and the mean is equal to the median and the mode,
half the scores should be above 45 and half should be below. Therefore P( S 45) 0.5 .
b Since the data within one standard deviation of the mean should be 68% of the sample,
P(30 S 60) 0.68 .
c Since the data within two standard deviations of the mean should be 95% of the sample,
P(15 S 75) 0.95 .
d Alexia is incorrect: although P( X 100) 0, the value is very small as 100 is more than three
standard deviations from the mean, so the model as a whole is still reasonable.
9 a The mean of the normal distribution is where the highest point on the curve appears. From the
sketch, this is around 36 cm.
b The points of inflection of the normal occur at and . From the sketch, these points lie
somewhere in the intervals [33, 34] and [38, 39]. Since the mean is around 36 cm, this means that
the standard deviation should be somewhere between 2 cm and 3 cm.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
Exercise 7B
1 a P(Z > 1.27) = 1 − P(Z < 1.27)
= 1 – 0.8980
= 0.102
2 Use the Normal CD function on your calculator, with= σ 1 and a small value for the lower
µ 0,=
limit, e.g. −10.
a P( Z < 2.12)
= 0.98299...
= 0.9830 (4 d.p.)
b P( Z < 1.36)
= 0.91308...
= 0.9131 (4 d.p.)
P( Z > 0.84)
=1 − P( Z < 0.84)
= 1 − 0.79954...
= 0.20045...
= 0.2005 (4 d.p.)
d P( Z <=
−0.38) 0.35197...
= 0.3520 (4 d.p.)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
2 e
f P( Z <=
−1.63) 0.05155...
= 0.0516 (4 d.p.)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
Exercise 7C
1 a P(Z < a) = 0.3336
P(Z > a) = 0.3336
So P(Z < a) = 1 – 0.3336
= 0.6664
a = 0.43
But since P(Z < a) < 0, a is negative, therefore
a = −0.43
a P( Z <=
a ) 0.9082 ⇒=
a 1.32975...
= 1.3298 (4 d.p.)
b P( Z > a ) =0.0314
⇒ P( Z < a ) = 0.9686
=⇒ a 1.86060...
= 1.8606 (4 d.p.)
c P( Z > a ) =0.15
⇒ P( Z < a ) = 0.85
=⇒ a 1.03643...
= 1.0364 (4 d.p.)
(Alternatively, use the table of percentage points with p= 0.15 ⇒ a= 1.0364)
d P( Z > a ) =0.95
⇒ P( Z < a ) = 0.05
⇒a= −1.64485... = −1.6449 (4 d.p.)
(Alternatively, use the table of percentage points with p = 0.05 ⇒ −a = 1.6449 ⇒ a = −1.6449)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
2 e
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
2 h
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
Exercise 7D
1 X N ( 20, 42 )
a P(X ⩽ 26)
X −µ
z=
σ
26 − 20
=
4
= 1.5
P(X ⩽ 26) = P(Z ⩽ 1.5)
Φ(1.5) = 0.9332
P(X ⩽ 26) = 0.9332
2 X N (18,10 )
a P(X > 20) = 1 – P(X ⩽ 20)
X −µ
z=
σ
20 − 18
=
10
= 0.6325
Φ(0.6325) = 0.7365
P(X > 20) = 1 − 0.7365
= 0.2635
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
2 b P(X < 15) = 1 – P(X < 21)
X −µ
z=
σ
21 − 18
=
10
= 0.9487
Φ(0.9487) = 0.8286
P(X < 15) = 1 – 0.8286
= 0.1714
3 a X N ( 24,32 )
P(X ⩽ 29)
X −µ
z=
σ
29 − 24
=
3
= 1.667
Φ(1.667) = 0.9522
P(X ⩽ 29) = 0.9522
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
4 Y N ( 30,52 )
a − 30 a − 30
P Z > =0.30 ⇒ P Z < =0.70
5 5
a − 30
= 0.5244
5
a = 32.622
5 Y N (15,32 )
a − 15 a − 15
P z > =0.15 ⇒ P z < =0.85
3 3
a − 15
= 1.036
3
a = 18.108
= 18.1 (to 3 s.f.)
6 Y N (100,152 )
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
7 X N ( 80, 42 )
x − µ 154 − 154
9 a x = 154 ⇒ z = = =0 ⇒ P( X < 154) =Φ (0)
σ 12
x − µ 160 − 154
b x = 160 ⇒ z = = =0.5 ⇒ P( X < 160) =Φ (0.5)
σ 12
x − µ 151 − 154
c x = 151 ⇒ z = = −0.25 ⇒ P( X > 151) =
= 1 − P( X < 151) =
1 − Φ (−0.25)
σ 12
x − µ 140 − 154 7
d x = 140 ⇒=z = = −
σ 12 6
x − µ 155 − 154 1
x = 155 ⇒=z = =
σ 12 12
1 7
⇒ P(140 < X < 155) = P( X < 155) − P( X < 140) = Φ − Φ −
12 6
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
10 a P( Z > z=
) 0.025 ⇒ = p 0.025
Using the percentage points table, =
p 0.025 ⇒=
z 1.96
x−µ
b Using the formula z = :
σ
x − 80
1.96 =
4
x − 80 =4 × 1.96
x 80 + 7.84
=
= 87.84
A score of 87.8 (3 s.f.) is needed to get on the programme.
x−µ
b Using the formula z = :
σ
x − 57
−1.0364 =
2
x − 57 = 2 × (−1.0364)
x 57 − 2.0728
=
= 54.9272
The size of a ‘little’ hat is 54.9 cm (3 s.f.).
b A ‘standard’ light bulb should have a range of life within the above range, but for N(1175, 56).
x−µ
Using the formula z = with z = −1.2816:
σ
x − 1175
−1.2816 =
56
x − 1175= 56 × (−1.2816)
= x 1175 − 71.7696
= 1103.2304
Similarly, for z = 1.2816, x = 1175 + 71.7696 = 1246.7696.
So the range of life for a ‘standard’ bulb is 1103 to 1247 hours.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
Exercise 7E
18
1 P ( X 18) 0.9032 P Z 0.9032
5
20 11
2 P( X 20) 0.01 P Z 0.01
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
25
3 P(Y 25) 0.15 P Z 0.15
40
40 50
4 P(Y 40) 0.6554 P Z 0.6554
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
5 Using the inverse normal function,
17
P( X 17) 0.8159 P Z 0.8159 z1 0.8998...
25
P( X 25) 0.9970 P Z 0.9970 z2 2.7477...
So 0.8998 17 (1)
and 2.7477 25 ( 2)
(2) (1): 1.8479 8
4.329...
Substituting into ( 2):
17 0.8998 4.329... 13.104...
So 13.1 and 4.32 (3 s.f.)
6 Using the inverse normal function (or the percentage points table),
25
P(Y 25) 0.10 P Z 0.10 z1 1.28155...
35
P(Y 35) 0.005 P Z 0.005 z2 2.57582...
So 1.2816 25 (1)
and 2.5758 35 (2)
(2) (1): 3.8574 10
2.5924...
Substituting into ( 2):
35 2.5758 2.5924... 28.322...
So 28.3 and 2.59 (3 s.f.)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
7
By symmetry, 1
2 9 15 12
15 12
P( X 15) 0.20 P Z 0.20
Using the inverse normal function (or the percentage points table), z 0.8416...
3
so 0.8416
3
3.564...
0.8416
So 12 and 3.56 (3 s.f.)
By symmetry, 1
2 25 45 35
45 35
P( X 45) 0.25 P Z 0.25
Using the inverse normal function, z 0.6744...
10
so 0.6744
10
14.82...
0.6744
So 35 and 14.8 (3 s.f.)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
40
9 0 (given) so P( X 4) 0.2 and P Z 0.2
Using the inverse normal function (or the percentage points table), z 0.8416...
4
so 0.8416
4
4.752...
0.8416
So 4.75 (3 s.f.)
10
Using the inverse normal function (or the percentage points table),
2a 2.68
P( X 2a) 0.2 P Z 0.2 z1 0.8416...
a 2.68
P( X a) 0.4 P Z 0.4 z2 0.2533...
So 0.8416 2a 2.68 (1)
and 0.2533 a 2.68 ( 2)
(2) 2 : 0.5066 2a 5.36 ( 3)
(1) ( 3): 1.3482 2.68
1.9878...
Substituting into ( 2):
a 2.68 0.2533 1.9878... 2.176...
So 1.99 and a 2.18 (3 s.f.)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
11 a The distribution is D ~ N( , 52 ).
200
P( D 200) 0.75 P( D 200) 0.25 P Z 0.25
5
Using the inverse normal function, z 0.6744...
200
so 0.6744...
5
200 5 0.6744...
203.37... 203 mm (3 s.f.)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 6
13 a
18
b P(M 18) 0.10 P Z 0.10 z1 1.28155...
30
P( M 30) 0.05 P Z 0.05 z 2 1.64485...
So 1.28155... 18 (1)
and 1.64485... 30 ( 2)
(2) (1): 2.92640... 12
4.10059...
Substituting into ( 2):
30 1.64485... 4.10059... 23.25512...
So 23.26 and 4.101 (4 s.f.)
25 23.25512...
c P( M 25) 1 P Z 0.33522...
4.10059...
Use the binomial distribution X ~ B(10, 0.33522...)
Using the binomial CD function,
P( X 4) 1 P( X 3) 1 0.55408... 0.44591...
So the probability that at least 4 have a mass greater than 25 kg is 0.4459 (4 d.p.)
16
14 a P( L 16) 0.20 P Z 0.20 z1 0.84162...
18
P( L 18) 0.10 P Z 0.10 z2 1.28155...
So 0.84162... 16 (1)
and 1.28155... 18 ( 2)
(2) (1): 2.12317... 2
0.94198...
Substituting into ( 2):
18 1.28155... 0.94198... 16.79279...
So 16.79 and 0.9420 (4 s.f.)
Q1 16.79279...
b P( L Q1 ) 0.25 z1 0.67448... 0.67448... Q1 16.15743...
0.94198...
Q 16.79279...
P( L Q3 ) 0.75 z2 0.67448... 3 0.67448... Q3 17.42814...
0.94198...
The interquartile range is Q3 Q1 17.42814... 16.15743... 1.2701... 1.27 (2 d.p.)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 7
Challenge
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 8
Chapter review 7
1 H ~ N(178, 42 )
2 W ~ N(32.5, 2.22 )
3 T ~ N(48, 82 )
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
4 X ~ N(24, σ 2 )
30 − µ
a P( X > 30) = 0.05 ⇒ P Z > = 0.05
σ
d −µ
c P( X > d ) = 0.01 ⇒ P Z > = 0.01
σ
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
5 L ~ N(120, σ 2 )
140 − µ
a P( L > 140) = 0.01 ⇒ P Z > = 0.01
σ
c−µ
c P( L < c)= 0.10 ⇒ P Z < = 0.10
σ
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
6 a P( X < 20) = 0.25 and P( X < 40) = 0.75
Using the inverse normal function (or the percentage points table),
20 − µ
P( X < 20)= 0.25 ⇒ P Z < = 0.25 ⇒ z1= −0.67448...
σ
40 − µ
P( X < 40)= 0.75 ⇒ P Z < = 0.75 ⇒ z2 = 0.67448...
σ
So − 0.6745σ = 20 − µ (1)
and 0.6745σ
= 40 − µ ( 2)
( 2) − (1): 1.3489σ = 20
σ = 14.826...
Substituting into ( 2):
µ =− 40 0.6745 × 14.826... = 29.99...
= So µ 30 = and σ 14.8 (3 s.f.)
15 − µ
7 P( H > 15)= 0.10 ⇒ P Z > = 0.10 ⇒ z1= 1.28155...
σ
4−µ
P( H < 4)= 0.05 ⇒ P Z > = 0.05 ⇒ z2 = −1.64485...
σ
So − 1.6449σ = 4−µ
1.2816σ= 15 − µ
Subtract 2.9265σ = 11
= σ 3.= 7587... 3.76 cm (3 s.f.)
µ= 15 − 1.2816σ = 10.2 cm (3 s.f.)
8 a T ~ N(80, 102 )
Using the normal CD function, P(T >=
85) 0.30853...
= 0.3085 (4 d.p.)
b S ~ N(100, 152 )
Using the normal CD function, P( S > 105)
= 0.36944...
= 0.3694 (4 d.p.)
c The student’s score on the first test was better, since fewer of the students got this score or higher.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
9 J ~ N(108, σ 2 )
100 − µ
a P( J < 100) = 0.03 ⇒ P Z < = 0.03
σ
X −µ
a P(T > 15)
= 0.0446 ⇒ P Z > 0.0446 ⇒ z = 1.70
=
σ
15 − µ
so 1.70 =
3.8
µ =15 − 3.8 × 1.70
= 8.54 minutes (3 s.f.)
5 − 8.54
b P(T < 5)= P Z <
3.8
= P(Z < −0.93...)
= 0.17577...
= 0.1758 (4 d.p.)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
11 T ∼ N( µ , σ 2 )
Using the inverse normal function,
7−µ
P(T < =7) 0.9861 ⇒ P Z < = 0.9861 ⇒= z1 2.20009...
σ
5.2 − µ
P(T < 5.2)
= 0.0102 ⇒ P Z < = 0.0102 ⇒ =
z2 −2.31890...
σ
So 2.2001σ= 7 − µ (1)
and − 2.3189σ = 5.2 − µ ( 2)
(1) − ( 2): 4.5190σ = 1.8
σ = 0.3983...
Substituting into (1):
µ= 7 − 2.2001 × 0.3983... = 6.123...
So the mean thickness of the shelving is 6.12 mm and the standard deviation is 0.398 mm (3 s.f.).
Challenge
1 X N ( 58,102 )
P(X < 36) = 1 − P(X < 80)
X −µ
z=
σ
80 − 58
=
10
= 2.2
Φ(2.2) = 0.9861
P(X < 36) = 1 – 0.9861
= 0.0139
Since 2 000 000 televisions are made,
2 000 000 × 0.0139 = 27 800 televisions may be replaced.
2 a i The mean would remain unchanged at 5.2 hours as the mean of the extra people is also
5.2 hours.
ii The variance decreases, as the average deviation from the mean is less.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 6
Practice paper
1 a
b Let X be the probability that Sudeshna passes on the first or second attempt.
P ( X ) = 0.6 + 0.4 × 0.5
= 0.8
Let Y be the probability that Sudeshna gets a certificate.
0.8
P( X |Y ) =
0.916
= 0.873 (3 s.f.)
2 a
b Negative correlation
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
2 c t =
18.63, p ==
468.75, Stp −5023.75, Stt =
181.88, S pp =
179 337.50
Stp
p= a + bt where b= and a= p − bt
Stt
Stp
b=
Stt
−5023.75
=
181.88
= −27.6...
a = 468.75 + 27.6... ×18.63
= 983.33...
Therefore:
p 983 − 27.6t (3 s.f.)
=
d when t = 22
= p 983.33... − 27.62... ( 22 )
= 375.66...
Therefore, it should cost £376
∑ go − ∑ n∑
g o
b=
S go
324 × 877
= 34 440 −
8
= −1078.5
e The ninth year goes against the correlation. As a result, the PMMC will decrease (get
weaker).
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
4 a Total area of shaded bars is:
5 × 7.5 + 10 × 30 + 5 × 22.5 + 5 × 7.5 + 5 × 15 =562.5 small squares, which represents 450
employees. Therefore 1.25 small squares represents 1 employee.
The number of employees claiming more than 35 hours overtime is:
5 × 7.5 + 5 ×15
= 90
1.25
c By linear interpolation:
195
Q 2 =20 + ×10
240
Q 2 = 28.1 (3 s.f.)
5 a
33
c P ( F '∪ T '∪ B ') =
100
42 21
d P (=
F) =
100 50
55 11
e P (T ∪ B ) = =
100 20
30
5 f P (T | F ) = 100
42
100
5
=
7
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
6 a i E(X) = 0.6 and E(X2) = 4
E ( X ) =−2 × a + 0 × b + 2 × a + 4 × c
4c = 0.6
c = 0.15
ii E ( X 2 ) =( −2 ) × a + 0 × b + 22 × a + 42 × 0.15
2
8a + 2.4 =
4
a = 0.2
Since 2a + b + c = 1
b = 0.45
c Y = 7 – 4X
E(Y) = E(7 – 4X)
= 7 – 4E(X)
= 7 – 4(0.6)
= 4.6
d Y = 7 – 4X
Var(Y) = Var(7 – 4X)
= 42Var(X)
= 16(3.64)
= 58.24
e
x −2 0 2 4
y 15 7 −1 −9
P(Y = y) 0.2 0.45 0.2 0.15
P(Y ≥ 0) = 0.8
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
7 a Let X be the volume of coffee, in ml, delivered by the machine to a cup.
X N ( 505,1.82 )
i P ( X > 507 ) =
1 − P ( X < 507 )
507 − 505
1− P Z <
=
1.8
1 − P ( Z < 1.1111)
=
= 1 – 0.8667
= 0.133 (3 s.f)
b X N ( 503,1.62 )
P (1006 − w < X < w=
) P ( X < w ) − P ( X < 1006 − w )
P ( X < w ) − P ( X < 1006 − w ) =
0.9246
w − 503 (1006 − w) − 503
P Z < − P Z < =0.9246
1.6 1.6
w − 503 (1006 − w) − 503
− =1.4367...
1.6 1.6
w − 503 − 1006 + w + 503 = 2.2987...
2 w − 1006 = 2.2987...
w = 504 (3s.f )
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
7 c X N ( r, q2 )
P(Z < z1) = 0.01 ⇒ z1 = −2.3263
499 − r r − 499
= −2.3263 ⇒ q = (1)
q 2.3263
P(Z > z2) = 0.05 ⇒ z2 = 1.6449
505 − r 505 − r
= 1.6449 ⇒ = q (2)
q 1.6449
Equating (1) and (2) gives:
r − 499 505 − r
=
2.3263 1.6449
1.6449 ( r −=
499 ) 2.3263 ( 505 − r )
3.9712r = 1995.5866
r = 502.5147
505 − 502.5147
q=
1.6449
= 1.5108
r = 503, q = 1.51 (3 s.f)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 6
Review exercise 1
1 a Any 2 from:
• Used to simplify or represent a real-world problem.
• Cheaper or quicker (than producing the real situation) or more easily modified.
• To improve understanding of the real-world problem.
• Used to predict outcomes from a real-world problem (idea of predictions).
x − 120
2 y= therefore:
5
x − 120
= 24
5
x = 240
x
y = therefore:
5
x = 5 2.8
= 14
3
y = 1.4 x − 20 therefore:
1.4 x − 20 = 60.8
404
x=
7
= 57.7 (3 s.f.)
y = 1.4 x therefore:
6.60
x =
1.4
33
=
7
= 4.71 (3 s.f.)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
4 x = 10s + 1
x −1
s=
10
x 947
coded mean, x = = = 31.6
n 30
31.6 − 1
actual mean, s = = 3.06 hours
10
33 065.37
coded standard deviation, x = = 33.2
30
33.2
actual standard deviation, s = = 3.32 hours
10
x − 720
5 y= therefore:
1000
x − 720
= 18
1000
x = $18 720
m + 12
6 a t=
1.25
7 a
t 5–10 10–14 14–18 18–25 25–40
Frequency 10 16 24 35 15
b 40
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
7.5 10 + 12 16 + 16 24 + 21.5 35 + 32.5 15
7 c t =
100
= 18.91 minutes
e Median = 18 minutes
Using interpolation:
The lower quartile lies in the 10–14 group
15
Q1 = 10 + 4
16
Q1 = 13.75 minutes
The upper quartile lies in the 18–25 group
25
Q3 = 18 + 7
32
Q1 = 23 minutes
b A temperature of 45 °C is very high so it is likely this value was recorded incorrectly. Therefore,
this outlier should be omitted from the data.
9 a Positive skew
31
Q2 = 19.5 + 10
43
Q 2 = 26.7 km (3 s.f.)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
9 c fx = 3550 and fx 2
= 138 020
x=
fx
n
3550
=
120
= 29.6 km (3 s.f.)
fx fx
2 2
2
= −
n n
2
138 020 3550
= −
120 120
= 274.99...
= 16.6 (3 s.f.)
10 a Mode = 56
b There are 27 pieces of data therefore the median is the 14th piece of data.
Median = 52
Q1 is the 7th piece of data so Q1 = 35
Q3 is the 21st piece of data so Q3 = 60
c fx = 1335 and fx 2
= 71801
x=
fx
n
1335
=
27
= 49.4 (3 s.f.)
fx fx
2 2
2
= −
n n
2
71801 1335
= −
27 27
= 214.54...
= 14.6 (3 s.f.)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
3 ( mean − median ) 3 ( 49.44... − 52 )
10 d =
standard deviation 14.64...
= −0.533 (3 s.f.)
class width
b Frequency density =
frequency
Frequency
Class Frequency
density
41–45 4 0.8
46–50 19 3.8
51–60 53 5.3
61–70 37 3.7
71–90 15 0.75
91–150 6 0.1
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
11 d fx = 8379.5 and fx 2
= 557 489.75
x=
fx
n
8379.5
=
134
= 62.5 (3 s.f.)
fx fx
2 2
2
= −
n n
2
557 489.75 8379.5
= −
134 134
= 249.92...
= 15.8 (3 s.f.)
c Area on diagram is 7.2 cm2 which represents 9 students; therefore 1 student is represented
by 0.8 cm2.
d The total area is 24 cm2. Therefore the number of students is 24 ÷ 0.8 = 30 students.
13 a
c Many delays are short and passengers should find them acceptable.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 6
14 a 17 males and 15 females
b £48
15 a i 37 minutes
ii upper quartile
b They are outliers. Outliers are values that are much greater or much less than the other values.
d The children from school A generally took less time than those from school B. The median for A is
less than the median for B. A has outliers, but B does not. The interquartile range for A is less than
the interquartile range for B, suggesting that the times for school A are less spread out. However,
the total range for A is greater than the total range for B (although this includes the outliers).
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 7
16 Area of 65 to 67 cm class = 26
Frequency density = 26 ÷ 2 = 13
The number of owls with wing length between 63 and 73 cm is given by the shaded area on the
graph.
(2 2) + 26 + 36 + 24 + (3 3)
P(63 ≤ l ≤ 73) =
10 + 26 + 36 + 24 + 15 + 10
99
=
121
= 0.82
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 8
17 a P(A or B) = P(A but not B) + P(B but not A) + P(A and B)
0.65 = 0.32 + 0.11 + P(A and B)
P(A and B) = 0.65 − 0.32 − 0.11= 0.22
P(neither A nor B) = 1 – 0.65 = 0.35
18 a Magazines and Television are mutually exclusive preferences as the sets do not overlap.
13
b P(M and B) = = 0.34
38
21 11 231
P(M) × P(B) = = = 0.32
38 19 722
19 a Start in the middle of the Venn diagram and work outwards. Remember the rectangle and those
not in any of the circles. Your numbers should total 100.
10 1
b P ( G L D ) = = = 0.1
100 10
41
c P ( G L D ) = = 0.41
100
9 + 7 + 5 21
d P ( only two attributes ) = = = 0.21
100 100
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 9
19 e The word ‘given’ in the question tells you to use conditional probability:
P ( G | L D ) 100
10
10 2
P (G | L D ) = = 15 = = = 0.667 (3 s.f.)
P (L | D) 100 15 3
20 a Let F be the event that a student reads fiction books on a regular basis, and N the event that they
read non-fiction books.
P(F N ) = P(F ) + P(N ) − P(F N )
0.6 = 0.25 + 0.45 − P( F N )
P ( F N ) = 0.1
b When drawing the Venn diagram remember to draw a rectangle around the circles and add the
probability 0.4.
Remember the total in circle F = 0.25 and the total in circle N = 0.45 .
c The words ‘given that’ in the question tell you to use conditional probability:
0.15 1
P ( F N | F N ) = = = 0.25
0.6 4
21 a The first two probabilities allow two spaces in the Venn diagram to be filled in.
P( A B ) = P( A B) + P( A B) + P( A B ) , and this can be rearranged to see that
P( A B) = 0.15
Finally, P( A B) = 0.62 P(( A B)) = 0.38 . The completed Venn diagram is therefore:
d If A and B are independent, then P( A) = P( A | B) = P( A | B) . From parts b and c, this is not the
case. Therefore they are not independent.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 10
22 a P( A B) = P( A) P( B) P( A) = P( A B) P( B ) = 0.15 0.3 = 0.5
P( A C ) 0.1
d i P( A | C ) = = = 0.25
P(C ) 0.4
ii The set A ( B C ) must be contained within A. First find the set B C : this is made up
from four distinct regions on the above Venn diagram, with labels 0.15, 0.15, 0.25 and 0.05.
Restricting to those regions that are also contained within A leaves those labelled 0.15 and 0.25.
Therefore, P( A ( B C )) = 0.15 + 0.25 = 0.4
iii From part ii, P( B C ) = 0.15 + 0.15 + 0.25 + 0.05 = 0.6 . Therefore
P( A ( B C)) 0.4 2
P( A | ( B C)) = = =
P( B C) 0.6 3
50 25
23 a P ( tourism ) = = = 0.338 (3 s.f.)
148 74
b The words ‘given that’ in the question tell you to use conditional probability:
P ( G T ) 148
23
23
P ( no glasses | tourism ) = = 50 = = 0.46
P (T ) 148 50
d The words ‘given that’ in the question tell you to use conditional probability:
P ( E RH ) 148
30
0.8 12
P ( engineering | right-handed ) = = 55 = = 0.218 (3 s.f.)
P ( RH ) 74 55
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 11
24 a There are two different events going on: ‘Joanna oversleeps’ (O) and ‘Joanna is late for college’
(L). From the context, we cannot assume that these are independent events.
Drawing a Venn diagram, none of the regions can immediately be filled in. We are told that
P(O ) = 0.15 and so P( J does not oversleep) = P(O) = 0.85 . The other two statements can be
P( L O) P(L O)
interpreted as = 0.75 and = 0.1
P(O) P(O)
Filling in the first one:
P( L O) P( L O)
= 0.75 = 0.75 P( L O) = 0.1125
P(O) 0.15
P( L O)
Also, = 0.1 P( L O) = 0.085
0.85
Therefore, P( L) = P( L O) + P( L O) = 0.1125 + 0.085 = 0.1975
P( L O) 0.1125 45
b P( L | O) = = = = 0.5696 (4 s.f.).
P(O) 0.1975 79
25 a
c P(balls are different colours) = P(blue then red) + P(red then blue)
= 129 113 + 123 119 = 27132
+ 27
= 132
54
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 12
26 a
27 a
b i There are two different situations where the second counter drawn is blue. These are BB and
RB. Therefore the probability is: ( 83 72 ) + ( 85 73 ) = 656
+15
= 56
21
= 83 = 0.375 .
P(both blue and 2nd blue) P(both blue) 83 72 2
ii P(both blue | 2nd blue) = = = 3 =
P(2nd blue) P(2nd blue) 8 7
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 13
Challenge
z+7
1 P(C) =
50
y +1
P(A) =
50
y+7 y +1
= 3
50 50
z + 7 = 3y + 3
z + 4 = 3y (1)
38
P(not B) = 0.76 =
50
y + z + 18
P(not B) =
50
y + z + 18 38
So =
50 50
y + z + 18 = 38
y = 20 − z (2)
Use (2) to substitute for y in (1):
z + 4 = 3(20 − z)
z + 4 = 60 − 3z
4z = 56
z = 14
Substituting this value for z in (2):
y = 20 − 14 = 6
Referring to the diagram:
x = 50 − (6 + 1 + 7 + 14 + 18) = 4
x = 4, y = 6, z = 14
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 14
Review exercise 2
1 Diagram A: as x increases, y decreases. There is a negative correlation.
So this corresponds to r = –0.79.
Diagram B: There is no real pattern. There are several values of v for one value of u. There is very
weak or no correlation. So this corresponds to r = 0.08.
b A temperature of 45 °C is very high so it is likely this value was recorded incorrectly. Therefore,
this outlier should be omitted from the data.
c In the regression equation, 2.81 represents the number of additional ice creams (in hundreds) sold
each month for each degree Celsius increase in average temperature.
d A temperature of 2 °C is outside the range of the data so a value calculated using the equation for
the regression line involves extrapolation and is likely to be inaccurate.
3 a
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 1
3 c y = ax + b where
S
b = xy and a= y − bx
S xx
∑ xy − ∑ n∑
x y
S xy
=
106 × 704
= 8354 −
10
= 891.6
(∑ x)
2
S xx
= ∑x 2
−
n
2
106
= 1352 −
10
= 228.4
891.6
b=
228.4
= 3.903...
= 3.90 (3 s.f.)
Since a= y − bx
When
x = 10.6, y = 70.4 and b = 3.90
a = 70.4 – 3.90 × 10.6
= 29.021…
= 29.0 (3 s.f.)
e i y = 3.90x + 29.0
When x = 19
y = 3.90(19) + 29.0
= 103.1
= 103 ml (3 s.f.)
ii y = 3.90x + 29.0
When x = 35
y = 3.90(35) + 29.0
= 165.5
= 166 ml (3 s.f.)
f The prediction for 19 weeks is likely to be reasonably reliable as it is close to the range
investigated.
The prediction for 35 weeks is likely to be unreliable, since the time is well outside range of x and
there is no evidence that model will continue to hold.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 2
4 a This is a scatter diagram of the data. (The diagram also shows the regression line, see part d.)
∑ xy −
b S xy =
∑ x∑ y =
28750 −
315 × 620
4337.5
=
n 8
(∑ x)
2
3152
S xx = ∑ x2 − n
= 15 225 −
8
= 2821.875
S xy
c=
b = 1.537 09= 1.54 (3 s.f.)
S xx
620 315
a =y − bx = −b =16.976 =17.0 (3 s.f.)
8 8
e i Brand D is overpriced, since this data point is a long way above the regression equation line.
5 a
S xy 8100
b
= = = 0.395363=
0.395 (3 s.f.)
S xx 20 487.4
48 8100 130
a =−y bx = − × = −0.424 68 = −0.425 (3 s.f.)
8 20 487.4 8
So the equation of the regression line is: y = –0.425 + 0.395x
b f − 100 =
−0.42468 + 0.39536 (m − 250)
⇒ f 0.7353 + 0.3953 m
=
f 0.735 + 0.395 m
⇒= (giving the equation parameters to 3 s.f.)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 3
6 a Find the y values by subtracting 2460 form all the l values. The summary data for x and y are:
∑=x ∑=t 337.1 ∑=y 16.28
∑ xy −
S xy =
∑ x∑ y =
757.467 −
337.1×16.28
71.4685
=
n 8
(∑ x)
2
337.12
S xx =∑ x 2 − =15965.01 − =1760.45875
n 8
S xy 71.4685
b=
b = = 0.040 596=
52 0.0406 (3 s.f.)
S xx 1760.458 75
16.28 337.1
a =y − bx = − 0.04059652 × =0.324364 =0.325 (3 s.f.)
8 8
The equation of the regression line is:
= y 0.324 + 0.0406 x
S xy −808.917
7 a r= = −0.8157 =
= −0.816 (3 s.f.)
S xx S yy 113 573 × 8.657
b There is a negative correlation. The survey suggests that houses are cheaper the further they are
from the railway station.
c To change miles to kilometres, multiply by 1.6. The coding is linear, so the product moment
correlation coefficient will be unaffected by the coding. So the product moment correlation
coefficient will still be –0.816.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 4
Stm 1191.8
8 c r
= = = 0.91392
= 0.914 (3 s.f.)
Stt Smm 983.6 ×1728.9
d The coding is linear (m = amount spent – 20) so the product moment correlation coefficient will
be unaffected by the coding. So the product moment correlation coefficient will still be 0.914.
e The product moment correlation coefficient of 0.914 suggests that the longer spent shopping the
more money the customer spends. It would suggest a relationship between time spent shopping
and money spent.
The product moment correlation coefficient of 0.178 suggests that there is no relationship between
time spent shopping and money spent.
f The two sets of data might be very different because the data was collected at different times of the
day – or on different days of the week – when shopping behaviour is not the same.
9 a
b The product moment correlation coefficient measures the linear correlation between two variables,
i.e. it is a measure of the strength of the linear link between the variables.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 5
9 c The summary statistics for t and p are:
(∑ t )
2
33.92
Stt =∑ t 2
− =141.51 − =26.589
n 10
(∑ p)
2
96.42
S pp = ∑ p 2
− = 1081.74 − = 152.444
n 10
∑ tp −
Stp =
∑ t∑ p =
386.32 −
33.9 × 96.4
59.524
=
n 10
59.524
d r
= = 0.93494 = 0.935 (3 s.f.)
152.444 × 26.589
No. of heads, x 0 1 2 3 4
P(X = x) 1 4 n 4 1
16 16 16 16 16
1 4 n 4 1
+ + + + 1
=
16 16 16 16 16
10 + n = 1
n=6
c P(HHT) = 1
2
× 12 × 12
= 0.125
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 6
11 a
x 1 2 3 4 5 6
P (X = x) 1 3 5 7 9 11
36 36 36 36 36 36
= 0.0278 = 0.0833 = 0.1389 = 0.1944 = 0.25 = 0.3056
E(Y ) =
4 − 2 E( X ) =
1 1
0.25 − 0.5E( X )
So 0.25 − 0.5 ( −3a − 2b + a + 3c ) =−0.05
⇒ −2a − 2b + 3c =0.6 (2)
x –3 –2 0 1 3
P(X = x) 0.1 0.2 0.2 0.1 0.4
1− 2X
b −3 X < 5Y ⇒ −3 X < 5 ⇒ −12 X < 5 − 10 X ⇒ X > − 2
5
4
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 7
13 a The probability distribution for X is:
x 1 2 3 4 5 6
P(X = x)
b P(2 < X 5) =
P( X = 3) + P( X = 4) + P( X =
5)
5 + 7 + 9 21 7
= = = = 0.583 (3 s.f.)
36 36 12
c E( X =
) 1
36 (1 + 2 × 3 + 3 × 5 + 4 × 7 + 5 × 9 + 6 ×11)= 161
36
d Show all the steps when asked to show that Var(X) = 1.97
E( X 2 ) =
∑ x 2 P( X =x)
= 1
36 (1 + 22 × 3 + 32 × 5 + 42 × 7 + 52 × 9 + 62 ×11)
= 791
36
Var(
= X ) E( X 2 ) − (E( X )) 2
25 291 28 476 25 291
=791
26 − 1296 =1296 − 1296 = 1296 =
2555
1.97 (3 s.f.)
x 1 2 3 4 5
P(X = x) k 2k 3k 5k 6k
c Show all the steps when asked to show that Var(X) = 1.47
E( X 2 ) =
∑ x 2 P( X =x)
= 1
17 (1 + 22 × 2 + 32 × 3 + 42 × 5 + 52 × 6)
= 266
17
Var(
= X ) E( X 2 ) − (E( X )) 2
17 − 289 = 289 − 289 = 289 =1.47 (3 s.f.)
=266 4906 4522 4906 426
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 8
15 a Probabilities sum to 1, so:
c P(4 < X 7) =
P( X =
5) + P( X =
7)
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 9
16 d
( )
b E X 2 = 0.2(−2) 2 + 0.3(−1) 2 + 0.4(1) 2 = 1.5
=Var( X ) E ( X 2 ) − (E(=
X )) 2 =
1.5 − (−0.3) 2 1.41
d
So Y + 1 < X ⇒ 3 − 3 X < X ⇒ X > 3
4
18 a P(X = x) = 0.2
b The spinner has 3 odd numbers and two even numbers so:
P(X = even) = 0.4
P(X = odd) = 0.6
y 1 2 3 4
P(Y = y) 0.6 0.4 × 0.6 = 0.24 0.42 × 0.6 = 0.096 0.43 × 0.6 + 0.44 = 0.064
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 10
19 a Drawing a diagram will help you to work out the correct area:
x−µ
Using z = . As 91 is to the left of 100, your z value should be negative.
σ
91 − 100
P ( X < 91) = P Z <
15
= P ( Z < −0.6 )
= 1 − 0.7257
= 0.2743
(The tables give P ( Z < 0.6
= ) P ( Z > −0.6 ) , so you want 1 − this probability .)
b
As 0.2090 is not in the table of percentage points you must work out the larger area:
1 − 0.2090 = 0.7910
Use the first table or calculator to find the z value. It is positive as 100 + k is to the right of 100.
P ( X=> 100 + k ) 0.2090 or P ( X= < 100 + k ) 0.791
100 + k − 100
= 0.81
15
k = 12
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 11
20 a Let H be the random variable ~ height of athletes, so H ~ N (180, 5.22 )
Drawing a diagram will help you to work out the correct area:
x−µ
Using z = . As 188 is to the right of 180 your z value should be positive. The tables give
σ
P ( Z < 1.54 ) so you want 1− this probability:
188 − 100
P ( H > 188 ) = P Z >
5.2
= P ( Z > 1.54 )
= 1 − 0.9382
= 0.0618
x−µ
Using z = . As 97 is to the right of 85, your z value should be positive.
σ
97 − 85
P (W < 97 ) = P Z <
7.1
= P ( Z < 1.69 )
= 0.9545
c P (W > 97 ) =
1 − P (W < 97 ) , so
) 0.618 (1 − 0.9545)
P ( H > 188 & W > 97=
= 0.00281
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 12
21 a Use the table of percentage points or calculator to find z. You must use at least the four decimal
places given in the table.
P (Z > a) = 0.2
a = 0.8416
P ( Z < b) =
0.3
b = −0.5244
0.5244 is negative since 1.65 is to the left of the centre. 0.8416 is positive as 1.78 is to the right of
the centre.
x−µ
Using z = :
σ
1.78 − µ
= 0.8416 ⇒ 1.78 − µ =
0.8416σ (1)
σ
1.65 − µ
= −0.5244 ⇒ 1.65 − µ = 0.5244σ ( 2)
σ
Solving simultaneously, (1) − (2) :
0.13 = 1.366σ
σ = 0.095 m
µ 0.8416 × 0.095
Substitute in (1): 1.78 −=
µ = 1.70 m
x−µ
Using z = :
σ
1.74 − 1.70
P ( height > 1.74 ) =P z >
0.095
=P ( z > 0.42 ) (the tables give P(Z < 0.42) so you need 1 − this probability)
= 1 − 0.6628
= 0.3372 (calculator gives 0.3369)
b P(21 < D < 22.5)= P( D < 22.5) − P( D < 21)= 0.5045 (4 s.f.).
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 13
23 X N (14,32 )
a P(X ≥ 11) = P(X < 17)
X −µ
z=
σ
17 − 14
=
3
=1
Φ(1) = 0.8413
P(X ≥ 11) = P(Z < 1)
P(X ≥ 11) = 0.8413
24 a X N ( 20,52 )
P(X ⩽ 16) = 1 – P(X ⩽ 24)
X −µ
z=
σ
24 − 20
=
5
= 0.8
Φ(0.8) = 0.7881
P(X ⩽ 16) = 1 – P(Z ⩽ 0.8)
P(X ⩽ 16) = 0.2119
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 14
24 b P(X < d) = 0.95
d − 20
P z < =0.95
5
d − 20
= 1.645
5
d = 28.225
= 28.2
25 a S N ( 850,502 )
P(S < 830) = 1 − P(S < 870)
X −µ
z=
σ
870 − 850
=
50
= 0.4
Φ(0.4) = 0.6554
P(S < 830) = 1 – P(Z ⩽ 0.4)
P(S < 830) = 1 – 0.6554
= 0.3446
26 T N ( 25, 42 )
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 15
26 b P(|T – 25| < 5) = P(20 < T < 30)
For P(T < 30)
X −µ
z=
σ
30 − 25
=
4
= 1.25
Φ(1.25) = 0.8944
P(T < 30) = 0.8944
Therefore P(25 < T < 30) = 0.8944 – 0.5
= 0.3944
and
P(20 < T < 25) = P(25 < T < 30) = 0.3944
so
P(20 < T < 30) = 0.3944 × 2
= 0.7888
Challenge
1 a i −2.63 + 2.285 x
y=
iii y = 1.1762e0.3484 x
b For y =
−2.63 + 2.285 x the residuals are:
when x = 1
y= −2.63 + 2.285 (1) =
−0.345 ⇒ residual = 1.5 − (−0.345) = 1.845
when x = 3
y= −2.63 + 2.285 ( 3) =
4.225 ⇒ residual = 3.3 − 4.225 = −0.925
when x = 4
y= −2.63 + 2.285 ( 4 ) =
6.510 ⇒ residual = 5.3 – 6.510 = −1.210
when x = 5
y= −2.63 + 2.285 ( 5 ) =
8.795 ⇒ residual = 7.5 – 8.795 = −1.295
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 16
when x = 7
y= −2.63 + 2.285 ( 7 ) =
13.635 ⇒ residual = 13.8 – 13.635 = 0.165
when x = 8
y= −2.63 + 2.285 ( 8 ) =
15.65 ⇒ residual = 16.8 – 15.65 = 1.15
when x = 3
1.04 + 0.1206 ( 3) + 0.2353 ( 3) =
2
y= 3.5195 ⇒ residual = 3.3 – 3.5195 = −0.2195
when x = 4
1.04 + 0.1206 ( 4 ) + 0.2353 ( 4 ) =
2
y= 5.2872 ⇒ residual = 5.3 – 5.2872 = 0.0128
when x = 5
1.04 + 0.1206 ( 5 ) + 0.2353 ( 5 ) =
2
y= 7.5255 ⇒ residual = 7.5 – 7.5255 = −0.0255
when x = 7
1.04 + 0.1206 ( 7 ) + 0.2353 ( 7 ) =
2
y= 13.4139 ⇒ residual = 13.8 – 13.4139 = 0.3861
when x = 8
1.04 + 0.1206 ( 8 ) + 0.2353 ( 8 ) =
2
y= 17.064 ⇒ residual = 16.8 – 17.064 = −0.264
For y = 1.1762e0.3484 x
when x = 1
0.3484(1)
=y 1.1762e
= 1.6664 ⇒ residual = 1.5 – 1.6664 = –0.1664
when x = 3
0.3484( 3)
=y 1.1762e
= 3.3451 ⇒ residual = 3.3 – 3.3451 = −0.0451
when x = 4
0.3484( 4 )
=y 1.1762e
= 4.7393 ⇒ residual = 5.3 – 4.7393 = 0.5607
when x = 5
0.3484( 5 )
=y 1.1762e
= 6.7146 ⇒ residual = 7.5 – 6.7146 = −0.0255
when x = 7
0.3484( 7 )
=y 1.1762e
= 13.4784 ⇒ residual = 13.8 – 13.4784 = 0.7854
when x = 8
0.3484( 8 )
=y 1.1762e
= 19.0962 ⇒ residual = 16.8 – 19.0962 = −2.2962
Hence the quadratic model is most suitable as the residuals are smaller and are randomly scattered
around zero.
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 17
2 a P(difference is 0) = P(1,1,1) + P(2,2,2) + P(3,3,3) + P(4,4,4)
1 1 1
= 4 × ×
4 4 4
1
=
16
P(difference is 1) = P(1,1,2) + P(1,2,1) + P(2,1,1) + P(1,2,2) + P(2,1,2) + P(2,2,1)
+ P(2,2,3) + P(2,3,2) + P(3,2,2) + P(2,3,3) + P(3,2,3) + P(3,3,2)
+ P(3,3,4) + P(3,4,3) + P(4,3,3) + P(3,4,4) + P(4,3,4) + P(4,4,3)
1 1 1
= 18 × ×
4 4 4
9
=
32
P(difference is 3) = P(1,1,4) + P(1,4,1) + P(4,1,1) + P(1,4,4) + P(4,1,4) + P(4,4,1)
+ P(1,2,4) + P(1,4,2) + P(4,2,1) + P(2,1,4) + P(2,4,1) + P(4,1,2)
+ P(1,3,4) + P(1,4,3) + P(4,3,1) + P(3,1,4) + P(34,1) + P(4,1,3)
1 1 1
= 18 × ×
4 4 4
9
=
32
12
Since the sum of the probabilities is 1, P(difference is 2) =
32
x 0 1 2 3
2 9 12 9
P(X = x)
32 32 32 32
E( X )
b= ∑=
xP ( X x )
2 9 12 9
=0 × + 1× + 2 × + 3 ×
32 32 32 32
15
= as required
8
© Pearson Education Ltd 2019. Copying permitted for purchasing institution only. This material is not copyright free. 18