John Doyle: Example write-up
Gender pay discrimination? Controlling for Age, TWC, and Occupation.
1. Introduction.
The data in PERSONclean.xls are part of the evidence brought in a pay discrimination case against
“Company C”. Although, in analysing secondary data, there is no obligation to maintain the original focus of
the data collection, we too will treat it as a question of gender discrimination. We start by establishing that
there is indeed a prima facie case to be answered. We then examine and rebut different counter-arguments that
the defence might have mustered, such that C’s men are older than women, have been with the company
longer than women, and tend to occupy different jobs to women. While all of these are true, we show that
discrimination still exists when we have controlled for each of these variables in turn.
Most of our analyses use graphical evidence about each claim, followed by more formal hypothesis-
testing procedures that tell us how likely the data we observe will have arisen “by chance”. The exploratory /
descriptive evidence not only guides our expectations, but also helps form our judgments about the most
appropriate ways to analyse the data. We will also use 2-tailed tests throughout, first because they are more
conservative, and second because it obviates the need for special case-by-case pleading.
2. Is there a case to be answered?
The left hand panel of Figure 1 compares men’s (n=201) and women’s (n=211) wages, using
boxplots1. At every corresponding point of the two distributions (minimum, maximum, upper and lower
whiskers, Q1, Q3, and Median) men earn more than women. Furthermore, not only does the upper quartile
wage for women (Q3 = £70) lie below the corresponding Q3 for men (= £87), it even lies below Q1 (=£74) for
men2. This means that there is a particular wage, somewhere between £70 and £74, such that at least 75% earn
more than it, while at least 75% of women earn less than it!
Wages by gender Ages by gender
120
60
100
50
80
40
60
30
40
20
male female male female
Figure 1. Comparative boxplots of male versus female wages (left) and ages (right).
1
table(Gender) # how many men and women involved?
par(mfrow=c(1,2)) # two separate side-by-side paired boxplots
boxplot(Wage[Gender==0],Wage[Gender==1],main="Wages by gender",names=c("male","female"),varwidth=TRUE)
boxplot(Age[Gender==0],Age[Gender==1],main="Ages by gender",names=c("male","female"),varwidth=TRUE)
2
summary(Wage[Gender==0])
summary(Wage[Gender==1])
1
John Doyle: Example write-up
The boxplots suggest no particular skew, freedom from obvious kurtosis, though with a few too many
dispersed outliers than we might expect from a normal distribution. Examining the QQnorm plots3, shown in
Figure 2, we again see no very strong skew or kurtosis, though statistics suggest more caution. Moment skew4
is 0.95 for male and -0.68 for female staff. The opposite signs imply that any transformation that improves
skew for men will make it worse for women, and vice versa. Excess Kurtosis is 1.28 and 2.16 for male and
female staff, respectively. One further feature, not apparent from the boxplots, is that Wages are often stepped,
most obviously for the 21 (male) Foremen, all with salaries of £119, which appear misleadingly as a single
point in the boxplot. Overall, there is sufficient concern to use both parametric and non-parametric tests here.
Male Wages Female Wages
120
80
100
Quantiles of Wages distribution
Quantiles of Wages distribution
70
60
80
50
60
40
30
40
-3 -2 -1 0 1 2 3 -3 -2 -1 0 1 2 3
Gaussian Quantiles Gaussian Quantiles
(z-scores) (z-scores)
Figure 2. QQplots of empirical Wage distribution data against quantiles of the theoretical (normal)
distribution.
The most general (null) hypothesis concerning gender pay discrimination is:
H1: Men and women earn different amounts.
H0: Men and women earn the same amount.
The mean Wage for men (£83.47) was significantly greater than for women (£64.59), t(342) = 15.14, p < 2.2 x
10-16. Thus we reject5 H0 in favour of H1. This rejection of H0 is confirmed by the Wilcoxon Rank Sum test,
which is distribution-free, therefore less influenced by skew and kurtosis: W = 37342, p < 2.2 x 10-16. In
conclusion, whether looked at descriptively, using parametric tests, or using non-parametric tests, there is a
prima facie case of discrimination: men do earn significantly more than women, both in a practical (men
earned nearly 30% more than women) and in a statistical sense.
3. Are Wage differences really Age differences?
It might be argued that men in Company C tend to be older than women (see right hand panel of
Figure 1), and indeed they are: testing the mean ages between-gender (42.58 versus 34.19 years): t(406) =
6.26, p < 10-9, and by Wilcoxon's Rank sum: W = 28647.5, p , 10-9. If older people are also paid more, then
that might be the real mechanism behind gender discrimination, making discrimination more apparent than
3
par(mfrow=c(1,2))
qqnorm(Wage[Gender==0],main="Male Wages",xlab="Gaussian Quantiles\n(z-scores)",ylab="Quantiles of Wages distribution")
qqnorm(Wage[Gender==1],main="Female Wages",xlab="Gaussian Quantiles\n(z-scores)",ylab="Quantiles of Wages distribution")
4
m2 = mean((Wage-mean(Wage))**2); m3 = mean((Wage-mean(Wage))**3); m4 = mean((Wage-mean(Wage))**4)
Skew = m3/(m2**(3/2)); Kurtosis = (m4/(m2**2)- 3) # similarly, using Wage[Gender==0] and Wage[Gender==1].
5
2.2 x 10-16 is the smallest value to which R will compute p-values.
2
John Doyle: Example write-up
real. The relationship between age and wage is shown in Figure 3 in a scatterplot, with superimposed trend-
line6.
120
100
Wage (pounds per week)
80
60
40
20 30 40 50 60
Age (years)
Figure 3. Scatterplot and lowess smooth (larger circles) of Age versus Wage.
Because the trend in Figure 3 is not linear, the correlation is better assessed using Spearman’s
correlation coefficient rather than Pearson’s: Rho(n=412) = .354, p < 10-12. We therefore reject the null
hypothesis that Wage is not related to Age. However, Figure 3 also shows that there are three parts to the
trend-line. Up to the mid twenties there is a very marked and steep rise in Wage with Age. Then, over the next
30 years Wages are flat with Age. Finally, from mid fifties onwards there is the possibility that Wage
increases slowly with Age. Assessing each part in turn: Rho(n=117) = .743, p < 2.2 x 10-16; Rho(n=221) = -.038, p >
.5; and Rho(n=74) = .149, p > .2. These correlations7 suggest that, for people older than 25, Age cannot be used
to explain away gender discrimination. However, the under-26s represent a populous group (n=117, or 28% of
the sample) that cannot easily be ignored8. The tight correlation in this group between Age and Wage
threatens to dilute the prosecution’s arguments about discrimination, by having to concede that it is only
evident in certain age groups. Fortunately, there is a second way of analysing the data which is able to confirm
that gender discrimination exists amongst these youngest staff too.
6
par(mfrow=c(1,1))
plot(Age,Wage,pch=19,cex=.5,xlab="Age (years)",ylab="Wage (pounds per week)")
points(lowess(Wage~Age,f=1/3),pch=19, cex=1.5)
7
cor.test(Age,Wage,method="spearman")
cor.test(Age[Age<=25],Wage[Age<=25],method="spearman")
cor.test(Age[Age > 25 & Age < 55],Wage[Age>25 & Age < 55],method="spearman")
cor.test(Age[Age >= 55],Wage[Age >= 55],method="spearman")
8
sum(table(Age[Age <= 25]))
3
John Doyle: Example write-up
Numbers Mean Wages
Age Male Female Male Female
17 0 5 na 33.60
18 1 7 43.00 43.71
19 1 9 54.00 49.67
20 2 9 55.00 57.00
21 5 13 78.60 62.54
22 13 14 77.62 66.64
23 3 9 76.67 66.44
24 5 11 82.20 70.64
25 3 7 84.00 68.71
Table 1. Counts of Male and Female employees at ages 17-25, inclusive; and the mean wages at those ages.
Consider Table 1, which shows the number of Male (and Female) staff aged 17-25, and the mean
wages at those ages9. The means in bold are paired, row-wise, by having the same age. (We must discard the
17 year-olds as there are no males to pair with the females.) Using a paired-sample t-test, the mean wages for
men are significantly greater than those for women10 (68.89 versus 60.67), t(7) = 3.37, p < .02. Also when
tested non-parametrically, Wilcoxon paired-sample test: V=33, p < .05. We reject the null hypothesis of
equality of Wages.
In conclusion, gender discrimination cannot be reduced to age differences in male versus female staff.
3. Are Wage differences really loyalty differences?
In a benign world, companies should reward staff who are loyal, i.e. stay with the company for many
years. Such employees build up knowledge of organizational practice and ethos. Having loyal staff also
reduces the time and money needed to recruit and train new staff. Figure 4 is the scatterplot and lowess-
smooth of Wage with TWC (time with company). It shows an approximately linear trend, upwards. Statistics
confirms what the eye perceives: Pearson’s correlation = .422, t(410) = 9.42, p < 2.2 x 10-16, and Spearman's
rank correlation rho = .326, p < 10-10. Therefore loyal staff are paid more. Furthermore, men stay with the
company longer than women (means of 10.62 versus 6.11 years), t(321) = 5.83, p < 10-7, and by Wilcoxon
Rank Sum test: p < 10-6, allowing us to reject the H0: TWC(Males) = TWC(Females). As in section 2, at face
value without further investigation these results support the counter-claim that gender discrimination might be
due to loyalty. In short, people who stay longer earn more; men stay longer.
To investigate this possibility, we applied the same paired-sample design used in section 2, where
male and female Wages are paired at the different TWCs. There is a significant difference between Male and
Female wages, t(20) = 11.01, p < 10-8, or by Wilcoxon Rank Sum: p < 10-6. That is, when TWC is implicitly
held constant between Males and Females at each of 21 levels (years), there is still a significant difference
between genders11. TWC (loyalty) cannot re-explain the gender pay gap.
9
table(Gender[Age <= 25], Age[Age <= 25]) # gives the counts at each age, by gender
x=0; y=0 # and for the means …
for (i in 18:25){
x[i-17]= mean(Wage[Gender==0 & Age == i])
y[i-17]= mean(Wage[Gender==1 & Age == i])
}
10
t.test(x,y,paired=TRUE); wilcox.test(x,y,paired=TRUE)
11
x=0;y=0
for (i in 1:50){
x[i]= mean(Wage[Gender==0 & TWC == i-1])
y[i]= mean(Wage[Gender==1 & TWC == i-1])
}
t.test(x,y,paired=TRUE); wilcox.test(x,y,paired=TRUE) # NaNs won't be included in calculations
4
John Doyle: Example write-up
120
100
Wage (pounds per week)
80
60
40
0 10 20 30 40
TWC (time with company, years)
Figure 3. Scatterplot and lowess-smooth of Wage with TWC.
4. Can the jobs people do explain the gender pay gap?
Another argument goes that men and women tend to do different jobs. It's just that men do the better
paid jobs. To argue for this possibility is already to admit that gender discrimination exists. But its locus is
between-jobs, for instance in job (self?) selection. In this section we show that discrimination does exist
within-job too, in that men earn more for doing the same job as women.
Numbers Mean Wages
Occupation Male Female Male Female
Clerical Staff 90 96 77.67 61.40
Draughtsmen 13 3 84.77 84.33
Foremen 21 0 119.00 na
Inspectors 15 0 92.47 na
Lab Assistant 4 4 69.50 62.25
Office Machine Operators 19 28 70.58 66.43
Professional and Related 3 1 80.00 62.00
Secretaries Typists 3 56 70.00 64.34
Security Firemen 4 1 80.50 71.00
Technician 29 22 83.03 74.36
Table 2. Numbers and mean wages by Gender and Occupation.
In Table 2, since there are no female foremen or inspectors (incidentally, the two best paid jobs), we
must discard them from the analysis. Doing so makes for a conservative test of pay discrimination. We
construct paired sample tests of the gender mean Wages (figures in bold), first using t: t (7) = 4.18, p < .005;
5
John Doyle: Example write-up
and also using Wilcoxon Signed Rank test, p < .01. We reject the null hypothesis of equality of means. Note
in passing that in all eight occupations for which comparisons can be made, men earn more than women12.
5. Conclusions.
Men are paid more than women for doing the same job. They also exclusively occupy the best paid
jobs. Although age and TWC are positively related to wage, and men are older and stay with the company
longer than women, a gender pay gap continues to exist when these factors are controlled for. Our conclusions
are inescapable: Company C did practice pay discrimination against its female staff.
Word count: 1479, excluding footnotes, tables, figures, and their captions, section headings, and the title.
[I had to work to get it below 1500 words, and I expect you to do so too.]
Flesch Reading Ease: 49.6.
[Make sure your overall Flesch score is above 30, preferably >30 for each and every paragraph.]
Further notes:
1. The figures and tables are small enough that they do not dominate the page, yet can still be read.
2. Work to keep the footnotes together at the bottom of the page where they are referenced, rather than being
split over two pages. Sometimes that means you have to locate the footnote a sentence or two from the
ideal. Sometimes add a blank line, and so on. Please make the effort.
3. Try to avoid putting reference superscripts on anything that might be confusing, such as an equation.
4. I wrote one hypothesis out explicitly, the rest were abbreviated, or even implied. You can do the same (one
explicit, the rest more condensed).
5. I used the "royal we". Modern usage encourages us to use "I", but I'm old-school. Take your pick.
6. Figures and Tables are numbered separately. They each have a brief caption below.
7. Axes and categories in boxplots should be labelled.
8. I have found no need for any Appendices, and I'm done in 6 pages.
9. Pages should be numbered.
10. Work your tables up outside of R. I do them in Excel. Use my kind of template if you like.
11. Do left justify your text.
12. Quote both word count and Flesch at the end of your document.
13. Make up a title that reflects what you have done. "Statistics assignment" is not good enough.
14. It helps to run your commands from an R script. When they work, you can easily copy to footnotes.
15. HOWEVER, this was just one way of writing an assignment, threading a line of argument together. Do
not follow my example slavishly. For instance, I didn't use data transformations, regression, chi-squared,
or 1-tailed tests, etc. They aren't here because I didn't seem to need them. Your analyses may need them,
but be able to do without devices I did use. It all depends... Finally, I would like to think my report is
technically quite proficient, but it's not particularly creative. Once you read the title, you should know
where it's going, though not necessarily how it's going to get there. It's a bit of a me-too effort. May I
encourage you to be more create than The John.
12
Ox = c(8,9,10,11,12,14,15,16,21,27)
x=0; y=0
for (i in 1:10){
x[i] = mean(Wage[Gender == 0 & Occupation == Ox[i]])
y[i] = mean(Wage[Gender == 1 & Occupation == Ox[i]])
}
t.test(x,y,paired=TRUE); wilcox.test(x,y,paired=TRUE)