0% found this document useful (0 votes)

102 views24 pages

Tutorial 4

The document provides instructions for completing tutorial exercises on quantitative methods. It includes: 1) Downloading several Excel data files and reading the tutorial handout to attempt exercises before class for help from the tutor if needed. 2) Completing "Exercises for assessment" and submitting answers through Canvas by the next tutorial to receive credit. 3) An explanation of paired-sample and independent measures designs for comparing central locations of populations and examples of each.

Uploaded by

Bake A Doo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

102 views24 pages

Tutorial 4

Uploaded by

Bake A Doo

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

ECON20003 – QUANTITATIVE METHODS 2

TUTORIAL 4

Download the t4e1, t4e2, t4e3, t4e4, t4e5a and t4e5b Excel data files from the subject
website and save them to your computer or USB flash drive. Read this handout and try to
complete the tutorial exercises before your tutorial class, so that you can ask help from your
tutor during the Zoom session if necessary.

After you have completed the tutorial exercises attempt the “Exercises for assessment”.
You must submit your answers to these exercises in the Tutorial 4 Homework Canvas
Assignment Quiz by the next tutorial in order to get the tutorial mark. For each assessment
exercise type your answer in the relevant box available in the Quiz or upload your typed
answer separately in PDF format as an attachment. In either case, if the exercise requires
you to use R, save the relevant R/RStudio script and printout and upload it together with
your written answer in PDF format.

Comparing the Central Locations of Two Populations

In certain situations, we might be interested to know whether a certain treatment has some
significant effect on the central location (measured by the mean or the median) of a
population, while in some other situations we might wish to compare the central locations of
two distinct populations. We label the first scenario as paired-sample design (or matched
pairs experiment) and the second as independent measures design. In both cases the focus
is on the difference between two population central locations, in the case of the paired-
sample design we are interested in the difference between the before treatment and after
treatment central locations, while in the case of the independent measures design we are
interested in the difference between the central locations of two distinct populations.

To illustrate the paired-sample design, suppose that in order to find out whether some newly
designed golf clubs improve golfers’ performance, we ask a group of golfers to play a round
on a familiar golf course with their own clubs and then another round with the new clubs. Or,
suppose we want to find out whether a particular real estate agency tends to overvalue the
properties of potential vendors in order to secure more business, and we compare a sample
of evaluations by this agency to the evaluations of the same properties by some independent
property valuer.

In both of these examples, there is just one set of experimental units (golfers; properties),
one variable of interest (golfers’ scores on the given course; appraised values of properties),
and a single random sample of pairs of observations (pairs of scores with the old and new
clubs, respectively; pairs of appraised values provided by the real estate agency and the
independent property valuer, respectively). Most importantly, the sample elements (golfers;
properties) are supposed to be selected randomly but the observations in any particular pair
of observations are related to each other.

To illustrate the independent measures design, suppose that we are interested in the
customer satisfaction levels of two competing paid television channels ‘A’ and ‘B’, and ask
1
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
a sample of viewers who usually watch channel ‘A’ and another sample of viewers who
usually watch ‘B’ to answer a few questions about their level of satisfaction. Or, suppose we
are interested in the relationship between job tenure and qualification at a company, and
compare the length of time employees with a bachelor’s degree or higher have been working
at the company with that of employees who do not have such a degree.

In these examples, there are two different sets of experimental units (viewers of the two
television channels; employees with a bachelor’s degree or higher and employees without
such degree), one variable of interest (customer satisfaction; job tenure), but two random
samples (samples of the viewers of the two channels; samples of the two types of
employees). Crucially, these random samples are supposed to be independent of each
other.

No matter whether we have a paired-sample design or an independent measures design,

we can use point and interval estimators and hypothesis tests for the difference between
two population central locations, just like previously when we focused on a single population
central location.

Paired-Sample Design

Once the differences between the corresponding observations are calculated, we can apply
the same inferential procedures on the central location of the population of differences than
on the central location of any quantitative population.

These procedures, i.e. the matched-pairs Z / t tests and the corresponding confidence
interval for the difference (D) between the before and after population means are based on
the following assumptions:

i. The data is a random sample of pairs of observations (i.e. the before and after samples
are not independent of each other).
ii. The variable of interest is quantitative and continuous.
iii. The measurement scale is interval or ratio.
iv. Either (Z test) the population standard deviation of the differences, D, is known and
the sample mean of the differences is at least approximately normally distributed, or
(t-test) D is unknown but the population of the differences is normally distributed (at
least approximately).

Exercise 1 (McClave et. al, p. 510, ex. 9.37)

A pupilometer is a device used to observe changes in pupil dilations at the eye exposed to
different visual stimuli. Since there is a direct correlation between the amount an individual‘s
pupil dilates and his or her interest in the stimuli, marketing organizations sometimes use
pupilometer to help them evaluate potential consumer interest in new products, alternative
package designs, and other factors (Optical Engineering, Mar. 1995). The Design and
Market Research Laboratories of the Container Corporation of America used a pupilometer
to evaluate consumer reaction to different silverware patterns for a client. Suppose 15

2
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
consumers were chosen at random, and each was shown the same two silverware patterns.
Their pupilometer readings (in millimetres) are saved in the t4e1 Excel file.

a) What are the appropriate null and alternative hypotheses to test whether the mean
amount of pupil dilation differs for the two patterns?

Suppose the researcher shows two silverware patterns one after the other to a client
and after each experiment measures his/her pupil dilation. Denote these pupilometer
readings as X1 and X2, respectively. These measurements form a pair of matching
observations, and the experiment itself is based on a paired-sample design.

If Di denotes the difference between the two measurements for client i, i.e. Di = X1i - X2i,
and μD the mean of population D, then the question implies the following null and
alternative hypotheses:

H0 : D  0 , HA : D  0

b) Conduct the test in part (a) using α = 0.05, assuming that the population of D is normally
distributed. Interpret the results.

Launch RStudio, create a new RStudio project and script, import the data from the Excel
file to RStudio and load it into your current project. The pupilometer measurements are
named Pattern1 and Pattern2. Calculate the differences between the corresponding
measurements:

D = Pattern1 - Pattern2

Since the standard deviation of the population of D is unknown, but the population of D
is assumed to be normally distributed, you can now perform a t-test. Do not worry about
doing the calculations manually during the tutorial class, you can do so later.1 Right now
use R like in Exercise 3 of Tutorial 3 and execute the following command:2

t.test(D)

You should get the following printout:

1
To save time, we are going to perform the test with R. Note, however, that you are expected to be able to do
the required calculations with your hand calculator as well. Since this is a simple t-test on the population mean
of the differences, this should not be a problem, granted that you know how to use your calculator efficiently.
2
Recall that by default t.test performs a two-tail test with zero hypothesized population mean at the 5%
significance level. Hence, this time we need to specify only the name of the variable.
3
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
The observed test statistic is tobs = 5.7637 and the p-value is practically zero, so H0 can
be rejected at any significance level. Therefore we conclude that the mean amount of
pupil dilation differs for the two patterns.

In order to highlight the equivalence between the t-test for a single population mean and
the paired-sample t-test, we first calculated D and then run a t-test on it. Using R,
however, there is no need for this two-step procedure, we can perform the paired-
sample t-test straight on the the original variables Pattern1 and Pattern2 by adding the
paired = TRUE argument to the t.test command:

t.test(Pattern1, Pattern2, paired = TRUE)

The new printout is on the next page. As you can see, the two printouts are formatted
slightly differently, but otherwise they are indeed alike.

c) Interpret the 95% confidence interval for the difference between the two pupil dilation
population measurements, i.e. for population D.

With 95% confidence, the difference in the mean pupil dilation between pattern 1 and
pattern 2 is somewhere between 0.1503 and 0.3283 millimetres.

d) Is the paired-sample design used for this study preferable to the independent measures
design? For independent samples we could select 30 consumers, divide them into two
groups of 15, and show each group a different pattern. Explain your preference.

As often, the paired-sample design is preferred to the independent samples design.

There can be much variation in pupil dilation from person to person which could disguise
the variation due to the different patterns shown to the consumers.

e) In part (b) it was assumed that the population of differences is normally distributed. Since
the sample size is only 15, this assumption is fairly crucial. However, given this small
sample size, the usual diagnostics for normality can be unreliable and misleading. For
this reason perform the appropriate non-parametric test(s) for the median of the
differences between the two pupil dilation measurements. Do you arrive at the same
conclusion than in part (b)?

Last week on the tutorial you used two non-parametric alternatives of the t-test for a
population mean, the one sample sign test and the one sample Wilcoxon signed ranks
test for the population median. The same tests can be performed on D, or on Pattern1
and Pattern2.
4
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
Let’s start with the sign test. When it is performed on two samples, it has the following
requirements:

i. The data is a random sample of independent pairs of observations (i.e. the before
and after samples are not independent of each other but the selected pairs are).
ii. The variable of interest is qualitative or quantitative.
iii. The measurement scale is at least ordinal.

Since the consumers were selected randomly and each was shown the same two
silverware patterns, and pupilometer reading is a quantitative variable measured on a
ratio scale, all requirements are satisfied.

The null and alternative hypotheses are

H0 :  0 , HA :  0

Execute

library(DescTools)
SignTest(D)

You should obtain the following printout:

Alternatively, you can run the test on the original variables instead of D, just like before.

SignTest(Pattern1, Pattern2)

returns the following printout:

5
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
This time the test is labelled Dependent-samples Sign-test, but otherwise the two
printouts are equivalent. The test statistic is S = 14 and the p-value is less than 0.001,
so H0 can be rejected at any reasonable significance level. Therefore we conclude that
the median amount of pupil dilation differs for the two patterns.

Let’s move on to the two-sample Wilcoxon signed ranks test. It assumes that

i. The data is a random sample of pairs of observations (i.e. the before and after
samples are not independent of each other but the elected pairs are).
ii. The variable of interest is quantitative and continuous.
iii. The measurement scale is interval or ratio.
iv. The distribution of the differencesis symmetric.

As we already saw, the first three requirements are satisfied in this example, so we
need to consider only the fourth requirement.

We can do so like in Exercise 2 of Tutorial 3. Execute the following commands:

hist(D, freq = FALSE, col = "yellow")

lines(seq(-0.1, 0.6, by = 0.01),
dnorm(seq(-0.1, 0.6, by = 0.01), mean(D), sd(D)),
col= "red")

qqnorm(D, main = "Normal Q-Q Plot",

xlab = "Theoretical Quantiles", ylab = "Sample Quantiles",
col = "forestgreen")
qqline(D, col = "blue")

library(pastecs)
round(stat.desc(D, basic = FALSE, desc = TRUE, norm = TRUE),3)

The first two commands return a histogram with a normal curve superimposed on it and
the next two commands return a Q-Q plot (see them on the next page). Finally, the last
two commands call the pastecs library and display the following statistics:

Try to evaluate these outputs the way we did last week. You should conclude that they
do not cast any doubt on the assumption of symmetry. It is important to remember
though that this conclusion is not very sound this time due to the small sample size. We
do not need to worry about this uncertainty at this stage because if the sign test and the
Wilcoxon signed ranks test lead to the same conclusion, then it does not really matter
whether the population of D is symmetric or not.

6
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
Histogram of D

3.0
1.5 2.0 2.5
Density

0.5 1.0
0.0

-0.1 0.0 0.1 0.2 0.3 0.4 0.5 0.6

Normal Q-Q Plot

0.6
0.5
Sample Quantiles

0.4
0.3
0.2
0.1
0.0

-1 0 1

Theoretical Quantiles

Similarly to the sign test, for the sake of illustration, perform the Wilconon signed ranks
test twice by executing the following commands:

library(exactRankTests)
wilcox.exact(D)
wilcox.exact(Pattern1, Pattern2, paired = TRUE)
7
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
They return:

and

Again, these two printouts are equivalent. They show that the p-value is less than
0.00015, so H0 can be rejected at any reasonable significance level implying that the
median amount of pupil dilation differs for the two patterns. This is the same conclusion
as the one we arrived at before on the basis of the sign test. Note also that apart from
the fact that in part (b) we tested the population mean while this time the population
median, the conclusions are the same. Therefore, it is not really crucial this time
whether the sampled population is normally distributed (required by the t-test) or is at
least symmetric (required by the Wilcoxon signed ranks test).

Quit RStudio and save your RData and R files.

Independent Measures Design

Suppose we have two independent random samples of size n1 and n2 drawn from two
quantitative populations that have 1 and 2 means and 12 and 22 variances. The
parameter of interest is the difference between the population means, 1 - 2, which can be
estimated with the difference of the sample means and, depending on 12 and 22, we
distinguish three possible scenarios.

(1) The population variances are known.

In this undoubtedly unrealistic case the confidence interval estimator of the

difference between the population means is

 x1  x2   z /2  x  x where  x  x   1   2
2 2

1 2 1 2
n1 n2

and hypotheses about 1 - 2 can be tested with z-tests based on the following test
statistic:

Z 
X 1  X 2    D ,0
 N (0,1)
 x x
1 2

8
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
(2) The population variances are unknown but equal.

In this case the common population variance can be estimated with the sample
variance of the combined or pooled sample

(n1  1)s12  (n2  1)s22

s 2p 
n1  n2  2

and the standard error of the difference between the two sample means is

s 2p s 2p 1 1
sx1  x2    sp 
n1 n2 n1 n2

Assuming that the sampled populations are not extremely non-normal, the
confidence interval estimator of the difference between the population means is

 x1  x2   tdf , /2 sx x 1 2
where the degrees of freedom is d f  n1  n 2  2 ,

and hypotheses about 1 - 2 can be tested with t-tests based on the following test
statistic:

T 
X 1  X 2    D ,0
 t df
s x1  x2
(3) The population variances are unknown and different.

In this case the two population variances have to be estimated separately with the
corresponding sample variances, s12 and s22. Similarly, the variances of the two
sample means have to be estimated separately in the usual way, i.e.

s12 s2
sx21  , sx22  2
n1 n2

Given these variances, the standard error of the difference between the two sample
means can be estimated with

s12 s22
sx1  x2  s s  2
x1 2
x2
n1 n2

Assuming again that the sampled populations are not extremely non-normal, the
confidence interval estimator of the difference between the population means is like
in scenario (2),

 x1  x2   tdf , /2 sx x 1 2

9
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
and hypotheses about 1 - 2 can be tested with t-tests based on the same test
statistic than in scenario (2), i.e.

T 
X 1  X 2    D ,0
 t df
s x1  x2

but the degrees of freedom is different,

s 
2
2
x1  x2
df  .
s   
2 2
2
x1 / ( n1  1)  s 2
x2 / ( n2  1)

The two-independent-sample Z / t test and the corresponding confidence interval estimator

for the difference between two population means are based on the following assumptions:

i. The data consists of two independent random samples of independent observations

(i.e. both the samples and the observations within each sample are independent).
ii. The variable of interest is quantitative and continuous.
iii. The measurement scale is interval or ratio.
iv. Either (Z test) the population standard deviations, 1 and 2, are known and the sample
means are at least approximately normally distributed, or (t-test) 1 and 2 are
unknown but the sampled populations are normally distributed (at least approximately).

Exercise 2 (McClave et. al, p. 497, ex. 9.21)

Marketing strategists would like to predict consumers’ response to new products and their
accompanying promotional schemes. Consequently, studies that examine the differences
between buyers and non-buyers of a product are of interest. One classic study conducted
by Shuchman and Riesz (Journal of Marketing Research, Feb. 1975) was aimed at
characterizing the purchasers and non-purchasers of Crest toothpaste. The researchers
demonstrated that both the mean household size (number of persons) and mean household
income were significantly larger for purchasers than for non-purchasers. A similar study
utilized independent random samples of size 20 on the age of the householder primarily
responsible for buying toothpaste. Householders were categorized as non-purchaser or
purchaser of a particular brand of toothpaste coded as N and P, respectively. The data are
saved in the t4e2 file.

a) Obtain and interpret a 90% confidence interval for the difference between the mean
ages of purchasers and non-purchasers.

Let’s denote the age of non-purchasers as X1 and the age of purchasers as X2. We have
two independent random samples of the same size n1 = n2 = 20 on X1 and X2 and the
experiment is based on an independent measures design. We need to develop a
confidence interval for the difference between the population means, 1 - 2. As we have
just discussed, there are three possible scenarios, but since the population variances

10
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
are unknown, we can discard the first one. In order to decide whether scenario (2) or (3)
is the more appropriate, we need to compare the sample variances.

For the sake of illustration, we develop the required confidence interval first manually,
but to save time we are going to obtain the sample means and sample standard
deviations with R.

Launch RStudio, create a new RStudio project and script, import the data from the Excel
file to RStudio and load it into your current project.

Last week you learnt that for a single variable these statistics are provided by the mean
and sd commands. This time, however, we need them separately for the two different
types of housholders. This can be achieved by the

by(data, byvar, fun)

function, where data is the variable or data frame to be analysed, byvar is the variable that
specifies the groups (i.e. grouping variable), and fun is a function to be applied to the subsets
of data.

Hence, execute

by(Age, Householder, mean)

to obtain the sample means

Householder: N
[1] 47.2

Householder: P
[1] 39.8

and

by(Age, Householder, sd)

to obtain the sample standard deviations

Householder: N
[1] 13.62119

Householder: P
[1] 10.03992

The two sample variances are s12 = (13.621)2 = 185.532 and s22 = (10.040)2 = 100.802.
Their ratio, s12 / s22 = 1.84, seems to be too big to assume that the corresponding

11
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
population variances are equal, so let’s follow the third scenario.3 Assume, moreover,
that the sampled populations are not extremely non-normal. Then we can develop the
required confidence interval as follows.

The estimates of the variances of the sample means are

s12 185.532 s 2 100.802

sx21    9.277 , sx22  2   5.040
n1 20 n2 20

and the estimate of the standard error of the difference between the two sample means
is

s x1  x2  s x2  s x22  9.277  5.040  3.784

The degrees of freedom for the t distribution is

s 
2
2
x1  x2 (3.7842 ) 2
df    34.9  35
s   s 
2 2
2 2 9.277 2 5.040 2
x1 x2 
19 19
n1  1 n2  1

and the t reliability factor from the t-table is

tdf , /2  t35,0.05 1.690

Putting all these together, the 90% confidence interval estimate of the difference
between the mean ages of purchasers and non-purchasers is

 x1  x2   tdf , /2 sx x 1 2
 (47.2  39.8) 1.690  3.784  (1.005;13.795)

It means that with 90% confidence the difference between the mean ages of purchasers
and non-purchasers is somewhere between 1.0 and 13.8 years.4

b) What assumptions did you make in part (a)? Are they likely satisfied?

The confidence interval in part (a) was based on the assumptions that the data consists
of two independent random samples, the variable of interest is quantitative and
continuous, the measurement scale is interval or ratio, the population standard
deviations are unknown but the sampled populations are normally distributed (at least
approximately).

3
The ratio of the two sample variances is s12 / s22 = 1.84, while the ratio of the two sample standard deviations
is s1 / s2 = 1.36, i.e. much smaller. In order to decide whether to follow scenario (2) or (3), we compared the
sample variances not the sample standard deviations because the question is whether the unknown population
variances are equal or not.
4
We shall get this confidence interval with R as well a bit later.
12
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
We can take the first assumption granted as it was explicitly mentioned that the study
was based on “independent random samples”. The variable of interest is the age of the
householder primarily responsible for buying toothpaste. It is a quantitative and
continuous variable. However, the actual observations are rounded to the nearest year,
so they are discrete values, but since there are large number of possible values, for the
purpose of hypothesis testing we can still treat this variable as being continuous.

As for normality, although the sample sizes are a bit small, let’s apply stat.desc on the
two samples separately. We can select certain obserbvations with the

subset(x, cond)

function, where x is the object to be subsetted and cond is a logical expression that indicates
which elements of x to keep.

In this case x is Age and cond is based on the Householder grouping variable. Hence,

library(pastecs)
stat.desc(subset(Age, Householder == "N"),
basic = FALSE, desc = TRUE, norm = TRUE)

returns the descriptive statistics and the Shapiro-Wilk test result for the Age of those
Householders who are not primarily responsible for buying toothpaste,

and

stat.desc(subset(Age, Householder == "P"),

basic = FALSE, desc = TRUE, norm = TRUE)

returns the same for the Age of those Householders who are primarily responsible for
buying toothpaste,

As you can see, for both groups, the mean and the median are close to each other5, the
skewness and kurtosis statistics are smaller in absolute value than twice their standard

5
For the Householder = “N” group the difference between the mean and the median is 4.8, which is less than
10% of the mean and also smaller than 2 standard errors of the mean (6.092).
13
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
errors6, and the p-values of the Shapiro-Wilk tests are larger than 0.1. All in all, these
statistics do not cast any doubt on the normality assumption.
Recall, that we also assumed that population variances are unequal based on a simple
comparison of the sample variances. Next week you will be asked to perform a formal
hypothesis test to see whether there is indeed a significant difference between the two
population variances.

c) Do the data present sufficient evidence to conclude that there is a difference in the mean
age of purchasers and non-purchasers? Assume that the populations are normally
distributed and use α = 0.10. Perform the test first manually and then with R.

The question implies the following null and alternative hypotheses

H0 : 1  2  0 , HA : 1  2  0

Recall that confidence interval estimation with a 90% level of confidence is equivalent
to two-tail hypothesis testing at the 10% level. Since the confidence interval in part (a)
does not include zero, we can conclude at the 10% level of significance that there is a
significant difference between the mean ages of purchasers and non-purchasers.

Still, for the sake of illustration, let’s now perform a formal hypothesis test as well. Under
scenario (3) the test statistic is

T 
X 1  X 2    D ,0
 t df
s x1  x2

The degrees of freedom is the same than in part (a), i.e. 35, and thus the upper 5%
critical value is the same than the reliability factor in part (a), i.e. 1.690, and the lower
5% critical value is -1.690. The null hypothesis is to be rejected if the observed test static
value is smaller than the lower critical value or larger than the upper critical value, or in
brief, if the absolute value of the observed test statistic is larger than the upper critical
value.

Using the details from part (a),

 x1  x2    D ,0 (47.2  39.8)  0
t obs    1.956
s x1  x2 3.784

Since this observed test statistic value is above the upper critical value, we reject H0 and
conclude at the 10% significance level that the mean ages of purchasers and non-
purchasers differ from each other.

We can perform this test with R by executing the following command:

t.test(Age ~ Householder, conf.level = 0.90)

6
Recall (Tutorial 3, p. 13) that on the printout skew.2SE is skewness divided by 2 times its standard error,
kurtosis is actually excess kurtosis, and kurt.2SE is excess kurtosis divided by 2 times its standard error.
14
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
where ~ (called tilde) is an operator. In general, it means “as a function of” and it
separates the left-had side (dependent variable) and right-hand side (independent
variable) in a model formula. In this case, Age ~ Householder implies that the t.test
command is to be executed on the difference between the two means of Age specified
by the two categories of Householder.7

It returns the following output:

By default, R assumes that the population variances are different (3rd scenario) and
performs an unequal variances t-test originally developed by Welch. The test statistic is
tobs = 1.9557 and the p-value is 0.05853, so H0 can be rejected even at the 6% level.

This printout also shows the 90% confidence interval for the difference between the
mean ages of purchasers and non-purchasers: (1.006765 ; 13.793235). It is practically
the same than in part (a).

What if there is reason to believe that the population variances are equal? In this case
we need to add the var.equal = TRUE argument to the t.test command. The augmented
command,

t.test(Age ~ Householder, var.equal = TRUE, conf.level = 0.90)

returns the following results:

If you compare the two printouts, you can see that the test statistics are the same, but
the numbers of degrees of freedom are different (34.94 vs. 38) and hence the 90%
confidence interval estimates and the p-values are not the same either. However, the
the difference between the p-values is very small, so as far as the t-test is concerned,

7
Note that the data in the t4e2.xlxs file is arranged in long format while in t4e1.xlxs it is in wide format. That’s
why in Exercise 1 in the t.test command we specified the variables as Pattern1, Pattern2, while this time we
specified them as Age ~ Householder. You will learn about these two data formats on Tutorial 6.
15
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
this time it does not really matter whether we assume equal or different population
variances.

d) Although the original question implies a two-tail test, for the sake of illustration, perform
a left-tail and a right-tail t-test as well and compare the printouts to each other.

By default the t.test command assumes that the t-test is a two-tail test, so when the
hypotheses are

H0 : 1  2  0 , HA : 1  2  0

it has to be augmented with the alternative = “less” argument.

The augmented command,

t.test(Age ~ Householder, conf.level = 0.90, alternative = "less")

returns

Likewise, when the hypotheses are

H0 : 1  2  0 , HA : 1  2  0

the appropriate command is

t.test(Age ~ Householder, conf.level = 0.90, alternative = "greater")

and it returns

There are two differences between these two printouts.

16
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
First, the confidence interval that corresponds to the left-tail t-test is open from below (-
, 12.34), while the confidence interval that corresponds to the right-tail t-test is open
from above (2.46, ).

Second, the left-tail t-test fails to reject the null hypothesis (p-value = 0.9707), while the
right-tail t-test rejects it even at the 3% significance level (p-value = 0.02927). This is
because R considers Householder = “N” as population 1 and Householder = “P” as
population 2, and hence the difference between the two sample means is positive (47.2
– 39.8 = 7.4), providing no support for the left-sided alternative hypothesis.

What if we intend to compare the central locations of two populations which are very non-
normal, or do not have means because they are measured on an ordinal scale, or do have
means but we prefer to use the medians to measure their central locations? In these cases,
we should use some nonparametric test for the difference between the population medians.
The simplest option is the Wilcoxon rank-sum test.

The Wilcoxon rank-sum test is similar to the Wilcoxon signed ranks test, but instead of
classifying the observations based on their relative positions to the hypothesized median, it
classifies the observations according to some characteristic of the experimental units (in the
current example according to being or not the toothpaste purchaser in the household).

The Wilcoxon rank-sum test is is based on the following assumptions:

i. The data consists of two independent random samples of independent observations

(i.e. both the samples and the observations within each sample are independent)..
ii. The variable of interest is continuous …
iii. … and the measurement scale is at least ordinal.
iv. The two sampled populations that differ at most with respect to their central locations
measured by the medians (i.e. they are identical in shape and spread).

Let sample 1 be the smaller (not bigger) sample, so that n1  n2. To perform the Wilcoxon
rank-sum test, we need to rank all available observations in the combined (pooled) sample
from the smallest to the largest averaging the ranks of tied observations and calculate the
rank sums of the two samples (T1 and T2).8 The test statistic is T = T1.

The exact small sample (n1  10, n2  10) lower and upper critical values, TL and TU, are in
Table 8, Appendix B (p. 1088) of the Selvanathan book. We can reject H0 if

(i) right-tail test: T ≥ TU,,

(ii) left-tail test: T ≤ TL,,
(iii) two-tail test: T ≥ TU,/2 or T ≤ TL,/2.

For larger sample sizes the sampling distribution of T has a normal approximation with
parameters

8
To perform the test, we need only T1, the rank sum of the smaller (not bigger) sample, but it is recommended
to check whether T1 + T2 is equal to n(n+1)/2.
17
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
n1 ( n1  n2  1) n n ( n  n2  1)
T  ,  T2  1 2 1
2 12

e) Assume this time that the populations are not normally distributed and perform the
Wilcoxon rank-sum test to see whether there is a difference in the median age of
purchasers and non-purchasers (use α = 0.10). Perform the test first manually and then
with R.

The hypotheses are

H0 : 1 2  0 , HA : 1 2  0

The ranks are shown on the next page.

The rank sums are T1 = 338 and T2 = 482. Their sum is 820, equal to

n ( n  1) 40  41
  820
2 2

confirming our rank calculations.

The test statistic is T1 = 338 and the sample sizes are larger than 10, so we rely on
normal approximation. The Z critical values for a two-tail test with 10% significance level
are

 z 0 .0 5   1 .6 4 5

and we reject H0 if the calculated test statistic is smaller than -1.645 or greater than
1.645.

The expected value and variance of the test statistic under the null hypothesis are

n1 ( n1  n2  1) 20  41 n n ( n  n2  1) 20  20  41
T    410 ,  T2  1 2 1   1366.7
2 2 12 12

and the observed standardised test statistic is

T  T T1  T 338  410
zobs     1.948
T T 1366.7

Since the absolute value of the observed test statistic is larger than 1.645, we can reject
H0 and conclude that at the 10% significance level there is a difference in the median
age of purchasers and non-purchasers.

18
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
Householder Age Rank T
P 34 13.00
P 35 15.50
P 23 2.00
P 44 18.50
P 52 28.50
P 46 22.00
P 28 5.00
P 48 23.50
P 28 5.00
P 34 13.00
P 33 10.50
P 52 28.50
P 41 17.00
P 32 9.00
P 34 13.00
P 49 25.00
P 50 26.00
P 45 20.50
P 29 7.00
P 59 35.50 338.00
N 28 5.00
N 22 1.00
N 44 18.50
N 33 10.50
N 55 33.00
N 63 39.00
N 45 20.50
N 31 8.00
N 60 37.00
N 54 32.00
N 53 31.00
N 58 34.00
N 52 28.50
N 52 28.50
N 66 40.00
N 35 15.50
N 25 3.00
N 48 23.50
N 59 35.50
N 61 38.00 482.00

Using R, this test can be performed with the wilcox.exact command of the
exactRankTests package. Execute the following commands:9

library(exactRankTests)
wilcox.exact(Age ~ Householder)

9
By default, wilcox.exact assumes that the samples are independent, i.e. paired = FALSE, so unlike earlier in
in Exercise 1 (page 7), it is not necessary to use the paired argument.
19
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
You should get the following printout:

The reported test statistic is W = 272 and the p-value is 0.05129, so H0 can be rejected
even at the 5.2% significance level.

But, where did this test statistic come from? When we performed the test manually, we
used T = T1 = 338 as the test statistic because the sample sizes are equal and the first
sample in the data file is Householder = “P”. R, however, considers the Householder =
“N” sample the first sample10, and reports an adjusted version of the Wilcoxon test
statistic11,

n2 ( n2  1) 20  21
W  T2   482   272
2 2

f) Like in part (d), perform a left-tail and a right-tail Wilcoxon rank-sum test as well and
compare the printouts to each other.

For a left-tail test, execute

wilcox.exact(Age ~ Householder, alternative = "less")

to obtain

For a right-tail test

wilcox.exact(Age ~ Householder, alternative = "greater")

returns

10
You can see this on the previous t.test printouts.
11
The adjustment term is the smallest possible value of T, i.e. the sum of the first n positive integers where n
is the number of observations in the ‘first’ sample.
20
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
Like in the case of the t.test command, the test statistic reported by wilcox.exact does
not depend on the nature of the test (i.e. two-tail, left-tail or right-tail), but the p-value
does. For the left-tail test it is far too big (0.9752) to reject H0, while for the right-tail test
it is small enough (0.02565) to reject H0 even at the 3% significance level.

Quit RStudio and save your RData and R files.

Exercise 3

The owner of a computer store is concerned about the one-year parts and labour warranty
on its top two bestselling brands of laptop computers, brand A and brand B. In particular, he
would like to know whether there is a difference between these two brands in terms of the
time between the sale of a laptop and its return for repair under warranty. In the last month
there were 6 claims for warranty repairs of brand A laptops and 9 claims for warranty repairs
of brand B laptops. The number of days these laptops had been owned prior to coming in
for repair are saved in the t4e3 Excel file. Perform the Wilcoxon rank sum test at the 5%
significance level, both manually and with R, to assist the owner in his quest.12

Since there 6 returned brand A laptops but 9 brand B laptops, we consider brand A as #1
and brand B as #2. The hypotheses are

H0 : 1 2  0 , HA : 1 2  0

The 5% critical values from Table 8, Appendix B of Selvanathan are TL = 31 and TU = 65,
and H0 is rejected if T  TL or T  TU, where T = T1.

Like in Exercise 2, we need to rank all 15 observations from the smallest to the largest and
calculate the rank sums. The details are shown in the table below.13

Brand_A Brand_B
Days Rank Days Rank
225 12.5 83 7.0
79 6.0 52 3.5
225 12.5 113 9.0
52 3.5 67 5.0
29 1.0 165 11.0
98 8.0 132 10.0
   48 2.0
   230 14.0
      255 15.0
T1= 43.5 T2= 76.5

The rank sum is T1 = 43.5 for Brand A and T2 = 76.5 for Brand B. Their sum is 120, equal to

12
Note that this time the sample sizes are far too small to assess normality in any reasonable way.
13
Try to reproduce this table for the sake of practice.
21
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
n ( n  1) 15  16
  120
2 2

confirming our rank calculations.

The observed test statistic is T = T1 = 43.5. It is between the lower and upper critical values,
so at the 5% significance level there is not enough evidence to reject H0.
To repeat this test with R, launch RStudio, create a new RStudio project and script, import
the data from the Excel file to RStudio and load it into your current project. Then, execute
the following commands

library(exactRankTests)
wilcox.exact(Brand_A, Brand_B)

to obtain

The reported test statistic is

n1 ( n1  1) 67
W  T1   43.5   22.5
2 2

and the p-value is far too big to reject H0 at any reasonable significance level. Hence, we
cannot conclude that there is a difference between the two brands of laptops in terms of the
time between sale and return for repair under warranty.

Suppose now that the test is a one-tail test and the significance level is still 5%. The test
statistic does not change, but the critical values and the decision rule do. The new critical
values are TL = 33 and TU = 63.

For a left-tail test

H0 : 1 2  0 , HA : 1 2  0

and H0 can be rejected if T ≤ TL. In this case T = 43.5 is larger than TL = 33, so H0 cannot be
rejected.

This test can be performed with R by executing the following command:

wilcox.exact(Brand_A, Brand_B, alternative = "less")

which returns

22
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
The p-value is 0.3131, so H0 is maintained.

For a right-tail test

H0 : 1 2  0 , HA : 1 2  0

and H0 can be rejected if T ≥ TU. Since T = 43.5 is smaller than TU = 63, again we fail to
reject H0.

The appropriate R command is

wilcox.exact(Brand_A, Brand_B, alternative = "greater")

which returns

The p-value is 0.7069, again far too big to reject H0.

Quit RStudio and save your RData and R files.

23
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4
Exercises for Assessment

Exercise 4 (Selvanathan et al., p. 887, ex. 20.9)

In recent years, insurance companies offering medical coverage have given discounts to
companies that are committed to improving the health of their employees. To help determine
whether this policy is reasonable, the general manager of one large insurance company in
the US organised a study of a random sample of 30 workers who regularly participate in
their company’s lunchtime exercise program and 30 workers who do not. Over a two-year
period, he observed the total dollar amount of medical expenses for each individual. The
data are stored in the t4e4 (column 1: Expenses; column 2: Exercise, 1 for yes, 0 for no)
Excel file. Do all calculations with R.

a) Can the manager conclude at the 5% significance level that companies that provide
exercise programs should be given discounts? Perform an independent-samples t-test
to answer the question. Do not forget to specify the null and alternative hypotheses.

b) What assumptions must hold to ensure the validity of the hypothesis test in part (a)
above? Does it appear that these conditions are satisfied?

c) Assuming that some of the assumption(s) mentioned above is (are) not satisfied, which
nonparametric hypothesis-testing procedure could be used? Conduct this test and give
the appropriate conclusion in the context of the problem.

d) Compare your conclusions in parts (a) and (c).

Exercise 5 (Selvanathan et al., p. 886, ex. 20.5)

In a taste test of a new beer, 25 people rated the new beer and another 25 rated the leading
brand on the market. The possible ratings were Poor, Fair, Good, Very Good, and Excellent.

a) Suppose the responses for the new beer and the leading beer were stored using a 1-2-
3-4-5 coding system (1 = Poor, …, 5 = Excellent). Based on the data saved in the t4e5a
file, can we infer that the new beer is rated less highly than the leading brand?

b) Suppose the responses were recoded so that 3 = Poor, 8 = Fair, 22 = Good, 37 = Very
Good, and 55 = Excellent. Based on the recoded data, saved in the t4e5b file, can we
infer that the new beer is rated less highly than the leading brand?

c) What does this exercise tell you about ordinal data?

Do all calculations with R.

24
L. Kónya, 2020, Semester 2 ECON20003 - Tutorial 4

Complete Business Statistics: The Comparison of Two Populations
No ratings yet
Complete Business Statistics: The Comparison of Two Populations
66 pages
Chap 008
No ratings yet
Chap 008
43 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
60 pages
OLANTIGUE Written Report
No ratings yet
OLANTIGUE Written Report
15 pages
Oultine 4
No ratings yet
Oultine 4
1 page
Statistics Notes
No ratings yet
Statistics Notes
18 pages
Hypothesis Testing for Means
No ratings yet
Hypothesis Testing for Means
61 pages
Independent t Test Explained
No ratings yet
Independent t Test Explained
11 pages
T TestHandout2009
No ratings yet
T TestHandout2009
18 pages
Theory Hypothesis Design Data: To Answer / To Test Research Study Collect
No ratings yet
Theory Hypothesis Design Data: To Answer / To Test Research Study Collect
44 pages
Mean Separation
No ratings yet
Mean Separation
47 pages
CHAP 9 Hypothesis Testing Two Population 1
No ratings yet
CHAP 9 Hypothesis Testing Two Population 1
4 pages
MBA Graduate Salary Comparison Study
No ratings yet
MBA Graduate Salary Comparison Study
7 pages
CH 10
No ratings yet
CH 10
12 pages
Geog 3mb3 Section 4
No ratings yet
Geog 3mb3 Section 4
30 pages
Module 4 T Test For Independent
No ratings yet
Module 4 T Test For Independent
8 pages
Small Sample Test and Chi-Square Test
No ratings yet
Small Sample Test and Chi-Square Test
9 pages
Engineering Data Analysis M9 Finals
No ratings yet
Engineering Data Analysis M9 Finals
39 pages
Statistical Inferences: Two Populations
No ratings yet
Statistical Inferences: Two Populations
26 pages
4.02 Comparing Group Means - T-Tests and One-Way ANOVA Using Stata, SAS, R, and SPSS (2009)
No ratings yet
4.02 Comparing Group Means - T-Tests and One-Way ANOVA Using Stata, SAS, R, and SPSS (2009)
51 pages
University of Engineering & Management, Jaipur: University Examination MBA, 1 Year, 2 Semester
No ratings yet
University of Engineering & Management, Jaipur: University Examination MBA, 1 Year, 2 Semester
6 pages
Check List For Reviewing Mathematical Statistics
No ratings yet
Check List For Reviewing Mathematical Statistics
3 pages
Statistics Ms Word For Comp App
No ratings yet
Statistics Ms Word For Comp App
16 pages
Annova
No ratings yet
Annova
4 pages
Chapter 5: Two Samples Tests of Hypothesis
No ratings yet
Chapter 5: Two Samples Tests of Hypothesis
5 pages
Practice Exam Chapter 10-TWO-SAMPLE TESTS: Section I: Multiple-Choice
No ratings yet
Practice Exam Chapter 10-TWO-SAMPLE TESTS: Section I: Multiple-Choice
19 pages
Statistics Summary
No ratings yet
Statistics Summary
26 pages
Main Module Business Statistics Class 12
No ratings yet
Main Module Business Statistics Class 12
33 pages
Sec Assignment - Unit II
No ratings yet
Sec Assignment - Unit II
14 pages
13 Hypothesis Test Concerning Means Two Population Means
No ratings yet
13 Hypothesis Test Concerning Means Two Population Means
32 pages
STA 204 Lecture Note 2 - Continuation
No ratings yet
STA 204 Lecture Note 2 - Continuation
25 pages
2802 Key Points
No ratings yet
2802 Key Points
34 pages
Statistics Exam for Bachelor Students
No ratings yet
Statistics Exam for Bachelor Students
2 pages
6-T Test For Difference of Means
No ratings yet
6-T Test For Difference of Means
22 pages
Chapter 19 Main Topics
No ratings yet
Chapter 19 Main Topics
6 pages
Data Analysis Lecture
No ratings yet
Data Analysis Lecture
17 pages
Tests of Hypotheses: Nahda College Program of Pharmacy 2019-2020
No ratings yet
Tests of Hypotheses: Nahda College Program of Pharmacy 2019-2020
26 pages
Final Exam
No ratings yet
Final Exam
13 pages
R Unit-4
No ratings yet
R Unit-4
13 pages
Stats With R
No ratings yet
Stats With R
21 pages
Session 2 On Hypothesis Testing
No ratings yet
Session 2 On Hypothesis Testing
13 pages
Parametric and Non Parametric Assignment
No ratings yet
Parametric and Non Parametric Assignment
17 pages
Hypothesis Testing Guide
No ratings yet
Hypothesis Testing Guide
36 pages
Research G9 Q3M3 Independent-Samples-T-Test EDITED
No ratings yet
Research G9 Q3M3 Independent-Samples-T-Test EDITED
10 pages
Hypothesis Testing: T-Test & ANOVA
No ratings yet
Hypothesis Testing: T-Test & ANOVA
17 pages
T Test
No ratings yet
T Test
11 pages
Hypothesis Test Part Two - 012729
No ratings yet
Hypothesis Test Part Two - 012729
32 pages
Student T Test
No ratings yet
Student T Test
12 pages
T Test and Anova - Unit 3 - Notes
No ratings yet
T Test and Anova - Unit 3 - Notes
41 pages
WK 3 Ltwo Sample 24 ST
No ratings yet
WK 3 Ltwo Sample 24 ST
138 pages
Chapter Three
No ratings yet
Chapter Three
14 pages
145-Unit 4 Review
No ratings yet
145-Unit 4 Review
6 pages
Stat - 5 Two Sample Test
100% (1)
Stat - 5 Two Sample Test
15 pages
William Sealy Gosset
No ratings yet
William Sealy Gosset
11 pages
Section 06.5 and Essential Synthesis C Shared Lab - New
No ratings yet
Section 06.5 and Essential Synthesis C Shared Lab - New
5 pages
People Analytics Parte 7
No ratings yet
People Analytics Parte 7
30 pages
Stratman Deptals Reviewer 1
No ratings yet
Stratman Deptals Reviewer 1
7 pages
(Ebook PDF) Statistics For Social Workers 9th Edition Download
100% (1)
(Ebook PDF) Statistics For Social Workers 9th Edition Download
47 pages
Python Doctest Module Tutorial
No ratings yet
Python Doctest Module Tutorial
3 pages
Quality Control in Clinical Laboratory 1698297013
No ratings yet
Quality Control in Clinical Laboratory 1698297013
48 pages
Communicating With Children
100% (1)
Communicating With Children
35 pages
Nursing Students' Logic Guide
No ratings yet
Nursing Students' Logic Guide
57 pages
What Is The Difference Between LCD, LED and PLASMA
No ratings yet
What Is The Difference Between LCD, LED and PLASMA
5 pages
CLEAR - Q2 - Math 8 - Week 1
No ratings yet
CLEAR - Q2 - Math 8 - Week 1
20 pages
Financial Planning 2nd Edition McKeown Digital Access
No ratings yet
Financial Planning 2nd Edition McKeown Digital Access
406 pages
Climatic Zones and Their Salient Features
No ratings yet
Climatic Zones and Their Salient Features
10 pages
1042 620 Eh 001 - 0
No ratings yet
1042 620 Eh 001 - 0
55 pages
(Worksheet 15.2) - (Probability)
No ratings yet
(Worksheet 15.2) - (Probability)
10 pages
Guidelines For Students For The Event - Code-A-Haunt
No ratings yet
Guidelines For Students For The Event - Code-A-Haunt
2 pages
Weather Vocabulary Quiz
No ratings yet
Weather Vocabulary Quiz
13 pages
Multiscale Transformer and Attention Mechanism For Magnetic Spatiotemporal Sequence Localization
No ratings yet
Multiscale Transformer and Attention Mechanism For Magnetic Spatiotemporal Sequence Localization
16 pages
Item No. 3 - Noise Survey Summary Data 2022-04-01 DENR
No ratings yet
Item No. 3 - Noise Survey Summary Data 2022-04-01 DENR
4 pages
Project 2 - Preliminary Project
No ratings yet
Project 2 - Preliminary Project
6 pages
Nestle Strategy Audit Report
No ratings yet
Nestle Strategy Audit Report
7 pages
Genetic Diversity Wild Almonds PDF
No ratings yet
Genetic Diversity Wild Almonds PDF
21 pages
Ipdc-2 Unit1
No ratings yet
Ipdc-2 Unit1
14 pages
PRM
No ratings yet
PRM
24 pages
Mat2a201 5
No ratings yet
Mat2a201 5
14 pages
Single Phase Line Parameters with MATLAB
No ratings yet
Single Phase Line Parameters with MATLAB
3 pages
Ambica Brochure
No ratings yet
Ambica Brochure
12 pages
Automotive NVH With Abaqus
No ratings yet
Automotive NVH With Abaqus
26 pages
Culture and Society Lesson 2.3
No ratings yet
Culture and Society Lesson 2.3
40 pages
اللّسانيات الغربيَّة وإشكاليَّة التَّلقي في الوطن العربيِّ
No ratings yet
اللّسانيات الغربيَّة وإشكاليَّة التَّلقي في الوطن العربيِّ
15 pages
What Makes Us Human
No ratings yet
What Makes Us Human
7 pages
2015 Natural Science Placement Results
No ratings yet
2015 Natural Science Placement Results
13 pages
Data Management (Part 1)
No ratings yet
Data Management (Part 1)
24 pages