[go: up one dir, main page]

0% found this document useful (0 votes)
113 views5 pages

Chapter 7 Database Analysis Answers

This document provides answers to questions about analyzing databases from Chapter 7 of the textbook "Business Statistics by Ken Black, 7th edition". It includes explanations and calculations for selecting a random sample from a population using random number tables and Excel, determining the mean and probability of a sample mean for household food spending data, and calculating probabilities for proportions of hospital types in samples compared to population values.

Uploaded by

Ahmed Shehata
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
113 views5 pages

Chapter 7 Database Analysis Answers

This document provides answers to questions about analyzing databases from Chapter 7 of the textbook "Business Statistics by Ken Black, 7th edition". It includes explanations and calculations for selecting a random sample from a population using random number tables and Excel, determining the mean and probability of a sample mean for household food spending data, and calculating probabilities for proportions of hospital types in samples compared to population values.

Uploaded by

Ahmed Shehata
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 5

Business Statistics by Ken Black, 7th edition

Answers to Analyzing the Databases Questions in Chapter 7

1. The first question in the text is not clear. There should be an additional
sentence (to be the 4th sentence) that says, Explain how would you go
about doing this. The answer is something like:
The 140 SIC codes need to be numbered. Since a table of random
numbers contains random digits in all directions, we could use the SIC
codes as they are listed. All SIC codes here have three digits. If, on the
other hand, the researcher wished to start at 1 and number
consecutively, then each SIC would be numbered from 001 to 140.
Note that each SIC code must have the same number of digits when
we go to select random numbers. Next, we go to a table of random
numbers (like table A.1). We start wherever we want in the table and
pick three digits. If the three digits match one of our SIC code numbers
or one of the recoded numbers (if we renumbered from 001 to 140),
then that SIC code has been selected, if not we pick another and so on
until we have selected 6 different SIC codes. In using a table of
random numbers, sometimes we have a predetermined pattern that
we use. For example, we start at some point in table A.1 and select
three digits. Then we have predetermined to skip say five digits and
pick the next three and so on down the row. Once we reach the end of
a row, maybe we have decided to skip down two rows and start again.
Since the idea is to randomly generate numbers between 001 and 140,
the order or pattern we use is not important as long as selection the
process is consistent. Another option for selecting random numbers is
to use Excel. In Excel the =RANDBETWEEN(1,140) will generate a
random number between 1 and 140. We can do this 6 times to obtain 6
different random numbers. If Excel happens to generate the same
number twice, we simply repeat the function until we have 6 unique
values. Note, if we choose to use Excel to generate random numbers
the SIC codes will need to be consecutively numbered from1 to 140; at
present they are not consecutive and range from 201 to 399.
Suppose we want to take a systematic sample of size 10 (n = 10)
from a population of 140 SIC code numbers (N = 140). We can use the
systematic sampling formula to determine the sampling cycle.

= 14

Based on these results, we would select every 14th SIC code. To


determine where to start (or what our first value should be), we select
a random number between 01 and 14 and let that be the starting
point. For example, if we randomly select the value 08, then we have
selected the eighth SIC code as our first member of the sample. Our
next value is determined by moving down 14 values. Thus, 08 + 14 =
22nd value, which is our next selection. The third value would be the 22
+ 14 = 36th value and so on. Using the manufacturing database, we
find the 8th value is SIC code 208, the 22nd value is SIC code 229, and
the 36th value is SIC code 245.
To stratify the dataset we could use the variable Value of Industry
Shipments which divides the companies into four groups according to
the number of dollars of value of their industry shipments. The bigger
the number, the higher the value of the shipments (see p. 14 of
chapter 1). The population could also be stratified by Industry Group.
There are twenty industry groups represented in these data. In a
study, it might be useful to have representation from each industry
group. One could stratify on other variables, but the strata would need
to be identified first. For example, Cost of Materials values span a
large range. A researcher could identify meaningful breaks in the
data or categories and further organize the data into these categories
which could then be used for stratifying.
2. The mean annual food spending per household in the Consumer Food
database is
$ 8,966.07 and the standard deviation is $ 3,125.01. Using Excels
=RANDBETWEEN(1,200) function, a random sample of 32 numbers
was generated
between 1 and 200. Of course, students will have their own random
numbers, but
these can be used as an example. The random numbers are:
4
9
11
20
22
39
44
51
54
57
68
76
80
83
95 103 110 114 118 125 130 135 139 148
152 158 160 169 171 177 192
200
Using these random numbers, we select the annual food spending
associated with these households and they are:

$14,112
$ 6,771
$10,944
$ 3,681
$ 8,185
$11,358

$13,514
$ 2,630
$ 4,264
$ 9,863
$ 7,436
$ 9,908

$ 7,622
$ 9,559
$ 8,865
$ 3,875
$15,397

$ 8,694
$12,630
$ 8,444
$10,650
$12,529

$ 8,678
$12,300
$11,290
$ 8,056
$ 8,868

$ 8,642
$ 4,261
$11,869
$ 5,762
$ 8,456

The sample mean of these 32 values is $ 9,034.78

Looking up z = 0.12 in the standard normal distribution table results


in an area of
.0478. To solve for the probability of obtaining a mean this larger or
larger, subtract the table value from .5000:
.5000 - .0478 = .4522
Thus, there is a 45.22% chance of randomly selecting 32 families for
which the sample
mean is greater than or equal to $9,034.78.
However, because the sample size is larger than 5% of the total
population size, the finite correction factor should be used:

Looking up z = 0.14 in the standard normal distribution table results in


an area of
.0557. To solve for the probability of obtaining a mean this larger or
larger, subtract the table value from .5000:
.5000 - .0557 = .4443
Thus, there is a 44.43% chance of randomly selecting 32 families for
which the sample
mean is greater than or equal to $9,034.78.

Note, the difference in the probability obtained using the finite


correction factor is
.0079 or 0.79%.
3. Examining the variable, Control, out of the 200 hospitals in the
database, 86 are category 2, nongovernment not-for-profit. The
proportion of these hospitals in the database is: p = 86/200 = .43.
If n = 500 hospitals are sampled, what is the probability that
> .45
is computed as:

= 0.90

Looking up z = 0.90 in the standard normal distribution table results


in an area of
.3159. To solve for the probability of obtaining a proportion this larger
or larger, subtract the table value from .5000:
.5000 - .3159 = .1841
Assuming the proportion of nongovernment not-for-profit hospitals is
forty-three percent across the United States, there is a 18.41% chance
that in a sample of 500 hospitals forty-five percent or more would be
nongovernment not-for-profit hospitals.
If n = 100 hospitals are sampled, what is the probability that

< .40

is computed as:
=

= -0.61

Looking up z = -0.61 in the standard normal distribution table results


in an area of

.2291. To solve for the probability of obtaining a proportion smaller


than .40, subtract the table value from .5000:
.5000 - .2291 = .2709
Assuming the proportion of nongovernment not-for-profit hospitals is
forty-three percent across the United States, there is a 27.09% chance
that in a sample of 100 hospitals less than forty percent would be
nongovernment not-for-profit hospitals.

You might also like