3.data Summarizing and Presentation PDF
3.data Summarizing and Presentation PDF
3.data Summarizing and Presentation PDF
AND PRESENTATION
3.1 Introduction
fi fi
fir = , f i r (%) = 100 , with the property:
f i f i
f i
r
= 1 , and f i
r
(%) = 100
Current number 1 2 3 4 5 6 7 8 9 10 11 12 13 14
Number of persons 4 4 4 4 4 4 4 4 4 4 4 4 3 4
on a household
Total 14 1 360
0
A relative frequency expresses the weight of the absolute frequency into the
total and it will be computed:
f i = absolute _ frequency;
fi
fi =
r
f
; f i = total _ population _ size;
i
f i r = relative _ frequency.
3 members
7%
4 members
93%
Pie charts are more suggestive and easy to understand than tables.
Step 3: Compute the class size, k, if all the classes have the same size and
the variation is continuous: k= A/r, where: A amplitude; r - number of
classes
Step 4: Construct the classes, by adding step by step the class size k starting
with the minimum value xmin (or a smaller value) or by subtracting the class
size step by step starting from the maximum value xmax (or from a larger
value).
Example:
The class marks of a distribution of a daily number of monthly average
numbers of trips made by an employee, are 10.5, 18.5, 26.5, 34.5, and 42.5.
Find: a) the class boundaries; b) the class limits
Solution:
a) The boundary - between the first two classes is:
(10.5 + 18.5)/2 = 14.5
Statistics for Business Administration
There is a basic distinction between a data that is discrete and one that is
continuous. A data set is discrete if we make a count, for example the
number of people in a room or the number of cars sold last month.
There are some exceptions, for example, money can be seen as discrete
since it changes hands in increments but is usually treated as continuous
because the increments can be relatively small. Age is a continuous variable
but when quoted as age last birthday becomes discrete.
Data Summarizing and Presentation
The counts of cars sold by model, and region in the Table 3-1 below
represents discrete data.
Numbers of four types of cars sold by region during the last financial year
Table 3-1
Model North South East West Total
Sport 675 60 35 20 790
Sedan 30 490 30 20 570
Break 150 180 235 15 580
Van 5 20 0 35 60
Total 860 750 300 90 2000
Percentages unit sales of four types of industrial trolley by region during the last
financial year
Table 3-2
Model North South East West
Sport 78.49 8.00 11.67 22.22
Sedan 3.49 65.33 10.00 22.22
Break 17.44 24.00 78.33 16.67
Van 0.58 2.67 0.00 38.89
Total 100.00 100.00 100.00 100.00
It can be seen that Sport cars accounts for 78.49% of unit sales in the North
and only 8.00% of unit sales in the South. To calculate these percentages we
can first find the fraction and then multiply by 100. Sport cars for example,
accounts for 675 sales out of the 860 achieved in the northern sales region .
The figure 675 divided by the total of 860 = 0.7849 = 78.49% (if expressed
in percentages). The original totals are often given as base figures.
These take a similar form to their numeric counterparts, except that the
groups (or classes) describe qualitative (i.e. non-numeric) characteristics of
the data. For example:
Employee distribution
Table 3-3
The name given to the data describing the values of some variable over
successive time periods is a time series. For example, the data given in
Tables 3-5 and 3-6 are typical.
Production evolution
Table 3-5
The data in table 3-8, although similar in form to that of table 3-7, have
classifications for each month, which cannot be added to form meaningful
totals. They are sometimes called multiple time series, since the data given
for each type of ice-cream is a separate time series.
Price of Vanilla
ice-cream
Chocolate
Fruits
Statistics for Business Administration
The type of diagrams (i.e charts and graphs) described in this chapter can be
conveniently classified under three headings as follows:
3.4.1 Pictograms
Appropriate pictures to show comparison replace bars. Whilst, this is more
eye-catching, it is considerably less accurate and may be miss-leading.
(Figure 1.b, how many does a fraction of person represent? Half or 45%)
Data Summarizing and Presentation
Features of pictograms
a) Pictograms are sometimes referred to as ideograms.
b) The symbol are normally duplicated and for the sake of accuracy the
numeric values being represented are sometimes shown. A scaled axes
can be included.
c) An alternative method to duplicating the symbol used is to magnify
them. For example, Figure 2 represents different numbers of trucks by
the area of the truck symbol and Figure 3 represents an increase in sales
of detergent (in kg) by the volume of the two detergent boxes.
d) Advantages:
Easy to understand for a non-sophisticated audience, extremely suggestive.
e) Disadvantage:
- Can be award to construct if complex symbols are used.
- Not accurate enough for serious statistical presentation.
Statistics for Business Administration
A simple bar chart is a chart consisting of a set of bars separated with gaps
(they are non-join bars). A separated bar for each class is drawn to a height
proportional to the class frequency. The widths of the bars drawn for each
class are always the same and if desired, each bar can be shaded or colored
differently. Simple bar charts can be used to represent nonnumeric
frequency distributions and time series equally well.
Sedan
Sport
Break
Van
Sedan
Sport
Break
Van
Figure 7 depicts a Percentage bar chart: Percentage Bar Charts are used
where relative comparison between components, are important. The
disadvantage is that actual figures including class totals are not comprised
into the graph.
Multiple bar charts
These have a set of bars each bar representing a single constituent part of
the total. Within each set, the bars are physically joined and always arranged
in the same sequence. Sets of bars should be separated.
Data Summarizing and Presentation
Figure 10 The breakdown of an employee's monthly pay, with each sector exploded.
d) Advantages:
- A dramatic and appealing way of presenting data.
- Good for comparing classes in relative terms.
e) Disadvantages:
- Compilations laborious. Circles should not be drawn by hand and sectors
should be drawn using a protractor, However, without a protractor (once
the size of each sector has been determined, their physical size within the
circle can be intelligently guessed at.
- Can be untidy if there are many classes (say 8 or more) and different shading
or colorings are being used.
Multiple pie-charts
Multiple Pie Charts can be used as alternatives to percentage bar charts; that
is, a pie chart (360 degrees) replaces a bar (100%) for each class or year.
For example, Table 3-11 represents the skills classifications of the
workforce at two factories.(Note that the degrees figure in Table 3-11 can be
obtained by multiplying each percentage by 3.6)
Statistics for Business Administration
Table 3-11
Comments on the situation shown by Fig. 11: At both factories, about 20%
of the workforce is semi-skilled. However, whereas unskilled workers
account for only 20% of the workforce of factory A, they will constitute
about 35% of factory B's employed workforce.
The advantage of using multiple pie charts as opposed to a percentage bar
chart is mainly visual impact; they generally felt to be more attractive.
However, their construction is more involved and this is considered as a
major disadvantage. Most people prefer to work out percentages and draw
straight line bars than calculate degree of sectors and draw circles.
Data Summarizing and Presentation
We calculate the size of the sectors for both circles with current method.
Statistics for Business Administration
3.4.4 Histograms
0.4
Relative Frequency
0.3
0.2
0.1
0
10 15 20 25 30
Def ective Items
Table 3-12
3.4.6 Z-charts
a) The first line diagram drawn describes the actual time series values.
Normally, the time series consists of monthly measurements over one
complete year, i.e Jan, Feb, through to Dec.
b) The second line diagram drawn describes the accumulated time series
values, For monthly measurements, the first plot will coincide with the
first plot of the diagram in a), i.e January figure. The second plot will be
January and February, the third, January, February and March and so on.
This diagram is useful for charting monthly progress towards an annual
total. The more removed from straight line it is, so the more variation
there has been on the actual monthly figures.
Production evolution
Table 3-13
The third line diagram is drawn such that each point describes the current
month's figure plus the previous eleven month's figures, to form a
Data Summarizing and Presentation
Comment on the situation shown by Figure 15: Production in the 2nd year
was relatively constant with a slight drop in the summer months. The long
term shows a drop in overall production.
Statistics for Business Administration
Figure 16 GANTT chart drawn up for the data of Table 3-13. The comments shown
on the chart are for information purposes and may or may not be included
on an actual chart.
Data Summarizing and Presentation
For example, suppose the yearly demand for a new technological product
was estimated as 2000 in year 1. Table 3-15 shows the difference between
an actual increase of 500 per year and a relative increase of 25% per year
and figure 17 shows these values plotted using line diagrams.
Table: 3-15
a b
Figure 17 a and b. Demand evolution forecast
Statistics for Business Administration
The above logarithms against year are displayed in Figure 18. Notice that
the points form a perfect straight line (which was expected, since the values
were calculated using a constant 25% increase.)
Data Summarizing and Presentation
b) The larger the rate of increase of time series data, so the steeper the
(semi-logarithmic) line will be. This fact enables the rates of increase of
two or more time series to be compared on the same set of axes. Note
that when comparing time series data in this way, it is only the steepness
(or inclination) of the lines that is of any interest and not their positions.
Figure 19 and 20 shows the significance of some standard shapes of
semi-logarithmic graphs.
a) All diagrams should be neat and attractive to look at. Always use graph
paper and a ruler.
b) Diagrams should be easy to read, without excessive detail.
c) Always try to locate the diagram centrally on the paper, using as much of
it as possible.
d) A general title must be given which describes what is being ported but it
should be as brief as possible and to the point
e) Axes, if used, should be clearly labeled, giving the units of the data and a
note of any break of scale.
f) Shading or coloring, if used, must be lightly done as it may detract from
the presentation.
Statistics for Business Administration
a) b)
3.5 Exercises
Multiple choice exercises with answers
1. In order to display a qualitative variable variation we can use:
a. the histogram
b. the frequency polygon
c. the ogyve of cumulated frequencies
d. the bar chart
e. no graphical presentation is suitable for qualitative data distributions
ANSWER: d.
Statistics for Business Administration
4. The best type of chart for comparing two sets of categorical data is:
a. a line chart
b. a pie chart
c. a histogram
d. a bar chart
ANSWER: d
19 6 15 20 17 16 17 12 15
29 23 17 7 10 14 14 27 22
8 5 23 19 9 28 5
Total 25 1.00
Class contains observations up to but not including 10. The other classes are
defined similarly. This notion is used throughout the chapter.
b. Note that the numbers that appear along the horizontal axis represent the
upper limits of the class intervals even though they appear in the center of
the classes of the histogram.
Statistics for Business Administration
0.4
Relative Frequency
0.3
0.2
0.1
0
10 15 20 25 30
Def ec tiv e Items
c. The area under the histogram between two values is five times the relative
frequency of observations that fall between those two values (where 5 is the
width of each class). The total area under the histogram will be equal to 5.
a.
Ice-cream Frequency Proportion
category
Vanilla 7 0.32
Chocolate 6 0.24
Fruit 12 0.44
Data Summarizing and Presentation
The Fruit ice-cream on Bucharest market is stronger than the Vanilla and
Chocolate ice-creams.
b.
14
12
10
8
6
4
2
0
Vanilla Chocolate Fruit
8. Identify the type of data for which each of the following graphs is
appropriate.
a. Histogram
b. Pie chart
c. Bar chart
9. For each of the following examples, identify the data type as, either
qualitative, ranked, or quantitative, and specify the appropriate measurement
scale for each as either interval, nominal, or ordinal.
a. the letter grades received by students in a computer science class
b. the number of students in a statistics course
c. the starting salaries of newly Ph.D. graduates from a statistics program
d. the size of fries (small, medium, large) ordered by a sample of Burger
King customers.
e. the college (Arts and science, Business, etc.) you are enrolled in.
Statistics for Business Administration
10. In its 2002 report, a company presented the following data regarding its
sales (in millions of dollars), net income (in millions of dollars), return
on equity (%), and net income per share (in dollars).