[go: up one dir, main page]

0% found this document useful (0 votes)
97 views16 pages

Unit II Question Bank With Hints and Answers

This document provides information about describing and summarizing data. It discusses frequency distributions including grouped, ungrouped, cumulative, and relative distributions. It also covers measures of central tendency like mean, median, and mode and measures of variability including range, variance, standard deviation, interquartile range, and z-scores. Degrees of freedom and the normal curve are also defined. Examples are provided to explain concepts like grouped frequency distributions, histograms, and calculating z-scores.

Uploaded by

akshaya vijay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
97 views16 pages

Unit II Question Bank With Hints and Answers

This document provides information about describing and summarizing data. It discusses frequency distributions including grouped, ungrouped, cumulative, and relative distributions. It also covers measures of central tendency like mean, median, and mode and measures of variability including range, variance, standard deviation, interquartile range, and z-scores. Degrees of freedom and the normal curve are also defined. Examples are provided to explain concepts like grouped frequency distributions, histograms, and calculating z-scores.

Uploaded by

akshaya vijay
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 16

Unit II Hints

Types of data

Frequency Distribution (Table)

Describing data with Averages (mean,median & mode)


Describing Variability (Range,variance & Std Deviation)

1)SS = ΣX2 - ((ΣX)2 / N) - sum of squares(ss) for population

2)SS = ΣX2 - ((ΣX)2 / n) - sum of squares(ss) for sample

3) σ =√ SS / N - Standard deviation for population

4)s =√ SS /(n−1) - Standard deviation for sample

5) σ2 = SS/N - Variance for population

6) s2 = SS/n-1 - Variance for sample

7) z = (x – μ) / σ - Z-score
UNIT II

PART

1. What is Frequency distribution?


Frequency distribution is a collection of observations produced by sorting observations into classes and showing
their frequency (f) of occurrence in each class. Frequency distributions are visual displays that organise and present
frequency counts so that the information can be interpreted more easily. Frequency distributions can show absolute
frequencies or relative frequencies, such as proportions or percentages.

2. What are the types and uses of Frequency distributions?


Types of Frequency distributions:
 Grouped frequency distribution.
 Ungrouped frequency distribution.
 Cumulative frequency distribution.
 Relative frequency distribution.
 Relative cumulative frequency distribution.

Uses:
A frequency distribution helps us to detect any pattern in the data (assuming a pattern exists) by superimposing
some order on the inevitable variability among observations.

3. What is Grouped frequency distribution:


When observations are sorted into classes of more than one value, the result is referred to as a frequency
distribution for grouped data. The intervals in grouped frequency distribution are called class limits.
For example, Marks obtained by 20 students in the test are as follows. 5, 10, 20, 15, 5, 20, 20, 15, 15, 15, 10, 10, 10, 20,
15, 5, 18, 18, 18, 18. To arrange the data in grouped table we have to make class intervals. Thus, we will make class
intervals of marks like 0 – 5, 6 – 10, and so on.

Marks
obtained in
No. of Students (Frequency)
Test (class
intervals)

0–5 3

6 – 10 4

11 – 15 5

16 – 20 8
Marks
obtained in
No. of Students (Frequency)
Test (class
intervals)

Total 20

4. What is ungrouped frequency distribution?

When observations are sorted into classes of single values the result is referred to as a frequency distribution for
ungrouped data.  In the ungrouped frequency distribution table, we don't make class intervals, we write the accurate
frequency of individual data. 

Marks obtained in Test No. of Students

5 3

10 4

15 5

18 4

20 4

Total 20

5. What is cumulative frequency distribution?


Cumulative frequency distributions show the total number of observations in each class and in all lower-ranked
classes. This type of distribution can be used effectively with sets of scores, such as test scores for intellectual or
academic aptitude, when relative standing within the distribution assumes primary importance.

6. What is relative frequency distribution?


Relative frequency distributions show the frequency of each class as a part or fraction of the total frequency for the
entire distribution. o create a relative frequency distribution table, take the count of students in a row (one grade
level) and divide it by the total number of students. For example, in the first row, there are 23 students in the first
grade—23 out of 88 = 26.1%.

Define Percentile Ranks?


The percentile rank of a score indicates the percentage of scores in the entire distribution with similar or smaller
values than that score. A percentile rank indicates how well a student performed in comparison to the students in
the specific norm group, for example, in the same grade and subject. A student's percentile rank indicates that the
student scored as well as, or better than, the percent of students in the norm group

7. What is Histograms?
A histogram is the most commonly used graph to show frequency distributions. It is a bar-type graph for
quantitative data. The common boundaries between adjacent bars emphasize the continuity of the data, as with
continuous variables.

8. Explain any three Features of histograms?


Equal units along the horizontal axis (the X axis, or abscissa) reflect the various class intervals of the frequency
distribution. Equal units along the vertical axis (the Y axis, or ordinate) reflect increases in frequency. (The units along
the vertical axis do not have to be the same width as those along the horizontal axis.). The body of the histogram
consists of a series of bars whose heights reflect the frequencies for the various classes.
9. What is Frequency Polygon?
Frequency Polygon is a line graph for quantitative data that also emphasizes the continuity of continuous variables.
Frequency polygons may be constructed directly from frequency distributions.

10. What is mean?


Mean is an essential concept in mathematics and statistics. The mean is the average or the most common
value in a collection of numbers. In statistics, it is a measure of central tendency of a probability distribution along
median and mode.The mean is found by adding all scores and then dividing by the number of scores.
MEAN = SUM OF ALL SCORES /NUMBER OF SCORES

11. What is median?


The median reflects the middle value when observations are ordered from least to most. The median splits a set of
ordered observations into two equal parts, the upper and lower halves. In other words, the median has a percentile rank
of 50, since observations with equal or smaller values constitute 50 percent of the entire distribution.

12. What is mode?


Mode reflects the value of the most frequently occurring score.It is easy to assign a value to the mode if the data
are organized. The most frequent number—that is, the number that occurs the highest number of times. Example:
The mode of {4 , 2, 4, 3, 2, 2} is 2 because it occurs three times, which is more than any other number.

13. What if a distribution have More than one mode or no mode at all?
Distributions can have more than one mode (or no mode at all). Distributions with two obvious peaks, even
though they are not exactly the same height, are referred to as bimodal. Distributions
with more than two peaks are referred to as multimodal. The presence of more than one mode might reflect important
differences among subsets of data.

14. Explain Range, variance and standard deviation?


Range: The range is the difference between the largest and smallest scores
Variance: The variance is a measure of variability
Standard deviation: Standard deviation is a measure of the amount of variations or dispersion of a set of value. A low
standard deviation indicates that the values tend to be close to the mean of the set, while high standard deviation
indicates that the values are spread out over a wider range.
𝑠𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = √variance

15. What is DEGREES OF FREEDOM (df)?


Degrees of freedom (df) refers to the number of values that are free to vary, given one or more mathematical
restrictions, in a sample being used to estimate a population characteristic. The degrees of freedom (DF) in
statistics indicate the number of independent values that can vary in an analysis without breaking any
constraints.

16. What is INTERQUARTILE RANGE (IQR)?


Interquartile range (IQR), is simply the range for the middle 50 percent of the scores. More specifically, the IQR
equals the distance between the third quartile (or 75th percentile) and the first quartile.

17. Define Normal curve and its properties.


The normal curve is a theoretical curve defined for a continuous variable, and noted for its symmetrical bell-shaped
form.
Properties of Normal curve:
The normal curve is symmetrical, its lower half is the mirror image of its upper half. Being bell shaped, the normal
curve peaks above a point midway along the horizontal spread and then tapers off gradually in either direction from the
peak(without actually touching the horizontal axis, since, in theory, the tails of a normal curve extend infinitely far)

18. What is Z-score?


A z score is a unit-free, standardized score that, regardless of the original units of measurement, indicates how many
standard deviations a score is above or below the mean of its distribution.
z = (x – μ) / σ
where X is the original score and μ and σ are the mean and the standard deviation, respectively.
PART B

1. Explain the different types of frequency distribution with example.

A frequency distribution: Also known as a frequency table, a frequency distribution is a


visual depiction of the frequency of certain events in a particular set of values.
Here is a list of scores from a 5-point rating scale:1, 5, 4, 5, 3, 2, 3, 2, 5, 5, 3, 4, 3, 3, 4, 5, 5, 5, 3, 4
To summarize these scores in a frequency distribution, In the frequency distribution table,
make two columns. Label the left column, X, representing the scores, and the right column, f,
representing the frequency.
To get the frequency in the frequency distribution table, arrange the scores in ascending or
descending order on the left, then enter the frequency of each score on the right.
X F
5 7
4 4
3 6
2 2
1 1

Frequency distribution gives a clear picture of the distribution of values. By organizing data in a
distribution table, researchers can identify impossible values and the location of scores in a distribution.
A frequency distribution shows how high or low measurements are.
Types of Frequency Distributions
There are three types of frequency distribution:
 Categorical frequency distribution.
 Grouped frequency distribution.
 Ungrouped frequency distribution
Categorical frequency distribution
Categorical frequency distribution is the distribution frequency of classifiable values such as blood
type or educational level.
Here is an example of a categorical frequency distribution table:
Relative
X = Blood type f
frequency

A 7 0.35 or 35%

B 4 0.20 or 20%

AB 6 0.30 or 30%

O 2 0.10 or 10%

A+ 1 0.05 or 5%
In a frequency distribution, researchers can also compute relative frequencies.
Relative frequency: shows how often a score occurs within total frequencies in a distribution table. To
get the relative frequency of a score in a frequency distribution, divide a score's frequency by the total
number of frequencies.
To find the relative frequency of the first row, divide 7 by 20 (the total number of outcomes), which is
equal to 0.35 or 35%.
Frequency distributions also include cumulative relative frequencies.
Cumulative relative frequency: the sum of prior relative frequencies in a distribution table. To find
the cumulative relative frequency of a score in distribution frequency, combine its relative frequency
with all relative frequencies above it.
X = Blood
f Relative frequency Cumulative relative frequency
type

A 7 0.35 or 35% 0.35


B 4 0.20 or 20% 0.35 + 0.20 = 0.55

AB 6 0.30 or 30% 0.55 + 0.30 = 0.85

O 2 0.10 or 10% 0.85 + 0.10 = 0.95

A+ 1 0.05 or 5% 0.95 + 0.05 = 1.00

Grouped frequency distribution

Grouped frequency distribution is the distribution frequency of grouped data called class


intervals which appear as number ranges in a distribution table. Grouped frequency distributions are
ideal for large amounts of data.

Here are a few guidelines for the distribution frequency of grouped data:

 Generally, grouped frequency distributions should have at least 10 class intervals.


 Ensure that a class interval width is a simple number.
 The bottom score of each score range should be a multiple of the width.
 A score should only belong in one class interval.

A Math teacher listed the grades of her 25 students as follows:


98, 90, 84, 92, 76, 87, 95, 83, 79, 80, 91, 94, 88, 75, 85, 84, 79, 96, 81, 75, 82, 89, 93, 97, 90
Let's organize these grades in a frequency distribution. The highest score (H) is 98, and the lowest
score (L) is 75.
To identify the number of rows for the frequency distribution, use the following formula: H - L =
difference + 1
98 - 75 = 23 + 1 (24 rows)
Twenty-four rows are too many, so we group the scores. With three as the interval width, there will be
a total of 8 intervals in the frequency distribution (24/3 = 8). An interval width of 3 indicates three
values for each interval.
75 (lowest score) = 75, 76, 77

Class interval: 75–77


X f

96–98 3

93–95 3

90–92 4

87–89 3

84–86 3

81–83 3

78–80 3

75–77 3
Ungrouped frequency distribution

Ungrouped frequency distribution is the distribution frequency of ungrouped data listed as individual
values in a distribution table. This type of frequency distribution is ideal for a small set of values.

X f

7 1

6 2

5 1

4 3

3 2

2 4

1 3
In this frequency distribution, X represents the number of children in a household, and f is the number
of families with said number of children. Here, we can see that four homes have two children, and one
has seven children

2. Describe Mean, Median, Mode and Averages with example.

Central tendency refers to identifying the central position of the given data set. Central tendency has
3 important measures that are Mean, Median, and Mode.

Mean: Mean, here the arithmetic mean is the most used measure of central tendency. With both discrete
and continuous data set, a mean can be obtained. To obtain the mean of a set of numbers, you have to
sum up all the numbers and then divide it using the total numbers. In short, it is taking out the average of
all numbers. It can also be said that it is the ratio of the sum of all observations to the total no of
observations.
Median: When you arrange a given set of numbers from smaller to biggest, the middle number is said to
be the median. Arrange any given series in either ascending or descending order or the middle value
then is termed as the median. To find the median the data can be arranged in either descending order or
ascending order. In geometry, it is defined as the centre or midpoint of a polygon. 
Mode: The most repeated number in a given set of observations is the mode or it can also be said that
the Number or Value which have the highest frequency in a given series of numbers. It is also known as
the modal value. It is a part of 3 central tendencies apart from the median and Mean. It is the highest bar
if presented in a histogram or a chart form.  If there is no repeated number in a given series then no
Mode will exist for that series

Table of differences between Mean, Median, and Mode


All these measures of central tendencies are co-related. They share an empirical relationship but are
different from each other. Here are the differences:

S.No Mean Median Mode

The most frequently


The average taken of given occurred number in a
observations is called The middle number in a given set of given set of observations
1 Mean.  observations is called Median. is called mode.
Add up all the numbers and  the mode is derived when
divide by the total number Place all the numbers in ascending or a number has frequency
2 of terms descending order  occurred in a series

After arranging everything from The mode can be one or


 Once the above step is smallest to biggest, take out the more than one. It is
finished, what we get is the middle number, which is your possible to have no mode
3 mean. median. at all, as well

Mean is the arithmetic mean


or in a  simple way can be a When series have even numbers, If there is a unique data
simple average or weighted median is the simple average of the set, there is no mode at
4 average.  middle pair of numbers.  all.

When data is normally When there is a nominal


distributed, the mean is When data distribution is skewed, distribution of data, the
5 widely preferred. median is the best representative. mode is preferred.

If the total number of observations


(n) are odd then median is:
Median = (n + 1/2)th observation  
If the total number of observations
(n) are even number, then the formula
is given below: The mode is the most
Median = (n/2)th observation + frequently occurring
6 Mean= x̄ = ∑x/ N (n/2+1)th observation /2 observation or value

For Example
We have a set of numbers that is 4, 8, 2, 1, 1, 4, 3, 1. Find the mean, median, and mode.
Solution:
Mean: 
8 + 4 + 2 + 1 + 1 + 4 + 3 + 1 = 24 and 24/8 = 3
Median: 
2 + 3/2 = 2.5 (after arranging the numbers in ascending order as 1, 1, 1, 2, 3, 4, 4, 8 and middle terms
are 2 and 3 as total number of terms are 8 which is even)
Mode:
3. because it is present 3 times in the sequence

3. Construct a frequency distribution for the following Grouped data


Specify the real limits for the lowest class interval in this frequency distribution
Answer on page no 420 and 421 in text book “Statistics”

3. Analyze how graph can be used to represent qualitative and quantitative data?
There are two types of data that we can collect:
 Qualitative data describes a subject, and cannot be expressed as a number.
 Quantitative data defines a subject and is expressed as a number (it can be quantified) that can be
analyzed. There are two types of quantitative data continuous and discrete.

Graphs

There are many types, including:


1. Pie charts and bar graphs are used for qualitative data
2. Histograms (similar to bar graphs) are used for quantitative data
3. Line graphs are used for quantitative data
4. Scatter graphs are used for quantitative data
Graphs should contain:
 A descriptive title below the graph or chart
 A caption below the title (optional)
 Axes labelled with the name of variable, units (if applicable) and the variable intervals; intervals must
be spaced according to scale
 A legend to indicate which data points belong to which set of data, if more than one data set is
displayed

Graphs for Qualitative Data

Qualitative data are words describing a characteristic of the individual. There are several different
graphs that are used for qualitative data. These graphs include bar graphs, Pareto charts, and pie charts.

Pie charts and bar graphs are the most common ways of displaying qualitative data. A spreadsheet
program like Excel can make both of them. The first step for either graph is to make a frequency or
relative frequency table. A frequency table is a summary of the data with counts of how often a data
value (or category) occurs.

Example
Example
Suppose you have the following data for which type of car students at a college drive?
Ford, Chevy, Honda, Toyota, Toyota, Nissan, Kia, Nissan, Chevy, Toyota, Honda, Chevy, Toyota, Nissan,
Ford, Toyota, Nissan, Mercedes, Chevy, Ford, Nissan, Toyota, Nissan, Ford, Chevy, Toyota, Nissan,
Honda, Porsche, Hyundai, Chevy, Chevy, Honda, Toyota, Chevy, Ford, Nissan, Toyota, Chevy, Honda,
Chevy, Saturn, Toyota, Chevy, Chevy, Nissan, Honda, Toyota, Toyota, Nissan
Step 1 : Construct Frequency table and calculate Relative frequency
Table 2.1.22.1.2: Relative Frequency Table for Type of Car Data

Category Frequency Relative Frequency


Ford 5 0.10

Chevy 12 0.24

Honda 6 0.12

Toyota 12 0.24

Nissan 10 0.20

Other 5 0.10

Total 50 1.00

the relative frequency of each category is just the frequency divided by the total. As an example for Ford
category:

relative frequency =550=0.10
There are several different types of graphs that can be used: bar chart, pie chart, and Pareto charts.
Bar graphs or charts consist of the frequencies on one axis and the categories on the other axis. Then you
draw rectangles for each category with a height (if frequency is on the vertical axis) or length (if frequency
is on the horizontal axis) that is equal to the frequency. All of the rectangles should be the same width, and
there should be equally width gaps between each bar.
 

A pie chart is where you have a circle and you divide pieces of the circle into pie shapes that are
proportional to the size of the relative frequency. There are 360 degrees in a full circle. Relative frequency
is just the percentage as a decimal. All you have to do to find the angle by multiplying the relative
frequency by 360 degrees. Remember that 180 degrees is half a circle and 90 degrees is a quarter of a
circle

Graphs for Quatitative Data

Quantitative graphs are used to present and summarize numerical information coming from
the study of a categorical quantitative variable.
The most frequently used types of quantitative graphs are:
 Bar chart
 Pictogram
 Pie chart
EXAMPLE
A teacher records scores on a 20-point quiz for the 30 students in his class. The scores are:
19 20 18 18 17 18 19 17 20 18 20 16 20 15 17 12 18 19 18 19 17 20 18 16 15 18 20 5 0 0
These scores could be summarized into a frequency table by grouping like values:
Score Frequency

0 2

5 1

12 1

15 2

16 2

17 4

18 8
19 4

20 6
Using the table from the first example, it would be possible to create a standard bar chart from this summary,
like we did for categorical data:

However, since the scores are numerical values, this chart doesn’t really make sense; the first and
second bars are five values apart, while the later bars are only one value apart. It would be more correct to treat
the horizontal axis as a number line. This type of graph is called a histogram. A histogram is like a bar graph,
but where the horizontal axis is a number line.

Other graph types such as pie charts are possible for quantitative data. The usefulness of different graph
types will vary depending upon the number of intervals and the type of data being represented. For example, a
pie chart of our weight data is difficult to read because of the quantity of intervals we used.

FREQUENCY POLYGON
An alternative representation is a frequency polygon. A frequency polygon starts out like a histogram,
but instead of drawing a bar, a point is placed in the midpoint of each interval at height equal to the frequency.
Typically the points are connected with straight lines to emphasize the distribution of the data.
EXAMPLE
This graph makes it easier to see that reaction times were generally shorter for the larger target, and that the
reaction times for the smaller target were more spread out.
4. Generate the ungrouped and grouped frequency table for the
following data
90,92,87,88,87,92,98,90,90,87,87,88,88,89,90,87,89,92,92,92,98,90
,95,87,87
(i) How many people scored 98?
(ii) How many people scored 90 or less?
(iii) What proportion scored 87?
5. (i) Calculate the sum of square population standard deviation for the given
x data value 13,10,11,7,9,11,9
(ii) Calculate the sample standard deviation for the given data 7,3,1,0,4

Answer
(i) SS = ΣX2 - ((ΣX)2 / N)
X X2
13 169
10 100
11 121
7 49
9 81
11 121
9 81
Total 70 722

SS = 722 - ((70)2 / 8) =109.5


σ =√ SS / N = 3.69 - Standard deviation for population

(ii) SS = ΣX2 - ((ΣX)2 / n)

X X2

7 49
3 9
1 1
0 0
4 16
Total 15 75

SS = ΣX2 - ((ΣX)2 / n) =75 -225/5 = 30

sample standard deviation= SQUARE ROOT(ss/n-1) = √ 30/4=√ 7.50 = 2/.74

6. Suppose the IQ score have a bell shaped distribution with a mean of 100 and standard
deviation of 15 then calculate the following,
(i) What percentage of people should have an IQ score between 85 and 115 .
(ii) What percentage of people should have an IQ score between 70 and 130
(iii) What percentage of people should have an IQ score more than 130
(iv) A person with an IQ score greater than 145 is considered as genius. Does the
empirical rule support this statement

Solution
Mean μ= 100
Std.Deviation σ =15
The empirical rule in statistics, also known as the 68 95 99 rule, states that for normal distributions,
68% of observed data points will lie inside one standard deviation of the mean, 95% will fall within
two standard deviations, and 99.7% will occur within three standard deviations.

IQ scores are normally distributed with a mean of 100 and a standard deviation of 15.

 About 68% of individuals have IQ scores in the interval 100±1(15)=[85,115].


 About 95% of individuals have IQ scores in the interval 100±2(15)=[70,130].
 About 99.7% of individuals have IQ scores in the interval 100±3(15)=[55,145].

You might also like