[go: up one dir, main page]

0% found this document useful (0 votes)
8 views23 pages

CEC217 Lecture3

The lecture covers key concepts in probability and statistics, focusing on sample variance, sample standard deviation, and their roles in describing population parameters. It discusses statistical modeling, graphical plots, and various methods for visualizing data, including scatter plots, stem-and-leaf plots, and histograms. Additionally, the lecture includes exercises for calculating statistical measures and constructing visual representations of data.

Uploaded by

eraykarakoc0538
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views23 pages

CEC217 Lecture3

The lecture covers key concepts in probability and statistics, focusing on sample variance, sample standard deviation, and their roles in describing population parameters. It discusses statistical modeling, graphical plots, and various methods for visualizing data, including scatter plots, stem-and-leaf plots, and histograms. Additionally, the lecture includes exercises for calculating statistical measures and constructing visual representations of data.

Uploaded by

eraykarakoc0538
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Probability and Statistics

CEC217 Lecture 3

Dr. Tarık Adnan


Email: tarikalmohamad@karabuk.edu.tr
1
Office: 104
Cont.
• We have seen previously the sample variance and the
sample standard deviation measures.
• Both measures play huge roles in the use of statistical
methods and both reflect the same concept in measuring
variability, …BUT:
Variability in
Sample standard linear units
deviation

Sample Variability in
variance squared
units
2
3
Population Characteristics (parameters) Parameters

• We can use the aforementioned concepts to describe the


population parameters. In other words, we have now two
mean variance
important parameters: the population mean and the
population variance.
• The sample variance plays an explicit role in the statistical
methods used to draw inferences about the population
variance.
• The sample standard deviation has an important role along
with the sample mean in inferences that are made about the
population mean.
4
Statistical Modeling,
Scientific Inspection, and
Graphical Plots

5
Statistical Modeling
• A model form is often the foundation of assumptions that are
made by the analyst.
• A statistical model is not deterministic but, rather, must entail
some probabilistic aspects. Often the end result of a
statistical analysis is the estimation of parameters of a
postulated \ assumed model.
• Assume one want to draw some level of distinction between
the nitrogen and no-nitrogen populations through the sample
information. The analysis may require a certain model for the
data such as the two samples can be derived from normal or
Gaussian distributions 6
Graphical Plots
• In this part, the duty of sampling and the display of data for
enhancement of statistical inference is explored in detail.
• We present some simple but effective displays that
complement the study of statistical populations

Graphical Illustrations

Box-and-
Stem-
Scatter Whisker
and-Leaf Histogram
Plot Plot or
Plot
Box Plot
7
Graphical Plots: Scatter Plot
tensile strength
• A textile manufacturer who designs an experiment where
cloth specimen that contain various percentages of cotton
=
çekme direnci,
çekme kuvveti,
are produced: gerilme direnci

• Five cloth specimens are manufactured for each of the four


cotton percentages. Some simple graphics can shed
important light on the clear distinction between the samples.
8
Scatter Plot (Cont.)
• Let’s have a look on the following figure; the sample means
and variability are showed nicely in the scatter plot.

Figure.1.5
Scatter plot of
tensile strength and
cotton percentages

• One possible aim of this study is simply to determine which cotton


percentages are truly distinct from the others
9
MATLAB Scatter Plot

• Open your MATLAB and 1) create 𝑥 as 200 equally spaced


values between 0 and 3𝜋 . 2) Create 𝑦 as cosine values with
random noise. Then, 3) create a scatter plot by using the
function scatter

10
Scatter Plot (Cont.)

• Plots can illustrate information that allows the results of the


formal statistical inference to be better communicated to the
scientist or engineer.

• At times, plots or exploratory data analysis can teach the


analyst something not retrieved from the formal analysis.
• Graphics can nicely highlight violation of assumptions that
would be unobserved or ignored.

• Let’s see next the other types of graphical plots. 11


Stem-and-Leaf Plot
Table 1.4: Car Battery Life
• The Stem-and-Leaf Plot is a
combination of tabular and 1

graphic display which can be


2
very handy to analyse the
distribution’s behavior of a
specific statistical data Table 1.5: Stem-and-Leaf Plot of Battery Life
generated in large masses.
• For the number 2.6, the digit 2
is designated the stem and the
digit 6 is the leaf.
12
Stem-and-Leaf Plot (Cont.)

• The stem-and-leaf plot of Table 1.5 contains only four stems


and consequently does not provide an adequate picture of
the distribution. What can we do then?
• The solution is to increase the number of stems in our plot.
• One simple way to accomplish this is to write each stem
value twice and then record the leaves 0, 1, 2, 3, and 4
opposite the appropriate stem value where it appears for the
first time, and the leaves 5, 6, 7, 8, and 9 opposite this
same stem value where it appears for the second time.

13
Double-stem-and-leaf

• This modified double-stem-and-leaf plot is illustrated in Table


1.6, where the stems corresponding to leaves 0 through 4
have been coded by the symbol (*) and the stems
corresponding to leaves 5 through 9 by the symbol (·).

Table 1.6 Double-Stem-and-Leaf Plot of Battery Life

Table 1.4: Car Battery Life

14
Frequency distribution

• Another way is through the use of the frequency distribution,


where the data, grouped into different classes or intervals,
can be constructed by counting the leaves belonging to each
stem and noting that each stem defines a class interval.

• In Table 1.5, the stem 1 with 2 leaves defines the interval


1.0–1.9 containing 2 observations; the stem 2 with 5 leaves
defines the interval 2.0–2.9 containing 5 observations and so
forth…
• Notice that the total number of observations here is 40 15
Histogram

• We obtain the proportion of the set of observations in each of


the classes by dividing each class frequency by the total
number of observations.
• A table listing relative frequencies is called a relative
frequency distribution.
• The relative frequency distribution for the data of Table 1.4,
showing the midpoint of each class interval, is given in Table
1.7.
• The information provided by a relative frequency distribution
in tabular form is easier to grasp if presented graphically. 16
Histogram (Cont.)

• Using the midpoint of each interval and the corresponding relative frequency, a
relative frequency histogram (Figure 1.6) can be constructed.

Table 1.7 Relative Frequency Distribution of Battery Life


Figure 1.6 Relative frequency histogram

𝐹𝑟𝑒𝑞𝑢𝑒𝑛𝑐𝑦 2
= = 0.05
𝑇𝑜𝑡𝑎𝑙 𝑁𝑜. 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠 40 17
MATLAB Histogram

• Let’s generate 10,000 random numbers and create a histogram.


The histogram function automatically chooses an appropriate
number of bins to cover the range of values in x and show the
shape of the underlying distribution.

• Please investigate more about the


function histogram in MATLAB
Documentation 18
Estimating frequency distribution

• We obtain the proportion of the set of observations in each of


the classes by dividing each class frequency by the total
number of observations. Many continuous frequency
distributions can be represented graphically by the
characteristic bell-shaped curve of Figure 1.7.

Figure 1.7 Estimating frequency distribution 19


Estimating frequency distribution
• A probability distribution is said to be symmetric if it can be
folded along a vertical axis so that the two sides coincide.
• A distribution that lacks symmetry with respect to a vertical
axis is said to be skewed

• The distribution illustrated in


Figure 1.8(a) is said to be
skewed to the right since it
has a long right tail and a
much shorter left tail. In
Figure 1.8(b) we see that the
distribution is symmetric,
while in Figure 1.8(c) it is
skewed to the left.
Figure 1.8 Skewness of data 20
====Exercise====
===============
• The lengths of power failures, in minutes, are recorded in the
following table.

(a) Find the sample mean and sample median of the power-failure
times.
(b) Find the sample standard deviation of the power failure times.

21
====Exercise====
===============
• The following scores represent the final examination grades
for an elementary statistics course:

(a) Construct a stem-and-leaf plot for the examination grades in which


the stems are 1, 2, 3, . . . , 9.
(b) Construct a relative frequency histogram, draw an estimate of the
graph of the distribution, and discuss the skewness of the distribution.
(c) Compute the sample mean, sample median, and sample standard
deviation. 22
Thank you

Feel free to ask questions

23 23

You might also like