Data and Statistics
Applications in Economics I need
Data help!
Data Sources
Descriptive Statistics
Statistical Inference
Computers and
Statistical Analysis
Applications in Economics
Statistics: a methodology to use data to
learn the “truth.” i.e., Uncover the true
data mechanism
Probability: Branch of mathematics that
models of the truth
In economics, we estimate and test economic models
and their predictions
Use empirical models for prediction,
forecasting, and policy analysis.
Applications in Business
Marketing
Electronic point-of-sale scanners at
retail checkout counters are used to
collect data for a variety of marketing
research applications.
Production
Statistical quality
control charts are used to monitor
the output of a production process.
Applications in Finance
◼ Finance
Financial advisors use statistical models
to guide their investment advice.
Data, Data Sets,
Elements, Variables, and Observations
Variables
Element
Names Annual Earn/
Company Sales($M) Share($)
Dataram Dataram 73.10 0.86
EnergySouth EnergySouth 74.00 1.67
Keystone Keystone 365.70 0.86
LandCare LandCare 111.40 0.33
Psychemedics Psychemedics 17.60 0.13
Data Set
Data and Data Sets
◼ Data are the facts and figures collected,
summarized, analyzed, and interpreted.
◼ The data collected in a particular study are referred
to as the data set.
Elements, Variables, and Observations
◼ The elements are the entities on which data are
collected.
◼ A variable is a characteristic of interest for the elements.
◼ The set of measurements collected for a particular
element is called an observation.
◼ The total number of data values in a data set is the
number of elements multiplied by the number of
variables.
Scales of Measurement
Data
Qualitative Quantitative
Numerical Nonnumerical Numerical
Nominal Ordinal Nominal Ordinal Interval Ratio
Scales of Measurement
Scales of measurement include:
Nominal Interval
Ordinal Ratio
The scale determines the amount of information
contained in the data.
The scale indicates the data summarization and
statistical analyses that are most appropriate.
Scales of Measurement
◼ Nominal
Data are labels or names used to identify an
attribute of the element.
A nonnumeric label or numeric code may be used.
Scales of Measurement
Nominal
Example:
Students of a university are classified by the
dorm that they live in using a nonnumeric label
such as Lenchwe, Loan, and so on.
A numeric code can be used for
the school variable (e.g. 1: SoCTAS, 2: SB,
3: SSS, and so on).
Scales of Measurement
◼ Ordinal
The data have the properties of nominal data and
the order or rank of the data is meaningful.
A nonnumeric label or numeric code may be used.
Scales of Measurement
◼ Ordinal
Example:
Students of a university are classified by their
class standing using a nonnumeric label such as
Freshman, Sophomore, Junior, or Senior.
A numeric code can be used for
the class standing variable (e.g. 1 denotes
Freshman, 2 denotes Sophomore, and so on).
Scales of Measurement
◼ Interval
The data have the properties of ordinal data, and
the interval between observations is expressed in
terms of a fixed unit of measure.
Interval data are always numeric.
Scales of Measurement
◼ Interval
Example: Average Starting Salary Offer 2003
Economics/Finance: K40,084
History: K32,108
Psychology: K27,454
Econ & Finance majors earn K7,976 more than
History majors and K12,630 more than
Psychology majors.
Scales of Measurement
◼ Ratio
The data have all the properties of interval data
and the ratio of two values is meaningful.
Variables such as distance, height, weight, and time
use the ratio scale.
This scale must contain a zero value that indicates
that nothing exists for the variable at the zero point.
Scales of Measurement
◼ Ratio
Example:
Econ & Finance majors salaries are 1.24 times
History major salaries and are 1.46 times
Psychology major salaries
Qualitative and Quantitative Data
Data can be qualitative or quantitative.
The appropriate statistical analysis depends
on whether the data for the variable are qualitative
or quantitative.
There are more options for statistical
analysis when the data are quantitative.
Qualitative Data
Labels or names used to identify an attribute of each
element. E.g., Black or white, male or female.
Referred to as categorical data
Use either the nominal or ordinal scale of
measurement
Can be either numeric or nonnumeric
Appropriate statistical analyses are rather limited
Quantitative Data
Quantitative data indicate how many or how much:
Discrete, if measuring how many. E.g., number
of 6-packs consumed at tail-gate party
Continuous, if measuring how much. E.g., pounds
of hamburger consumed at tail-gate party
Quantitative data are always numeric.
Ordinary arithmetic operations are meaningful for
quantitative data.
Cross-Sectional Data
Cross-sectional data observations across individuals
at the same point in time.
Example: the growth rate from 1960 to 2004 of
each country in the world (about 182 of them).
Example: wages for head of household in
Indiana
Time Series Data
Time series data are collected over several time
periods.
Example: the sequence of U.S. GDP growth each
Year from 1960 to 2005
Example: the sequence of Professor Mark’s wage
each year from 1983 to 2005.
Data Sources
◼ Existing Sources
Within a firm – almost any department
Business database services – Dow Jones & Co.
Government agencies - U.S. Department of Labor
Industry associations – Travel Industry Association
of America
Special-interest organizations – Graduate Management
Admission Council
Collect your own
Data Sources
◼ Statistical Studies
In experimental studies variables of interest
are identified. Then additional factors are
varied to obtain data that tells us how
those factors influence the variables.
In observational (nonexperimental) studies we
cannot control or influence the
variables of interest.
a survey is a
good example
Descriptive Statistics
◼ Descriptive statistics are the tabular, graphical,
and numerical methods used to summarize data.
Example: Hudson Auto Repair
The manager of Hudson Auto
would like to understand the cost
of parts used in the engine
tune-ups performed in the
shop. She examines 50
customer invoices for tune-ups. The costs of parts,
rounded to the nearest dollar, are listed on the next
slide.
Example: Hudson Auto Repair
Sample of Parts Cost for 50 Tune-ups
91 78 93 57 75 52 99 80 97 62
71 69 72 89 66 75 79 75 72 76
104 74 62 68 97 105 77 65 80 109
85 97 88 68 83 68 71 69 67 74
62 82 98 101 79 105 79 69 62 73
Tabular Summary:
Frequency and Percent Frequency
Parts Parts Percent
Cost ($) Frequency Frequency
50-59 2 4
60-69 13 26
(2/50)100
70-79 16 32
80-89 7 14
90-99 7 14
100-109 5 10
50 100
Graphical Summary: Histogram
18
16 Tune-up Parts Cost
14
12
Frequency
10
8
6
4
2
Parts
50-59 60-69 70-79 80-89 90-99 100-110 Cost ($)
Numerical Descriptive Statistics
◼ The most common numerical descriptive statistic
is the average (or sample mean).
◼ Hudson’s average cost of parts, based on the 50
tune-ups studied, is $79 (found by summing the
50 cost values and then dividing by 50).
Statistical Inference
Population - the set of all elements of interest in a
particular study
Sample - a subset of the population
Statistical inference - the process of using data obtained
from a sample to make estimates
and test hypotheses about the
characteristics of a population
Census - collecting data for a population
Sample survey - collecting data for a sample
Process of Statistical Inference
1. Population
consists of all 2. A sample of 50
tune-ups. Average engine tune-ups
cost of parts is is examined.
unknown.
4. The sample average 3. The sample data
provide a sample
is used to estimate the average parts cost
population average. of $79 per tune-up.