[go: up one dir, main page]

0% found this document useful (0 votes)
19 views24 pages

Technical Background - Lecture Notes

Economic technical background descriptions

Uploaded by

Shruti Mittal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views24 pages

Technical Background - Lecture Notes

Economic technical background descriptions

Uploaded by

Shruti Mittal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 24

Econometrics for Business I

BSE3703
Topic 1
Technical Background
Learning Outcomes

At the end of the lesson, students must be able to:


1. understand the applications of econometric models in business
2. identify the different types of data used in econometric modelling
3. identify categorical vs numerical data and their measurement scales
4. describe the different types of sampling methods
5. apply the notations for summation
6. apply the notations for mean and variance
7. understand the statistical distributions used in econometric modelling
Application of Econometric Models
to Problems in Business …
To quantify the impacts of real-world variables
in order to make good decisions

Investment Trading Access to Business Policy


Decision Decision Funds Decision Decision
Types of Data : Panel Data
There are 3 types of data which business analysts (strategists alike) might use for analysis.
▪ Time Series Data consists of a sequence of data points that occur in Economic
Economic
Active
Ease of
Doing Geopolitical
successive order over a period of time. Growth Population Business Risk

✓ GDP: Quarterly and Annually 2008 1.86 3740067 1 Low


✓ Unemployment: Monthly, Quarterly and Annually 2009 0.13 3892872 1 Low
✓ Government Budget: Annually Cross-Sectional Data
2010 14.52 3987655 1 Low
✓ MQL/SQL: Daily, Weekly and Monthly Marketing Qualified Lead, Sales Qualified lead
✓ Stock Prices: As transaction occurs 2011 6.21 4070632 1 Low
2012 4.44 4161359 1 Low
▪ Cross-Sectional Data consists of data points on one or more
2013 4.82 4226257 1 Low
variables collected at a single point in time.
2014 3.94 4286396 1 Low
✓ A poll of usage of payment gateway service providers
✓ Cross-section of stock returns on SGX 2015 2.98 4337457 1 Low
2016 3.60 4373213 2 Low
▪ Panel Data consists of data points with the dimensions of both time 2017 4.54 4340011 2 Low
series and cross-sections. 2018 3.58 4315261 2 Low
✓ The daily prices of a number of blue-chip stocks over two years 2019 2 Low
1.33 4311715
“It is common to denote each observation by the letter t and the total number of observations by T for time series data, 2020 -3.90 4246955 Low
and to denote each observation by the letter i and the total number of observations by N for cross sectional data.”
2021 8.88 4029143 Low
2022 3.65 4117459 Low

Time Series Data


Panel Data
Types of Data : Categorical vs Numerical
1. Categorical Data have values that can only be placed into categories.
(example: Nationality, Gender etc.)

2. Numerical Data have values that represent quantities.


2.1. Discrete Numerical Data can only take point value and no values in between.
(example: number of students, number of years etc.)

2.2. Continuous Numerical Data can take any value.


(example: Height, Weight, Distance etc.)
Data : Measurement Scales
▪ Two measurement scales for Categorical Data
1. Nominal Scale classifies categorical data into distinct categories with no implied ranking.
(example: Nationality such as Singaporean, Korean, European)

2. Ordinal Scale classifies categorical data into distinct categories with implied ranking.
(example: Customer Reviews such as Excellent, Good, Average, Poor)

▪ Two measurement scales for Numerical Data True: The value of zero should be meaningful
Zero: lowest value should be zero.
Both should be true for a Ratio Scale.

1. Interval Scale measures numerical data on an ordered scale with no true zero point.
(example: Temperature in Degree Celsius) Can only talk about differences
IQ Score

2. Ratio Scale measures numerical data on an ordered scale with a true zero point.
(example: Income) Can talk about multiples as well as it is lower bound
Age
Data : Sources
▪ Primary source of data: When the data collector is the one using the data
for analysis, the source is primary.

▪ Secondary source of data: When the person performing the statistical


analysis is not the data collector, the source is secondary.

▪ Sources of data fall into one of four categories:


1. Data distributed by an organization or an individual Secondary source

2. A designed experiment Prima


ry
3. A survey Sourc
4. An observational study e
Sampling
Why sampling? Census versus Sample
A census is the collection of data from every member of a population.
▪ Expensive
▪ Time consuming
▪ Not practical

A sample is the collection of data from a randomly selected portion of a population.


▪ Less Expensive
▪ Less Time consuming
▪ More practical
Sampling Methods: Simple Random Sample
𝑁 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑚𝑒𝑚𝑏𝑒𝑟𝑠 𝑖𝑛 𝑎 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
𝑛 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 (𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 = 𝑛)

1. Simple Random Sample: Choose a sample of items in such a way that


every subject in the sample has the equal chance of being selected.

Step 1. Assign an ID to all members in a population


Step 2. Randomly select 𝑛 IDs to obtain the sample.
Sampling Methods: Systematic Sample
𝑁 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑚𝑒𝑚𝑏𝑒𝑟𝑠 𝑖𝑛 𝑎 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
𝑛 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 (𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 = 𝑛)

2. Systematic Sample: Choose a sample of items by systematically selecting


a member in each respective partition of a population.
𝑁
Step 1. Partition the 𝑁 members of a population into 𝑛 groups with k members, where 𝑘 = .
(𝑘 should be rounded off the nearest integer, if needed)
𝑛

Step 2. Randomly select an item in the first group of 𝑘 members. for example, 3rd member is chosen

Step 3. Select the 𝑘th item in each of the remaining (𝑛 − 1) groups. now 3rd member will be chosen from each group
Sampling Methods: Stratified Sample
𝑁 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑚𝑒𝑚𝑏𝑒𝑟𝑠 𝑖𝑛 𝑎 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
𝑛 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 (𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 = 𝑛)

3. Stratified Sample: Choose a sample of items by combining the subjects of


simple random samples from separate sub-population (strata) in a population.
similar characterstics

Step 1. Subdivide the 𝑁 members of a population into separate sub-population or strata.


(a strata is defined by common characteristics)

Step 2. Select a simple random sample from each the strata.

Step 3. Combine the subjects from all the simple random samples to obtain the stratified sample.
Sampling Methods: Cluster Sample
𝑁 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑚𝑒𝑚𝑏𝑒𝑟𝑠 𝑖𝑛 𝑎 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛
𝑛 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑠𝑢𝑏𝑗𝑒𝑐𝑡𝑠 𝑖𝑛 𝑡ℎ𝑒 𝑠𝑎𝑚𝑝𝑙𝑒 (𝑠𝑎𝑚𝑝𝑙𝑒 𝑠𝑖𝑧𝑒 = 𝑛)

4. Cluster Sample: Choose a sample of items by selecting all members in


randomly selected cluster(s). cluster is a subset of strata

Step 1. Subdivide the 𝑁 members of a population into several clusters.


(clusters are naturally occurring designations such as districts, city blocks etc. )

Step 2. Obtain the sample by taking all members in one (or more) randomly selected cluster(s).
(if clusters are large, items taken from a single cluster is all that is needed)
Mathematics Recap: Summation Notations
n

෍ axk = ax1 + ax2 + ⋯ + axn


k=1

n n

෍ axk = a ෍ xk
k=1 k=1
n n n

෍ xk + yk = ෍ xk + ෍ yk
k=1 k=1 k=1
n n n n n

෍(xk ± yk )2 = ෍ xk2 + yk2 ± 2xk yk = ෍ xk2 + ෍ yk2 ± 2 ෍ xk yk


k=1 k=1 k=1 k=1 k=1
n

෍ a = an
k=1
Statistics Recap: Statistical Notations
Suppose X and Y are random variables

2
Mean or Expected Value of X = E(X) Variance of X = Var X = E X − E X

E aX = aE X Var aX = a2 Var X

E X ± Y = E X ± E(Y) Var X ± Y = Var X + Var Y ± 2 Cov(X, Y)

E XY = E X E(Y)

E(X + Y)2 = E X 2 + Y 2 + 2XY = E X 2 + E Y 2 + 2E(X)E(Y)

E(a) = a Var a = 0

Cov X, Y = 0 if X and Y are independent random variables


Further Moments
▪ There are further statistics that describe the shape of the distribution,
using formulae that are similar to those of the mean and variance.

Mean 1st Moment (describe central value)

Variance 2nd Moment (describe dispersion)

Skewness 3rd Moment (describe asymmetry)

Kurtosis 4th Moment (describe peakedness)


Further Moments: Sample Skewness
▪ Sample Skewness measures the degree of asymmetry exhibited by the data
not in exam
𝑛
1
𝑆𝑎𝑚𝑝𝑙𝑒 𝑆𝑘𝑒𝑤𝑛𝑒𝑠𝑠 = 3 ෍(𝑥𝑖 − 𝑥)ҧ 3
𝑛𝑠
𝑖=1

▪ If skewness equals zero, the histogram is symmetric about the mean

▪ Positive Skewness
➢ There are more observations below the mean than above it
➢ When the mean is greater than the median

▪ Negative Skewness
➢ There are a small number of low observations and a large number of high ones
➢ When the median is greater than the mean
Further Moments: Sample Kurtosis
▪ Sample Kurtosis measures the relative peakedness or flatness of a distribution compared to
the normal distribution.
𝑛
1
𝑆𝑎𝑚𝑝𝑙𝑒 𝐾𝑢𝑟𝑡𝑜𝑠𝑖𝑠 = −3 + 4 ෍(𝑥𝑖 − 𝑥)ҧ 4
𝑛𝑠
𝑖=1

▪ The kurtosis of a normal distribution is 0

▪ Platykurtic: When the kurtosis < 0, the frequencies throughout the curve are closer to be equal
(the curve is more flat and wide) and indicates a relatively flat distribution with short tails.

▪ Leptokurtic: When the kurtosis > 0, there are high frequencies in only a small part of the curve
(the curve is more peaked) and indicates a relatively peaked distribution with relatively long tails.
Z Distribution
X = N E X , Var X

X
E X

every x value will have a corresponding z value

X−E X
Z= = N(0,1)
sd X

Area = 0.1292
the probability of you getting a z value iof bigger that 1.13 is 0.1292

Z
0 1.13
t Distribution v imp

β෠ ≈ N E β෠ , Var β෠

estimator
different samples can have different estimator

β෠
E β෠

t 122 = the
Z std dev of an estimator is called the std error
β෠ − E β෠
≈ t n−1−k
se β ෠෠ k is no. of variables on the RHS
t 95 Var β
Degree of Freedom (df)

t 77 Area = 0.025

t
0 1.9913
the shape of the t dist. depends on the degree of freedom. higher degree of freedom = higher peaks in dist.
in z table the overlap is area but in t table, the overlap is critical value and the column header is the area, i.e, the probability.
F Distribution v imp
not symmetric

there are two degrees of freedom


F3,23
choose the correct f table according to area given in context

Area = 0.05 critical value

F
0.05
F3,23 = 3.03
2
Chi-squared χ Distribution
shape entirely changes w change in dgree of freedom unlike t dist. and f dist.

Area = 0.05
critical value

χ2 (10)

χ2
χ20.05 (10) = 18.307
Prepared by

Daniel SOH

You might also like