[go: up one dir, main page]

0% found this document useful (0 votes)
26 views6 pages

Basic Regression Analysis 4

Basic Regression Analysis 4

Uploaded by

Abhorn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views6 pages

Basic Regression Analysis 4

Basic Regression Analysis 4

Uploaded by

Abhorn
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Modelling cycle

This leads us into a modelling cycle


 Fit
 Examine residuals
 Transform data or change model if necessary

This cycle is repeated until we are “happy” with the


fitted model
Diagramatically….

H.M.F 15
Modelling cycle
Choose Model Plots, theory

Fit model

Transform/ Examine residuals

change
Good fit
Bad fit
Use model
H.M.F 16
Exa: U. S. State Public-School Expenditures
 Data (Anscombe) for the 50 states of the USA:
 Library(car); data(Anscombe)
 Variables are:
 Per capita expenditure on education (response),
variable education
 Per capita Income, variable income
 Number of residents per 1000 under 18, variable
young
 Number of residents per 1000 in urban areas, variable
Urban
 Fit model: education~ income+young+urban
H.M.F 17
200 250 300 350 400 450 500 550 300 320 340 360 380

300 400 500 600 700 800 900


urban
Outlier!
500

response
(response)
400

educ
300
200

3500 4000 4500 5000 5500


percap
380
360
340

under18
320
300

300 400 500 600 700 800 900 3500 4000 4500 5000 5500

H.M.F 18
Basic fit, outlier in
educ.lm = lm(education~ income+young+urban,
data=Anscombe)
>summary(educ.lm)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -2.868e+02 6.492e+01 -4.418 5.82e-05 ***
income 8.065e-02 9.299e-03 8.674 2.56e-11 ***
young 8.173e-01 1.598e-01 5.115 5.69e-06 ***
urban -1.058e-01 3.428e-02 -3.086 0.00339 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 26.69 on 47 degrees of freedom


Multiple R-squared: 0.6896, Adjusted R-squared: 0.6698
F-statistic: 34.81 on 3 and 47 DF, p-value: 5.337e-12

R2 is 69%
H.M.F 19
Basic fit, outlier out
> educ50.lm = lm(education~ income+young+urban,
data=Anscombe,subset=-50)
>summary(educ50.lm) See how we exclude pt 50
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -242.18729 82.19788 -2.946 0.005033 **
income 0.07432 0.01173 6.336 9.07e-08 ***
young 0.71232 0.19901 3.579 0.000826 ***
urban -0.08657 0.04060 -2.132 0.038369 *
---
Signif. codes: 0 „***‟ 0.001 „**‟ 0.01 „*‟ 0.05 „.‟ 0.1 „ ‟ 1

Residual standard error: 26.75 on 46 degrees of freedom


Multiple R-squared: 0.5692, Adjusted R-squared: 0.5411
F-statistic: 20.26 on 3 and 46 DF, p-value: 1.636e-08

R2 is now 57%
H.M.F 20

You might also like