[go: up one dir, main page]

0% found this document useful (0 votes)
27 views8 pages

LSE Data Analysis M6U3 Notes

This document provides a comprehensive guide on performing regression analysis using Microsoft Excel, including enabling the data analysis add-in and interpreting the results. It covers key components such as regression statistics, the ANOVA table, and regression coefficients, emphasizing the importance of understanding inputs and outputs for reliable results. The conclusion highlights the efficiency of software in executing complex calculations while stressing the need for a solid grasp of the underlying concepts.

Uploaded by

Subrato Nath
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views8 pages

LSE Data Analysis M6U3 Notes

This document provides a comprehensive guide on performing regression analysis using Microsoft Excel, including enabling the data analysis add-in and interpreting the results. It covers key components such as regression statistics, the ANOVA table, and regression coefficients, emphasizing the importance of understanding inputs and outputs for reliable results. The conclusion highlights the efficiency of software in executing complex calculations while stressing the need for a solid grasp of the underlying concepts.

Uploaded by

Subrato Nath
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

MODULE 6 UNIT 3

Perform a regression analysis

© 2025 LSE
All Rights Reserved
lse.ac.uk

Table of contents
1. Introduction 3
2. Regression analysis in Excel 3
3. Analysing regression results 6
3.1 Regression statistics 6
3.2 ANOVA table 7
3.3 Regression coefficients 7
4. Conclusion 8
5. Bibliography 8

© 2025 LSE
All Rights Reserved

getsmarter.com | info@getsmarter.com
+44 203 457 5774 (UK) | +1 224 249 3522 (US) | +27 21 447 7565 (SA)

Page 2 of 8
lse.ac.uk

Learning outcomes:

LO4: Execute a regression analysis.

LO5: Analyse the results from a regression analysis to conclude what drives the
variation in variables.

1. Introduction
In Unit 1, you explored the fundamentals of regression analysis and the appropriateness
of linearity when performing a regression analysis. In Unit 2, you considered categorical
variables and the assumptions and challenges associated with regression analysis.

In this unit, you will combine this knowledge to execute a regression analysis in Excel and
interpret the results.

2. Regression analysis in Excel


To perform a regression analysis in Excel, the data analysis add-in needs to be enabled.
This step is explained in Video 1.

When running the regression analysis through the data analysis add-in, an input dialog
box, shown in Figure 1, appears on screen.

© 2025 LSE
All Rights Reserved

getsmarter.com | info@getsmarter.com
+44 203 457 5774 (UK) | +1 224 249 3522 (US) | +27 21 447 7565 (SA)

Page 3 of 8
lse.ac.uk

Figure 1: Regression analysis input dialog box.

In the Input section, the response variable (Input Y Range) and explanatory variable(s)
(Input X Range) are selected first. After this, you can select whether labels (i.e. variable
names) are included in the first row of the Excel file and what the level of confidence should
be. You can also select the box that indicates whether the constant should be 0 (i.e.
whether you wish to suppress the intercept to be 0), should that be required.

Under the Output options section, you can select the output range.

The Residuals and Normal Probability sections allow you to plot these values as part of
the regression results. These plots are useful for checking some of the regression
assumptions, such as homogeneity of variance.

In Videos 1 and 2, Dr James Abdey introduces you to regression analysis in Microsoft


Excel, executes a regression analysis, and explores the outcomes of regression analysis.

© 2025 LSE
All Rights Reserved

getsmarter.com | info@getsmarter.com
+44 203 457 5774 (UK) | +1 224 249 3522 (US) | +27 21 447 7565 (SA)

Page 4 of 8
lse.ac.uk

Video 1: Dr James Abdey discusses regression analysis in Microsoft Excel. (Access this set of
notes on the Online Campus to engage with this video.)

Video 2: Dr James Abdey explores the outcomes of regression analysis. (Access this set of notes
on the Online Campus to engage with this video.)

© 2025 LSE
All Rights Reserved

getsmarter.com | info@getsmarter.com
+44 203 457 5774 (UK) | +1 224 249 3522 (US) | +27 21 447 7565 (SA)

Page 5 of 8
lse.ac.uk

3. Analysing regression results


When performing a regression analysis in Microsoft Excel, the outcome is generated in the
form of a table, as shown in Figure 2. This table consists of three parts: the regression
statistics, the ANOVA table, and the regression coefficients. The most important
components in this table will be explored individually in the following subsections.

Figure 2: Microsoft Excel regression results.

3.1 Regression statistics


The regression statistics are an indication of how well the model fits the data.

• Multiple R: This value is the correlation coefficient and is an indication of the


strength of the linear relationship between the variables. Remember, a value of −1
indicates a strong, negative, linear relationship; a value of 1 indicates a strong,
positive, linear relationship; and a value of 0 indicates that no linear relationship
exists between the variables.

• R Square: This value is the coefficient of determination. It is interpreted as the


proportion (or percentage when multiplied by 100) of the variation in the response
variable that is explained by the explanatory variable. In other words, it is an
indication of how well x explains y.

• Adjusted R Square: For more than one variable, this is the R squared value to be
interpreted. This value adjusts for the number of x-variables used in the model.

• Standard Error: This value is an estimate of the standard deviation of the error
term. This value is not the same as the standard error calculated using descriptive
statistics.

• Observations: This value is the total number of observations used in the model.

© 2025 LSE
All Rights Reserved

getsmarter.com | info@getsmarter.com
+44 203 457 5774 (UK) | +1 224 249 3522 (US) | +27 21 447 7565 (SA)

Page 6 of 8
lse.ac.uk

3.2 ANOVA table


As you learnt in Unit 1, the values obtained in an ANOVA table – for example, the sum of
squares for x-data (Sxx) and the sum of products of x-data and y-data (Sxy) – are used to
calculate the coefficient of determination and other regression statistics. This means that
these values have already been used to calculate other components in this table and are
therefore rarely interpreted in regression analysis, except for the Significance F values.
The values are broken down into the components for Regression and Residual, and then
summed as Total. The components in the ANOVA table are the following:

• df: Degrees of freedom

• SS: Sum of squares. Calculated using the formulae discussed earlier in this
module.

• MS: Mean sum of squares. The sum of squares divided by the degrees of freedom.

• F: The overall F test for the null hypothesis that all non-β0 beta coefficients are 0.

• Significance F: This value is the associated p-value of the F test.

3.3 Regression coefficients


The coefficients in this table contain specific information about the overall regression line
and associated confidence intervals. Each of the columns contains information related to
the variable indicated in the first column.

The standard error, t-stat, and confidence interval columns indicate the confidence that is
held in the beta parameter estimates. The confidence level (for example, 95%) is set when
the initial test is performed.

• First column: These values will differ depending on the labels of the variables
analysed through regression analysis. The first will be the y-intercept of the
regression line, and those under it, the x-variables. If no labels were defined, the
standard labels will read “X Variable 1”, “X Variable 2”, and so on.

• Coefficients: This column contains the least squares estimates of the beta
parameters corresponding to the variables in the first column. The least squares
estimates are what have been defined as the β^ (beta-hat) coefficients thus far. By
combining the coefficients with the variable names in the first column, the
regression line equation is formed. From the example table in Figure 2, the
regression line becomes:

𝑦𝑦� = intercept + coefficient × (X variable 1) + coefficient × (X variable 2)

𝑦𝑦� = −76.48 + 8.57 × (X variable 1) −1,401.09 × (X variable 2)

• Standard error: This column contains the standard errors of the respective least
squares estimates.

© 2025 LSE
All Rights Reserved

getsmarter.com | info@getsmarter.com
+44 203 457 5774 (UK) | +1 224 249 3522 (US) | +27 21 447 7565 (SA)

Page 7 of 8
lse.ac.uk

• t-stat: This column contains the t-statistic. As you learnt in previous modules, this
statistic is used to decide between the null versus alternative hypotheses.

• p-value: This column contains the p-values for the hypothesis (t) tests of individual
beta parameters.

• Lower 95% and Upper 95%: These columns contain the lower and upper
endpoints of 95% confidence intervals of the respective parameter estimates.

4. Conclusion
Software such as Microsoft Excel allows you to quickly and repeatedly execute complex
calculations such as regression analysis.

Although software makes performing these calculations more efficient, it is important that
you fully understand the inputs and outputs involved before you conduct any analysis. By
being aware of what the different components mean, you can easily identify errors, and
thereby ensure that your regression results are reliable.

5. Bibliography
Abdey, J. 2024. Elementary Statistical Theory [ST102 Lecture material]. Department of
Statistics, The London School of Economics and Political Science.

© 2023 LSE
All Rights Reserved

getsmarter.com | info@getsmarter.com
+44 203 457 5774 (UK) | +1 224 249 3522 (US) | +27 21 447 7565 (SA)

Page 8 of 8

You might also like