[go: up one dir, main page]

0% found this document useful (0 votes)
25 views57 pages

Data Report

report for statistic

Uploaded by

trangnnt22402ca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views57 pages

Data Report

report for statistic

Uploaded by

trangnnt22402ca
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 57

Acknowledgments

We would like to express our deepest gratitude to Mr. Nguyen Phuc Son, Lecturer of the
Department of Data Analysis in Economics, for his enthusiastic guidance and support
throughout the process of conducting this data analysis report. As economics students
with limited experience and knowledge in data analysis, we were able to approach and
complete this report to the best of our ability, thanks to his dedicated mentorship and
insightful feedback.

This report has provided us with an opportunity to explore data analysis in a more
multidimensional way, expanding our understanding. Although the results and arguments
presented here may contain subjective interpretations or inaccuracies, we hope you will
recognize the effort and dedication our group has put into working with the data and
compiling this report.

Finally, we sincerely thank you once again for providing us with the essential data and
detailed instructions. These valuable resources have offered us a practical perspective,
allowing us to apply analytical models in meaningful ways. As this is our first time
conducting such a report, we understand there may be mistakes, and we appreciate your
constructive feedback to help us build a solid foundation for future reports.

Abstract
This study analyzes the factors affecting Adidas' Operating Profit in the US market by
applying multiple regression models. The main objective is to identify the key factors that
impact profitability to assist in making strategic decisions. The study uses simple linear
regression to evaluate the individual impact of each variable, then applies a multiple
regression model to examine the combined impact of factors on Operating Profit. Data
will be filtered to focus on the three most profitable retailers, analyzing their performance
in detail to make appropriate strategic recommendations. The results indicate which
factors such as product pricing, volume of units sold, and regions or sales methods have
the greatest impact on Operating Profit. Based on these results, the study will provide
strategic recommendations to optimize pricing, marketing efforts, and regional or
method-specific sales strategies, thereby improving operating profits for Adidas’ retail in
the US.
I. Introduction
1.1. Research objectives
The main objective of this research is to analyze the Adidas sale Datasets of three
retailers which have the highest revenue among six given retailers in the US between
2020 and 2021. Moreover, the research aims to investigate some factors that influence
Operating Profit of those three retailers and identify which variable has the strongest
impact. Hence, we will give some recommendations to enable the company to make
strategic decisions such as adjusting prices, focusing on high-profit regions, or optimizing
sales methods to boost overall revenue and profits.
1.2. Research questions
a. Are there any differences in Operating Profits of 3 chosen retailers?
b. Does the Sales method, Region and the number of shoes that 3 chosen retailers have a
significant impact on their Operating Profit?
c. Does the effect of selling price on Operating Profit differ across regions? Because
some regions may be more price-sensitive, and this difference may be a factor in
adjusting pricing strategies for each specific region.
d. Do different approaches to selling lead to varying profit outcomes for each product
category? Because some products are more profitable when being sold online, while
others are more profitable when being sold offline.
e. Which are the key factors that significantly impact on profitability?
f. Predict High Value in Operating Profit based on Sales Method and Units
Sold. Calculate odds ratio
1.3. Significance of research
This study will provide the understanding of the use of Stata for running different
scenarios to simulate potential impacts on operating profit as well as how to interpret the
output from Stata to make informed decisions based on the analysis; therefore, translate
into actionable recommendations for business.
II. Data analysis and Results
1) Analyze given data
To achieve the research objective, we filtered the available data and selected the
appropriate variables including three Retailers: Foot Locker, Sports Direct and West
Gear; along with Region, Product, Units Sold, Total Sales, Operating Profit and Sales
Method. We choose these three Retailers because after using Excel to calculate the
total of Operating Profit for each product across all Retailers. We found that they have
the highest Operating Profit, so we believe it will get easier to make recommendations
for Adidas by focusing on potential factors, thereby increasing profitability.
Foot Sports West Grand
Product/ Area Amazon Locker Kohl's Direct Walmart Gear Total
Men's Apparel 3,331,444 9,942,405 5,945,043 8,723,915 3,166,591 13,653,633 44,763,030
Men's Athletic Footwear 4,518,030 12,409,221 5,725,763 11,935,673 4,029,258 13,228,944 51,846,888
Men's Street Footwear 8,707,658 23,060,809 9,219,820 15,837,750 5,438,666 20,537,557 82,802,261
Women's Apparel 6,280,072 17,192,901 5,596,173 17,832,961 6,348,451 15,400,413 68,650,971
Women's Athletic
Footwear 2,701,608 8,477,314 4,570,693 9,688,746 3,239,052 10,298,372 38,975,785
Women's Street Footwear 3,279,692 9,639,474 5,753,761 10,313,910 3,560,034 12,548,955 45,095,827
Grand Total 28,818,503 80,722,125 36,811,253 74,332,955 25,782,053 85,667,873 332,134,761

In addition, we created an additional column, High Profit, based on the Operating Profits
column. If the data value is greater than the mean of Operating Profits, we consider it as
High Profit and assign a value of 1. Conversely, if the data value is less than the mean of
Operating Profits, we assign a value of 0.
And this is out final data: Adidas data

a. Are there any differences in Operating Profits of 3 chosen retailers?


. import excel "C:\Users\DELL\Downloads\Adidas-US-Sales-Datasets-1.xlsx", sheet("Lọc") firstrow

(14 vars, 7,043 obs)

oneway OperatingProfit Retailer, tabulate

| Summary of Operating Profit

Retailer | Mean Std. dev. Freq.

-------------+------------------------------------

Foot Locker | 30611.348 51194.485 2,637

Sports Dir.. | 36581.179 58018.483 2,032


West Gear | 36085.877 56359.735 2,374

-------------+------------------------------------

Total | 34179.036 55044.884 7,043

Analysis of variance

Source SS df MS F Prob > F

------------------------------------------------------------------------

Between groups 5.3922e+10 2 2.6961e+10 8.92 0.0001

Within groups 2.1283e+13 7040 3.0231e+09

------------------------------------------------------------------------

Total 2.1337e+13 7042 3.0299e+09

Bartlett's equal-variances test: chi2(2) = 40.7412 Prob>chi2 = 0.000

Ho: 1 = 2 = 3
H: Not all means are equal
where: 1= mean of Operating Profits in Foot Locker
2= mean of Operating Profits in Sport Direct
3=mean of Operating Profits in West Gear
p-value= 0,0001 <0,05
We reject Ho
 We can conclude that the mean Operating Profits is not the same in 3 chosen retailers.
Next step, we will check the assumptions for Analysis of Variance
histogram OperatingProfit, discrete

As it can be seen, the response variable (Operating Profit) is not normally distributed
Therefore, we will use Krusal Wallis test instead without the assumption of nomally
distributed assumptions
kwallis OperatingProfit, by(Retailer)

Kruskal–Wallis equality-of-populations rank test

+----------------------------------+

| Retailer | Obs | Rank sum |

|---------------+-------+----------|

| Foot Locker | 2,637 | 8.84e+06 |

| Sports Direct | 2,032 | 7.34e+06 |

| West Gear | 2,374 | 8.63e+06 |

+----------------------------------+
chi2(2) = 29.504

Prob = 0.0001

chi2(2) with ties = 29.504

Prob = 0.0001

Ho: All populations are identical


H: Not all populations are identical
p_value= 0,0001 < 0,05
We reject Ho
 We can conclude that the Operating Profits is not the same in 3 chosen retailers
b. Does the Sales method, Region and the number of shoes that 3 chosen retailers have a
significant impact on their Operating Profit?
. describe

Contains data

Observations: 7,043

Variables: 14

-------------------------------------------------------------------------------------------------
-------------------------------------

Variable Storage Display Value

name type format label Variable label

-------------------------------------------------------------------------------------------------
-------------------------------------

Retailer str13 %13s Retailer

RetailerID long %10.0g Retailer ID

InvoiceDate int %td.. Invoice Date

Region str9 %9s Region

State str14 %14s State

City str14 %14s City

Product str25 %25s Product

PriceperUnit double %14.2f Price per Unit

UnitsSold int %10.0gc Units Sold

TotalSales double %10.0g Total Sales

OperatingProfit double %10.0g Operating Profit


OperatingMargin double %4.2f Operating Margin

SalesMethod str8 %9s Sales Method

HighProfit byte %10.0g High Profit

-------------------------------------------------------------------------------------------------
-------------------------------------

Sorted by:

Note: Dataset has changed since last saved.

. sum

Variable | Obs Mean Std. dev. Min Max

-------------+---------------------------------------------------------

Retailer | 0

RetailerID | 7,043 1171788 27355.06 1128299 1197831

InvoiceDate | 7,043 22403.05 175.3224 21915 22645

Region | 0

State | 0

-------------+---------------------------------------------------------

City | 0

Product | 0

PriceperUnit | 7,043 44.65327 15.00465 7 110

UnitsSold | 7,043 253.7656 215.3128 0 1275

TotalSales | 7,043 91655.55 142157.9 0 825000

-------------+---------------------------------------------------------

OperatingP~t | 7,043 34179.04 55044.88 0 382500

OperatingM~n | 7,043 .4256851 .0982308 .1 .8

SalesMethod | 0

HighProfit | 7,043 .302144 .4592199 0 1

We will convert 2 categorical variables that are Sales Method and Region into dummy
variables
. tabulate SalesMethod, generate (SalesMethod)

Sales |

Method | Freq. Percent Cum.

------------+-----------------------------------

In-store | 1,441 20.46 20.46


Online | 3,485 49.48 69.94

Outlet | 2,117 30.06 100.00

------------+-----------------------------------

Total | 7,043 100.00

. tabulate Region, generate (Region)

Region | Freq. Percent Cum.

------------+-----------------------------------

Midwest | 1,492 21.18 21.18

Northeast | 1,599 22.70 43.89

South | 1,284 18.23 62.12

Southeast | 940 13.35 75.47

West | 1,728 24.53 100.00

------------+-----------------------------------

Total | 7,043 100.00

. rename SalesMethod1 Instore

. rename SalesMethod2 Online

. rename SalesMethod3 Outlet

. rename Region1 Midwest

. rename Region2 Northeast

. rename Region3 South

. rename Region4 Southeast

. rename Region5 West


. regress OperatingProfit UnitsSold

Source | SS df MS Number of obs = 7,043

-------------+---------------------------------- F(1, 7041) = 26537.10

Model | 1.6863e+13 1 1.6863e+13 Prob > F = 0.0000

Residual | 4.4741e+12 7,041 635438912 R-squared = 0.7903

-------------+---------------------------------- Adj R-squared = 0.7903

Total | 2.1337e+13 7,042 3.0299e+09 Root MSE = 25208

------------------------------------------------------------------------------

OperatingP~t | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

UnitsSold | 227.272 1.395144 162.90 0.000 224.5371 230.0069

_cons | -23494.77 464.2917 -50.60 0.000 -24404.92 -22584.62

------------------------------------------------------------------------------

Simple linear equation: y= 227272UnitsSold -23494,77


Ho: β1 = 0
Hα: β1 not equal to 0
 Based on the information above, we can see that F (1, 7041) = 26537.10 and the
Prob > F = 0.0000. Because the p-value < 0.05.
 We reject Ho and conclude that there is a significant relationship between
UnitsSold and OperatingProfit.
 The R-square = 79,03% and Adj R-square = 79,03%, these results close to 1 so it
provides a good fit for data.

. predict residual, residuals

. ttest residual ==0

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]

---------+--------------------------------------------------------------------

residual | 7,043 -4.86e-06 300.3497 25206.12 -588.7758 588.7758

------------------------------------------------------------------------------

mean = mean(residual) t = -0.0000

H0: mean = 0 Degrees of freedom = 7042


Ha: mean < 0 Ha: mean != 0 Ha: mean > 0

Pr(T < t) = 0.5000 Pr(|T| > |t|) = 1.0000 Pr(T > t) = 0.5000

Ho: Mean = 0
Hα: Mean not equal to 0
 Based on the information above, we can see that Mean = -4.86e-06 nearly equal to
0 and Pr(|T| > |t|) = 1.0000. We cannot reject Ho. Linear regression model is
suitable because it ensures that the residuals have a mean of 0.

. sktest residual

Skewness and kurtosis tests for normality

----- Joint test -----

Variable | Obs Pr(skewness) Pr(kurtosis) Adj chi2(2) Prob>chi2

-------------+-----------------------------------------------------------------

residual | 7,043 0.0000 0.0000 . .

Ho: Residuals follow normal distribution


Hα: Residuals do not follow normal distribution
 Based on the information above, we can see that Pr(skewness), Pr(kurtosis) have
p-value < 0.05 and p-value of Joint test < 0.05. We reject Ho. Residuals do not
follow normal distribution.

. swilk residual

Shapiro–Wilk W test for normal data

Variable | Obs W V z Prob>z

-------------+------------------------------------------------------

residual | 7,043 0.88172 433.948 16.099 0.00000

Note: The normal approximation to the sampling distribution of W'

is valid for 4<=n<=2000.


Ho: Residuals follow normal distribution
Hα: Residuals do not follow normal distribution
 Based on the information above, we can see that p-value < 0.05. We reject Ho.
Residuals do not follow normal distribution.

. estat hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of OperatingProfit

H0: Constant variance

chi2(1) = 7650.05

Prob > chi2 = 0.0000

 Based on the information above, we can see that p-value < 0.05. We reject Ho. The
variance is not homogeneous.

. imtest

Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 1816.11 2 0.0000

Skewness | 376.41 1 0.0000

Kurtosis |-3152147.49 1 1.0000

---------------------+----------------------------

Total |-3149954.97 4 1.0000

--------------------------------------------------
 Based on the information above, we can see that p-value (Heteroskedasticity) <
0.05. The variance is not homogeneous.
 We cannot conclude that Units Sold has significant impact on Operating Profits.

. regress OperatingProfit Instore

Source | SS df MS Number of obs = 7,043

-------------+---------------------------------- F(1, 7041) = 971.46

Model | 2.5870e+12 1 2.5870e+12 Prob > F = 0.0000

Residual | 1.8750e+13 7,041 2.6630e+09 R-squared = 0.1212

-------------+---------------------------------- Adj R-squared = 0.1211

Total | 2.1337e+13 7,042 3.0299e+09 Root MSE = 51604

------------------------------------------------------------------------------

OperatingP~t | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

Instore | 47508.42 1524.254 31.17 0.000 44520.42 50496.42

_cons | 24458.8 689.4621 35.48 0.000 23107.25 25810.35

------------------------------------------------------------------------------

Linear regression equation: 47508.42In store + 24458.8


Hypothesis Ho: β1 = 0
Hα: β1 not equal to 0
 Based on the information above, we can see that F (1, 7041) = 971.46 and the Prob
> F = 0.0000. Because the p-value < 0.05. We reject Ho, there is a significant
relationship between UnitsSold and In store .
 The R-square = 12.12% and Adj R-square = 12.11%, these results do not provides
a good fit for data.

. predict residual1, residuals

. ttest residual1 ==0

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]

---------+--------------------------------------------------------------------

residu~1 | 7,043 .0001841 614.8543 51600.16 -1205.299 1205.3


------------------------------------------------------------------------------

mean = mean(residual1) t = 0.0000

H0: mean = 0 Degrees of freedom = 7042

Ha: mean < 0 Ha: mean != 0 Ha: mean > 0

Pr(T < t) = 0.5000 Pr(|T| > |t|) = 1.0000 Pr(T > t) = 0.5000

Ho: Mean = 0
Hα: Mean not equal to 0
 Based on the information above, we can see that Mean = 0.0001841 nearly equal
to 0 and Pr(|T| > |t|) = 1.0000. We cannot reject Ho. Linear regression model is
suitable because it ensures that the residuals have a mean of 0.

. sktest residual1

Skewness and kurtosis tests for normality

----- Joint test -----

Variable | Obs Pr(skewness) Pr(kurtosis) Adj chi2(2) Prob>chi2

-------------+-----------------------------------------------------------------

residual1 | 7,043 0.0000 0.0000 . .

Ho: Residuals follow normal distribution


Hα: Residuals do not follow normal distribution
 Based on the information above, we can see that Pr(skewness), Pr(kurtosis) have
p-value < 0.05 and p-value of Joint test < 0.05. We reject Ho. Residuals do not
follow normal distribution.

. swilk residual1

Shapiro–Wilk W test for normal data

Variable | Obs W V z Prob>z

-------------+------------------------------------------------------

residual1 | 7,043 0.65223 1275.929 18.958 0.00000

Note: The normal approximation to the sampling distribution of W'

is valid for 4<=n<=2000.


Ho: Residuals follow normal distribution
Hα: Residuals do not follow normal distribution
 Based on the information above, we can see that p-value < 0.05. We reject Ho.
Residuals do not follow normal distribution.

. estat hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of OperatingProfit

H0: Constant variance

chi2(1) = 45.42

Prob > chi2 = 0.0000

 Based on the information above, we can see that p-value < 0.05. We reject Ho. The
variance is not homogeneous.

. imtest

Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 9.11 1 0.0025

Skewness | 247.92 1 0.0000

Kurtosis | -4.79e+07 1 1.0000

---------------------+----------------------------

Total | -4.79e+07 3 1.0000

--------------------------------------------------

 Based on the information above, we can see that p-value (Heteroskedasticity) <
0.05. The variance is not homogeneous.
 We cannot conclude a significant impact of In store on Operating Profits
. regress OperatingProfit Online

Source | SS df MS Number of obs = 7,043

-------------+---------------------------------- F(1, 7041) = 530.40

Model | 1.4947e+12 1 1.4947e+12 Prob > F = 0.0000

Residual | 1.9842e+13 7,041 2.8181e+09 R-squared = 0.0701

-------------+---------------------------------- Adj R-squared = 0.0699

Total | 2.1337e+13 7,042 3.0299e+09 Root MSE = 53086

------------------------------------------------------------------------------

OperatingP~t | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

Online | -29137.56 1265.177 -23.03 0.000 -31617.68 -26657.43

_cons | 48596.81 889.967 54.61 0.000 46852.21 50341.41

------------------------------------------------------------------------------

Simple linear regression equation: -29137.56Online + 48596.81


Hypothesi Ho: β1 = 0
Hα: β1 not equal to 0
 Based on the information above, we can see that F (1, 7041) = 530.4 and the Prob
> F = 0.0000. Because the p-value < 0.05. We reject Ho, there is a significant
relationship between UnitsSold and OperatingProfit.
 The R-square = 7,01% and Adj R-square = 7,01%. These results does not provide
good fit data

. predict residual2, residuals

. ttest residual2 ==0

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]

---------+--------------------------------------------------------------------

residu~2 | 7,043 9.51e-06 632.5097 53081.85 -1239.909 1239.909


------------------------------------------------------------------------------

mean = mean(residual2) t = 0.0000

H0: mean = 0 Degrees of freedom = 7042

Ha: mean < 0 Ha: mean != 0 Ha: mean > 0

Pr(T < t) = 0.5000 Pr(|T| > |t|) = 1.0000 Pr(T > t) = 0.5000

Ho: Mean = 0
Hα: Mean not equal to 0
 Based on the information above, we can see that Mean 9.51e-06 nearly equal to 0
and Pr(|T| > |t|) = 1.0000. We cannot reject Ho. Linear regression model is suitable
because it ensures that the residuals have a mean of 0.

. sktest residual2

Skewness and kurtosis tests for normality

----- Joint test -----

Variable | Obs Pr(skewness) Pr(kurtosis) Adj chi2(2) Prob>chi2

-------------+-----------------------------------------------------------------

residual2 | 7,043 0.0000 0.0000 . .

Ho: Residuals follow normal distribution


Hα: Residuals do not follow normal distribution
 Based on the information above, we can see that Pr(skewness), Pr(kurtosis) have
p-value < 0.05 and p-value of Joint test < 0.05. We reject Ho. Residuals do not
follow normal distribution.

. swilk residual2

Shapiro–Wilk W test for normal data

Variable | Obs W V z Prob>z

-------------+------------------------------------------------------

residual2 | 7,043 0.73269 980.737 18.261 0.00000

Note: The normal approximation to the sampling distribution of W'

is valid for 4<=n<=2000.


Ho: Residuals follow normal distribution
Hα: Residuals do not follow normal distribution
Based on the information above, we can see that p-value < 0.05. We reject Ho. Residuals
do not follow normal distribution
. estat hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of OperatingProfit

H0: Constant variance

chi2(1) = 270.07

Prob > chi2 = 0.0000

Based on the information above, we can see that p-value < 0.05. We reject Ho. The
variance is not homogeneous
. imtest

Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 61.35 1 0.0000

Skewness | 255.98 1 0.0000

Kurtosis | -7.46e+07 1 1.0000

---------------------+----------------------------

Total | -7.46e+07 3 1.0000

--------------------------------------------------

 Based on the information above, we can see that p-value (Heteroskedasticity) <
0.05. The variance is not homogeneous.
 This linear regression model can’t be used to analyze and predict for existing data
because of it’s not align with those assumptions.
. regress OperatingProfit Outlet

Source | SS df MS Number of obs = 7,043

-------------+---------------------------------- F(1, 7041) = 2.22

Model | 6.7198e+09 1 6.7198e+09 Prob > F = 0.1364

Residual | 2.1330e+13 7,041 3.0294e+09 R-squared = 0.0003

-------------+---------------------------------- Adj R-squared = 0.0002

Total | 2.1337e+13 7,042 3.0299e+09 Root MSE = 55040

------------------------------------------------------------------------------

OperatingP~t | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

Outlet | -2130.344 1430.377 -1.49 0.136 -4934.314 673.6262

_cons | 34819.38 784.2097 44.40 0.000 33282.09 36356.67

------------------------------------------------------------------------------

Simple linear regression: -2130.344Outlet + 34819.38


Ho: β1 = 0
Hα: β1 not equal to 0
 Based on the information above, we can see that F (1, 7041) = 2.22 and the Prob >
F = 0.1364 Because the p-value > 0.05. We cannot reject Ho, there is no
significant relationship between Outlet and OperatingProfit.
 The R-square = 00.03% and Adj R-square = 00.02%. This results do not provide
good fit data

. predict residual3, residuals

. ttest residual3 ==0

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]

---------+--------------------------------------------------------------------

residu~3 | 7,043 .0000917 655.7974 55036.22 -1285.56 1285.56

------------------------------------------------------------------------------

mean = mean(residual3) t = 0.0000

H0: mean = 0 Degrees of freedom = 7042


Ha: mean < 0 Ha: mean != 0 Ha: mean > 0

Pr(T < t) = 0.5000 Pr(|T| > |t|) = 1.0000 Pr(T > t) = 0.5000

. sktest residual3

Skewness and kurtosis tests for normality

----- Joint test -----

Variable | Obs Pr(skewness) Pr(kurtosis) Adj chi2(2) Prob>chi2

-------------+-----------------------------------------------------------------

residual3 | 7,043 0.0000 0.0000 . .

. swilk residual3

Shapiro–Wilk W test for normal data

Variable | Obs W V z Prob>z

-------------+------------------------------------------------------

residual3 | 7,043 0.66544 1227.444 18.856 0.00000

Note: The normal approximation to the sampling distribution of W'

is valid for 4<=n<=2000.

. estat hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of OperatingProfit

H0: Constant variance

chi2(1) = 4.30

Prob > chi2 = 0.0381

. imtest
Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 1.07 1 0.3019

Skewness | 293.50 1 0.0000

Kurtosis | -2.60e+07 1 1.0000

---------------------+----------------------------

Total | -2.60e+07 3 1.0000

--------------------------------------------------

 This linear regression model can’t be used to analyze and predict for existing data
because of violations of Heteroskedasticity.

. regress OperatingProfit Midwest

Source | SS df MS Number of obs = 7,043

-------------+---------------------------------- F(1, 7041) = 52.06

Model | 1.5661e+11 1 1.5661e+11 Prob > F = 0.0000

Residual | 2.1180e+13 7,041 3.0081e+09 R-squared = 0.0073

-------------+---------------------------------- Adj R-squared = 0.0072

Total | 2.1337e+13 7,042 3.0299e+09 Root MSE = 54846

------------------------------------------------------------------------------

OperatingP~t | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

Midwest | -11540.24 1599.399 -7.22 0.000 -14675.55 -8404.937

_cons | 36623.74 736.1435 49.75 0.000 35180.68 38066.8

------------------------------------------------------------------------------

Ho: β1 = 0
Hα: β1 not equal to 0
 Based on the information above, we can see that F (1, 7041) = 52.06 and the Prob
> F = 0.0000. Because the p-value < 0.05. We reject Ho, there is a significant
relationship between Mid West and OperatingProfit.
 The R-square = 00.73% and Adj R-square = 00.72%. This results does not provide
good fit data

. predict residual4, residuals

. ttest residual4 ==0

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]

---------+--------------------------------------------------------------------

residu~4 | 7,043 -.0000306 653.4892 54842.5 -1281.036 1281.035

------------------------------------------------------------------------------

mean = mean(residual4) t = -0.0000

H0: mean = 0 Degrees of freedom = 7042

Ha: mean < 0 Ha: mean != 0 Ha: mean > 0

Pr(T < t) = 0.5000 Pr(|T| > |t|) = 1.0000 Pr(T > t) = 0.5000

Ho: Mean = 0
Hα: Mean not equal to 0
 Based on the information above, we can see that Mean = -0,0000306 nearly equal
to 0 and Pr(|T| > |t|) = 1.0000. We cannot reject Ho. Linear regression model is
suitable because it ensures that the residuals have a mean of 0.

. sktest residual4

Skewness and kurtosis tests for normality

----- Joint test -----

Variable | Obs Pr(skewness) Pr(kurtosis) Adj chi2(2) Prob>chi2

-------------+-----------------------------------------------------------------

residual4 | 7,043 0.0000 0.0000 . .

Ho: Residuals follow normal distribution


Hα: Residuals do not follow normal distribution
 Based on the information above, we can see that Pr(skewness), Pr(kurtosis) have
p-value < 0.05 and p-value of Joint test < 0.05. We reject Ho. Residuals do not
follow normal distribution.

. swilk residual4

Shapiro–Wilk W test for normal data

Variable | Obs W V z Prob>z

-------------+------------------------------------------------------

residual4 | 7,043 0.69111 1133.287 18.644 0.00000

Note: The normal approximation to the sampling distribution of W'

is valid for 4<=n<=2000.

Ho: Residuals follow normal distribution


Hα: Residuals do not follow normal distribution
 Based on the information above, we can see that p-value < 0.05. We reject Ho.
Residuals do not follow normal distribution.

. estat hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of OperatingProfit

H0: Constant variance

chi2(1) = 291.43

Prob > chi2 = 0.0000

 Based on the information above, we can see that p-value < 0.05. We reject Ho. The
variance is not homogeneous.

. imtest

Cameron & Trivedi's decomposition of IM-test


--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 73.82 1 0.0000

Skewness | 322.62 1 0.0000

Kurtosis | -4.11e+07 1 1.0000

---------------------+----------------------------

Total | -4.11e+07 3 1.0000

--------------------------------------------------

 Based on the information above, we can see that p-value (Heteroskedasticity) <
0.05. The variance is not homogeneous.
 We can not conclude there is significant relationship between Mid West and
Operating Profits

. regress OperatingProfit Northeast

Source | SS df MS Number of obs = 7,043

-------------+---------------------------------- F(1, 7041) = 25.84

Model | 7.8028e+10 1 7.8028e+10 Prob > F = 0.0000

Residual | 2.1259e+13 7,041 3.0193e+09 R-squared = 0.0037

-------------+---------------------------------- Adj R-squared = 0.0035

Total | 2.1337e+13 7,042 3.0299e+09 Root MSE = 54948

------------------------------------------------------------------------------

OperatingP~t | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

Northeast | -7945.479 1562.96 -5.08 0.000 -11009.35 -4881.608

_cons | 35982.93 744.7203 48.32 0.000 34523.05 37442.81

------------------------------------------------------------------------------

Simple linear equation: -7945.479Northeast + 35982.93


Ho: β1 = 0
Hα: β1 not equal to 0
 Based on the information above, we can see that F (1, 7041) = 25.84 and the Prob
> F = 0.0000. Because the p-value < 0.05. We reject Ho, there is a significant
relationship between UnitsSold and OperatingProfit.
 The R-square = 00.37% and Adj R-square = 00.35%. These results do not provide
good fit data

. predict residual5, residuals

. ttest residual5 ==0

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]

---------+--------------------------------------------------------------------

residu~5 | 7,043 -.0002708 654.7003 54944.14 -1283.41 1283.409

------------------------------------------------------------------------------

mean = mean(residual5) t = -0.0000

H0: mean = 0 Degrees of freedom = 7042

Ha: mean < 0 Ha: mean != 0 Ha: mean > 0

Pr(T < t) = 0.5000 Pr(|T| > |t|) = 1.0000 Pr(T > t) = 0.5000

Ho: Mean = 0
Hα: Mean not equal to 0
 Based on the information above, we can see that Mean = -0.0002708 nearly equal
to 0 and Pr(|T| > |t|) = 1.0000. We cannot reject Ho. Linear regression model is
suitable because it ensures that the residuals have a mean of 0.

. sktest residual5

Skewness and kurtosis tests for normality

----- Joint test -----

Variable | Obs Pr(skewness) Pr(kurtosis) Adj chi2(2) Prob>chi2

-------------+-----------------------------------------------------------------

residual5 | 7,043 0.0000 0.0000 . .

Ho: Residuals follow normal distribution


Hα: Residuals do not follow normal distribution
 Based on the information above, we can see that Pr(skewness), Pr(kurtosis) have
p-value < 0.05 and p-value of Joint test < 0.05. We reject Ho. Residuals do not
follow normal distribution.

. swilk residual5

Shapiro–Wilk W test for normal data

Variable | Obs W V z Prob>z

-------------+------------------------------------------------------

residual5 | 7,043 0.67976 1174.904 18.740 0.00000

Note: The normal approximation to the sampling distribution of W'

is valid for 4<=n<=2000.

Ho: Residuals follow normal distribution


Hα: Residuals do not follow normal distribution
 Based on the information above, we can see that p-value < 0.05. We reject Ho.
Residuals do not follow normal distribution.

. estat hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of OperatingProfit

H0: Constant variance

chi2(1) = 47.57

Prob > chi2 = 0.0000

 Based on the information above, we can see that p-value < 0.05. We reject Ho. The
variance is not homogeneous.
. imtest

Cameron & Trivedi's decomposition of IM-test


--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 11.79 1 0.0006

Skewness | 286.32 1 0.0000

Kurtosis | -4.24e+07 1 1.0000

---------------------+----------------------------

Total | -4.24e+07 3 1.0000

--------------------------------------------------

 Based on the information above, we can see that p-value (Heteroskedasticity) <
0.05. The variance is not homogeneous.
 We cannot conclude there is significant impact of North East on Operating Profits

. regress OperatingProfit South

Source | SS df MS Number of obs = 7,043

-------------+---------------------------------- F(1, 7041) = 2.34

Model | 7.0836e+09 1 7.0836e+09 Prob > F = 0.1263

Residual | 2.1330e+13 7,041 3.0294e+09 R-squared = 0.0003

-------------+---------------------------------- Adj R-squared = 0.0002

Total | 2.1337e+13 7,042 3.0299e+09 Root MSE = 55040

------------------------------------------------------------------------------

OperatingP~t | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

South | 2597.458 1698.629 1.53 0.126 -732.3659 5927.282

_cons | 33705.5 725.2741 46.47 0.000 32283.74 35127.25

------------------------------------------------------------------------------

Linear regression equation: 2597.458South + 33705.5


Ho: β1 = 0
Hα: β1 not equal to 0
 Based on the information above, we can see that F (1, 7041) = 2.34 and the Prob >
F = 0.1263 Because the p-value > 0.05. We cannot reject Ho, there is not any
relationship between South and OperatingProfit.
 The R-square = 00.03% and Adj R-square = 00.02%. These results do not provide
good fit data

. predict residual6, residuals

. ttest residual6 ==0

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]

---------+--------------------------------------------------------------------

residu~6 | 7,043 -.0001517 655.7918 55035.75 -1285.55 1285.549

------------------------------------------------------------------------------

mean = mean(residual6) t = -0.0000

H0: mean = 0 Degrees of freedom = 7042

Ha: mean < 0 Ha: mean != 0 Ha: mean > 0

Pr(T < t) = 0.5000 Pr(|T| > |t|) = 1.0000 Pr(T > t) = 0.5000

. sktest residual6

Skewness and kurtosis tests for normality

----- Joint test -----

Variable | Obs Pr(skewness) Pr(kurtosis) Adj chi2(2) Prob>chi2

-------------+-----------------------------------------------------------------

residual6 | 7,043 0.0000 0.0000 . .

. estat hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of OperatingProfit

H0: Constant variance

chi2(1) = 45.65
Prob > chi2 = 0.0000

. imtest

Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 11.36 1 0.0008

Skewness | 297.45 1 0.0000

Kurtosis | -2.58e+07 1 1.0000

---------------------+----------------------------

Total | -2.58e+07 3 1.0000

--------------------------------------------------

. estat vif

Variable | VIF 1/VIF

-------------+----------------------

South | 1.00 1.000000

-------------+----------------------

Mean VIF | 1.00

 This linear regression model can be used to analyze and predict for existing data.
. regress OperatingProfit Southeast

Source | SS df MS Number of obs = 7,043

-------------+---------------------------------- F(1, 7041) = 129.03

Model | 3.8398e+11 1 3.8398e+11 Prob > F = 0.0000

Residual | 2.0953e+13 7,041 2.9758e+09 R-squared = 0.0180

-------------+---------------------------------- Adj R-squared = 0.0179

Total | 2.1337e+13 7,042 3.0299e+09 Root MSE = 54551

------------------------------------------------------------------------------

OperatingP~t | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------
Southeast | 21712 1911.382 11.36 0.000 17965.11 25458.88

_cons | 31281.23 698.2849 44.80 0.000 29912.38 32650.07

------------------------------------------------------------------------------

Simple linear equation: 21712Southeast + 31281.23


Ho: β1 = 0
Hα: β1 not equal to 0
 Based on the information above, we can see that F (1, 7041) = 129.03 and the Prob
> F = 0.0000. Because the p-value < 0.05. We reject Ho, there is a significant
relationship between South East and OperatingProfit.
 The R-square = 1.8% and Adj R-square = 1.79%. These results do not provide a
good fit data.
. predict residual7, residuals

. ttest residual7 ==0

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]

---------+--------------------------------------------------------------------

residu~7 | 7,043 .0000305 649.9721 54547.33 -1274.141 1274.141

------------------------------------------------------------------------------

mean = mean(residual7) t = 0.0000

H0: mean = 0 Degrees of freedom = 7042

Ha: mean < 0 Ha: mean != 0 Ha: mean > 0

Pr(T < t) = 0.5000 Pr(|T| > |t|) = 1.0000 Pr(T > t) = 0.5000

Ho: Mean = 0
Hα: Mean not equal to 0
 Based on the information above, we can see that Mean = 0.0000305 nearly equal
to 0 and Pr(|T| > |t|) = 1.0000. We cannot reject Ho. Linear regression model is
suitable because it ensures that the residuals have a mean of 0.

. sktest residual7
Skewness and kurtosis tests for normality

----- Joint test -----

Variable | Obs Pr(skewness) Pr(kurtosis) Adj chi2(2) Prob>chi2

-------------+-----------------------------------------------------------------

residual7 | 7,043 0.0000 0.0000 . .

Ho: Residuals follow normal distribution


Hα: Residuals do not follow normal distribution
 Based on the information above, we can see that Pr(skewness), Pr(kurtosis) have
p-value < 0.05 and p-value of Joint test < 0.05. We reject Ho. Residuals do not
follow normal distribution.

. swilk residual7

Shapiro–Wilk W test for normal data

Variable | Obs W V z Prob>z

-------------+------------------------------------------------------

residual7 | 7,043 0.71740 1036.828 18.408 0.00000

Note: The normal approximation to the sampling distribution of W'

is valid for 4<=n<=2000.

Ho: Residuals follow normal distribution


Hα: Residuals do not follow normal distribution
 Based on the information above, we can see that p-value < 0.05. We reject Ho.
Residuals do not follow normal distribution.

. estat hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of OperatingProfit

H0: Constant variance

chi2(1) = 422.98
Prob > chi2 = 0.0000

 Based on the information above, we can see that p-value < 0.05. We reject Ho. The
variance is not homogeneous.

. imtest

Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 109.01 1 0.0000

Skewness | 327.17 1 0.0000

Kurtosis | -2.15e+07 1 1.0000

---------------------+----------------------------

Total | -2.15e+07 3 1.0000

--------------------------------------------------

 Based on the information above, we can see that p-value (Heteroskedasticity) <
0.05. The variance is not homogeneous.
 We cannot conclude there is a significant impact of South East on Operating
Profits
. regress OperatingProfit West

Source | SS df MS Number of obs = 7,043

-------------+---------------------------------- F(1, 7041) = 2.25

Model | 6.8030e+09 1 6.8030e+09 Prob > F = 0.1340

Residual | 2.1330e+13 7,041 3.0294e+09 R-squared = 0.0003

-------------+---------------------------------- Adj R-squared = 0.0002

Total | 2.1337e+13 7,042 3.0299e+09 Root MSE = 55040

------------------------------------------------------------------------------

OperatingP~t | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

West | 2284.057 1524.172 1.50 0.134 -703.7785 5271.892

_cons | 33618.64 754.9652 44.53 0.000 32138.68 35098.6


------------------------------------------------------------------------------

Simple linear regression: 2284.057West + 33618.64

Ho: β1 = 0
Hα: β1 not equal to 0
 Based on the information above, we can see that F (1, 7041) = 2.25 and the Prob >
F = 0.0000. Because the p-value < 0.05. We reject Ho, there is a significant
relationship between West and OperatingProfit.
 The R-square = 00.03% and Adj R-square = 00.02%. These results do not provide
good fit data

. predict residual8, residuals

. ttest residual8 ==0

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]

---------+--------------------------------------------------------------------

residu~8 | 7,043 .000035 655.7962 55036.11 -1285.558 1285.558

------------------------------------------------------------------------------

mean = mean(residual8) t = 0.0000

H0: mean = 0 Degrees of freedom = 7042

Ha: mean < 0 Ha: mean != 0 Ha: mean > 0

Pr(T < t) = 0.5000 Pr(|T| > |t|) = 1.0000 Pr(T > t) = 0.5000

. sktest residual8

Skewness and kurtosis tests for normality

----- Joint test -----

Variable | Obs Pr(skewness) Pr(kurtosis) Adj chi2(2) Prob>chi2

-------------+-----------------------------------------------------------------

residual8 | 7,043 0.0000 0.0000 . .


. swilk residual8

Shapiro–Wilk W test for normal data

Variable | Obs W V z Prob>z

-------------+------------------------------------------------------

residual8 | 7,043 0.66628 1224.385 18.849 0.00000

Note: The normal approximation to the sampling distribution of W'

is valid for 4<=n<=2000.

. estat hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of OperatingProfit

H0: Constant variance

chi2(1) = 3.88

Prob > chi2 = 0.0488

. imtest

Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 0.96 1 0.3272

Skewness | 299.39 1 0.0000

Kurtosis | -2.63e+07 1 1.0000

---------------------+----------------------------

Total | -2.63e+07 3 1.0000

--------------------------------------------------
. estat vif

Variable | VIF 1/VIF

-------------+----------------------

West | 1.00 1.000000

-------------+----------------------

Mean VIF | 1.00

CONCLUSION: As it can be seen, 3 variables which are Outlet, South and West do not
have s significant impact on Operating Profits. We will eliminate them into the multiple
linear regression in order to avoid adding noise to the model, increasinging its complexity
and the risk of overfitting.

b)

. regress OperatingProfit UnitsSold Instore Online Midwest Northeast South Southeast

Source | SS df MS Number of obs = 7,043

-------------+---------------------------------- F(7, 7035) = 3984.57

Model | 1.7039e+13 7 2.4342e+12 Prob > F = 0.0000

Residual | 4.2977e+12 7,035 610898703 R-squared = 0.7986

-------------+---------------------------------- Adj R-squared = 0.7984

Total | 2.1337e+13 7,042 3.0299e+09 Root MSE = 24716

------------------------------------------------------------------------------

OperatingP~t | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

UnitsSold | 228.9376 1.557794 146.96 0.000 225.8839 231.9914

Instore | 8039.491 923.506 8.71 0.000 6229.141 9849.841

Online | 3626.105 710.689 5.10 0.000 2232.941 5019.27

Midwest | 6998.846 897.8194 7.80 0.000 5238.85 8758.843

Northeast | 6305.189 871.6857 7.23 0.000 4596.422 8013.956

South | 142.4665 925.8822 0.15 0.878 -1672.542 1957.475

Southeast | -2396.44 1021.206 -2.35 0.019 -4398.31 -394.569

_cons | -29976.86 852.9794 -35.14 0.000 -31648.96 -28304.77

------------------------------------------------------------------------------
. predict residual9, residuals

. ttest residual9 ==0

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]

---------+--------------------------------------------------------------------

residu~9 | 7,043 4.60e-06 294.3675 24704.08 -577.0488 577.0488

------------------------------------------------------------------------------

mean = mean(residual9) t = 0.0000

H0: mean = 0 Degrees of freedom = 7042

Ha: mean < 0 Ha: mean != 0 Ha: mean > 0

Pr(T < t) = 0.5000 Pr(|T| > |t|) = 1.0000 Pr(T > t) = 0.5000

. sktest residual9

Skewness and kurtosis tests for normality

----- Joint test -----

Variable | Obs Pr(skewness) Pr(kurtosis) Adj chi2(2) Prob>chi2

-------------+-----------------------------------------------------------------

residual9 | 7,043 0.0000 0.0000 . .

. swilk residual9

Shapiro–Wilk W test for normal data

Variable | Obs W V z Prob>z

-------------+------------------------------------------------------

residual9 | 7,043 0.85947 515.587 16.556 0.00000

Note: The normal approximation to the sampling distribution of W'

is valid for 4<=n<=2000.


. estat hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of OperatingProfit

H0: Constant variance

chi2(1) = 8237.45

Prob > chi2 = 0.0000

. imtest

Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 1994.69 22 0.0000

Skewness | 426.65 7 0.0000

Kurtosis |-4100332.57 1 1.0000

---------------------+----------------------------

Total |-4097911.23 30 1.0000

--------------------------------------------------

. estat vif

Variable | VIF 1/VIF

-------------+----------------------

Instore | 1.60 0.624943

Midwest | 1.55 0.644479

Northeast | 1.54 0.650490

South | 1.47 0.678740

Online | 1.46 0.687004

Southeast | 1.39 0.719166

UnitsSold | 1.30 0.771105


-------------+----------------------

Mean VIF | 1.47

 Although this linear regression aligns all of the assumptions, p-value(t-test) of


South is higher than 0.05. So South region is not significant in this linear
regression. We recommend to move it out.

. regress OperatingProfit UnitsSold Instore Online Midwest Northeast Southeast

Source | SS df MS Number of obs = 7,043

-------------+---------------------------------- F(6, 7036) = 4649.30

Model | 1.7039e+13 6 2.8399e+12 Prob > F = 0.0000

Residual | 4.2977e+12 7,036 610813934 R-squared = 0.7986

-------------+---------------------------------- Adj R-squared = 0.7984

Total | 2.1337e+13 7,042 3.0299e+09 Root MSE = 24715

------------------------------------------------------------------------------

OperatingP~t | Coefficient Std. err. t P>|t| [95% conf. interval]

-------------+----------------------------------------------------------------

UnitsSold | 228.9529 1.554529 147.28 0.000 225.9055 232.0002

Instore | 8013.929 908.3785 8.82 0.000 6233.234 9794.625

Online | 3618.228 708.7934 5.10 0.000 2228.779 5007.677

Midwest | 6944.887 826.444 8.40 0.000 5324.808 8564.966

Northeast | 6249.552 793.1002 7.88 0.000 4694.837 7804.267

Southeast | -2455.721 945.6773 -2.60 0.009 -4309.533 -601.9087

_cons | -29913.66 747.509 -40.02 0.000 -31379 -28448.32

------------------------------------------------------------------------------

. predict residual10, residuals

. ttest residual10 ==0

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]

---------+--------------------------------------------------------------------

resid~10 | 7,043 2.07e-06 294.368 24704.12 -577.0498 577.0498


------------------------------------------------------------------------------

mean = mean(residual10) t = 0.0000

H0: mean = 0 Degrees of freedom = 7042

Ha: mean < 0 Ha: mean != 0 Ha: mean > 0

Pr(T < t) = 0.5000 Pr(|T| > |t|) = 1.0000 Pr(T > t) = 0.5000

. sktest residual10

Skewness and kurtosis tests for normality

----- Joint test -----

Variable | Obs Pr(skewness) Pr(kurtosis) Adj chi2(2) Prob>chi2

-------------+-----------------------------------------------------------------

residual10 | 7,043 0.0000 0.0000 . .

. swilk residual10

Shapiro–Wilk W test for normal data

Variable | Obs W V z Prob>z

-------------+------------------------------------------------------

residual10 | 7,043 0.85940 515.846 16.558 0.00000

Note: The normal approximation to the sampling distribution of W'

is valid for 4<=n<=2000.

. estat hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of OperatingProfit

H0: Constant variance

chi2(1) = 8239.34

Prob > chi2 = 0.0000


. imtest

Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 1904.24 18 0.0000

Skewness | 411.84 6 0.0000

Kurtosis |-4178219.02 1 1.0000

---------------------+----------------------------

Total |-4175902.94 25 1.0000

--------------------------------------------------

. estat vif

Variable | VIF 1/VIF

-------------+----------------------

Instore | 1.55 0.645841

Online | 1.45 0.690587

Midwest | 1.31 0.760501

UnitsSold | 1.29 0.774240

Northeast | 1.27 0.785677

Southeast | 1.19 0.838512

-------------+----------------------

Mean VIF | 1.34

 This linear regression model can be used to analyze and predict for existing data.
 2 Sales method (In-store and Online), Region (Midwest, Northeast, Southeast) and
the number of shoes that 3 chosen retailers have a significant impact on their
Operating Profit.
c. Does the effect of selling price on Operating Profit differ across regions? Because
some regions may be more price-sensitive, and this difference may be a factor in
adjusting pricing strategies for each specific region.
. gen Price_Region_Interaction1 = PriceperUnit*Northeast
. gen Price_Region_Interaction2 = PriceperUnit* Midwest

. gen Price_Region_Interaction3 = PriceperUnit* Southeast

. regress OperatingProfit UnitsSold Instore Online Midwest Northeast Southeast


Price_Region_Interaction1

Source | SS df MS Number of obs = 7,043

-------------+---------------------------------- F(7, 7035) = 4025.96

Model | 1.7075e+13 7 2.4392e+12 Prob > F = 0.0000

Residual | 4.2623e+12 7,035 605872362 R-squared = 0.8002

-------------+---------------------------------- Adj R-squared = 0.8000

Total | 2.1337e+13 7,042 3.0299e+09 Root MSE = 24614

-------------------------------------------------------------------------------------------

OperatingProfit | Coefficient Std. err. t P>|t| [95% conf. interval]

--------------------------+----------------------------------------------------------------

UnitsSold | 227.3143 1.563009 145.43 0.000 224.2503 230.3782

Instore | 7690.098 905.6887 8.49 0.000 5914.676 9465.521

Online | 3126.803 708.8441 4.41 0.000 1737.255 4516.351

Midwest | 6916.19 823.1028 8.40 0.000 5302.661 8529.72

Northeast | -12993.68 2639.35 -4.92 0.000 -18167.6 -7819.755

Southeast | -2218.65 942.3551 -2.35 0.019 -4065.95 -371.3504

Price_Region_Interaction1 | 418.7923 54.80781 7.64 0.000 311.3525 526.2322

_cons | -29194.18 750.41 -38.90 0.000 -30665.21 -27723.15

-------------------------------------------------------------------------------------------

. predict re1, residuals

. ttest re1 ==0

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]

---------+--------------------------------------------------------------------

re1 | 7,043 9.11e-06 293.154 24602.24 -574.67 574.67

------------------------------------------------------------------------------
mean = mean(re1) t = 0.0000

H0: mean = 0 Degrees of freedom = 7042

Ha: mean < 0 Ha: mean != 0 Ha: mean > 0

Pr(T < t) = 0.5000 Pr(|T| > |t|) = 1.0000 Pr(T > t) = 0.5000

. sktest re1

Skewness and kurtosis tests for normality

----- Joint test -----

Variable | Obs Pr(skewness) Pr(kurtosis) Adj chi2(2) Prob>chi2

-------------+-----------------------------------------------------------------

re1 | 7,043 0.0000 0.0000 . .

. swilk re1

Shapiro–Wilk W test for normal data

Variable | Obs W V z Prob>z

-------------+------------------------------------------------------

re1 | 7,043 0.86029 512.560 16.541 0.00000

Note: The normal approximation to the sampling distribution of W'

is valid for 4<=n<=2000.

. estat hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of OperatingProfit

H0: Constant variance

chi2(1) = 8089.92

Prob > chi2 = 0.0000


. imtest

Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 1913.99 23 0.0000

Skewness | 417.23 7 0.0000

Kurtosis |-4806898.07 1 1.0000

---------------------+----------------------------

Total |-4804566.84 31 1.0000

--------------------------------------------------

. estat vif

Variable | VIF 1/VIF

-------------+----------------------

Northeast | 14.21 0.070369

Price_Regi~1 | 13.85 0.072208

Instore | 1.55 0.644427

Online | 1.46 0.684902

UnitsSold | 1.32 0.759666

Midwest | 1.31 0.760485

Southeast | 1.19 0.837603

-------------+----------------------

Mean VIF | 4.99

 This linear regression model can’t be used to analyze and predict for existing data
because the VIF is higher than 10. Multicollinearity will occur.
. regress OperatingProfit UnitsSold Instore Online Midwest Northeast Southeast
Price_Region_Interaction2

Source | SS df MS Number of obs = 7,043

-------------+---------------------------------- F(7, 7035) = 4079.43

Model | 1.7119e+13 7 2.4456e+12 Prob > F = 0.0000

Residual | 4.2175e+12 7,035 599500965 R-squared = 0.8023

-------------+---------------------------------- Adj R-squared = 0.8021


Total | 2.1337e+13 7,042 3.0299e+09 Root MSE = 24485

-------------------------------------------------------------------------------------------

OperatingProfit | Coefficient Std. err. t P>|t| [95% conf. interval]

--------------------------+----------------------------------------------------------------

UnitsSold | 227.6294 1.544311 147.40 0.000 224.6021 230.6568

Instore | 8549.143 901.1161 9.49 0.000 6782.684 10315.6

Online | 3795.679 702.3665 5.40 0.000 2418.829 5172.529

Midwest | -17923.9 2300.763 -7.79 0.000 -22434.09 -13413.71

Northeast | 6075.868 785.8648 7.73 0.000 4535.336 7616.4

Southeast | -2408.037 936.8879 -2.57 0.010 -4244.619 -571.4542

Price_Region_Interaction2 | 616.767 53.32557 11.57 0.000 512.2328 721.3012

_cons | -29693.35 740.7993 -40.08 0.000 -31145.54 -28241.16

-------------------------------------------------------------------------------------------

. predict re2, residuals

. ttest re2 ==0

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]

---------+--------------------------------------------------------------------

re2 | 7,043 -.0000105 291.6085 24472.54 -571.6404 571.6403

------------------------------------------------------------------------------

mean = mean(re2) t = -0.0000

H0: mean = 0 Degrees of freedom = 7042

Ha: mean < 0 Ha: mean != 0 Ha: mean > 0

Pr(T < t) = 0.5000 Pr(|T| > |t|) = 1.0000 Pr(T > t) = 0.5000

. sktest re2

Skewness and kurtosis tests for normality

----- Joint test -----

Variable | Obs Pr(skewness) Pr(kurtosis) Adj chi2(2) Prob>chi2


-------------+-----------------------------------------------------------------

re2 | 7,043 0.0000 0.0000 . .

. swilk re2

Shapiro–Wilk W test for normal data

Variable | Obs W V z Prob>z

-------------+------------------------------------------------------

re2 | 7,043 0.85854 518.980 16.574 0.00000

Note: The normal approximation to the sampling distribution of W'

is valid for 4<=n<=2000.

. estat hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of OperatingProfit

H0: Constant variance

chi2(1) = 8031.42

Prob > chi2 = 0.0000

. imtest

Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 1906.77 23 0.0000

Skewness | 428.80 7 0.0000

Kurtosis |-4635946.66 1 1.0000

---------------------+----------------------------
Total |-4633611.08 31 1.0000

--------------------------------------------------

. estat vif

Variable | VIF 1/VIF

-------------+----------------------

Midwest | 10.38 0.096308

Price_Regi~2 | 9.91 0.100924

Instore | 1.55 0.644138

Online | 1.45 0.690258

UnitsSold | 1.30 0.769989

Northeast | 1.27 0.785390

Southeast | 1.19 0.838496

-------------+----------------------

Mean VIF | 3.87

 This linear regression model can’t be used to analyze and predict for existing data
because the VIF is higher than 10. Multicollinearity will occur.

. regress OperatingProfit UnitsSold Instore Online Midwest Northeast Southeast


Price_Region_Interaction3

Source | SS df MS Number of obs = 7,043

-------------+---------------------------------- F(7, 7035) = 4244.46

Model | 1.7252e+13 7 2.4646e+12 Prob > F = 0.0000

Residual | 4.0849e+12 7,035 580654037 R-squared = 0.8086

-------------+---------------------------------- Adj R-squared = 0.8084

Total | 2.1337e+13 7,042 3.0299e+09 Root MSE = 24097

-------------------------------------------------------------------------------------------

OperatingProfit | Coefficient Std. err. t P>|t| [95% conf. interval]

--------------------------+----------------------------------------------------------------

UnitsSold | 226.7268 1.520119 149.15 0.000 223.7469 229.7067

Instore | 6797.627 887.9445 7.66 0.000 5056.988 8538.266

Online | 2697.756 692.7438 3.89 0.000 1339.77 4055.743

Midwest | 7073.14 805.8101 8.78 0.000 5493.509 8652.771

Northeast | 6253.562 773.2721 8.09 0.000 4737.716 7769.408


Southeast | -54031.66 2847.634 -18.97 0.000 -59613.88 -48449.44

Price_Region_Interaction3 | 1065.639 55.66699 19.14 0.000 956.5153 1174.763

_cons | -28726.67 731.4536 -39.27 0.000 -30160.54 -27292.8

-------------------------------------------------------------------------------------------

. predict re3, residuals

. ttest re3 ==0

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]

---------+--------------------------------------------------------------------

re3 | 7,043 -.0000242 286.9881 24084.78 -562.5831 562.5831

------------------------------------------------------------------------------

mean = mean(re3) t = -0.0000

H0: mean = 0 Degrees of freedom = 7042

Ha: mean < 0 Ha: mean != 0 Ha: mean > 0

Pr(T < t) = 0.5000 Pr(|T| > |t|) = 1.0000 Pr(T > t) = 0.5000

. sktest re3

Skewness and kurtosis tests for normality

----- Joint test -----

Variable | Obs Pr(skewness) Pr(kurtosis) Adj chi2(2) Prob>chi2

-------------+-----------------------------------------------------------------

re3 | 7,043 0.0000 0.0000 . .

. swilk re3

Shapiro–Wilk W test for normal data

Variable | Obs W V z Prob>z

-------------+------------------------------------------------------

re3 | 7,043 0.87010 476.592 16.348 0.00000


Note: The normal approximation to the sampling distribution of W'

is valid for 4<=n<=2000.

. estat hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of OperatingProfit

H0: Constant variance

chi2(1) = 7332.46

Prob > chi2 = 0.0000

. imtest

Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 1996.72 23 0.0000

Skewness | 412.04 7 0.0000

Kurtosis |-3889444.66 1 1.0000

---------------------+----------------------------

Total |-3887035.90 31 1.0000

--------------------------------------------------

. estat vif

Variable | VIF 1/VIF

-------------+----------------------

Southeast | 11.38 0.087909

Price_Regi~3 | 11.36 0.088037

Instore | 1.56 0.642534


Online | 1.46 0.687260

Midwest | 1.32 0.760448

UnitsSold | 1.30 0.769709

Northeast | 1.27 0.785677

-------------+----------------------

Mean VIF | 4.23

 This linear regression model can’t be used to analyze and predict for existing data
because the VIF is higher than 10. Multicollinearity will occur.
d. Do different approaches to selling lead to varying profit outcomes for each product
category? Because some products are more profitable when being sold online, while
others are more profitable when being sold offline.
. tabulate Product, generate (Product)

Product | Freq. Percent Cum.

--------------------------+-----------------------------------

Men's Apparel | 1,165 16.54 16.54

Men's Athletic Footwear | 1,175 16.68 33.22

Men's Street Footwear | 1,178 16.73 49.95

Women's Apparel | 1,170 16.61 66.56

Women's Athletic Footwear | 1,175 16.68 83.25

Women's Street Footwear | 1,180 16.75 100.00

--------------------------+-----------------------------------

Total | 7,043 100.00

. rename Product1 MenApparel

. rename Product2 MenAthleticFootwear

. rename Product3 MenStreetFootwear

. rename Product4 WomenApparel

. rename Product5 WomenAthleticFootwear

. rename Product6 WomenStreetFootwear


. gen Product_SalesMethod_1 = MenApparel*Online

. regress OperatingProfit UnitsSold Online Instore Midwest Northeast Southeast


Product_SalesMethod_1

Source | SS df MS Number of obs = 7,043

-------------+---------------------------------- F(7, 7035) = 4027.22

Model | 1.7076e+13 7 2.4394e+12 Prob > F = 0.0000

Residual | 4.2612e+12 7,035 605720515 R-squared = 0.8003

-------------+---------------------------------- Adj R-squared = 0.8001

Total | 2.1337e+13 7,042 3.0299e+09 Root MSE = 24611

---------------------------------------------------------------------------------------

OperatingProfit | Coefficient Std. err. t P>|t| [95% conf. interval]

----------------------+----------------------------------------------------------------

UnitsSold | 229.9077 1.552921 148.05 0.000 226.8635 232.9519

Online | 2246.327 727.6548 3.09 0.002 819.9048 3672.75

Instore | 7862.856 904.7929 8.69 0.000 6089.19 9636.523

Midwest | 7043.176 823.0886 8.56 0.000 5429.674 8656.678

Northeast | 6353.395 789.9 8.04 0.000 4804.953 7901.837

Southeast | -2553.287 941.8102 -2.71 0.007 -4399.519 -707.0555

Product_SalesMethod_1 | 8733.656 1125.965 7.76 0.000 6526.424 10940.89

_cons | -30191.86 745.2494 -40.51 0.000 -31652.77 -28730.94

---------------------------------------------------------------------------------------

. predict rez1, residuals

. ttest rez1 ==0

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]

---------+--------------------------------------------------------------------

rez1 | 7,043 5.74e-06 293.1172 24599.15 -574.598 574.598

------------------------------------------------------------------------------

mean = mean(rez1) t = 0.0000

H0: mean = 0 Degrees of freedom = 7042


Ha: mean < 0 Ha: mean != 0 Ha: mean > 0

Pr(T < t) = 0.5000 Pr(|T| > |t|) = 1.0000 Pr(T > t) = 0.5000

. sktest rez1

Skewness and kurtosis tests for normality

----- Joint test -----

Variable | Obs Pr(skewness) Pr(kurtosis) Adj chi2(2) Prob>chi2

-------------+-----------------------------------------------------------------

rez1 | 7,043 0.0000 0.0000 . .

. swilk rez1

Shapiro–Wilk W test for normal data

Variable | Obs W V z Prob>z

-------------+------------------------------------------------------

rez1 | 7,043 0.85612 527.884 16.619 0.00000

Note: The normal approximation to the sampling distribution of W'

is valid for 4<=n<=2000.

. estat hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of OperatingProfit

H0: Constant variance

chi2(1) = 8373.46

Prob > chi2 = 0.0000

. imtest
Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 1919.92 23 0.0000

Skewness | 410.90 7 0.0000

Kurtosis |-4174239.43 1 1.0000

---------------------+----------------------------

Total |-4171908.60 31 1.0000

--------------------------------------------------

. estat vif

Variable | VIF 1/VIF

-------------+----------------------

Instore | 1.55 0.645542

Online | 1.54 0.649786

Midwest | 1.32 0.760321

UnitsSold | 1.30 0.769375

Northeast | 1.27 0.785451

Southeast | 1.19 0.838363

Product_Sa~1 | 1.11 0.903347

-------------+----------------------

Mean VIF | 1.33

 This linear regression model can be used to analyze and predict for existing data.

. gen Product_SalesMethod_2 = MenApparel*Instore

. regress OperatingProfit UnitsSold Online Instore Midwest Northeast Southeast


Product_SalesMethod_2

Source | SS df MS Number of obs = 7,043

-------------+---------------------------------- F(7, 7035) = 4042.42

Model | 1.7088e+13 7 2.4412e+12 Prob > F = 0.0000

Residual | 4.2484e+12 7,035 603896562 R-squared = 0.8009

-------------+---------------------------------- Adj R-squared = 0.8007


Total | 2.1337e+13 7,042 3.0299e+09 Root MSE = 24574

---------------------------------------------------------------------------------------

OperatingProfit | Coefficient Std. err. t P>|t| [95% conf. interval]

----------------------+----------------------------------------------------------------

UnitsSold | 230.5176 1.555378 148.21 0.000 227.4686 233.5666

Online | 3732.473 704.882 5.30 0.000 2350.691 5114.254

Instore | 5129.654 957.9995 5.35 0.000 3251.686 7007.622

Midwest | 7105.255 821.9428 8.64 0.000 5494 8716.511

Northeast | 6410.761 788.7984 8.13 0.000 4864.478 7957.043

Southeast | -2586.056 940.4179 -2.75 0.006 -4429.558 -742.5534

Product_SalesMethod_2 | 15767.2 1745.519 9.03 0.000 12345.46 19188.94

_cons | -30369.84 744.978 -40.77 0.000 -31830.22 -28909.46

---------------------------------------------------------------------------------------

. predict rez2, residuals

. ttest rez2 ==0

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]

---------+--------------------------------------------------------------------

rez2 | 7,043 2.51e-06 292.6756 24562.09 -573.7322 573.7322

------------------------------------------------------------------------------

mean = mean(rez2) t = 0.0000

H0: mean = 0 Degrees of freedom = 7042

Ha: mean < 0 Ha: mean != 0 Ha: mean > 0

Pr(T < t) = 0.5000 Pr(|T| > |t|) = 1.0000 Pr(T > t) = 0.5000

. sktest rez2

Skewness and kurtosis tests for normality

----- Joint test -----

Variable | Obs Pr(skewness) Pr(kurtosis) Adj chi2(2) Prob>chi2

-------------+-----------------------------------------------------------------
rez2 | 7,043 0.0000 0.0000 . .

. swilk rez2

Shapiro–Wilk W test for normal data

Variable | Obs W V z Prob>z

-------------+------------------------------------------------------

rez2 | 7,043 0.86137 508.601 16.520 0.00000

Note: The normal approximation to the sampling distribution of W'

is valid for 4<=n<=2000.

. estat hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of OperatingProfit

H0: Constant variance

chi2(1) = 8378.26

Prob > chi2 = 0.0000

. imtest

Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 1996.54 23 0.0000

Skewness | 414.09 7 0.0000

Kurtosis |-4490236.14 1 1.0000

---------------------+----------------------------

Total |-4487825.51 31 1.0000


--------------------------------------------------

. estat vif

Variable | VIF 1/VIF

-------------+----------------------

Instore | 1.74 0.574093

Online | 1.45 0.690365

Midwest | 1.32 0.760146

UnitsSold | 1.31 0.764637

Northeast | 1.27 0.785275

Southeast | 1.19 0.838315

Product_Sa~2 | 1.17 0.851564

-------------+----------------------

Mean VIF | 1.35

 This linear regression model can be used to analyze and predict for existing data.

. gen Product_SalesMethod_3 = MenStreetFootwear*Online

. regress OperatingProfit UnitsSold Online Instore Midwest Northeast Southeast


Product_SalesMethod_3

Source | SS df MS Number of obs = 7,043

-------------+---------------------------------- F(7, 7035) = 4082.22

Model | 1.7122e+13 7 2.4460e+12 Prob > F = 0.0000

Residual | 4.2152e+12 7,035 599171666 R-squared = 0.8024

-------------+---------------------------------- Adj R-squared = 0.8022

Total | 2.1337e+13 7,042 3.0299e+09 Root MSE = 24478

---------------------------------------------------------------------------------------

OperatingProfit | Coefficient Std. err. t P>|t| [95% conf. interval]

----------------------+----------------------------------------------------------------

UnitsSold | 231.3752 1.553418 148.95 0.000 228.33 234.4203

Online | 5998.613 730.724 8.21 0.000 4566.174 7431.052

Instore | 7632.739 900.2661 8.48 0.000 5867.947 9397.532

Midwest | 7186.737 818.7894 8.78 0.000 5581.663 8791.811

Northeast | 6502.148 785.8003 8.27 0.000 4961.742 8042.553

Southeast | -2677.931 936.8129 -2.86 0.004 -4514.366 -841.4953


Product_SalesMethod_3 | -13153.84 1120.892 -11.74 0.000 -15351.12 -10956.55

_cons | -30618.31 742.7819 -41.22 0.000 -32074.38 -29162.23

---------------------------------------------------------------------------------------

. predict rez3, residuals

. ttest rez3 ==0

One-sample t test

------------------------------------------------------------------------------

Variable | Obs Mean Std. err. Std. dev. [95% conf. interval]

---------+--------------------------------------------------------------------

rez3 | 7,043 .0000164 291.5284 24465.81 -571.4833 571.4834

------------------------------------------------------------------------------

mean = mean(rez3) t = 0.0000

H0: mean = 0 Degrees of freedom = 7042

Ha: mean < 0 Ha: mean != 0 Ha: mean > 0

Pr(T < t) = 0.5000 Pr(|T| > |t|) = 1.0000 Pr(T > t) = 0.5000

. sktest rez3

Skewness and kurtosis tests for normality

----- Joint test -----

Variable | Obs Pr(skewness) Pr(kurtosis) Adj chi2(2) Prob>chi2

-------------+-----------------------------------------------------------------

rez3 | 7,043 0.0000 0.0000 . .

. swilk rez3

Shapiro–Wilk W test for normal data

Variable | Obs W V z Prob>z

-------------+------------------------------------------------------

rez3 | 7,043 0.85215 542.433 16.691 0.00000


Note: The normal approximation to the sampling distribution of W'

is valid for 4<=n<=2000.

. estat hettest

Breusch–Pagan/Cook–Weisberg test for heteroskedasticity

Assumption: Normal error terms

Variable: Fitted values of OperatingProfit

H0: Constant variance

chi2(1) = 8640.94

Prob > chi2 = 0.0000

. imtest

Cameron & Trivedi's decomposition of IM-test

--------------------------------------------------

Source | chi2 df p

---------------------+----------------------------

Heteroskedasticity | 1971.88 23 0.0000

Skewness | 398.59 7 0.0000

Kurtosis |-4012565.94 1 1.0000

---------------------+----------------------------

Total |-4010195.47 31 1.0000

--------------------------------------------------

. estat vif

Variable | VIF 1/VIF

-------------+----------------------

Online | 1.57 0.637373

Instore | 1.55 0.645000

Midwest | 1.32 0.760019

UnitsSold | 1.31 0.760569


Northeast | 1.27 0.785087

Southeast | 1.19 0.838170

Product_Sa~3 | 1.12 0.891826

-------------+----------------------

Mean VIF | 1.33

 This linear regression model can be used to analyze and predict for existing data.
e. Which are the key factors that significantly impact on profitability?
f. Predict High Value in Operating Profit based on Sales Method and Units
Sold. Calculate odds ratio
III. Discussion and Recommendations
IV. Conclusion
The three most profitable retailers were identified as Sports Direct, West Gear, and Foot
Locker. Sports Direct led with the highest average profit of 36,581.18. West Gear
followed with an average profit of 36,085.88. Foot Locker came in third place with an
average profit of 30,611.35
The results of regression analysis show that Units Sold has a positive and significant
influence on Operating Profit. Each additional unit in sales volume increases profits,
suggesting that increasing sales can be an effective strategy for improving profits. Sales
Method also has a clear impact, in which In-store has higher profits than other methods.
In contrast, Online has a negative impact on profits, suggesting that online sales may
have higher costs or lower profit margins than in-store sales.
Besides, the Region factor also significantly affects profits. Regions such as the Midwest
and Northeast show a positive and statistically significant impact, indicating that these
are regions that are more profitable than other regions. Meanwhile, Southeast has a
negative impact, reflecting lower business performance and suggesting that a strategic
adjustment may be needed in this region.

You might also like