0% found this document useful (0 votes)

6 views15 pages

2 Regression

Uploaded by

metapi5906

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views15 pages

2 Regression

Uploaded by

metapi5906

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 15

CHAPTER 2: REGRESSION

1. CHECKING LINEARITY : [pg.no:21-22]

PROGRAM :

from pandas import DataFrame

import matplotlib.pyplot as plt

Stock_Market = {'Year': [2017, 2017, 2017, 2017, 2017, 2017,

2017, 2017, 2017, 2017, 2017, 2017, 2016, 2016, 2016, 2016,

2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016],

'Month': [12, 11, 10, 9, 8, 7, 6, 5, 4, 3,

2, 1, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1],

'Interest_Rate': [2.75, 2.5, 2.5, 2.5, 2.5,

2.5, 2.5, 2.25, 2.25, 2.25, 2, 2, 2, 1.75, 1.75, 1.75, 1.75,

1.75, 1.75, 1.75, 1.75, 1.75, 1.75, 1.75],

'Unemployment_Rate': [5.3, 5.3, 5.3, 5.3,

5.4, 5.6, None, 5.5, None, 5.6, 5.7, 5.9, 6, 5.9, 5.8, 6.1,

6.2, 6.1, 6.1, 6.1, 6.1, 5.9, 6.2, 6.2],

'Stock_Index_Price': [1464, 1394, 1357,

1293, 1256, 1254, 1234, 1195, 1159, 1167, 1130, 1075, 1047,

965, 943, 958, 971, 949, 884, 866, 876, 822, 704, 719]}

df = DataFrame(Stock_Market, columns=['Year', 'Month',

'Interest_Rate', 'Unemployment_Rate', 'Stock_Index_Price'])

plt.scatter(df['Interest_Rate'], df['Stock_Index_Price'],

color='red')

plt.title('Stock Index Price Vs Interest Rate', fontsize=14)

plt.xlabel('Interest Rate', fontsize=14)

plt.ylabel('Stock Index Price', fontsize=14)

plt.grid(True)

plt.show()

plt.scatter(df['Unemployment_Rate'],
df['Stock_Index_Price'], color='green')

plt.title('Stock Index Price Vs Unemployment Rate',

fontsize=14)

plt.xlabel('Unemployment Rate', fontsize=14)

plt.ylabel('Stock Index Price', fontsize=14)

plt.grid(True)

plt.show()

OUTPUT:

2. SIMPLE LINEAR REGRESSION:[pg.no:22-23]

PROGRAM :

from pandas import DataFrame

from sklearn import linear_model

Stock_Market = {'Year': [2017, 2017, 2017, 2017, 2017, 2017,

2017, 2017, 2017, 2017, 2017, 2017, 2016, 2016, 2016, 2016,

2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016],

'Month': [12, 11, 10, 9, 8, 7, 6, 5, 4, 3,

2, 1, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1],

'Interest_Rate': [2.75, 2.5, 2.5, 2.5, 2.5,

2.5, 2.5, 2.25, 2.25, 2.25, 2, 2, 2, 1.75, 1.75, 1.75, 1.75,

1.75, 1.75, 1.75, 1.75, 1.75, 1.75, 1.75],

'Unemployment_Rate': [5.3, 5.3, 5.3, 5.3,

5.4, 5.6, None, 5.5, None, 5.6, 5.7, 5.9, 6, 5.9, 5.8, 6.1,

6.2, 6.1, 6.1, 6.1, 6.1, 5.9, 6.2, 6.2],

'Stock_Index_Price': [1464, 1394, 1357,

1293, 1256, 1254, 1234, 1195, 1159, 1167, 1130, 1075, 1047,

965, 943, 958, 971, 949, 884, 866, 876, 822, 704, 719]}

df = DataFrame(Stock_Market, columns=['Year', 'Month',

'Interest_Rate', 'Unemployment_Rate', 'Stock_Index_Price'])

# Here we have 1 variable for linear regression

X = df[['Interest_Rate']]

Y = df['Stock_Index_Price']

# Model fitting with sklearn linear regression

regr = linear_model.LinearRegression()

regr.fit(X, Y)

# Displaying Intercept and coefficients

print('Intercept:\n', regr.intercept_)

print('\nCoefficients:\n', regr.coef_)

# Prediction with sklearn

new_interest_rate = 2.75

print('Predicted Stock Index Price:\n',

regr.predict([[new_interest_rate]]))

OUTPUT:
INTERPRETATION:

Simple linear regression is of the form y=w0 + wlx. The output shows wo (Intercept)

as —99 . 4 6431881371655 and W1 (Coefficient) as 564 . 2038924 9. According to the

above example, the equation becomes

Stock_Index_Price= wo+W1* Interest_Rate

i.e, Stock_Index_Price= -99.46431881371655 +564.20389249* Interest Rate

Stock_Index_Price = 1452 . 0 9 63 8 554 which is exactly the predicted stock index price.

3. READING FROM A CSV FILE AND PREDICTING A SET OF DEPENDENT VARIABLES :[pg.no:24-25]

PROGRAM:

import pandas as pd

from pandas import DataFrame

from sklearn import linear_model

# Reading the input data from a csv file

df = pd.read_csv("stock.csv")

# Here we have 1 variable for linear regression

X = df[['Interest_Rate']]

Y = df['Stock_Index_Price']

# Model fitting with sklearn linear regression

regr = linear_model.LinearRegression()

regr.fit(X, Y)

# Displaying Intercept and coefficients

print('Intercept:\n', regr.intercept_)

print('Coefficients:\n', regr.coef_)

# Prediction with sklearn for all the interest rates

new_interest_rate = df[['Interest_Rate']]
df1 = DataFrame(regr.predict(new_interest_rate))

print('Predicted Stock Index Price:\n', df1)

Output:

4. MULTIPLE LINEAR REGRESSION :[pg.no:25-27]

PROGRAM:

from pandas import DataFrame

from sklearn import linear_model

import statsmodels.api as sm
Stock_Market = {'Year': [2017, 2017, 2017, 2017, 2017, 2017,

2017, 2017, 2017, 2017, 2017, 2017, 2016, 2016, 2016, 2016,

2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016],

'Month': [12, 11, 10, 9, 8, 7, 6, 5, 4, 3,

2, 1, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1],

'Interest_Rate': [2.75, 2.5, 2.5, 2.5, 2.5,

2.5, 2.5, 2.25, 2.25, 2.25, 2, 2, 2, 1.75, 1.75, 1.75, 1.75,

1.75, 1.75, 1.75, 1.75, 1.75, 1.75, 1.75],

'Unemployment_Rate': [5.3, 5.3, 5.3, 5.3,

5.4, 5.6, 5.5, 5.5, 5.5, 5.6, 5.7, 5.9, 6, 5.9, 5.8, 6.1,

6.2, 6.1, 6.1, 6.1, 6.1, 5.9, 6.2, 6.2],

'Stock_Index_Price': [1464, 1394, 1357,

1293, 1256, 1254, 1234, 1195, 1159, 1167, 1130, 1075, 1047,

965, 943, 958, 971, 949, 884, 866, 876, 822, 704, 719]}

df = DataFrame(Stock_Market, columns=['Year', 'Month',

'Interest_Rate', 'Unemployment_Rate', 'Stock_Index_Price'])

# Here we have 2 variables for multiple regression.

X = df[['Interest_Rate', 'Unemployment_Rate']]

Y = df['Stock_Index_Price']

# Model fitting with sklearn linear regression

regr = linear_model.LinearRegression()

regr.fit(X, Y)

# Displaying Intercept and coefficients

print('Intercept:\n', regr.intercept_)

print('Coefficients:\n', regr.coef_)

# Prediction with sklearn

new_interest_rate = 2.75

new_unemployment_rate = 5.3

print('Stock Index Price:')

print(regr.predict([[new_interest_rate,new_unemployment_rate]]))

# Prediction with statsmodels

X = sm.add_constant(X) # adding a constant

model = sm.OLS(Y, X).fit()

predictions = model.predict(X)

print(model.summary())

Output:

INTREPRETATION OF RESULT:

This output includes the intercept and coeffcients. We can use this information to

build the multiple linear regression equation as follows. Stock_Index_Price = (Intercept) +

(Interest_Rate coef)*X1 +(Unemployment_Rate coef)*X2 Substituting the values of

intercept and coeffcients we get Stock_Index_Price= (1798.4040) +(3455401)X1+ (-250.1466)X2

Let. Interest_Rate = 2.75 (i.e., X 1= 2.75) and Unemployment_Rate = 5.3

(i.e., X2= 5.3). Substituting the above data into the regression equation, we will get the exact same
predicted results as displayed. = (1798.4040) + (3455401)*(2.75)+(-250.1466)*(5.3) = 1422.86

The table OLS Regression results displays a comprehensive table with statistical info

generated by statsmodels. Following are some important information from the OLS Regression
Results table.

 Adj. R-squared reflects the fit of the model.

 R-squared values range from 0 to 1, where a higher value generally indicates a better fit,
assuming certain conditions are met.
 const coeffcient is our Y-intercept. It means that if Interest Rate coeffcient is zero, then the
expected output (i.e., the Y) would be equal to the const coeffcient.
 Interest Rate coefficient represents the change in the output Y due to a change of one unit in
the interest rate (everything else held constant).
 Unemployment Rate coefficient represents the change in the output Y due to a change of one
unit in the interest rate (everything else held constant).
 std err reflects the level of accuracy of the coeffcients. The lower it is, the higher is the level of
accuracy.
 P >ltl is your p-value. A p-value of less than 0.05 is considered to be statistically
 significant. Confidence interval represents the range in which our coefficients are likely to fall
(with a likelihood of95%).

Notice that the coeffcients captured in this table (highlighted) match with the coeffcients generated
by sklearn. We got consistent results by applying both sklearn and statsmodels.

5. LINEAR REGRESSION:[pg.no:29-30]

PROGRAM :

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

# Importing the dataset

dataset = pd.read_csv('position_salaries.csv')

X = dataset.iloc[:, 1:2].values

y = dataset.iloc[:, 2].values

# Splitting the dataset into the Training set and Test set

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y,

test_size=0.2, random_state=0)

# Fitting Linear Regression to the dataset

from sklearn.linear_model import LinearRegression

regressor = LinearRegression()
regressor.fit(X_train, y_train)

# Visualizing the Linear Regression results

plt.scatter(X_train, y_train, color='red')

plt.plot(X_train, regressor.predict(X_train), color='blue')

plt.title('Linear Regression')

plt.xlabel('Years of Experience')

plt.ylabel('Salary')

plt.show()

Output:

Explanation:

In this example, we have used 4 libraries namely numpy, pandas, matplotlib and

sklearn. We have imported libraries and got the dataset first. The dataset is a table

which contains all values in our csv file. X, the 2nd column which contains Years of

Experience array and y the last column which contains Salary array. We have split

our dataset to get training set and testing set (both X and y values per each set).

Test_size=0.2:We have split our dataset (10 observations) into 2 parts (training

set, test set) and the ratio of test set compare to dataset is 0.2 (2 observations will

be put into the test set. We can put it 1/7 to get 20% or 0.2, they are the same. We

should not let the test set too big. If it's too big, we will be lacking data to train. Normally, we should
pick around 5% to 30%.

Train_size : If we use the test size already, the rest of data will

automatically be assigned to train_size.

Random_state : This is the seed for the random number generator. We can put
an instance of the RandomState class as well. If we leave it blank or 0, the

RandomState instance used by np.random will be used instead. We have

already the train set, test set, and built the linear regression model. Now, will build

a polynomial regression model and visualize it.

6. POLYNOMINAL REGRESSION :[pg.no:30-31]

PROGRAM :

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

# Importing the dataset

dataset = pd.read_csv('position_salaries.csv')

X = dataset.iloc[:, 1:2].values

y = dataset.iloc[:, 2].values

# Splitting the dataset into the Training set and Test set

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y,

test_size=0.2, random_state=0)

# Fitting polynomial regression to the dataset

from sklearn.linear_model import LinearRegression

from sklearn.preprocessing import PolynomialFeatures

poly_reg = PolynomialFeatures(degree=4)

X_poly = poly_reg.fit_transform(X)

lin_reg = LinearRegression()

lin_reg.fit(X_poly, y)

# Visualizing the Polynomial Regression results

def viz_polynomial():

plt.scatter(X, y, color='red')

plt.plot(X, lin_reg.predict(poly_reg.fit_transform(X)),

color='blue')

plt.title('Polynomial Regression')

plt.xlabel('Years of Experience')
plt.ylabel('Salary')

plt.show()

return

viz_polynomial()

OUTPUT:

7. LOGISTIC REGRESSION:[pg.no:32-33]

PROGRAM FOR CONFUSION MATRIX:

import pandas as pd

import seaborn as sn

import matplotlib.pyplot as plt

data = {'y_Predicted': [1, 1, 0, 1,0,1,1,0,1,0,0,0],

'y_Actual': [1, 0, 0, 1, 0, 1, 0, 0, 1,0,1,0]

df = pd.DataFrame(data, columns=['y_Actual', 'y_Predicted'])

# Creating confusion matrix

confusion_matrix = pd.crosstab(df['y_Actual'], df['y_Predicted'], rownames=['Actual'],

colnames=['Predicted'],margins=True)

# Generating heatmap and displaying it

ax = sn.heatmap(confusion_matrix, annot=True)

plt.show()
# Getting the statistics of the confusion matrix

print(confusion_matrix)

Output:

8. PROGRAM:[pg.no:33-35]

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

from sklearn import metrics

import seaborn as sn

import matplotlib.pyplot as plt

candidates = {

'gmat': [780, 750, 690, 710, 680, 730, 690, 720, 740,

690, 610, 690, 710, 680, 770, 610, 580, 650, 540,

590, 620, 600, 550, 550,570, 670, 660, 580, 650,

660, 640, 620, 660, 660, 680, 650, 670, 580, 590, 690],

'gpa': [4,3.9, 3.3, 3.7, 3.9, 3.7, 2.3, 3.3, 3.3,

1.7, 2.7, 3.7, 3.7,3.3, 3.3, 3, 2.7, 3.7, 2.7, 2.3,

3.3, 2,2.3, 2.7, 3, 3.3, 3.7, 2.3, 3.7,

3.3, 3, 2.7, 4, 3.3, 3.3, 2.3, 2.7, 3.3, 1.7,

3.7],

'work experience': [3, 4, 3, 5, 4, 6, 1, 4, 5, 1,3, 5, 6,

4, 3, 1, 4, 6, 2, 3, 2, 1, 4, 1, 2, 6, 4, 2, 6, 5, 1, 2, 4,

6, 5, 1,2, 1, 4, 5],

'admitted': [1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 0, 1, 1, 1,

1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 0, 0, 1, 1,

1, 0, 0, 0, 0, 1]

df = pd.DataFrame(candidates, columns=['gmat', 'gpa',

'work experience', 'admitted'])

X = df[['gmat', 'gpa', 'work experience']]

y = df['admitted']

# Splitting the dataset into training and testing

X_train, X_test, y_train, y_test = train_test_split(X, y,

test_size=0.25, random_state=0)

# Fitting logistic regression to the dataset

logistic_regression = LogisticRegression()

logistic_regression.fit(X_train, y_train)

y_pred = logistic_regression.predict(X_test)

# Creating confusion matrix

confusion_matrix = pd.crosstab(y_test, y_pred,

rownames=['Actual'], colnames=['Predicted'], margins=True)

# Generating heatmap and displaying it

ax = sn.heatmap(confusion_matrix, annot=True)

plt.show()

print(confusion_matrix)

# Displaying accuracy

print('Accuracy:', metrics.accuracy_score(y_test,y_pred))

Output:
9. PROGRAM:[pg.no:37-38]

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.linear_model import LogisticRegression

from sklearn import metrics

import seaborn as sn

import matplotlib.pyplot as plt

candidates = {

'gmat': [780, 750, 690, 710, 680, 730, 690, 720, 740,

690, 610, 690, 710, 680, 770, 610, 580, 650, 540, 590, 620,

600, 550, 550, 570, 670, 660, 580, 650, 660, 640, 620, 660,

660, 680, 650, 670, 580, 590, 690],

'gpa': [4,3.9, 3.3, 3.7, 3.9, 3.7, 2.3, 3.3, 3.3,

1.7, 2.7, 3.7, 3.7,3.3, 3.3, 3, 2.7, 3.7, 2.7, 2.3,

3.3, 2,2.3, 2.7, 3, 3.3, 3.7, 2.3, 3.7,

3.3, 3, 2.7, 4, 3.3, 3.3, 2.3, 2.7, 3.3, 1.7,

3.7],

'work experience': [3, 4, 3, 5, 4, 6, 1, 4, 5, 1,3, 5, 6,

4, 3, 1, 4, 6, 2, 3, 2, 1, 4, 1, 2, 6, 4, 2, 6, 5, 1, 2, 4,
6, 5, 1,2, 1, 4, 5],

'admitted': [1,1,1,1,1,1,0,1,1,0,0,1,1,1,1,0,0,1,

0,0,0,0,0,0,0,1,1,0,1,1,0,0,1,1,1,0,0,0,0,1]

df = pd.DataFrame(candidates, columns=['gmat', 'gpa',

'work experience', 'admitted'])

X = df[['gmat', 'gpa', 'work experience']]

y = df['admitted']

X_train, X_test, y_train, y_test = train_test_split(X, y,

test_size=0.25, random_state=0)

logistic_regression = LogisticRegression()

logistic_regression.fit(X_train, y_train)

new_candidates = {

'gmat': [590, 740, 680, 610, 710],

'gpa': [2,3.7, 3.3, 2.3, 3],

'work experience': [3, 4, 6, 1, 5]

df2 = pd.DataFrame(new_candidates, columns=['gmat', 'gpa',

'work experience'])

y_pred = logistic_regression.predict(df2)

print(df2)

print(y_pred)

Output:

Regression Analysis Techniques
No ratings yet
Regression Analysis Techniques
16 pages
Simple Linear Regression in Machine Learning
No ratings yet
Simple Linear Regression in Machine Learning
7 pages
Lecture-2 Unit 2
No ratings yet
Lecture-2 Unit 2
56 pages
Import Pandas As PD
No ratings yet
Import Pandas As PD
3 pages
Regression
No ratings yet
Regression
16 pages
Machine Learning 2
No ratings yet
Machine Learning 2
45 pages
Python Simple Linear Regression Guide
No ratings yet
Python Simple Linear Regression Guide
8 pages
Supervised Learning For Data Science...
No ratings yet
Supervised Learning For Data Science...
14 pages
Python Simple Linear Regression Guide
No ratings yet
Python Simple Linear Regression Guide
14 pages
Lab 6 - Linear Regression and Multiple Linear Regression
No ratings yet
Lab 6 - Linear Regression and Multiple Linear Regression
12 pages
Practical # 10
No ratings yet
Practical # 10
5 pages
Unit 2 Regression Analysis
No ratings yet
Unit 2 Regression Analysis
16 pages
DS Unit 4
No ratings yet
DS Unit 4
21 pages
Simple Linear Regression Guide
No ratings yet
Simple Linear Regression Guide
4 pages
Linear Regression - Numpy and Sklearn
No ratings yet
Linear Regression - Numpy and Sklearn
7 pages
Dav 2,3
No ratings yet
Dav 2,3
6 pages
Practical 5
No ratings yet
Practical 5
8 pages
Simple Linear Regression Lab II
No ratings yet
Simple Linear Regression Lab II
5 pages
Simple Linear Regression Code
No ratings yet
Simple Linear Regression Code
3 pages
Practical 8
No ratings yet
Practical 8
5 pages
Linear Regression
No ratings yet
Linear Regression
1 page
Intro to Machine Learning Basics
No ratings yet
Intro to Machine Learning Basics
132 pages
Data Science for Beginners
No ratings yet
Data Science for Beginners
98 pages
223a1131 ML Exp 1
No ratings yet
223a1131 ML Exp 1
8 pages
2.3 ML (Implementation of Polynomial Regression Using Python)
No ratings yet
2.3 ML (Implementation of Polynomial Regression Using Python)
9 pages
Linear Regression
No ratings yet
Linear Regression
20 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
2 pages
2 Linear Regression
No ratings yet
2 Linear Regression
5 pages
Simple Linear Regression-Codes
No ratings yet
Simple Linear Regression-Codes
1 page
DS P6 Yash
No ratings yet
DS P6 Yash
8 pages
Aakash S Project Report
No ratings yet
Aakash S Project Report
12 pages
Linear Regression2
No ratings yet
Linear Regression2
9 pages
Linear Regression Explained
No ratings yet
Linear Regression Explained
8 pages
Regression Model
No ratings yet
Regression Model
6 pages
Simple Linear Regression
No ratings yet
Simple Linear Regression
30 pages
Linear Regression
No ratings yet
Linear Regression
11 pages
Financial Econometrics Guide
No ratings yet
Financial Econometrics Guide
55 pages
Question 1 B
No ratings yet
Question 1 B
6 pages
Experiment No.8
No ratings yet
Experiment No.8
5 pages
Asg5 Dmds
No ratings yet
Asg5 Dmds
4 pages
Python Data Analysis Guide
No ratings yet
Python Data Analysis Guide
171 pages
ML 6 7 8
No ratings yet
ML 6 7 8
10 pages
Linear Regression
No ratings yet
Linear Regression
20 pages
WQU - Econometrics - Module2 - Compiled Content
100% (2)
WQU - Econometrics - Module2 - Compiled Content
73 pages
Exp 4 - LM
No ratings yet
Exp 4 - LM
5 pages
CL IV Manual
No ratings yet
CL IV Manual
108 pages
ML Polynomial Regression4
No ratings yet
ML Polynomial Regression4
36 pages
Major Project 1
No ratings yet
Major Project 1
5 pages
Wa0002.
No ratings yet
Wa0002.
5 pages
Module 2 Notes
No ratings yet
Module 2 Notes
4 pages
Linear Regression - Jupyter Notebook
100% (3)
Linear Regression - Jupyter Notebook
56 pages
? What Is Regression
No ratings yet
? What Is Regression
12 pages
Linear Regression
No ratings yet
Linear Regression
18 pages
ML Experiment No 1 Linear Regression Analysis
No ratings yet
ML Experiment No 1 Linear Regression Analysis
3 pages
Python File
No ratings yet
Python File
5 pages
Linear Regression Salary Prediction
No ratings yet
Linear Regression Salary Prediction
8 pages
STCMRKTFRCST - Multiple Linear Regression
No ratings yet
STCMRKTFRCST - Multiple Linear Regression
4 pages
Experiment No.:1: Program
No ratings yet
Experiment No.:1: Program
7 pages
ml1 PRG
No ratings yet
ml1 PRG
2 pages
HW 3
No ratings yet
HW 3
3 pages
Operations Management: Sustainability and Supply Chain Management
No ratings yet
Operations Management: Sustainability and Supply Chain Management
72 pages
Applied Bayesian Statistics Scott M Lynch PDF Download
100% (1)
Applied Bayesian Statistics Scott M Lynch PDF Download
90 pages
Competitive Exam Statistics Paper
No ratings yet
Competitive Exam Statistics Paper
4 pages
Dissertation Logistic Regression
100% (2)
Dissertation Logistic Regression
4 pages
Species Distribution Modelling Guide
No ratings yet
Species Distribution Modelling Guide
14 pages
Chap 8 22
No ratings yet
Chap 8 22
7 pages
Nedl Arch
No ratings yet
Nedl Arch
147 pages
Linear Programming
0% (1)
Linear Programming
8 pages
Packages Stata
No ratings yet
Packages Stata
30 pages
Linear Regression for Researchers
No ratings yet
Linear Regression for Researchers
41 pages
Least Squares Regression Guide
No ratings yet
Least Squares Regression Guide
15 pages
CLS565 - Sprin 2025
No ratings yet
CLS565 - Sprin 2025
4 pages
2.2.b Introduction To Reliability Rev2.1 - 201809
No ratings yet
2.2.b Introduction To Reliability Rev2.1 - 201809
42 pages
Introduction To Multiple Regression: Dale E. Berger Claremont Graduate University
100% (1)
Introduction To Multiple Regression: Dale E. Berger Claremont Graduate University
13 pages
Principles of Actuarial Science Games of Chance (1) : Week 1
No ratings yet
Principles of Actuarial Science Games of Chance (1) : Week 1
23 pages
Econ 481: Final Exam
No ratings yet
Econ 481: Final Exam
4 pages
Lect1 2 Exercises Solutions
No ratings yet
Lect1 2 Exercises Solutions
13 pages
Statistical Tools in Research
No ratings yet
Statistical Tools in Research
16 pages
9.0 Lesson Plan
No ratings yet
9.0 Lesson Plan
16 pages
Chapter 5 Multicollinearity
No ratings yet
Chapter 5 Multicollinearity
20 pages
Nguyễn Phát Thịnh - assignment 11
No ratings yet
Nguyễn Phát Thịnh - assignment 11
6 pages
ADMS 3330 - Test - II Formula Sheets
No ratings yet
ADMS 3330 - Test - II Formula Sheets
4 pages
4.4 Correlation and Simple Linear Regression
No ratings yet
4.4 Correlation and Simple Linear Regression
19 pages
Heteroskedasticity Analysis Guide
No ratings yet
Heteroskedasticity Analysis Guide
26 pages
23 Estimate Rand Var 2
No ratings yet
23 Estimate Rand Var 2
19 pages
Chapter 7-Tahoe-Salt
No ratings yet
Chapter 7-Tahoe-Salt
14 pages
Chapter 06 Investment Appraisal
No ratings yet
Chapter 06 Investment Appraisal
15 pages
Lecture 5
No ratings yet
Lecture 5
27 pages
Diseños Experimentales Docente: Dr. Melicio M. Siles Practica 10 - Solucion
No ratings yet
Diseños Experimentales Docente: Dr. Melicio M. Siles Practica 10 - Solucion
5 pages

2 Regression

Uploaded by

2 Regression

Uploaded by

CHAPTER 2: REGRESSION

1. CHECKING LINEARITY : [pg.no:21-22]

from pandas import DataFrame

import matplotlib.pyplot as plt

Stock_Market = {'Year': [2017, 2017, 2017, 2017, 2017, 2017,

2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016],

'Month': [12, 11, 10, 9, 8, 7, 6, 5, 4, 3,

2, 1, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1],

'Interest_Rate': [2.75, 2.5, 2.5, 2.5, 2.5,

2.5, 2.5, 2.25, 2.25, 2.25, 2, 2, 2, 1.75, 1.75, 1.75, 1.75,

1.75, 1.75, 1.75, 1.75, 1.75, 1.75, 1.75],

'Unemployment_Rate': [5.3, 5.3, 5.3, 5.3,

6.2, 6.1, 6.1, 6.1, 6.1, 5.9, 6.2, 6.2],

'Stock_Index_Price': [1464, 1394, 1357,

df = DataFrame(Stock_Market, columns=['Year', 'Month',

'Interest_Rate', 'Unemployment_Rate', 'Stock_Index_Price'])

plt.title('Stock Index Price Vs Interest Rate', fontsize=14)

plt.xlabel('Interest Rate', fontsize=14)

plt.ylabel('Stock Index Price', fontsize=14)

plt.title('Stock Index Price Vs Unemployment Rate',

plt.xlabel('Unemployment Rate', fontsize=14)

plt.ylabel('Stock Index Price', fontsize=14)

2. SIMPLE LINEAR REGRESSION:[pg.no:22-23]

from pandas import DataFrame

from sklearn import linear_model

2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016],

'Month': [12, 11, 10, 9, 8, 7, 6, 5, 4, 3,

2, 1, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1],

'Interest_Rate': [2.75, 2.5, 2.5, 2.5, 2.5,

2.5, 2.5, 2.25, 2.25, 2.25, 2, 2, 2, 1.75, 1.75, 1.75, 1.75,

1.75, 1.75, 1.75, 1.75, 1.75, 1.75, 1.75],

'Unemployment_Rate': [5.3, 5.3, 5.3, 5.3,

6.2, 6.1, 6.1, 6.1, 6.1, 5.9, 6.2, 6.2],

'Stock_Index_Price': [1464, 1394, 1357,

df = DataFrame(Stock_Market, columns=['Year', 'Month',

'Interest_Rate', 'Unemployment_Rate', 'Stock_Index_Price'])

# Here we have 1 variable for linear regression

# Model fitting with sklearn linear regression

# Displaying Intercept and coefficients

# Prediction with sklearn

print('Predicted Stock Index Price:\n',

as —99 . 4 6431881371655 and W1 (Coefficient) as 564 . 2038924 9. According to the

above example, the equation becomes

Stock_Index_Price= wo+W1* Interest_Rate

i.e, Stock_Index_Price= -99.46431881371655 +564.20389249* Interest Rate

from pandas import DataFrame

from sklearn import linear_model

# Reading the input data from a csv file

# Here we have 1 variable for linear regression

# Model fitting with sklearn linear regression

# Displaying Intercept and coefficients

# Prediction with sklearn for all the interest rates

print('Predicted Stock Index Price:\n', df1)

4. MULTIPLE LINEAR REGRESSION :[pg.no:25-27]

from pandas import DataFrame

from sklearn import linear_model

2016, 2016, 2016, 2016, 2016, 2016, 2016, 2016],

'Month': [12, 11, 10, 9, 8, 7, 6, 5, 4, 3,

2, 1, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1],

'Interest_Rate': [2.75, 2.5, 2.5, 2.5, 2.5,

2.5, 2.5, 2.25, 2.25, 2.25, 2, 2, 2, 1.75, 1.75, 1.75, 1.75,

1.75, 1.75, 1.75, 1.75, 1.75, 1.75, 1.75],

'Unemployment_Rate': [5.3, 5.3, 5.3, 5.3,

6.2, 6.1, 6.1, 6.1, 6.1, 5.9, 6.2, 6.2],

'Stock_Index_Price': [1464, 1394, 1357,

df = DataFrame(Stock_Market, columns=['Year', 'Month',

'Interest_Rate', 'Unemployment_Rate', 'Stock_Index_Price'])

# Here we have 2 variables for multiple regression.

# Model fitting with sklearn linear regression

# Displaying Intercept and coefficients

# Prediction with sklearn

print('Stock Index Price:')

# Prediction with statsmodels

model = sm.OLS(Y, X).fit()

build the multiple linear regression equation as follows. Stock_Index_Price = (Intercept) +

intercept and coeffcients we get Stock_Index_Price= (1798.4040) +(3455401)*X1+ (-250.1466)*X2

intercept and coeffcients we get Stock_Index_Price= (1798.4040) +(3455401)X1+ (-250.1466)X2