[go: up one dir, main page]

0% found this document useful (0 votes)
12 views14 pages

Project1 Research Report Week2 FullPages

The report outlines a comprehensive analysis of a marketing dataset, focusing on the relationship between advertising spends and sales. Key findings include strong correlations between TV and Radio spending with sales, the significance of linear regression modeling, and the importance of data cleaning and preprocessing. The report also discusses model evaluation techniques and the trade-offs involved in using polynomial features and interaction effects in modeling.

Uploaded by

daabhu62
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views14 pages

Project1 Research Report Week2 FullPages

The report outlines a comprehensive analysis of a marketing dataset, focusing on the relationship between advertising spends and sales. Key findings include strong correlations between TV and Radio spending with sales, the significance of linear regression modeling, and the importance of data cleaning and preprocessing. The report also discusses model evaluation techniques and the trade-offs involved in using polynomial features and interaction effects in modeling.

Uploaded by

daabhu62
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Chapter-wise Research Report - Project 1

Chapter 1: Understanding the Dataset and Problem Statement

Learned to identify the structure and objective of the dataset. Understood the relationship between marketing

spends (TV, Radio, Newspaper) and sales. Recognized the importance of defining dependent and

independent variables.
Chapter-wise Research Report - Project 1

Chapter 2: Data Cleaning and Preprocessing

Verified the dataset for missing values, proper column names, and data types. Learned how early checks

prevent issues during modeling. No missing values were found in the given dataset.
Chapter-wise Research Report - Project 1

Chapter 3: Exploratory Data Analysis (EDA)

Applied scatterplots and pairplots to observe relationships between variables. Found strong correlation of TV

and Radio with Sales. Used visualization to form hypotheses about variable behavior.
Chapter-wise Research Report - Project 1

Chapter 4: Correlation and Statistical Understanding

Calculated correlation coefficients. Learned to interpret strength and direction of relationships. Understood

that correlation does not imply causation.


Chapter-wise Research Report - Project 1

Chapter 5: Linear Regression Modeling

Built simple and multiple linear regression models. Learned about coefficients, intercept, R² value, and

adjusted R². Found that TV and Radio are statistically significant predictors.
Chapter-wise Research Report - Project 1

Chapter 6: Model Evaluation and Interpretation

Evaluated regression models using R² and p-values. Understood implications of high R² and the risk of

including insignificant predictors like Newspaper.


Chapter-wise Research Report - Project 1

Chapter 7: Polynomial Features and Interaction Effects

Learned to include polynomial terms and interaction features to model non-linear effects. Recognized the

tradeoff between accuracy improvement and risk of overfitting.


Additional Content for Chapter 1

In this chapter, we explored the business context of the dataset, focusing on the role of advertising
The dataset comprises data from a marketing campaign, with spending figures across TV, Radio, a
We emphasized the significance of clearly defining the dependent and independent variables to bui
Initial exploration helped us hypothesize how different media channels might impact sales differentl
This understanding guided our expectations and analysis in the subsequent chapters..
Additional Content for Chapter 2

Data cleaning is crucial for reliable results in any data science project.
We checked for missing values using functions like isnull().sum() and ensured that the column nam
Data types were examined and found appropriate.
The absence of missing data simplified the preprocessing.
We also considered renaming columns for clarity but retained the original names for consistency.
This chapter underlines the importance of validating data before proceeding to modeling..
Additional Content for Chapter 3

Using seaborn and matplotlib, we conducted an in-depth exploratory data analysis.


Pairplots revealed that TV and Radio spending showed a strong positive linear relationship with Sa
Boxplots helped identify the distribution and potential outliers in the dataset.
Correlation heatmaps visually supported our hypothesis about the varying impacts of each channel
These visual tools provided intuition and direction for model building..
Additional Content for Chapter 4

We computed Pearson correlation coefficients to quantify the strength of relationships between eac
The strongest correlation was observed between TV and Sales, followed by Radio.
Newspaper had a relatively weak correlation, suggesting it might not be a strong predictor.
We discussed the difference between correlation and causation and how this distinction affects bus
Additional Content for Chapter 5

Linear regression was implemented using sklearn.


We began with simple linear regression for individual predictors, followed by multiple regression inc
Model summaries provided insight into coefficients, intercepts, and R² values.
TV and Radio had statistically significant coefficients, reinforcing their importance as predictors.
Newspaper's coefficient was not significant, which raised considerations about model simplification
Additional Content for Chapter 6

Model evaluation was performed using R² and adjusted R² to assess fit quality.
We also reviewed p-values for each predictor to determine their statistical significance.
High R² values from models including TV and Radio indicated good fit, whereas including Newspap
This step taught the importance of balancing complexity with interpretability..
Additional Content for Chapter 7

Polynomial regression and interaction terms were introduced to capture non-linear and combined e
Polynomial terms like TV² and interaction terms like TV*Radio were added to improve prediction.
While model accuracy improved slightly, it also increased the risk of overfitting.
This experiment highlighted the trade-offs between model complexity and generalization capability

You might also like