0% found this document useful (0 votes)

38 views23 pages

Netflix Data Analysis

The Netflix analysis provides a farmhouse thought process into the analysis of a data with the help of the reviews and the movie journal and henceforth this data analysis will help you to know and analysis then the Forth coming Netflix movies

Uploaded by

jayachandraprabha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

38 views23 pages

Netflix Data Analysis

Uploaded by

jayachandraprabha

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Netflix Data Analysis

Python Data Analysis Project

Introduction

Netflix, a pioneer in streaming media and entertainment services, has fundamentally transformed the way
people consume content worldwide. Founded in 1997 as a DVD rental company, it transitioned to
streaming in 2007 and rapidly grew to become a global leader, boasting a vast library of movies, TV
shows, and documentaries. With millions of subscribers across more than 190 countries, Netflix offers
content in multiple languages and caters to diverse cultural and demographic preferences. This growth
reflects the industry-wide shift from traditional television to on-demand streaming services, driven by an
increasing reliance on digital media.

Industry Scope

The streaming industry has witnessed exponential growth over the last decade, disrupting the traditional
media landscape. This shift has empowered viewers with greater flexibility and choice, and has encouraged
other entertainment companies to enter the streaming market. As a result, platforms like Hulu, Amazon
Prime Video, and Disney+ are constantly competing for viewer attention. Netflix’s unique advantage lies in
its vast content catalog, which spans various genres, languages, and countries, and its investment in original
productions, which have garnered critical acclaim and a loyal following.

Purpose of the Analysis

This analysis aims to explore the content trends, regional preferences, and demographic targeting strategies
of Netflix over the years. By examining content types, distribution across countries, genre popularity,
release trends, and duration metrics, we can gain insights into how Netflix curates and expands its catalog
to meet evolving viewer demands. The findings could provide valuable information for strategic decisions
regarding content development, localization, and audience targeting.

Dataset Overview

The dataset comprises 7,787 rows and 12 columns, each providing information about individual Netflix
titles, such as their category, country of origin, and year of release. Below is an overview of the columns in
the dataset:

Column Description

show_id Unique identifier for each title.

type Specifies whether the title is a “Movie” or “TV Show”.
title Name of the title.

1
director Director of the title (if applicable).
Column Description
cast List of main cast members.
country Country where the content was produced.
date_added Date when the title was added to Netflix.
release_yearYear the title was released.
rating Content rating (e.g., TV-MA, PG-13).
duration Duration of the title (minutes for Movies, seasons for TV Shows).
listed_in Genres associated with the title.
description Brief summary of the title.
This dataset allows us to analyze content distribution by country, trends over time, genre preferences, age
demographics, and other key metrics. By examining these variables, we aim to uncover insights into
Netflix’s content strategy and the broader streaming industry dynamics.

1. Imports and Setup

2. Data Loading
[45]: # Load the Netflix dataset (replace 'netflix_data.csv' with your
file path) df = pd.read_csv('/content/drive/MyDrive/Data
Analysis/Python Project/Netflix␣ ↪Data
Analysis/netflix_dataset.csv')

# Display the first and last few rows to understand data

structure df.head()

[45]: show_id type title director \

0 s1 TV Show 3% NaN
1 s2 Movie 7:19 Jorge Michel Grau 2 s3
Movie 23:59 Gilbert Chan 3 s4
Movie 9 Shane Acker 4 s5 Movie 21
Robert Luketic

2
cast country \
0 João Miguel, Bianca Comparato, Michel Gomes, Brazil
R…
1 Demián Bichir, Héctor Bonilla, Oscar Serrano, Mexico
…
2 Tedd Chan, Stella Chung, Henley Hii, Lawrence Singapore
…
3 Elijah Wood, John C. Reilly, Jennifer Connelly… United States
4 Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar… United States
date_added release_year rating duration \
0 14-Aug-20 2020 TV-MA 4
Seasons
1 23-Dec-16 2016 TV-MA 93 min
2 20-Dec-18 2011 R 78 min
3 16-Nov-17 2009 PG-13 80 min
4 1-Jan-20 2008 PG-13 123 min
listed_in \
0 International TV Shows, TV Dramas, TV Sci-Fi &…
1 Dramas, International Movies 2 Horror Movies,
International Movies
3 Action & Adventure,
Independent Movies, Sci-Fi…
4 Dramas

description
0 In a future where the elite inhabit an island …
1 After a devastating earthquake hits Mexico Cit…
2 When an army recruit is found dead, his fellow…
3 In a postapocalyptic world, rag-doll robots hi…
4 A brilliant group of students become card-coun…

[46]: show_id type title director \

7782 s7783 Movie Zozo Josef Fares 7783 s7784 Movie Zubaan Mozez
Singh 7784 s7785 Movie Zulu Man in Japan NaN 7785 s7786 TV Show
Zumbo's Just Desserts NaN
7786 s7787 Movie ZZ TOP: THAT LITTLE OL' BAND FROM TEXAS Sam Dunn

cast \
7782 Imad Creidi, Antoinette
Turk, Elias Gergi, Car…
7783 Vicky Kaushal, Sarah-
Jane Dias, Raaghav
Chanan…
7784 Nasty C

3
7785 Adriano Zumbo, Rachel
Khoo
7786 NaN

country date_added \
7782 Sweden, Czech Republic, United Kingdom, Denmar… 19-Oct-20
7783 India 2-Mar-19 7784 NaN 25-Sep-20
7785 Australia 31-Oct-20
7786 United Kingdom, Canada,
United States 1-Mar-20

release_year rating duration \

7782 2005 TV-MA 99 min
7783 2015 TV-14 111 min
7784 2019 TV-MA 44 min
7785 2019 TV-PG 1 Season
7786 2019 TV-MA 90 min

listed_in \
7782 Dramas, International Movies
7783 Dramas, International Movies, Music &
Musicals
7784 Documentaries, International Movies, Music &
M…
7785 International TV Shows, Reality TV
7786 Documentaries, Music & Musicals

description
7782 When Lebanon's Civil War deprives Zozo of his …
7783 A scrappy but poor boy worms his way into a ty…
7784 In this documentary, South African rapper Nast…
7785 Dessert wizard Adriano Zumbo looks for the nex…
7786 This documentary delves into the mystique behi…

3. Exploratory Data Analysis (EDA)

3.1 Basic Dataset Information
[47]: # Display the number of rows and columns

[47]: (7787, 12)

4
<class
'pandas.core.frame.DataFrame'>
RangeIndex: 7787 entries, 0 to
7786 Data columns (total 12
columns):
# Column Non-Null Count
Dtype
--- ------ -------------- ----
-
0 show_id 7787 non-null object
1 type 7787 non-null object
2 title 7787 non-null object
3 director 5398 non-null object
4 cast 7069 non-null object
5 country 7280 non-null object
6 date_added 7777 non-null object
7 release_year 7787 non-null int64
8 rating 7780 non-null object
9 duration 7787 non-null object
10 listed_in 7787 non-null object
11 description 7787 non- object
null dtypes: int64(1),
object(11) memory usage:
730.2+ KB

[49]: show_id object

type object
title object
director object
cast object
country object
date_added object
release_year int64
rating object
duration object
listed_in object
description object
dtype: object
3.2 Check for Missing Values

[50]: <Axes: >

5
3.3 Duplicate Rows

4. Insights and Queries

4.1 Content Distribution (Movies vs. TV Shows)

6
7
4.2 Content by Country
[73]: # Identify the top 10 countries with the most content on
Netflix top_countries =
df['country'].value_counts().nlargest(10) top_countries_df =
df[df['country'].isin(top_countries.index)]

# Stacked Bar Chart for Top 10 Countries

plt.figure(figsize=(12, 6))
sns.countplot(data=top_countries_df, x='country', hue='type',
palette='pastel')
plt.title("Top 10 Countries: Movie and TV Show
Split") plt.xticks(rotation=45)
plt.xlabel('Country') plt.ylabel('Count')
plt.legend(title='Content Type') plt.show()

8
Step 4.3: Content Added Over Time
[74]: # Extract year from the date_added column df['Year'] =
pd.to_datetime(df['date_added'], errors='coerce').dt.year
yearly_content = df['Year'].value_counts().sort_index()

# Column Chart for Content Added Over Time

plt.figure(figsize=(12, 6)) sns.countplot(data=df,
x='Year', hue='type', palette='Set2')
plt.title("Movies & TV Shows Added Over Time")
plt.xticks(rotation=45)
plt.xlabel('Year')
plt.ylabel('Count')
plt.legend(title='Content Type')
plt.show()

<ipython-input-74-dadcf18ad24f>:2: UserWarning: Could not infer

format, so each element will be parsed individually, falling back to
`dateutil`. To ensure parsing is consistent and as-expected, please
specify a format.
df['Year'] = pd.to_datetime(df['date_added'], errors='coerce').dt.year

9
<ipython-input-75-9910a64bb36c>:6: FutureWarning:

Passing `palette` without assigning `hue` is deprecated and will be

removed in v0.14.0. Assign the `x` variable to `hue` and set
`legend=False` for the same effect.

sns.barplot(x=age_ratings_dist.index, y=age_ratings_dist.values,
palette='viridis')

10
5. Visualizations
5.1 Content Distribution by Category
[77]: # Pie chart for Movie and TV Show distribution
df['type'].value_counts().plot.pie(autopct='%1.1f%%',
startangle=90,␣
↪colors=[netflix_red, netflix_black])

plt.title("Netflix Content Distribution: Movies vs TV

Shows") plt.ylabel('') plt.show()

11
5.2 Content by Country (Top 10)
[78]: # Stacked bar chart for top 10 countries
plt.figure(figsize=(12, 6))
sns.countplot(data=top_countries_df, x='country', hue='type',
palette='pastel')
plt.title("Top 10 Countries: Movie and TV Show
Split") plt.xticks(rotation=45) plt.show()

12
5.3 Content Added Over Time
[79]: # Content additions over the years
plt.figure(figsize=(12, 6)) sns.countplot(data=df,
x='Year', hue='type', palette='Set2')
plt.title("Movies & TV Shows Added Over Time")
plt.xticks(rotation=45)
plt.show()

13
5.4 Monthly Content Additions
[83]: import pandas as pd import
matplotlib.pyplot as plt
import seaborn as sns
import calendar

# Assuming df is already defined and contains the 'date_added'

column

# Extract month from release date for line chart

df['Month'] = pd.to_datetime(df['date_added'],
errors='coerce').dt.month monthly_content =
df.groupby('Month').size()

# Line chart for monthly content additions

plt.figure(figsize=(10, 5))
sns.lineplot(x=monthly_content.index, y=monthly_content.values,
marker="o",␣
↪color='red') # Replace 'netflix_red' with actual color if not

defined plt.title("Monthly Content Addition on Netflix")

plt.xlabel("Month")
plt.ylabel("Content Count")

14
# Set the x-ticks to be the month names instead of numbers
month_names = [calendar.month_name[i] for i in range(1, 13)] #
Generate month␣
↪names plt.xticks(monthly_content.index, month_names,

rotation=45) # Rotate for␣ ↪better visibility

<ipython-input-83-c099d03177b8>:9: UserWarning: Could not infer

format, so each element will be parsed individually, falling back to
`dateutil`. To ensure parsing is consistent and as-expected, please
specify a format. df['Month'] = pd.to_datetime(df['date_added'],
errors='coerce').dt.month

5.5 Age Ratings Distribution by Country

[84]: import pandas as pd import
matplotlib.pyplot as plt

# Assuming df is already defined and contains 'country' and 'rating'

columns

# Age ratings distribution by country age_ratings_by_country =

df.groupby(['country', 'rating']).size().unstack(). ↪fillna(0)

15
# Check which top countries are in the age ratings data
valid_top_countries = top_countries.index[top_countries.index.
↪isin(age_ratings_by_country.index)]

# Plot the data for valid top countries only

age_ratings_by_country.loc[valid_top_countries].plot(

16
17
18
6. Advanced Insights and Analysis

6.2 Content Rating Analysis

[97]: # Count of movies in Canada with TV-14 rating tv_14_canada =
df[(df['rating'] == 'TV-14') & (df['country'] == 'Canada') &␣
↪(df['type'] == 'Movie')]

len(tv_14_canada)

[97]: 11

6.3 Country with Most TV Shows

19
20
21
22
Conclusion
The Netflix Content Analysis project has revealed significant insights into the streaming platform’s
strategies for catering to diverse audiences. By comparing content trends between the USA and India, we
observed distinct patterns in the growth of content addition, particularly after Netflix’s entry into the
Indian market in 2016.
Key findings include the identification of target age demographics, variations in ratings across regions, and
the presence of notable actors, which highlight Netflix’s tailored approach to meet viewer preferences.
Overall, this analysis underscores the importance of understanding regional differences in content strategy,
providing a foundation for future explorations into streaming trends.

Shaun Mia | LinkedIn

Example Project
No ratings yet
Example Project
31 pages
Netflix Movie Recommendation System
No ratings yet
Netflix Movie Recommendation System
15 pages
Netflix Content Analysis Using Python
No ratings yet
Netflix Content Analysis Using Python
16 pages
STA220 FInal Project Report
No ratings yet
STA220 FInal Project Report
30 pages
Netflix Project
No ratings yet
Netflix Project
20 pages
Tableu Ca Suheal Updated
No ratings yet
Tableu Ca Suheal Updated
16 pages
Netflix Movies and TV Shows Clustering
No ratings yet
Netflix Movies and TV Shows Clustering
29 pages
NM Assignment
No ratings yet
NM Assignment
14 pages
Technical Documenetflix Technicalnt
No ratings yet
Technical Documenetflix Technicalnt
15 pages
Netflix Data Analysis Insights
No ratings yet
Netflix Data Analysis Insights
14 pages
Tableu Ca Suheal
No ratings yet
Tableu Ca Suheal
16 pages
Netflix Data Analysis Project Report
No ratings yet
Netflix Data Analysis Project Report
7 pages
Tableu Ca Suheal
No ratings yet
Tableu Ca Suheal
13 pages
R Project 98
No ratings yet
R Project 98
15 pages
Netflix Data - Cleaning, Analysis and Visualization - (Data Analyst)
No ratings yet
Netflix Data - Cleaning, Analysis and Visualization - (Data Analyst)
24 pages
Naan Muthalvan Practical Sample
No ratings yet
Naan Muthalvan Practical Sample
7 pages
EDA Case Study
No ratings yet
EDA Case Study
2 pages
Netflix Data Analysis Project
No ratings yet
Netflix Data Analysis Project
16 pages
Netflix Analysis Report (2105878 - Bibhudutta Swain)
No ratings yet
Netflix Analysis Report (2105878 - Bibhudutta Swain)
19 pages
Brief Report - Task1
No ratings yet
Brief Report - Task1
1 page
Netflix Case
0% (1)
Netflix Case
19 pages
Netflix Streaming Analysis Guide
No ratings yet
Netflix Streaming Analysis Guide
17 pages
Netflix Analysis
No ratings yet
Netflix Analysis
22 pages
12 TH
No ratings yet
12 TH
4 pages
Netflix Case Study
No ratings yet
Netflix Case Study
12 pages
Technical Docs of NETFLIX MOVIES AND TV SHOWS CLUSTERING
No ratings yet
Technical Docs of NETFLIX MOVIES AND TV SHOWS CLUSTERING
12 pages
PowerBi Report
No ratings yet
PowerBi Report
6 pages
Anurag Chaturvedi Netflix - Jupyter - Notebook Case Study
No ratings yet
Anurag Chaturvedi Netflix - Jupyter - Notebook Case Study
27 pages
Netflix
No ratings yet
Netflix
11 pages
Introduction To The Netflix Dataset Original 2
No ratings yet
Introduction To The Netflix Dataset Original 2
17 pages
SDC Project Format
No ratings yet
SDC Project Format
13 pages
Assignment Final
No ratings yet
Assignment Final
1 page
IMDB MOVIES Analysis
No ratings yet
IMDB MOVIES Analysis
13 pages
I Am Sharing 'Netflix - PPT' With You
No ratings yet
I Am Sharing 'Netflix - PPT' With You
11 pages
Business Case - Netflix - Data Exploration and Visualisation - Ipynb - Colab
No ratings yet
Business Case - Netflix - Data Exploration and Visualisation - Ipynb - Colab
9 pages
Data Analysis Netflix - Ba
No ratings yet
Data Analysis Netflix - Ba
9 pages
The Netflix Experience - Final
No ratings yet
The Netflix Experience - Final
12 pages
Analyzing Netflix Data
No ratings yet
Analyzing Netflix Data
9 pages
Case Study Data Analytics
No ratings yet
Case Study Data Analytics
12 pages
Netflix Data Analysis Project Report
No ratings yet
Netflix Data Analysis Project Report
8 pages
Sneha Kumari - 262 - DS Project.
No ratings yet
Sneha Kumari - 262 - DS Project.
19 pages
Smit Proj
No ratings yet
Smit Proj
11 pages
Netflix Business Case Study - Data Exploration and Visualisation.. Sonam Meshram
No ratings yet
Netflix Business Case Study - Data Exploration and Visualisation.. Sonam Meshram
27 pages
15 Funciones Esenciales de Pandas
No ratings yet
15 Funciones Esenciales de Pandas
12 pages
15 Pandas That Every Data Scientists Should Know 1674474419
No ratings yet
15 Pandas That Every Data Scientists Should Know 1674474419
10 pages
Netflix Data Analysis
No ratings yet
Netflix Data Analysis
11 pages
Netflix Case Study by Pavithran
No ratings yet
Netflix Case Study by Pavithran
36 pages
DVB 11,12 Exp
No ratings yet
DVB 11,12 Exp
8 pages
Netflix Businesscase ShivangKhare
No ratings yet
Netflix Businesscase ShivangKhare
73 pages
Project Report
No ratings yet
Project Report
16 pages
Netflix Ip Investigatory Project XLL-C
No ratings yet
Netflix Ip Investigatory Project XLL-C
22 pages
Netflix Content Strategy Analysis
No ratings yet
Netflix Content Strategy Analysis
26 pages
Netflix Data Analysis & Insights
No ratings yet
Netflix Data Analysis & Insights
9 pages
Netflix User and Movies Interest Analysis For Asian Countries
No ratings yet
Netflix User and Movies Interest Analysis For Asian Countries
5 pages
Netflix Data Analysis Vashisht
No ratings yet
Netflix Data Analysis Vashisht
29 pages
Exp - 1
No ratings yet
Exp - 1
17 pages
Visualizing Netflix Data Using Python!
No ratings yet
Visualizing Netflix Data Using Python!
13 pages
Netflix Data Exploration Solution Approach
No ratings yet
Netflix Data Exploration Solution Approach
6 pages
RE Paper
No ratings yet
RE Paper
25 pages
Mathcad - 2010 Special Structural Wall Design
No ratings yet
Mathcad - 2010 Special Structural Wall Design
10 pages
LISTENING Worksheet Assignment 3
No ratings yet
LISTENING Worksheet Assignment 3
2 pages
3 Life Sciences Grade 12 Hearing and Balance Lesson Powerpoint
100% (1)
3 Life Sciences Grade 12 Hearing and Balance Lesson Powerpoint
15 pages
Rag KG
No ratings yet
Rag KG
5 pages
AI Tools for Educators & Students
No ratings yet
AI Tools for Educators & Students
1 page
Gentoo Linux AMD64 Handbook
No ratings yet
Gentoo Linux AMD64 Handbook
95 pages
Intellipaat Python Certification Training Course Converted 3
No ratings yet
Intellipaat Python Certification Training Course Converted 3
11 pages
Teqnos
No ratings yet
Teqnos
648 pages
Fractions of Million Step by Step Corrected
No ratings yet
Fractions of Million Step by Step Corrected
5 pages
Grade 11 Theory Test 2021 FINAL
No ratings yet
Grade 11 Theory Test 2021 FINAL
7 pages
S. L Unit - 3
No ratings yet
S. L Unit - 3
58 pages
Primary School Guide 2015 2016 Version September 2015
No ratings yet
Primary School Guide 2015 2016 Version September 2015
73 pages
Server Details
No ratings yet
Server Details
66 pages
History of Infinity
No ratings yet
History of Infinity
27 pages
Questions in English: Sharon Armon Lot em Wh-Questions 1
No ratings yet
Questions in English: Sharon Armon Lot em Wh-Questions 1
24 pages
Soal Stan Dan Pmbahasan
100% (1)
Soal Stan Dan Pmbahasan
6 pages
Daily Duas (Morning and Evening)
No ratings yet
Daily Duas (Morning and Evening)
44 pages
Img 0012
No ratings yet
Img 0012
1 page
The Politics of Pain Postwar England and The Rise of Nationalism First American Edition European Union Instant Download
100% (3)
The Politics of Pain Postwar England and The Rise of Nationalism First American Edition European Union Instant Download
59 pages
DLL English 4 Whole Year
No ratings yet
DLL English 4 Whole Year
166 pages
Hultgren From The Damascus Covenant To The Covenant of Community
100% (1)
Hultgren From The Damascus Covenant To The Covenant of Community
639 pages
Numerical Methods Quiz Solutions
0% (1)
Numerical Methods Quiz Solutions
4 pages
Filipino Curriculum & K-12 Overview
No ratings yet
Filipino Curriculum & K-12 Overview
2 pages
Tuyen Tap Cau Hoi Thi Rung Chuong Vang Bang Tieng Anh
No ratings yet
Tuyen Tap Cau Hoi Thi Rung Chuong Vang Bang Tieng Anh
4 pages
Final Ex Lev 10
No ratings yet
Final Ex Lev 10
3 pages
Grade 8 Mathematics Online Baseline - Siyavula - Printout
100% (1)
Grade 8 Mathematics Online Baseline - Siyavula - Printout
26 pages
Programming in C Lab Manual FOR Diploma in Ece/Eee
No ratings yet
Programming in C Lab Manual FOR Diploma in Ece/Eee
18 pages
Transcript EAR00332022
No ratings yet
Transcript EAR00332022
2 pages
El Kah-Anoual-Publications-17-08-2022-11-08-19-34
No ratings yet
El Kah-Anoual-Publications-17-08-2022-11-08-19-34
10 pages
TalendFeatures and Performance Points
No ratings yet
TalendFeatures and Performance Points
20 pages

Netflix Data Analysis

Uploaded by

Netflix Data Analysis

Uploaded by

Netflix Data Analysis

Python Data Analysis Project

Purpose of the Analysis

show_id Unique identifier for each title.

1. Imports and Setup

# Display the first and last few rows to understand data

[45]: show_id type title director \

[46]: show_id type title director \

release_year rating duration \

3. Exploratory Data Analysis (EDA)

[47]: (7787, 12)

[49]: show_id object

[50]: <Axes: >

4. Insights and Queries

# Stacked Bar Chart for Top 10 Countries

# Column Chart for Content Added Over Time

<ipython-input-74-dadcf18ad24f>:2: UserWarning: Could not infer

Passing `palette` without assigning `hue` is deprecated and will be

plt.title("Netflix Content Distribution: Movies vs TV

# Assuming df is already defined and contains the 'date_added'

# Extract month from release date for line chart

# Line chart for monthly content additions

defined plt.title("Monthly Content Addition on Netflix")

rotation=45) # Rotate for␣ ↪better visibility

<ipython-input-83-c099d03177b8>:9: UserWarning: Could not infer

5.5 Age Ratings Distribution by Country

# Assuming df is already defined and contains 'country' and 'rating'

# Age ratings distribution by country age_ratings_by_country =

# Plot the data for valid top countries only

6.2 Content Rating Analysis

6.3 Country with Most TV Shows

Shaun Mia | LinkedIn

You might also like