Assignment: Visualizing the Movie Dataset
Objective:
Learn how to visualize data to gain insights from the movie dataset.
Part 1: Dataset Overview
1. Load the movie dataset into a Pandas DataFrame.
2. Inspect the data:
o Display the first few rows.
o Check for missing values and fill or drop them.
o Get a summary of the data (e.g., column data types, basic
statistics).
Part 2: Simple Visualizations
1. Univariate Analysis (one variable):
o Create a histogram of the popularity column to see how movie
popularity is distributed.
o Create a bar plot of the average vote_average for each
release_year.
2. Bivariate Analysis (two variables):
o Create a scatter plot to show the relationship between
popularity and vote_average.
o Create a box plot of vote_average grouped by release_year to
check how ratings have changed over time.
Part 3: Insights
1. Write a short summary of what you learned from the visualizations.
For example:
o Did popularity and vote average have any noticeable patterns?
o How have average movie ratings changed over time?
Tools:
Use Matplotlib or Seaborn to create the visualizations.
Submission:
Submit a Jupyter Notebook with the code and a few lines explaining your
findings.