7/4/25, 10:45 PM Untitled11.
ipynb - Colab
# 📊 Task 2: Exploratory Data Analysis - Students Performance
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
# Step 1: Load Dataset
df = pd.read_csv("/content/StudentsPerformance.csv") # Upload file in Colab and adjust path if needed
# Step 2: Basic Overview
print("📌 First 5 Rows:")
print(df.head())
print("\n🧾 Dataset Info:")
print(df.info())
print("\n📊 Summary Statistics:")
print(df.describe())
# Step 3: Missing Values
print("\n❓ Missing Values:")
print(df.isnull().sum())
# Step 4: Gender-wise Math Score
plt.figure(figsize=(6,4))
sns.boxplot(x='gender', y='math score', data=df)
plt.title('Math Score by Gender')
plt.show()
# Step 5: Parental Education vs Writing Score
plt.figure(figsize=(10,5))
sns.barplot(x='parental level of education', y='writing score', data=df)
plt.xticks(rotation=45)
plt.title('Writing Score vs Parental Education')
plt.show()
# Step 6: Lunch Type vs Reading Score
plt.figure(figsize=(6,4))
sns.violinplot(x='lunch', y='reading score', data=df)
plt.title('Reading Score by Lunch Type')
plt.show()
https://colab.research.google.com/drive/1UPzhFzstDRXlznbufYT6o8Db4GH_i-iE#scrollTo=dMms2YypY41r&printMode=true 1/2
7/4/25, 10:45 PM Untitled11.ipynb - Colab
📌 First 5 Rows:
gender race/ethnicity parental level of education lunch \
0 female group B bachelor's degree standard
1 female group C some college standard
2 female group B master's degree standard
3 male group A associate's degree free/reduced
4 male group C some college standard
test preparation course math score reading score writing score
0 none 72 72 74
1 completed 69 90 88
2 none 90 95 93
3 none 47 57 44
4 none 76 78 75
🧾 Dataset Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 gender 1000 non-null object
1 race/ethnicity 1000 non-null object
2 parental level of education 1000 non-null object
3 lunch 1000 non-null object
4 test preparation course 1000 non-null object
5 math score 1000 non-null int64
6 reading score 1000 non-null int64
7 writing score 1000 non-null int64
dtypes: int64(3), object(5)
memory usage: 62.6+ KB
None
📊 Summary Statistics:
math score reading score writing score
count 1000.00000 1000.000000 1000.000000
mean 66.08900 69.169000 68.054000
std 15.16308 14.600192 15.195657
min 0.00000 17.000000 10.000000
25% 57.00000 59.000000 57.750000
50% 66.00000 70.000000 69.000000
75% 77.00000 79.000000 79.000000
max 100.00000 100.000000 100.000000
❓ Missing Values:
gender 0
race/ethnicity 0
parental level of education 0
lunch 0
test preparation course 0
math score 0
reading score 0
writing score 0
dtype: int64
https://colab.research.google.com/drive/1UPzhFzstDRXlznbufYT6o8Db4GH_i-iE#scrollTo=dMms2YypY41r&printMode=true 2/2