[go: up one dir, main page]

0% found this document useful (0 votes)
14 views2 pages

Codealpha Studentseda

The document outlines an exploratory data analysis (EDA) of student performance using a dataset containing 1000 entries with various attributes including gender, race, and scores in math, reading, and writing. It includes steps for loading the dataset, providing an overview, checking for missing values, and visualizing relationships between scores and factors such as gender and parental education. Key findings include summary statistics and visualizations that highlight trends in student performance based on different demographics.

Uploaded by

brajput19007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views2 pages

Codealpha Studentseda

The document outlines an exploratory data analysis (EDA) of student performance using a dataset containing 1000 entries with various attributes including gender, race, and scores in math, reading, and writing. It includes steps for loading the dataset, providing an overview, checking for missing values, and visualizing relationships between scores and factors such as gender and parental education. Key findings include summary statistics and visualizations that highlight trends in student performance based on different demographics.

Uploaded by

brajput19007
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

7/4/25, 10:45 PM Untitled11.

ipynb - Colab

# 📊 Task 2: Exploratory Data Analysis - Students Performance

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

# Step 1: Load Dataset


df = pd.read_csv("/content/StudentsPerformance.csv") # Upload file in Colab and adjust path if needed

# Step 2: Basic Overview


print("📌 First 5 Rows:")
print(df.head())

print("\n🧾 Dataset Info:")


print(df.info())

print("\n📊 Summary Statistics:")


print(df.describe())

# Step 3: Missing Values


print("\n❓ Missing Values:")
print(df.isnull().sum())

# Step 4: Gender-wise Math Score


plt.figure(figsize=(6,4))
sns.boxplot(x='gender', y='math score', data=df)
plt.title('Math Score by Gender')
plt.show()

# Step 5: Parental Education vs Writing Score


plt.figure(figsize=(10,5))
sns.barplot(x='parental level of education', y='writing score', data=df)
plt.xticks(rotation=45)
plt.title('Writing Score vs Parental Education')
plt.show()

# Step 6: Lunch Type vs Reading Score


plt.figure(figsize=(6,4))
sns.violinplot(x='lunch', y='reading score', data=df)
plt.title('Reading Score by Lunch Type')
plt.show()

https://colab.research.google.com/drive/1UPzhFzstDRXlznbufYT6o8Db4GH_i-iE#scrollTo=dMms2YypY41r&printMode=true 1/2
7/4/25, 10:45 PM Untitled11.ipynb - Colab

📌 First 5 Rows:
gender race/ethnicity parental level of education lunch \
0 female group B bachelor's degree standard
1 female group C some college standard
2 female group B master's degree standard
3 male group A associate's degree free/reduced
4 male group C some college standard

test preparation course math score reading score writing score


0 none 72 72 74
1 completed 69 90 88
2 none 90 95 93
3 none 47 57 44
4 none 76 78 75

🧾 Dataset Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1000 entries, 0 to 999
Data columns (total 8 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 gender 1000 non-null object
1 race/ethnicity 1000 non-null object
2 parental level of education 1000 non-null object
3 lunch 1000 non-null object
4 test preparation course 1000 non-null object
5 math score 1000 non-null int64
6 reading score 1000 non-null int64
7 writing score 1000 non-null int64
dtypes: int64(3), object(5)
memory usage: 62.6+ KB
None

📊 Summary Statistics:
math score reading score writing score
count 1000.00000 1000.000000 1000.000000
mean 66.08900 69.169000 68.054000
std 15.16308 14.600192 15.195657
min 0.00000 17.000000 10.000000
25% 57.00000 59.000000 57.750000
50% 66.00000 70.000000 69.000000
75% 77.00000 79.000000 79.000000
max 100.00000 100.000000 100.000000

❓ Missing Values:
gender 0
race/ethnicity 0
parental level of education 0
lunch 0
test preparation course 0
math score 0
reading score 0
writing score 0
dtype: int64

https://colab.research.google.com/drive/1UPzhFzstDRXlznbufYT6o8Db4GH_i-iE#scrollTo=dMms2YypY41r&printMode=true 2/2

You might also like