DataFrame Practical Questions
📌 Q1: Student Records
Data Table:
RollNo Name Class Marks
101 Aditi 12 89
102 Rohan 11 76
103 Simran 12 93
104 Neha 11 85
105 Aman 12 78
Tasks:
Create DataFrame using the above data.
Display students with marks above 80.
Add a new column "Result" with value "Pass".
📌 Q2: Product Inventory
Data Table:
ProductID Product Price Quantity
1 Pen 10 100
2 Pencil 5 200
3 Eraser 7 150
4 Notebook 40 50
5 Marker 25 80
Tasks:
Add column 'Total' = Price × Quantity.
Display products with Quantity > 100.
Sort by 'Total' in descending order.
📌 Q3: Employee Records
EmpID Name Department Salary
1 Raj HR 35000
2 Simran IT 50000
3 Aman Finance 40000
4 John IT 60000
5 Meena HR 38000
Tasks:
Filter employees in IT department.
Add a new column 'Bonus' = 10% of salary.
Display highest and lowest salary.
📌 Q4: Movie Ratings
Movie Year Rating
Inception 2010 8.8
Dangal 2016 8.5
Titanic 1997 7.8
Avatar 2009 7.9
3 Idiots 2009 8.4
Tasks:
Show movies with rating above 8.
Rename column Rating to IMDB_Rating.
Sort by Rating.
📌 Q5: COVID-19 Statewise
State Cases Recovered
Delhi 650000 640000
Maharashtra 1250000 1200000
UP 490000 470000
Bihar 300000 290000
Kerala 580000 570000
Tasks:
Add column 'Active' = Cases - Recovered.
Display states with > 5 lakh cases.
Show total cases using sum().
📌 Q6: Library Books
BookID Title Author Price
1 Python Basics R. Kumar 450
2 Data Science A. Sharma 550
3 AI with Python J. Mathew 600
4 Informatics CBSE CBSE Board 500
5 Coding in Java S. Roy 480
Tasks:
Display books priced above ₹500.
Add a new column 'Available' with all values as 'Yes'.
Sort books by price.
📌 Q7: Sales by Region
Region Q1 Q2 Q3 Q4
North 100 150 130 170
Region Q1 Q2 Q3 Q4
South 120 140 160 180
East 90 110 120 100
West 80 100 110 130
Tasks:
Add column 'TotalSales' = Q1 + Q2 + Q3 + Q4.
Display region with highest TotalSales.
Calculate average sales for Q2.
📌 Q8: Cricket Stats
Player Matches Runs Average
Virat 250 12000 54.2
Rohit 220 10000 49.5
Rahul 100 4500 46.0
Dhoni 300 10500 50.1
Gill 80 3500 44.7
Tasks:
Display players with average above 50.
Sort players by Runs.
Add column 'StrikeRate' with dummy values.
📌 Q9: Online Course Registrations
Student Course Fees Completed
Ankit Python 2000 Yes
Meena Java 2500 No
Raghav Data Sci 3000 Yes
Tina AI 4000 No
Raj Python 2000 Yes
Tasks:
Filter students who completed the course.
Count how many registered for Python.
Display total fees collected.
📌 Q10: Electricity Bill Records
ConsumerID Name Units RatePerUnit
101 Anuj 300 6
102 Ria 250 5.5
103 Sumit 150 6
104 Meera 180 5.5
105 Aryan 220 6
Tasks:
Add column 'BillAmount' = Units × RatePerUnit.
Display consumers with BillAmount > 1500.
Calculate average units consumed.
Answers
🔹 Q1: Student Records
import pandas as pd
data1 = {
'RollNo': [101, 102, 103, 104, 105],
'Name': ['Aditi', 'Rohan', 'Simran', 'Neha', 'Aman'],
'Class': [12, 11, 12, 11, 12],
'Marks': [89, 76, 93, 85, 78]
}
df1 = pd.DataFrame(data1)
print(df1[df1['Marks'] > 80])
df1['Result'] = 'Pass'
🔹 Q2: Product Inventory
data2 = {
'ProductID': [1, 2, 3, 4, 5],
'Product': ['Pen', 'Pencil', 'Eraser', 'Notebook', 'Marker'],
'Price': [10, 5, 7, 40, 25],
'Quantity': [100, 200, 150, 50, 80]
}
df2 = pd.DataFrame(data2)
df2['Total'] = df2['Price'] * df2['Quantity']
print(df2[df2['Quantity'] > 100])
df2.sort_values(by='Total', ascending=False)
🔹 Q3: Employee Records
data3 = {
'EmpID': [1, 2, 3, 4, 5],
'Name': ['Raj', 'Simran', 'Aman', 'John', 'Meena'],
'Department': ['HR', 'IT', 'Finance', 'IT', 'HR'],
'Salary': [35000, 50000, 40000, 60000, 38000]
}
df3 = pd.DataFrame(data3)
print(df3[df3['Department'] == 'IT'])
df3['Bonus'] = df3['Salary'] * 0.10
print("Max:", df3['Salary'].max(), "Min:", df3['Salary'].min())
🔹 Q4: Movie Ratings
data4 = {
'Movie': ['Inception', 'Dangal', 'Titanic', 'Avatar', '3 Idiots'],
'Year': [2010, 2016, 1997, 2009, 2009],
'Rating': [8.8, 8.5, 7.8, 7.9, 8.4]
}
df4 = pd.DataFrame(data4)
print(df4[df4['Rating'] > 8])
df4.rename(columns={'Rating': 'IMDB_Rating'}, inplace=True)
df4.sort_values(by='IMDB_Rating', ascending=False)
🔹 Q5: COVID-19 Statewise
data5 = {
'State': ['Delhi', 'Maharashtra', 'UP', 'Bihar', 'Kerala'],
'Cases': [650000, 1250000, 490000, 300000, 580000],
'Recovered': [640000, 1200000, 470000, 290000, 570000]
}
df5 = pd.DataFrame(data5)
df5['Active'] = df5['Cases'] - df5['Recovered']
print(df5[df5['Cases'] > 500000])
print("Total Cases:", df5['Cases'].sum())
🔹 Q6: Library Books
data6 = {
'BookID': [1, 2, 3, 4, 5],
'Title': ['Python Basics', 'Data Science', 'AI with Python',
'Informatics CBSE', 'Coding in Java'],
'Author': ['R. Kumar', 'A. Sharma', 'J. Mathew', 'CBSE Board', 'S.
Roy'],
'Price': [450, 550, 600, 500, 480]
}
df6 = pd.DataFrame(data6)
print(df6[df6['Price'] > 500])
df6['Available'] = 'Yes'
df6.sort_values(by='Price')
🔹 Q7: Sales by Region
data7 = {
'Region': ['North', 'South', 'East', 'West'],
'Q1': [100, 120, 90, 80],
'Q2': [150, 140, 110, 100],
'Q3': [130, 160, 120, 110],
'Q4': [170, 180, 100, 130]
}
df7 = pd.DataFrame(data7)
df7['TotalSales'] = df7[['Q1', 'Q2', 'Q3', 'Q4']].sum(axis=1)
print(df7[df7['TotalSales'] == df7['TotalSales'].max()])
print("Avg Q2:", df7['Q2'].mean())
🔹 Q8: Cricket Stats
data8 = {
'Player': ['Virat', 'Rohit', 'Rahul', 'Dhoni', 'Gill'],
'Matches': [250, 220, 100, 300, 80],
'Runs': [12000, 10000, 4500, 10500, 3500],
'Average': [54.2, 49.5, 46.0, 50.1, 44.7]
}
df8 = pd.DataFrame(data8)
print(df8[df8['Average'] > 50])
df8.sort_values(by='Runs', ascending=False)
df8['StrikeRate'] = [90.1, 88.4, 85.6, 87.3, 84.5]
🔹 Q9: Online Course Registrations
data9 = {
'Student': ['Ankit', 'Meena', 'Raghav', 'Tina', 'Raj'],
'Course': ['Python', 'Java', 'Data Sci', 'AI', 'Python'],
'Fees': [2000, 2500, 3000, 4000, 2000],
'Completed': ['Yes', 'No', 'Yes', 'No', 'Yes']
}
df9 = pd.DataFrame(data9)
print(df9[df9['Completed'] == 'Yes'])
print("Python count:", (df9['Course'] == 'Python').sum())
print("Total Fees:", df9['Fees'].sum())
🔹 Q10: Electricity Bill Records
data10 = {
'ConsumerID': [101, 102, 103, 104, 105],
'Name': ['Anuj', 'Ria', 'Sumit', 'Meera', 'Aryan'],
'Units': [300, 250, 150, 180, 220],
'RatePerUnit': [6, 5.5, 6, 5.5, 6]
}
df10 = pd.DataFrame(data10)
df10['BillAmount'] = df10['Units'] * df10['RatePerUnit']
print(df10[df10['BillAmount'] > 1500])
print("Average Units:", df10['Units'].mean())