Worksheets Name
DSES DataFrame Operations Quiz
Class
Total questions: 20
Worksheet time: 10mins
Date
Instructor name: Mrs. Getzi Jeba 495
1. Which function sorts a DataFrame df by the column 'Salary' in ascending order?
a) df.arrange('Salary') b) df.sort_values('Salary')
c) df.sort('Salary') d) df.order_by('Salary')
2. What does df.dropna(subset=['Age']) do?
a) Drops the 'Age' column b) Drops rows where 'Age' is missing
c) Drops all columns with missing values d) Fills missing 'Age' values with zero
3. How do you select rows where 'Department' is 'HR'?
a) df.query('Department = HR') b) df.select('Department == HR')
c) df.filter('HR') d) df[df['Department'] == 'HR']
4. What is the output of df.groupby('City')['Population'].sum()?
a) Total population per city b) Average population per city
c) Sorted population values d) Number of cities
5. How do you add a column 'Tax' as 15% of 'Income'?
a) df.assign(Tax = df.Income * 0.15) b) df.new_column('Tax', df['Income'] * 0.15)
c) df.insert('Tax', df['Income'] * 0.15) d) df['Tax'] = df['Income'] * 0.15
6. What does df.describe() return?
a) First 5 rows b) Missing value counts
c) Summary statistics for numeric columns d) Column names
7. How do you rename the column 'OldName' to 'NewName'?
a) df.rename(columns={'OldName': 'NewName'}) b) df.set_column_name('OldName', 'NewName')
c) df.columns.replace('OldName', 'NewName') d) df['NewName'] = df['OldName']
8. What does df['Age'].mean() compute?
a) Minimum age b) Median age
c) Mode age d) Average age
9. How do you drop the column 'Temp' from df?
a) df.drop_column('Temp') b) df.drop(columns=['Temp'])
c) df.delete('Temp') d) df.remove('Temp')
10. What does df.duplicated().sum() return?
a) Sum of all values b) Number of duplicate rows
c) Count of unique rows d) Index of duplicates
11. How do you fill missing values in df with the column mean?
a) df.dropna() b) df.replace(np.nan, df.mean())
c) df.interpolate() d) df.fillna(df.mean())
12. What does df.isnull().any() return?
a) Count of missing values b) A list of missing indices
c) True for columns with any missing values d) A DataFrame without nulls
13. How do you drop rows where all values are missing?
a) df.dropna(axis=1) b) df.dropna(how='all')
c) df.dropna() d) df.remove_null_rows()
14. What does df['Age'].fillna(method='ffill') do?
a) Fills with zero b) Backward-fills missing ages
c) Forward-fills missing ages d) Drops missing ages
15. How do you replace NaN with -1 in the 'Score' column?
a) df.fillna({'Score': -1}) b) All of the above
c) df['Score'].replace(np.nan, -1) d) df['Score'].fillna(-1)
16. What does df.groupby('Dept')['Salary'].agg(['mean', 'max']) return?
a) Sorted salaries b) Total salary per department
c) Mean and max salary per department d) Count of employees per department
17. How do you compute the median salary by gender?
a) df.agg({'Salary': 'median'}, by='Gender') b) df['Salary'].groupby('Gender').median()
c) df.groupby('Gender')['Salary'].median() d) df.median('Salary').groupby('Gender')
18. What does df.pivot_table(values='Sales', index='Region', columns='Quarter') create?
a) A grouped DataFrame b) A pivot table of sales by region and quarter
c) A transposed DataFrame d) A summary of unique values
19. How do you reset the index after a groupby operation?
a) .reindex() b) .set_index()
c) .flatten() d) .reset_index()
20. What does df.nlargest(3, 'Salary') return?
a) 3 random rows b) Bottom 3 rows by salary
c) Rows with salary > 3 d) Top 3 rows by salary