[go: up one dir, main page]

0% found this document useful (0 votes)
12 views2 pages

Pyspark SQL Practice Questions No Window

Uploaded by

simpat00000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views2 pages

Pyspark SQL Practice Questions No Window

Uploaded by

simpat00000
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 2

PySpark SQL Hands-On Practice Questions

1. Display the first 10 orders with all columns from OrderDetails.


2. Select only Order ID, Order Date, Customer ID, Sales from OrderDetails.

3. Get all orders where Profit is less than 0.


4. Find all customers from Customer_East who are from New York.
5. Find all products from Product_Data where the Category is "Furniture".

6. Get all customer names containing the word "Smith".


7. Find all products where Product Name starts with "Bush".
8. Get all customers whose city name ends with "ville".

9. Find total sales and total profit for each Category.


10. Find average sales per Customer ID.
11. Find the top 5 customers with the highest total profit.

12. Get total sales for each combination of Category and Region.
13. Count the number of orders per Ship Mode and Segment.

14. Join OrderDetails and Customer_East on Customer ID and display Customer Name,
City, State, Order ID, Sales.
15. Join all three datasets (OrderDetails, Customer_East, Product_Data) to display
Order ID, Customer Name, City, Product Name, Category, Sales.

16. Find the top 10 highest Profit orders (descending order).


17. Find the top 5 most expensive products based on Sales value.

18. Extract the year, month, and day from Order Date.
19. Find total sales for each year.
20. Find orders placed in December 2020.
21. Calculate the number of days between Order Date and Ship Date.

22. Find the highest sales order in each state using GROUP BY and MAX(Sales).
23. For each customer, find their total sales and the number of distinct products
they bought.
24. For each product, find how many unique customers purchased it and the total
quantity sold.

25. Convert all Customer Name values to uppercase.


26. Replace "Inc." with "Incorporated" in Customer Name.
27. Find the length of each Product Name.

28. Create a new column Order_Size:


- "High" if Quantity >= 5
- "Medium" if Quantity between 2 and 4
- "Low" if Quantity = 1

29. Get all customers whose postal code is NULL or empty.


30. Find all orders where the discount is greater than 0.2 and profit is negative.

31. For each region, find:


- Total sales
- Average profit
- Highest sales order ID
32. For each category, find the percentage contribution of each sub-category to
total sales.
33. Find all pairs of customers from the same city.

34. Get all orders in 2020 where the customer name contains "James".

You might also like