WINDOW
FUNCTIONS
Gunjan Revar
WINDOW FUNCTIONS
A Window function performs calculation across a
specified set of table rows with a reference to
current row.
Unlike aggregate functions, which return a single
value for a group of rows, window functions allow
you to retain row-level details while applying
aggregate-like calculations.
function_name (expression) OVER (
PARTITION BY column ORDER BY column)
where
function_name: The window function to apply
OVER: executes the window function over a set
of specified rows.
PARTITION BY: Divides the result data into
partitions and applies the function to each
partition separately.
ORDER BY: Specifies the order of rows in
each partition.
WINDOW FUNCTIONS
WHEN DO WE USE WINDOW FUNCTIONS?
To compare rows in a group (e.g., sales rank by
region).
To compute cumulative or rolling metrics (e.g.,
cumulative sales, 3-month moving averages).
To improve query clarity and reduce complexity
when analyzing data.
COMMONLY USED WINDOW FUNCTIONS
Some of the most used window functions are:
RANK()
DENSE_RANK()
ROW_NUMBER()
LAG()
LEAD()
NTILE()
RANK()
The RANK() function assigns a rank to each row
within a partition based on the specified ORDER
BY clause.
Gaps are left in the ranking if there are ties. Thus if
two values are the same, they will be given a same
rank and the next rank will skip a number.
EXAMPLE:
Ranking employees by salary in each department.
SELECT employee_id, salary,
RANK() OVER (
PARTITION BY department_id ORDER BY salary
DESC) AS rank
FROM employees;
DENSE_RANK()
The DENSE_ RANK() function is similar to the
RANK() function but without gaps in the ranking
for ties.
Thus tied values will have the same rank and the
next value will continue sequentially without
skipping a number.
EXAMPLE:
Ranking products by sales with no rank gaps.
SELECT product_id, sales,
DENSE_RANK() OVER (
PARTITION BY category_id ORDER BY sales
DESC) AS dense_rank
FROM products;
ROW_NUMBER()
The ROW_NUMBER() assigns a unique sequential
number to each row within a partition, regardless of
ties.
EXAMPLE:
Numbering rows of the employee table to get the
highest-paid employee in each department.
SELECT employee_id, salary,
ROW_NUMBER() OVER (
PARTITION BY department_id ORDER BY salary
DESC) AS row_number
FROM employees;
LAG()
The LAG() function retrieves the value from the
previous row within a partition.
EXAMPLE:
Retrieves the previous month’s sales for comparison
with the current month’s sales.
SELECT salesperson_id, month, sales,
LAG(sales) OVER (
PARTITION BY salesperson_id ORDER BY
month) AS previous_sales,
FROM monthly_sales;
LEAD()
The LEAD() function is similar to LAG() function
but it retrieves the value from the next row within a
partition.
EXAMPLE:
Retrieve the next month’s sales and calculate the
change from the current value.
SELECT salesperson_id, month,
LEAD(sales) OVER (PARTITION BY
salesperson_id ORDER BY month) - sales AS
change_in_sales
FROM monthly_sales;
NTILE()
The NTILE() function divides rows into ‘n’ groups
and assigns a number to each group.
EXAMPLE:
Splitting sales data into quartiles (4 groups).
SELECT salesperson_id, sales,
NTILE(4) OVER(
ORDER BY sales DESC) AS quartile
FROM sales_data;
AGGREGATE FUNCTIONS
Alternatively, we can also use aggregate functions
with the window functions.
Using aggregate functions with window functions
lets us calculate metrics like sums, averages, counts,
or other aggregations over a specified set of rows,
while retaining the individual row details.
The aggregate functions in SQL are:
SUM()
COUNT()
MIN() & MAX()
AVG()
SUM()
The SUM() function can be used to calculate a
cumulative or running total within the partition.
EXAMPLE:
Cumulative total of sales by region.
SELECT region, salesperson_id, sales_amount,
SUM(sales_amount) OVER (
PARTITION BY region ORDER BY sales_date)
AS cumulative_sales
FROM sales;
COUNT()
The COUNT() function can be used to count the
number of rows in the window.
EXAMPLE:
Row count by region.
SELECT region, salesperson_id,
COUNT(*) OVER(
PARTITION BY region) AS total_people_in_region
FROM sales;
MIN() & MAX()
The MIN() & MAX() functions can be used to find
the minimum or maximum value within a partition.
EXAMPLE:
Find the highest and lowest sales for each product
category.
SELECT category_id, product_id, sales,
MAX(sales) OVER (PARTITION BY category_id)
AS max_sales,
MIN(sales) OVER (PARTITION BY category_id)
AS min_sales
FROM sales_data;
AVG()
The AVG() function can be used to compute a
rolling average.
EXAMPLE:
Calculate the average salary of employees within
each department.
SELECT department_id, employee_id, salary,
AVG(salary) OVER (
PARTITION BY department_id) AS avg_salary
FROM employees;