PostgreSQL Masterclass — Tutor Guide &
Daily Practice Plan
Instructor: Your AI SQL Tutor
Target: Database Management • Data Analysis • Data Science (PostgreSQL + pgAdmin)
How to Use This Guide
Run the provided SQL setup script first: sql_practice_setup.sql (creates schema and
synthetic data).
Use pgAdmin’s Query Tool or psql to execute the queries.
Follow the 28-day plan (approx. 60–90 minutes per day).
Each day includes concepts, worked examples, and exercises.
An exercises scaffold is provided in daily_exercises.sql; solutions appear later in this guide.
Prerequisites & Environment
Installed: PostgreSQL 13+ and pgAdmin.
Optional: psql CLI for faster iteration and EXPLAIN ANALYZE.
Set search path: SET search_path TO academy;
SET search_path TO academy;
Dataset Overview (created by sql_practice_setup.sql)
customers • products • orders • order_items • payments (commerce domain)
clinics • patients • visits (healthcare domain)
rides • events (mobility & behavioral analytics)
Indexes on common query paths (date, foreign keys, category).
28-Day Master Plan
Day 1 — SELECT, WHERE, ORDER BY, LIMIT
Key Concepts:
• Concepts: result sets, projection, filtering, sorting, pagination.
• Tip: Always start with SELECT ... FROM ...; expand iteratively.
Worked Examples:
-- Newest 10 customers
SELECT customer_id, full_name, city, signup_date
FROM customers
ORDER BY signup_date DESC
LIMIT 10;
-- Filter & sort
SELECT product_id, name, category, price
FROM products
WHERE category = 'Electronics' AND price < 100
ORDER BY price ASC;
Exercises:
□ List newest 15 customers from Lagos or Oyo by signup_date.
□ Find the 20 cheapest products in 'Health' or 'Grocery'.
Day 2 — Predicates & Pattern Matching
Key Concepts:
• Concepts: AND/OR/NOT, IN, BETWEEN, LIKE/ILIKE, NULL handling (IS NULL).
• Tip: Prefer ILIKE for case-insensitive search.
Worked Examples:
-- Case-insensitive search
SELECT product_id, name
FROM products
WHERE name ILIKE '%product 1%';
-- Range filter
SELECT *
FROM orders
WHERE order_date::date BETWEEN CURRENT_DATE - INTERVAL '30 days' AND
CURRENT_DATE;
Exercises:
□ Customers with missing email (NULL) — show id + name.
□ Orders in last 7 days with status NOT IN ('cancelled','refunded').
Day 3 — Aggregations & GROUP BY
Key Concepts:
• Concepts: COUNT/SUM/AVG/MIN/MAX, GROUP BY, HAVING.
• Tip: HAVING filters groups after aggregation.
Worked Examples:
-- Orders per status
SELECT status, COUNT(*) AS cnt
FROM orders
GROUP BY status
ORDER BY cnt DESC;
-- Average product price by category (only active products)
SELECT category, AVG(price)::numeric(10,2) AS avg_price
FROM products
WHERE active = TRUE
GROUP BY category
ORDER BY 1;
Exercises:
□ Total revenue (sum of total_amount) per state (from customers->orders).
□ Top 5 categories by total quantity sold (order_items).
Day 4 — Joins (INNER/LEFT/RIGHT/FULL)
Key Concepts:
• Concepts: keys, FK relationships, join types, join selectivity.
• Tip: Start with INNER JOIN; use LEFT JOIN to include non-matching rows.
Worked Examples:
-- Order lines with customer & product
SELECT o.order_id, o.order_date, c.full_name, p.name AS product,
oi.quantity, (oi.quantity*oi.unit_price) AS line_total
FROM orders o
JOIN customers c ON c.customer_id = o.customer_id
JOIN order_items oi ON oi.order_id = o.order_id
JOIN products p ON p.product_id = oi.product_id
ORDER BY o.order_date DESC
LIMIT 20;
-- Orders without payments (LEFT JOIN + NULL filter)
SELECT o.order_id, o.status, o.total_amount
FROM orders o
LEFT JOIN payments p ON p.order_id = o.order_id
WHERE p.order_id IS NULL;
Exercises:
□ For each state, total orders and total paid orders.
□ Patients with no clinic visits in the last 90 days.
Day 5 — Subqueries & CTEs
Key Concepts:
• Concepts: scalar subquery, EXISTS/IN, common table expressions (WITH).
• Tip: CTEs improve readability; Postgres may inline for perf.
Worked Examples:
-- Customers above 100k total spend
WITH spend AS (
SELECT o.customer_id, SUM(o.total_amount) AS total_spend
FROM orders o
GROUP BY o.customer_id
)
SELECT c.customer_id, c.full_name, s.total_spend
FROM spend s JOIN customers c ON c.customer_id = s.customer_id
WHERE s.total_spend > 100000
ORDER BY s.total_spend DESC;
-- Latest order per customer
WITH ranked AS (
SELECT o.*, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY
order_date DESC) AS rn
FROM orders o
)
SELECT customer_id, order_id, order_date
FROM ranked
WHERE rn = 1;
Exercises:
□ Find products never ordered.
□ Find top 3 customers by total rides distance.
Day 6 — Window Functions I
Key Concepts:
• Concepts: OVER(), PARTITION BY, ORDER BY; row_number, rank, dense_rank.
• Tip: Use window functions for analytics without collapsing rows.
Worked Examples:
-- Rank customers by spend within each state
WITH spend AS (
SELECT c.state, o.customer_id, SUM(o.total_amount) AS total_spend
FROM orders o JOIN customers c ON c.customer_id = o.customer_id
GROUP BY c.state, o.customer_id
)
SELECT state, customer_id, total_spend,
RANK() OVER (PARTITION BY state ORDER BY total_spend DESC) AS
rnk
FROM spend
ORDER BY state, rnk;
-- 7-day moving sum of orders (by order date::date)
WITH daily AS (
SELECT order_date::date AS d, SUM(total_amount) AS revenue
FROM orders
GROUP BY order_date::date
)
SELECT d, revenue,
SUM(revenue) OVER (ORDER BY d ROWS BETWEEN 6 PRECEDING AND
CURRENT ROW) AS rev_7d
FROM daily
ORDER BY d;
Exercises:
□ Compute rolling 30-day rides revenue per city.
□ Compute cumulative customer count over time (by signup_date).
Day 7 — Date/Time Toolbox
Key Concepts:
• Concepts: date_trunc, intervals, age, extract, time zones.
• Tip: Store timestamps in UTC; display in local time when needed.
Worked Examples:
-- Monthly orders & revenue
SELECT date_trunc('month', order_date) AS month,
COUNT(*) AS orders, SUM(total_amount) AS revenue
FROM orders
GROUP BY 1
ORDER BY 1;
-- Clinic billing by quarter
SELECT date_trunc('quarter', visit_date) AS qtr,
clinic_id, SUM(amount_billed) AS billed
FROM visits
GROUP BY 1,2
ORDER BY 1,2;
Exercises:
□ Average order value by day-of-week.
□ Visits trend (last 6 months) per clinic.
Day 8 — Set Ops (UNION/INTERSECT/EXCEPT)
Key Concepts:
• Concepts: dedup with UNION, overlap via INTERSECT, anti-join via EXCEPT.
Worked Examples:
-- Customers who placed orders and raised support tickets
WITH orderers AS (SELECT DISTINCT customer_id FROM orders),
tickets AS (
SELECT DISTINCT customer_id FROM events WHERE
event_type='support_ticket'
)
SELECT * FROM orderers INTERSECT SELECT * FROM tickets;
-- Customers who took rides but never ordered
WITH riders AS (SELECT DISTINCT customer_id FROM rides),
orderers AS (SELECT DISTINCT customer_id FROM orders)
SELECT * FROM riders EXCEPT SELECT * FROM orderers;
Exercises:
□ Customers who purchased and later refunded (ids).
□ Product categories that were ordered vs never ordered.
Day 9 — Nulls, Coalesce, Conditional Logic
Key Concepts:
• Concepts: NULL behavior, COALESCE, CASE WHEN ... THEN.
Worked Examples:
-- Default to 0 when missing
SELECT p.product_id, p.name,
COALESCE(SUM(oi.quantity),0) AS qty_sold
FROM products p
LEFT JOIN order_items oi ON oi.product_id = p.product_id
GROUP BY p.product_id, p.name
ORDER BY qty_sold DESC;
-- Categorize orders by value
SELECT order_id, total_amount,
CASE
WHEN total_amount >= 50000 THEN 'high'
WHEN total_amount >= 20000 THEN 'medium'
ELSE 'low'
END AS value_band
FROM orders;
Exercises:
□ Compute NPS-like bands for product price (low/med/high).
□ Show city with 0 rides as 0, not NULL, in daily series (hint: generate_series).
Day 10 — Advanced Aggregations
Key Concepts:
• Concepts: DISTINCT aggregates, FILTER clause, GROUPING SETS/ROLLUP/CUBE.
• Tip: FILTER lets you aggregate subsets in one pass.
Worked Examples:
-- Multi-metric in one query
SELECT
COUNT(*) AS orders,
SUM(total_amount) AS revenue,
SUM(total_amount) FILTER (WHERE status='refunded') AS refunded
FROM orders;
-- Rollup by state then city
SELECT c.state, c.city, COUNT(*) AS customers
FROM customers c
GROUP BY ROLLUP (c.state, c.city)
ORDER BY c.state NULLS LAST, c.city NULLS LAST;
Exercises:
□ Compute category revenue plus revenue for ALL categories using ROLLUP.
□ Use CUBE on state x status for order counts.
Day 11 — Window Functions II (frames, percentiles)
Key Concepts:
• Concepts: frame clauses, lag/lead, percentiles, ntile.
Worked Examples:
-- Median order total per state
SELECT state,
percentile_cont(0.5) WITHIN GROUP (ORDER BY total_amount) AS
median_order
FROM customers c
JOIN orders o ON o.customer_id = c.customer_id
GROUP BY state
ORDER BY state;
-- Month-over-month change
WITH m AS (
SELECT date_trunc('month', order_date) AS m, SUM(total_amount) AS
revenue
FROM orders GROUP BY 1
)
SELECT m,
revenue,
LAG(revenue) OVER (ORDER BY m) AS prev_rev,
(revenue - LAG(revenue) OVER (ORDER BY m)) AS delta
FROM m;
Exercises:
□ Top 3 products by monthly revenue (use dense_rank partitioned by month).
□ Compute 90th percentile ride fare by city.
Day 12 — Text Search & Trigrams (optional)
Key Concepts:
• Concepts: LIKE vs trigram (pg_trgm), basic full-text search approach.
Worked Examples:
-- Trigram similarity (requires pg_trgm)
SELECT name, similarity(name, 'produckt 12') AS sim
FROM products
WHERE name % 'produckt 12'
ORDER BY sim DESC
LIMIT 10;
Exercises:
□ Find near-duplicates among product names using % operator.
Day 13 — JSONB (semi-structured data)
Key Concepts:
• Concepts: jsonb structure, ->, ->>, ?, @>, indexing with GIN.
Worked Examples:
-- Example: temporary jsonb data
WITH tmp AS (
SELECT jsonb_build_object(
'customer_id', 1,
'prefs', jsonb_build_object('theme','dark','alerts',true)
) AS doc
)
SELECT doc->'prefs'->>'theme' AS theme FROM tmp;
Exercises:
□ Add a jsonb column (e.g., products.meta JSONB) and store extra attributes; query keys and
values.
Day 14 — Data Modeling & Normalization
Key Concepts:
• Concepts: 1NF/2NF/3NF/BCNF, surrogate vs natural keys, denormalization trade-offs.
Worked Examples:
-- Add NOT NULL/UNIQUE constraints
ALTER TABLE customers
ALTER COLUMN full_name SET NOT NULL,
ADD CONSTRAINT unique_email UNIQUE(email);
Exercises:
□ Sketch a normalized schema for a hospital pharmacy (entities: drugs, prescriptions,
prescribers, patients).
Day 15 — DDL Deep Dive & Constraints
Key Concepts:
• Concepts: CHECK, DEFAULT, FK options, ON DELETE/UPDATE, generated columns.
Worked Examples:
-- Add a CHECK constraint for positive quantities
ALTER TABLE order_items
ADD CONSTRAINT chk_quantity_positive CHECK (quantity > 0);
Exercises:
□ Add FK to ensure payments.order_id references orders, delete cascading behavior (already
applied).
□ Add a generated column (e.g., line_total) to order_items (Postgres 12+ use expression or
view).
Day 16 — Transactions & Isolation
Key Concepts:
• Concepts: ACID, BEGIN/COMMIT/ROLLBACK, isolation levels, deadlocks.
Worked Examples:
BEGIN;
UPDATE products SET price = price * 0.9 WHERE category='Health';
-- Check results then COMMIT or ROLLBACK
COMMIT;
-- Read committed vs repeatable read (open two sessions and observe).
Exercises:
□ Simulate a transfer mistake and use ROLLBACK to restore state.
Day 17 — Indexing Strategies
Key Concepts:
• Concepts: B-tree, hash, GIN/GiST; covering indexes; multi-column order.
• Tip: Index predicates you filter/join on frequently; beware over-indexing.
Worked Examples:
-- Create index for common filter
CREATE INDEX idx_orders_status_date ON orders(status, order_date);
-- Inspect plan
EXPLAIN ANALYZE
SELECT * FROM orders WHERE status='paid' AND order_date > NOW() -
INTERVAL '30 days';
Exercises:
□ Create an index that speeds up: top products by category in last 90 days.
□ Verify with EXPLAIN ANALYZE before/after.
Day 18 — Query Tuning
Key Concepts:
• Concepts: EXPLAIN (ANALYZE, BUFFERS), work_mem, join order, CTE inlining.
Worked Examples:
EXPLAIN (ANALYZE, BUFFERS)
SELECT c.state, SUM(o.total_amount)
FROM customers c
JOIN orders o ON o.customer_id = c.customer_id
WHERE o.order_date > NOW() - INTERVAL '180 days'
GROUP BY c.state;
Exercises:
□ Compare performance of correlated subquery vs JOIN for: total spend per customer.
Day 19 — Views & Materialized Views
Key Concepts:
• Concepts: logical layering, refresh cadence, dependency management.
Worked Examples:
CREATE MATERIALIZED VIEW mv_customer_spend AS
SELECT o.customer_id, SUM(o.total_amount) AS total_spend
FROM orders o
GROUP BY o.customer_id;
-- Refresh when needed
REFRESH MATERIALIZED VIEW mv_customer_spend;
Exercises:
□ Create a view for monthly category revenue; query top categories.
Day 20 — Partitioning (Time/Range)
Key Concepts:
• Concepts: declarative partitioning, pruning, local indexes.
Worked Examples:
-- Example skeleton (events by month)
-- CREATE TABLE events_p (... ) PARTITION BY RANGE (event_time);
-- CREATE TABLE events_2025_01 PARTITION OF events_p FOR VALUES FROM
('2025-01-01') TO ('2025-02-01');
Exercises:
□ Design partitions for the events table by month for the last 6 months.
Day 21 — Security & Roles
Key Concepts:
• Concepts: roles, grants, row-level security (RLS).
Worked Examples:
-- Read-only role for analysts
CREATE ROLE analyst LOGIN PASSWORD 'changeme';
GRANT USAGE ON SCHEMA academy TO analyst;
GRANT SELECT ON ALL TABLES IN SCHEMA academy TO analyst;
ALTER DEFAULT PRIVILEGES IN SCHEMA academy GRANT SELECT ON TABLES TO
analyst;
Exercises:
□ Create a limited role that can only SELECT from views, not base tables.
Day 22 — Backup & Restore
Key Concepts:
• Concepts: pg_dump, pg_restore, plain vs custom format, point-in-time basics.
Worked Examples:
-- Backup (shell)
-- pg_dump -U postgres -d yourdb -Fc -f backup.dump
-- Restore
-- pg_restore -U postgres -d yourdb backup.dump
Exercises:
□ Practice exporting only the academy schema and restoring to a new database.
Day 23 — ETL with SQL
Key Concepts:
• Concepts: staging tables, upserts, deduping, slowly changing dimensions (SCD).
Worked Examples:
-- Upsert (on conflict)
INSERT INTO products(name, category, price, active)
VALUES ('Product 5','Electronics', 99.99, TRUE)
ON CONFLICT (name) DO UPDATE SET price = EXCLUDED.price;
Exercises:
□ Load a CSV into a staging table (COPY ... FROM) and merge into products.
Day 24 — Reporting Patterns
Key Concepts:
• Concepts: cohort retention, funnels, ABC segmentation, pivoting (crosstab).
Worked Examples:
-- crosstab example (requires tablefunc)
-- SELECT * FROM crosstab(
-- $$ SELECT date_trunc('month', order_date), category,
SUM(total_amount)
-- FROM orders o JOIN order_items oi USING(order_id) JOIN products
p USING(product_id)
-- GROUP BY 1,2 ORDER BY 1,2 $$
-- ) AS ct(month timestamp, electronics numeric, fashion numeric,
grocery numeric);
Exercises:
□ Build a cohort table of signup month vs returning buyers at month+1, +2, +3.
Day 25 — Time-Series Extras
Key Concepts:
• Concepts: generate_series for calendar tables, gaps & islands, seasonality.
Worked Examples:
-- Calendar spine join
WITH cal AS (
SELECT d::date AS d
FROM generate_series(CURRENT_DATE - INTERVAL '90 days', CURRENT_DATE,
INTERVAL '1 day') AS g(d)
)
SELECT cal.d, COALESCE(SUM(total_amount),0) AS revenue
FROM cal
LEFT JOIN orders o ON o.order_date::date = cal.d
GROUP BY cal.d
ORDER BY cal.d;
Exercises:
□ Detect gaps in daily orders; fill with zeros for a chart-ready series.
Day 26 — Data Science Bridge
Key Concepts:
• Concepts: creating feature tables with SQL; percentiles, bucketing, joins.
Worked Examples:
-- Customer features (example)
WITH spend AS (
SELECT customer_id, SUM(total_amount) AS total_spend, COUNT(*) AS
orders_count,
MAX(order_date) AS last_order
FROM orders
GROUP BY customer_id
)
SELECT c.customer_id, c.state, c.signup_date, s.total_spend,
s.orders_count,
EXTRACT(DAY FROM (NOW() - s.last_order)) AS recency_days
FROM customers c LEFT JOIN spend s USING(customer_id);
Exercises:
□ Create a product features table: price band, total sold, last sold date, refund rate.
Day 27 — Capstone Project (Pick One)
Key Concepts:
• Project A (Commerce): Build a full revenue dashboard: daily revenue, AOV, top categories,
repeat rate.
• Project B (Healthcare): Analyze clinic performance: visits, billing, common diagnoses, patient
retention.
• Project C (Mobility): City ride KPIs: rides/day, avg fare, surge periods, heavy user segments.
Worked Examples:
Deliverables:
1) A documented .sql file with all queries and views.
2) A short write-up (1–2 pages) explaining your metrics and insights.
3) (Optional) A materialized view layer to speed up dashboards.
Exercises:
□ Tip: Use views/materialized views for clean layers. Consider indexes on filter/join keys.
Day 28 — Review & Next Steps
Key Concepts:
• Revisit weak areas; practice with EXPLAIN plans and window functions.
• Next: PostGIS for geospatial, TimescaleDB for time-series, dbt for analytics engineering.
Worked Examples:
Checklist:
- Comfortable with joins, GROUP BY, window functions
- Can design schemas with constraints
- Can index and tune common queries
- Can build views/materialized views for analytics
Exercises:
□ Optional: Try migrating your capstone to a different schema optimized for BI.
Selected Exercise Solutions
Day 1 — Newest Lagos/Oyo customers
SELECT customer_id, full_name, state, city, signup_date
FROM customers
WHERE state IN ('Lagos','Oyo')
ORDER BY signup_date DESC
LIMIT 15;
Day 3 — Revenue per state
SELECT c.state, SUM(o.total_amount) AS revenue
FROM customers c
JOIN orders o ON o.customer_id = c.customer_id
GROUP BY c.state
ORDER BY revenue DESC;
Day 4 — Paid vs all orders per state
WITH stats AS (
SELECT c.state,
COUNT(*) AS orders,
SUM((o.status IN ('paid','shipped','refunded'))::int) AS
paidish
FROM customers c JOIN orders o ON o.customer_id = c.customer_id
GROUP BY c.state
)
SELECT state, orders, paidish, (paidish::numeric/orders)::numeric(5,2)
AS pay_rate
FROM stats
ORDER BY pay_rate DESC;
Day 6 — Rolling 30-day rides revenue per city
WITH d AS (
SELECT ride_date::date AS d, city, SUM(fare) AS revenue
FROM rides GROUP BY 1,2
)
SELECT d, city, revenue,
SUM(revenue) OVER (PARTITION BY city ORDER BY d
ROWS BETWEEN 29 PRECEDING AND CURRENT ROW) AS
rev_30d
FROM d
ORDER BY city, d;
Day 8 — Purchased and later refunded
WITH purchased AS (
SELECT DISTINCT customer_id FROM events WHERE event_type='purchase'
),
refunded AS (
SELECT DISTINCT customer_id FROM events WHERE event_type='refund'
)
SELECT * FROM purchased INTERSECT SELECT * FROM refunded;
Day 11 — 90th percentile ride fare by city
SELECT city,
percentile_cont(0.9) WITHIN GROUP (ORDER BY fare) AS p90_fare
FROM rides
GROUP BY city
ORDER BY city;
Day 18 — Correlated vs Join (join version)
SELECT o.customer_id, SUM(o.total_amount) AS total_spend
FROM orders o
GROUP BY o.customer_id; -- Compare with correlated subquery via EXPLAIN
Admin & Ops Cheat Sheet
List active connections: SELECT * FROM pg_stat_activity;
Cancel a backend PID: SELECT pg_cancel_backend(pid);
Table sizes: SELECT relname, pg_size_pretty(pg_total_relation_size(relid)) FROM
pg_catalog.pg_statio_user_tables ORDER BY 2 DESC;
Index usage: SELECT relname, idx_scan FROM pg_stat_user_indexes ORDER BY idx_scan;
Vacuum/Analyze: VACUUM (ANALYZE, VERBOSE) table_name;
Next Steps & Learning Path
Build a real mini-warehouse: staging → core models → mart views.
Learn dbt for analytics engineering workflows.
Explore PostGIS for mapping/geo analytics (Lagos traffic, clinic coverage).
Practice with EXPLAIN ANALYZE weekly; maintain a personal query diary.