LEVEL 06

Performance & Optimization

Your Oracle DBA background gives you a head start. This level formalizes it in SQL and extends it to PostgreSQL.

Read the execution plan first

-- Oracle
EXPLAIN PLAN FOR
SELECT * FROM transactions WHERE account_id = 'ACC001';
SELECT * FROM TABLE(DBMS_XPLAN.DISPLAY);

-- PostgreSQL
EXPLAIN ANALYZE
SELECT * FROM transactions WHERE account_id = 'ACC001';

-- MySQL
EXPLAIN
SELECT * FROM transactions WHERE account_id = 'ACC001';

Plan keyword	Meaning	Good or bad?
TABLE ACCESS FULL / Seq Scan	Full table scan — reads every row	Bad on large tables
INDEX RANGE SCAN	Using an index to find rows	Good
INDEX ONLY SCAN	Reads only the index, not the table	Best
HASH JOIN	Good for large unsorted joins	Usually fine
NESTED LOOP	Good when one side is very small	Good for small sets

Indexes — the most impactful lever

-- B-tree (default — covers =, <, >, BETWEEN, LIKE 'prefix%')
CREATE INDEX idx_txn_date ON transactions(txn_date);

-- Composite — column ORDER matters
CREATE INDEX idx_branch_date ON transactions(branch, txn_date);
-- Helps: WHERE branch=? AND txn_date=?
-- Helps: WHERE branch=?
-- Does NOT help: WHERE txn_date=?  (leading column missing)

-- Bitmap — Oracle only, for low-cardinality columns like status
CREATE BITMAP INDEX idx_status ON transactions(status);

-- Partial index — PostgreSQL (index only matching rows)
CREATE INDEX idx_failed ON transactions(account_id) WHERE status='FAILED';

-- Function-based index
CREATE INDEX idx_upper_city ON accounts(UPPER(city));
-- Now: WHERE UPPER(city)='MUMBAI' uses the index

Anti-patterns that bypass indexes

-- ✗ Function on indexed column — full scan
WHERE ROUND(amount, 0) = 5000
WHERE EXTRACT(YEAR FROM txn_date) = 2024
WHERE TO_CHAR(txn_date, 'YYYY') = '2024'

-- ✓ Rewrite to use index
WHERE amount BETWEEN 4999.5 AND 5000.5
WHERE txn_date >= DATE '2024-01-01' AND txn_date < DATE '2025-01-01'

-- ✗ Leading wildcard — full scan
WHERE name LIKE '%Kumar'
-- ✓ Prefix match uses index
WHERE name LIKE 'Kumar%'

-- ✗ Implicit type conversion
WHERE amount = '5000'   -- amount is NUMBER, passing VARCHAR
-- ✓
WHERE amount = 5000

Filter early — execution order tuning

-- SLOW: 10M rows joined, then filtered
SELECT a.customer, SUM(t.amount)
FROM   transactions t
JOIN   accounts     a ON t.account_id = a.account_id
WHERE  EXTRACT(YEAR FROM t.txn_date) = 2024  -- no index use
GROUP BY a.customer;

-- FAST: aggregate inside CTE first, join small result
WITH txn_2024 AS (
    SELECT account_id, SUM(amount) AS total
    FROM   transactions
    WHERE  txn_date >= DATE '2024-01-01'   -- index works
    AND    txn_date <  DATE '2025-01-01'
    AND    status = 'SUCCESS'
    GROUP BY account_id
)
SELECT a.customer, t.total
FROM   accounts  a
JOIN   txn_2024  t ON a.account_id = t.account_id;

Partitioning

-- Oracle: range partition by date
CREATE TABLE transactions (txn_id NUMBER, txn_date DATE, amount NUMBER)
PARTITION BY RANGE (txn_date) (
    PARTITION p_2024_q1 VALUES LESS THAN (DATE '2024-04-01'),
    PARTITION p_2024_q2 VALUES LESS THAN (DATE '2024-07-01'),
    PARTITION p_2024_q3 VALUES LESS THAN (DATE '2024-10-01'),
    PARTITION p_2024_q4 VALUES LESS THAN (DATE '2025-01-01')
);

-- Partition pruning — only scans Q1 partition
SELECT SUM(amount) FROM transactions
WHERE  txn_date >= DATE '2024-01-01'
AND    txn_date <  DATE '2024-04-01';    -- ✓ pruning works

-- ✗ Pruning FAILS — function wraps the partition column
WHERE  TRUNC(txn_date, 'MM') = DATE '2024-01-01'

Materialized views

-- Oracle: pre-compute daily branch summary
CREATE MATERIALIZED VIEW mv_branch_daily
BUILD IMMEDIATE
REFRESH COMPLETE ON DEMAND
AS
SELECT branch, TRUNC(txn_date) AS txn_day,
       COUNT(*) AS txn_count, SUM(amount) AS total_volume
FROM   transactions
GROUP BY branch, TRUNC(txn_date);

EXEC DBMS_MVIEW.REFRESH('MV_BRANCH_DAILY');  -- manual refresh

-- PostgreSQL
CREATE MATERIALIZED VIEW mv_branch_daily AS
SELECT branch, txn_date::date AS txn_day,
       COUNT(*) AS txn_count, SUM(amount) AS total_volume
FROM   transactions GROUP BY branch, txn_date::date;

CREATE INDEX ON mv_branch_daily(branch, txn_day);  -- index the MV too
REFRESH MATERIALIZED VIEW CONCURRENTLY mv_branch_daily;

6 anti-patterns cheat sheet

Anti-pattern	Problem	Fix
Function on indexed col	Bypasses index	Use range instead
SELECT *	Pulls all columns, slow	Name your columns
HAVING instead of WHERE	Groups all rows first	Filter before GROUP BY
Correlated subquery in SELECT	Runs N times	Use JOIN or CTE
DISTINCT to fix bad join	Hides fan-out bug	Use EXISTS instead
ORDER BY inside subquery	Sort gets discarded	Only ORDER BY in outer query

Practice Problems

Q1 · Easy
Write this query then explain what EXPLAIN PLAN shows and what index fixes it: WHERE TRUNC(txn_date) = DATE '2024-03-15' AND branch = 'Mumbai'. Rewrite it to be index-friendly. What composite index would you create?

Q2 · Medium
50M row transactions table. Three hourly queries: A) WHERE status='FAILED', B) WHERE account_id=? AND txn_date>?, C) WHERE branch=? AND status=? AND txn_date BETWEEN ? AND ?. Design the minimum set of indexes to cover all three. Explain column order and why.

Q3 · Medium
Identify every anti-pattern in this query then rewrite it: correlated subqueries in SELECT, implicit cross join with comma, UPPER() in WHERE, TO_CHAR() on date, DISTINCT to fix a fan-out.

Q4 · Hard
Design a partitioning strategy for 500M row transactions (3 years). What type? What column? How many partitions? Write CREATE TABLE DDL. Write one query that guarantees pruning and one that accidentally disables it.

Q5 · Hard
A report runs 200×/day and takes 18 seconds each time (3600s daily DB load). Create a materialized view for it, add an index on the MV, write the refresh strategy, and show the rewritten query that uses the MV.

← Level 5 Level 7 →