How do I make window functions faster on large analytics tables?

Start by reducing the number of rows and columns that reach the window step. Push filters earlier, pre-aggregate when raw detail is not needed, and make the PARTITION BY and ORDER BY keys as narrow and selective as possible.

Are CTEs always materialized in modern SQL engines?

No. Engine behavior differs, and many optimizers treat WITH clauses as a readability feature rather than a guaranteed cache. If a heavy CTE is referenced multiple times, verify the plan and consider a temp table or an engine-specific materialization hint.

Why does LAST_VALUE return the wrong-looking result?

The issue is usually the window frame, not the function itself. Add an explicit ROWS frame, often ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING, when you want the final row from the whole partition.

When should I replace a self-join with LAG or LEAD?

Replace it when you are comparing each row to a prior or next row in the same business key. LAG and LEAD preserve row context without the join fan-out that often makes self-joins expensive.

SQL Optimization [2026] Window Functions & CTE Rewrites

High-volume analytics queries usually fail in predictable ways: repeated scans hidden behind readable CTEs, window functions that sort more data than necessary, and ranking patterns that force extra passes through already large intermediates. This cheat sheet focuses on practical rewrites you can apply fast, then verify with the plan. The goal is simple: reduce bytes read, sort width, spill risk, and duplicate work before you touch heavier physical design changes.

Key Takeaways

Use QUALIFY or a single outer filter to avoid extra ranking subqueries.
Default window frames often surprise LAST_VALUE; set ROWS explicitly.
A reused CTE may be recomputed; materialize once when the engine will not reuse it.
Pre-aggregate before joins to reduce shuffle, sort, and spill pressure.
Benchmark rewrites with the plan, not intuition: scan bytes, sorts, spills, and row counts.

Quick Reference

Bottom Line

Rewrite for less work before you tune for more horsepower. In practice, the biggest wins come from shrinking intermediate result sets and stopping the planner from doing the same expensive step twice.

Live search

Press / to focus search, Esc to clear, j and k to jump sections, and c to copy the next visible code block.

Shortcut	Action	Use It For
`/`	Focus filter	Jump straight to a rewrite pattern
`Esc`	Clear filter	Reset the cheat sheet view
`j`	Next section	Move through the reference without scrolling
`k`	Previous section	Review the prior block quickly
`c`	Copy code	Grab the next visible snippet

Use windows when you need row context without collapsing rows.
Use grouped aggregates when you can reduce cardinality early.
Use temp tables or materialized stages when the same heavy subquery feeds multiple branches.
Before sharing sample data or plans with a team, scrub sensitive fields with the Data Masking Tool.

Commands by Purpose

Filter top-N per partition

Best when you need one row or a small top-N slice per key.
Prefer QUALIFY where supported because it removes one nesting layer.
If QUALIFY is unavailable, keep the ranking in one subquery and filter once.

SELECT
  customer_id,
  order_id,
  event_ts
FROM orders
QUALIFY ROW_NUMBER() OVER (
  PARTITION BY customer_id
  ORDER BY event_ts DESC
) = 1;

Replace row-to-row self-joins

Self-joins can explode row counts and add unnecessary join work.
LAG and LEAD usually preserve intent with less shuffle.

SELECT
  session_id,
  event_ts,
  event_value - LAG(event_value) OVER (
    PARTITION BY session_id
    ORDER BY event_ts
  ) AS delta
FROM events;

Fix misleading default frames

The default frame is often fine for ranking, but not for every value function.
LAST_VALUE is a common trap because the default frame may stop at the current peer group.
Set ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING when you want the partition-wide last row.

SELECT
  account_id,
  event_ts,
  LAST_VALUE(status) OVER (
    PARTITION BY account_id
    ORDER BY event_ts
    ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
  ) AS final_status
FROM account_events;

Pre-aggregate before the join

Large fact-to-dimension joins get cheaper when you collapse the fact side first.
This is especially useful when downstream logic only needs grouped metrics, not raw rows.

WITH order_totals AS (
  SELECT
    customer_id,
    region_id,
    SUM(amount) AS gross
  FROM fact_orders
  WHERE order_date >= @start_date
  GROUP BY customer_id, region_id
)
SELECT
  ot.customer_id,
  dr.region_name,
  ot.gross
FROM order_totals ot
JOIN dim_region dr
  ON dr.region_id = ot.region_id;

Inline or materialize a CTE deliberately

Readable WITH clauses do not guarantee reuse.
In BigQuery documentation, WITH is described as a readability feature, not a performance guarantee.
In PostgreSQL, eligible WITH queries can be inlined, and NOT MATERIALIZED can force that behavior.

WITH filtered AS NOT MATERIALIZED (
  SELECT
    user_id,
    event_date,
    revenue
  FROM fact_events
  WHERE event_date >= @start_date
)
SELECT
  user_id,
  SUM(revenue) AS total_revenue
FROM filtered
GROUP BY user_id;

CREATE TEMP TABLE heavy_stage AS
SELECT
  customer_id,
  region_id,
  SUM(amount) AS gross
FROM fact_orders
WHERE order_date >= @start_date
GROUP BY customer_id, region_id;

SELECT hs.customer_id, dr.region_name, hs.gross
FROM heavy_stage hs
JOIN dim_region dr
  ON dr.region_id = hs.region_id;

Configuration

Planner-friendly defaults

Align physical partitioning or clustering with the columns used in your largest filters.
Keep window partitions narrow where possible; wide partitions multiply sort pressure.
Prefer stable sort keys with good cardinality for ranking and dedupe patterns.
Persist expensive transformations if the same shaped data feeds many dashboards or jobs.
Keep SQL readable; if you share snippets publicly, clean them with the Code Formatter after the rewrite is stable.

What to inspect in the plan

Signal	What it usually means	Rewrite first
Large sort stage	Window or final ordering is wider than it needs to be	Project fewer columns and reduce partition size
Repeated scan of same branch	CTE or subquery is being reevaluated	Materialize once or inline to enable pushdown
Row explosion after join	Join occurs before dedupe or aggregation	Pre-aggregate, filter earlier, or rewrite self-join
Spill to disk	Sort or hash state exceeds memory budget	Shrink intermediate rows and split the workload

Watch out: A readable CTE chain can hide the real cost center. The plan, not the indentation, tells you whether the branch was pushed down, scanned twice, or sorted too early.

Advanced Usage

Use QUALIFY to keep ranking local

When your engine supports it, QUALIFY filters window results after the window step without forcing an extra wrapper query.
It is especially useful for latest-row, top-N, and dedupe patterns.

Trade exact ranking for approximate bucketing when scale demands it

BigQuery documents that heavy NTILE usage can require a global ORDER BY and may hit resource limits.
For percentile-style grouping at massive scale, approximate quantiles can be the cheaper path if exact bucket boundaries are not mandatory.

SELECT
  APPROX_QUANTILES(latency_ms, 100) AS latency_percentiles
FROM api_requests;

Split logic when the optimizer cannot

If one monster query both transforms and serves analytics, separate the expensive stage from the final read path.
Temp tables, scheduled summary tables, and materialized views are often easier to operate than one heroic statement.

Pro tip: Treat every rewrite as a hypothesis. Measure bytes scanned, rows shuffled, spill, and elapsed time before you decide the query is truly faster.

Plan Checks

Fast verification loop

Capture the baseline plan and runtime metrics.
Rewrite one pattern at a time: ranking, frame, join order, or CTE reuse.
Re-run with the same date range and concurrency conditions.
Compare scan bytes, sort stages, spill, and output row counts.
Keep the simpler query only if the plan is measurably better.

EXPLAIN ANALYZE
SELECT
  customer_id,
  SUM(amount) AS gross
FROM fact_orders
WHERE order_date >= @start_date
GROUP BY customer_id;

For shared benchmarks, keep the same predicates, same warehouse or slot conditions, and same output shape.
If the rewrite changes semantics, stop calling it an optimization and document it as a behavior change.

SQL Optimization [2026] Window Functions & CTE Rewrites

Bottom Line

Key Takeaways

Quick Reference

Bottom Line

Commands by Purpose

Filter top-N per partition

Replace row-to-row self-joins

Fix misleading default frames

Pre-aggregate before the join

Inline or materialize a CTE deliberately

Configuration

Planner-friendly defaults

What to inspect in the plan

Advanced Usage

Use QUALIFY to keep ranking local

Trade exact ranking for approximate bucketing when scale demands it

Split logic when the optimizer cannot

Plan Checks

Fast verification loop

Frequently Asked Questions

Get Engineering Deep-Dives in Your Inbox

Related Deep-Dives

PostgreSQL EXPLAIN ANALYZE Cheat Sheet

BigQuery Performance Tuning Guide

Materialized Views vs Views in Production