What is a BRIN index used for? (Very large, naturally ordered tables.)

How each index type works and what it is for, composite and covering indexes, partial and expression indexes, the cost of indexing, and why the planner sometimes ignores an index.

Cracked Java

BRIN — Block Range INdex — stores a tiny summary per range of pages instead of an entry per row, which makes it microscopic and almost free to maintain. Its whole premise is physical correlation: if a column's values track the table's on-disk order, a few bytes per block range can rule out enormous spans of the table. It's purpose-built for very large, naturally-ordered tables — think append-only time-series.

How it works

The table's heap is divided into block ranges (default 128 pages, pages_per_range). For each range BRIN keeps only a summary — for a numeric/timestamp column, the min and max value seen in that range.

A query then checks each range's summary:

range 0 (pages 0..127):   created_at  2024-01-01 .. 2024-01-03
range 1 (pages 128..255): created_at  2024-01-03 .. 2024-01-06
...
WHERE created_at = '2024-06-01'  → skip every range whose [min,max] excludes it,
                                   scan only the surviving ranges (a Bitmap Heap Scan)

CREATE INDEX ON events USING brin (created_at);
-- optionally tune granularity:
CREATE INDEX ON events USING brin (created_at) WITH (pages_per_range = 32);

Why it's so cheap

Size: a BRIN on a multi-billion-row table can be kilobytes — orders of magnitude smaller than the equivalent B-tree, which means it stays in cache and adds almost nothing to disk.
Write cost: appends typically just widen an existing range's min/max, so maintenance overhead is minimal — perfect for high-ingest logging tables.

The hard requirement: correlation

BRIN only works when the column is physically ordered on disk. A timestamp on an append-only table is ideal because rows land in time order. If the values are scattered across the heap, every range's [min,max] spans the whole domain, no ranges can be skipped, and BRIN degrades to a full scan.

BRIN vs B-tree

It's a deliberate trade: B-tree is precise and points at exact rows; BRIN is lossy and points at page ranges to scan. You pick BRIN when the table is too big to want a giant B-tree and the data is naturally ordered — accepting that it returns candidate pages, not exact rows.