Difference between UNION and UNION ALL. Which is faster and why?

Relations, tuples, attributes and the SQL that operates on them — keys, NULL semantics, join types, and the logical query processing order interviewers test first.

Cracked Java

UNION removes duplicate rows; UNION ALL keeps every row. That deduplication is the only functional difference — and it's also why UNION is slower. If you don't actually need duplicates removed, UNION ALL is both faster and more honest about intent.

What each does

SELECT id FROM active_users
UNION
SELECT id FROM trial_users;       -- distinct ids across both sets

SELECT id FROM active_users
UNION ALL
SELECT id FROM trial_users;       -- every id, duplicates preserved

Both require the two queries to be union-compatible: same number of columns, with compatible types positionally (column names come from the first query). The result is a relation built from the set/bag union of both inputs.

Why UNION ALL is faster

UNION must guarantee no duplicate rows, so PostgreSQL appends both inputs and then eliminates duplicates — typically via a hash aggregate or a sort, comparing every column of every row. That extra step costs CPU, memory (or temp-disk spill for large inputs), and it's a blocking operation: it can't stream results until it has seen enough to dedup.

UNION ALL simply concatenates the inputs (an Append node) and streams rows straight through — no sort, no hash, no buffering. You can see the difference in EXPLAIN:

EXPLAIN SELECT ... UNION ALL SELECT ...;   -- Append
EXPLAIN SELECT ... UNION     SELECT ...;   -- HashAggregate / Unique over Append

Choosing the right one

Use UNION ALL when the inputs are already disjoint, or when duplicates are acceptable/desired (e.g., concatenating partitions, audit feeds, or combining counts you'll SUM later). It's the default you should reach for.
Use UNION only when you genuinely need set semantics — distinct rows across sources.
A common anti-pattern: writing UNION "to be safe" when the sets can't overlap, paying for a dedup that removes nothing. If you know they're disjoint, say so with UNION ALL.