PARTITION BY vs GROUP BY. — Cracked Java
MidTheoryTrick

PARTITION BY vs GROUP BY.

GROUP BY collapses each group into one row; PARTITION BY divides rows into groups for a window function but keeps every row. They sound similar and even produce the same per-group numbers — the difference is the shape of the result.

GROUP BY — one row per group

SELECT department, AVG(salary) AS avg_salary
FROM employees
GROUP BY department;

Output: one row per department. The individual employees are gone. Anything in the SELECT must be grouped or aggregated.

PARTITION BY — every row kept

SELECT
  employee_id,
  department,
  salary,
  AVG(salary) OVER (PARTITION BY department) AS dept_avg
FROM employees;

Output: one row per employee, each annotated with its department's average. You keep employee_id and salary and get the group-level number on the same line — exactly what GROUP BY cannot do.

The "compare row to its group" pattern

This is where PARTITION BY shines and GROUP BY forces a join:

SELECT employee_id, salary, dept_avg, salary - dept_avg AS diff_from_avg
FROM (
  SELECT employee_id, salary,
    AVG(salary) OVER (PARTITION BY department) AS dept_avg
  FROM employees
) t
WHERE salary > dept_avg;   -- everyone paid above their dept average

With GROUP BY you'd compute averages separately and join back; the window does it in one scan.

Key differences

A subtle but important point: PARTITION BY is optional. OVER () with no partition treats the entire result set as one window — SUM(amount) OVER () is the grand total on every row.

Mark your status