ForkJoinPool is a thread pool tuned for divide-and-conquer workloads where tasks recursively spawn subtasks. Its defining feature is work-stealing: each worker has a private deque, and idle workers steal tasks from busy ones to keep all cores saturated.
Per-worker deques
A standard ThreadPoolExecutor has a single shared queue — every thread contends on it. ForkJoinPool gives each worker thread its own double-ended queue. This is the key to its scalability: most task scheduling touches only the local deque, with no cross-thread synchronization.
When a task calls fork(), the subtask is pushed onto the head of the current worker's own deque. The worker also pops from the head — so its own work is LIFO. LIFO is deliberate: the most recently forked subtask is the one most likely still hot in cache, and in recursive splitting it's usually the smaller, deeper piece.
Stealing from the tail
When a worker's deque is empty, it doesn't sit idle. It picks another worker and steals from the tail of that worker's deque — FIFO from the thief's perspective.
// Inside a RecursiveTask's compute():
if (hi - lo <= THRESHOLD) {
return computeDirectly(lo, hi);
}
int mid = (lo + hi) >>> 1;
SumTask left = new SumTask(arr, lo, mid);
SumTask right = new SumTask(arr, mid, hi);
left.fork(); // pushed to THIS worker's deque head
int rightResult = right.compute(); // run inline, no thread handoff
int leftResult = left.join(); // help run / wait for the forked half
return leftResult + rightResult;
Stealing the oldest (tail) task is smart: in recursive decomposition, the oldest unstarted task represents the largest remaining chunk, so one steal hands the thief a big slice of work and avoids frequent re-stealing.
Why two ends matter
Owner and thieves operate on opposite ends, so they rarely collide — contention only occurs near an almost-empty deque. The result is near-linear scaling for balanced CPU-bound work, with automatic load-balancing when subtasks turn out uneven.