It’s when you can reassign a task earmarked for one machine to another after the fact. There’s a latency and throughput problem to keeping all tasks on a central server until a worker finishes an existing task.
But if you locally queue, and one worker gets unlucky and gets the longest tasks, then the whole cluster sits idle while this worker starts and finishes multiple tasks. If you reallocate the ones that haven’t started or have timed out, then the overall time comes down quite a bit (and in the latter case also covers workers that crash).
For development teams it could mean moving the boundary on a service or API so your team does more of the feature work than originally planned, because the other team keeps getting jammed up on bugs or other operational issues.
But if you locally queue, and one worker gets unlucky and gets the longest tasks, then the whole cluster sits idle while this worker starts and finishes multiple tasks. If you reallocate the ones that haven’t started or have timed out, then the overall time comes down quite a bit (and in the latter case also covers workers that crash).
For development teams it could mean moving the boundary on a service or API so your team does more of the feature work than originally planned, because the other team keeps getting jammed up on bugs or other operational issues.