AI teams promise speed. This test found the handoffs can ruin the work
Openverse: Genetic Engineering & Biotechnology News📷 ₡ґǘșϯγ Ɗᶏ Ⱪᶅṏⱳդ / flickr (via Openverse)
- ★A single autonomous agent solves 100% of test tasks, while entire 'teams' of sequential agents fail to solve any
- ★Hierarchical agents achieve 64% success, stigmergic 32%, and pipeline agents 0%
- ★Efficiency collapses proportionally to architectural complexity, suggesting systems inherit dysfunctional patterns from human institutions
Agentic AI isn't the efficiency nirvana many promised. It's become the management consulting of the algorithmic world: full of meetings that should have been emails, decisions delayed by committee, and processes that multiply faster than they deliver. New analysis from Genetic Engineering & Biotechnology News reveals that systems designed to act autonomously still fall prey to the same inefficiencies plaguing human middle managers.
The worst part? They get worse at their jobs when they operate one after another. Sequential agent swarms—where one AI passes work to the next—don't just slow things down; they actively degrade decision quality. That's the opposite of the optimization they were built to deliver.
This isn't a bug. It's a feature of how these systems are architected, according to early signals from research papers and developer forums. The more layers of delegation and decision-making you add, the more opportunities for misalignment, redundant approval loops, and cascading errors—all hallmarks of bureaucratic systems humans know too well.
The irony isn't lost on the AI community. Developers working on multi-agent setups report that promised gains in speed and accuracy often vanish once real-world constraints are introduced. Tasks that should take minutes stretch into hours, and outputs drift further from intended goals with each handoff. The research findings are stark: a single autonomous agent solves 100% of test tasks, while entire 'teams' of sequential agents fail to solve any.
Sequential agents solve 0% of tasks that solo AI handles perfectly
Openverse: Genetic Engineering & Biotechnology News📷 jurvetson / flickr (via Openverse)
The architectural breakdown is revealing. Hierarchical agents achieve 64% success, stigmergic agents 32%, and pipeline agents 0%. Efficiency collapses proportionally to architectural complexity—suggesting these systems inherit dysfunctional patterns from human institutions rather than transcending them.
There's speculation about whether tighter governance frameworks could mitigate these issues. Some practitioners note that adding explicit conflict-resolution protocols or centralized oversight might help, but that reintroduces the same bottlenecks these architectures were meant to eliminate. The core tension remains unresolved: coordination requires overhead, yet overhead undermines the speed that justifies multi-agent systems in the first place.
For practitioners, the implications are immediate. Before assembling a swarm, ask whether the task genuinely requires parallel cognition or whether you're building a digital bureaucracy that will consume more cycles than it saves. The data suggests that for many workflows, a single well-prompted agent outperforms an elaborate chain of specialized models.
The deeper question is whether agentic architectures can be redesigned to avoid these traps, or whether coordination costs are fundamental to any distributed system. Early evidence points toward the latter: the problems aren't implementation details but structural features of delegation itself.

