The Coordination Limit
A Physical Constraint on Distributed Systems
In January 2025, Doug O’Laughlin identified that something fundamental had broken. Marginal costs had returned to technology. The zero-marginal-cost foundation of the internet era no longer held for AI. Every additional unit of output now cost something real to produce.
That observation was correct. But it was the first move, not the last.
Ben Thompson sharpened it in April 2026. The constraint facing hyperscalers is not marginal cost. It is opportunity cost. Microsoft had the compute. It had the demand. It chose which workload to serve. CFO Amy Hood told investors that if the company had allocated the GPUs that came online in Q1 and Q2 entirely to Azure, the KPI would have been over 40. The constraint was not whether the company could serve more. It was that serving one workload meant not serving another.
That is a different problem. And it points to a different cause.
The current models describe how compute is produced and how it is allocated. They describe tokens per watt. They describe how available capacity is distributed across competing workloads.
They do not describe what the system consumes while it runs.
Something is consuming compute while models run. Not the inference itself. Not the tokens being produced. Not the transport carrying them.
As the model executes, as the agent acts, as the interaction crosses each service boundary, the system must establish identity at that boundary. It must evaluate policy at that hop. It must synchronize state with the next component. It must reconcile authority across jurisdictions. It must reconstruct context that fragmented at the last crossing.
That work runs concurrently with inference. It produces nothing a user sees. It consumes compute that could have produced output.
And unlike inference costs, it does not scale with model capability. It scales with the structure of the system the model is running inside.
The Physics
Strip away the narrative and what remains is not a theory. It is thermodynamics and computational complexity applied to distributed systems.
Every boundary crossing performs work. That work is not optional. A system that does not perform it ceases to function correctly. A system that performs it must consume energy to do so. There is no configuration in which coordination is free.
Energy consumed for coordination does not produce output, but it is required to preserve system validity. The system must spend it to remain coherent. That expenditure competes directly with the energy available for computation.
At each boundary, the system performs cryptographic validation, policy evaluation, and state reconstruction. These are not novel operations. They are repeated operations. Each boundary re-establishes conditions that were already established at the previous boundary, not once, but continuously, for as long as the interaction remains active.
Each crossing increases the number of possible system states that must be reconciled. That is entropy. The system is not simply doing work. It is increasing the amount of work required to remain internally consistent as it runs.
The Multiplier
If the system were linear, this cost would scale proportionally with the number of boundaries. It does not. Agents depend on agents. Services depend on services. Identity, policy, and data are resolved across multiple independent domains simultaneously. The system is not a chain. It is a graph.
When models and agents execute across independently governed identity, policy, modality, authority, and transport boundaries, coordination does not accumulate linearly. It multiplies.
Each participant, each modality, each feature, each authority, each transport transition expands the number of states the system must reconcile during execution. The system is not evaluating a single condition. It is evaluating all possible combinations of those conditions as the interaction proceeds.
That is the difference between a sum and a product.
The constant does not matter. Reducing it delays the boundary. It does not remove it. Once multiple independent constraints exist, the system enters a scaling regime where coordination cost grows faster than any linear improvement in compute efficiency can compensate.
Systems are designed to scale linearly. This one does not. Each additional dimension does not add cost. It multiplies the number of states that must be resolved during execution.
No improvement in hardware efficiency changes the class of this function. No increase in model capability reduces the number of boundary crossings required to maintain coherence. No abstraction layer eliminates the work. It only redistributes it.
Partial Participation Changes Nothing
Some interactions will involve fewer modalities. Some will activate fewer features. Some will operate within a single trust domain. That does not change the constraint.
Zero Trust enforcement is not optional. Data residency is not optional. These are not design preferences. They are regulatory and security requirements that attach to the interaction regardless of how the system is configured. If an interaction crosses a boundary, identity must be established. Policy must be evaluated. Authority must be reconciled.
The coordination surface does not need to be maximal. It only needs to include a boundary that cannot be ignored. Once such a boundary exists, the system must perform coordination work during execution. That work does not disappear. It accumulates across every boundary the interaction crosses.
Even under partial participation, the system does not return to linear behavior. It remains superlinear. The boundary still exists.
The Missing Term
Jensen Huang formalized the ceiling at GTC in March 2026:
Two levers. Efficiency on one side. Energy on the other. The industry is optimizing both. What this equation does not include is the work required to maintain the system while it runs. That work must be accounted for. And at scale, this term does not function as a correction to Huang's equation. It becomes the limiting factor.
Energy is not consumed solely to produce output. It is consumed maintaining consistency. The system is converting energy into coherence, not computation.
At the rack level, this becomes unavoidable.
Every unit of coordination overhead is not inefficiency. It is capacity displacement. Every watt spent reconciling identity, policy, and state as an interaction crosses a boundary is a watt not producing output. Improving available capacity does not eliminate coordination. Both scale against the same finite energy supply.
If coordination approaches total available capacity, payload collapses.
At that point, no compute remains for useful work. The system does not degrade. It becomes non-executable.
The Approach from Both Directions
This condition is not approached from one direction. It is approached from both.
During execution, the product grows. AI adds participants. Features multiply. Regulatory mandates introduce additional authorities. Systems span more transports. Every dimension is increasing in every deployed system. Each increase multiplies the coordination cost of every active interaction.
At the same time, capacity is constrained. Energy supply is no longer assumed to expand with demand. Infrastructure is bounded. Compute is allocated, not abundant.
The product increases. The ceiling does not keep pace. The boundary becomes reachable. At scale, the system crosses from being compute-bound to being coordination-bound.
What Is Already Visible
The consequences are already visible. Systems fail not because they lack compute, but because they cannot maintain coherent control-plane state during execution. Configuration propagates without boundaries. Policy fragments across domains. State diverges. Recovery becomes manual because the system cannot reconcile itself.
The pipes work. The system cannot govern what flows through them.
The instinctive response is to add more. More orchestration. More monitoring. More middleware. More policy layers. Each addition increases the number of independent constraints. Each increase multiplies coordination cost during execution. The system consumes additional capacity before it produces additional value.
Additive fixes accelerate the condition they attempt to correct.
The missing element is not another layer. It is the absence of a governing boundary at the level where the work is actually performed. Transport continuity keeps the signal alive. Authority continuity determines whether the system remains coherent as it runs. Without the second, every interaction pays the product cost at every boundary, for as long as it exists.
The Condition
O’Laughlin identified that the era ended. Thompson described the constraint that replaced it. Neither framing reaches the structural source of the behavior now being observed across infrastructure, economics, and failure patterns.
A growing fraction of available energy is consumed maintaining coherence in systems whose coordination cost scales multiplicatively as they execute.
You do not have to accept the architecture. That is not the point.
The system has entered a regime governed by thermodynamics and computational complexity, not design intent.
Coordination grows as a product. Compute does not.
As those constraints increase, the work required to maintain coherence increases with them. That work consumes energy drawn from the same finite pool that must power the computation itself.
At the limit, all available energy is consumed maintaining internal consistency. Nothing remains for useful work.
This is not a design choice.
It is a system boundary condition.
And when that boundary is reached, the system does not degrade.
It stops being executable.