Issue 1 Update - The Thesis Holds — and the Grid Is Starting to Answer
This post follows Issue 1 (April 2026) — The Real AI Energy Trade: Optionality Over Prediction in Infrastructure Investment
The Thesis Holds — and the Grid Is Starting to Answer
What a December 2025 trial at a London AI factory confirms about the constraint — and why efficiency gains make flexible, connected capacity more valuable, not less.
01 · WHAT THE LONDON TRIAL SHOWED
Issue 1 argued that the real AI energy constraint was not generation but delivery: the speed at which new large consumers can be connected to the grid and supplied with power. A trial completed at a London AI factory last December — its results published in early March 2026 and presented by Steve Smith of National Grid and Josh Parker of NVIDIA at Reset Connect London on 23 June confirms the market is beginning to solve exactly that problem.
In December 2025, Emerald AI, NVIDIA, National Grid, Nebius, and EPRI ran a UK-first trial at Nebius’s AI factory in London. Over five days, more than 200 simulated real-time grid events were sent to the facility — a cluster of 96 NVIDIA Blackwell Ultra GPUs running production-grade AI workloads testing whether the site could respond dynamically to grid signals without disrupting critical compute.
The results were unambiguous. The platform cut power draw by up to 40% without impacting critical workloads, followed load reduction requests for up to ten hours, and shed 30% of its load in approximately 30 seconds during a simulated system stress event. The system recorded 100% alignment with every power target National Grid instructed the cluster to follow across the entire trial.
The trial’s double-blind design mattered. NVIDIA’s team had no advance knowledge of what they would be asked to do. When the facility tracked National Grid’s target trace precisely, in microseconds, faster than most conventional flexible assets on the system, it confirmed something the grid operator had not yet been able to say with confidence: that AI factories can be among the most responsive resources available to a grid operator managing sudden demand swings. |
02 · HOW THE FLEXIBILITY WORKS
NVIDIA’s DSX Flex software library and Emerald AI’s Conductor platform orchestrate three distinct flexibility mechanisms simultaneously.
On-site battery discharge. Batteries charged during periods of grid abundance dispatch during stress events, allowing the facility to maintain compute throughput while reducing the power it draws from the grid.
Workload deferral. Non-time-sensitive compute jobs are identified and temporarily paused — a fine-tuning run that can wait an hour, a training checkpoint that can be deferred — while high-priority inference workloads continue uninterrupted. Critical workloads are protected by design.
Geographic redistribution. In the most structurally novel development, compute can be shifted across facilities — moved to a data center elsewhere on the system when local grid conditions are under pressure. The grid is becoming a compute-routing problem as much as a power-delivery one.
‘AI factories are too valuable to be treated as either passive loads or permanent islands. They produce tremendously valuable AI tokens and knowledge, and with DSX Flex, they can also provide measurable relief back to the grid.’ — Varun Sivaram, Founder and CEO, Emerald AI |
03 · WHAT THIS MEANS AT SCALE
The scale numbers presented at Reset Connect give the trial its investment context. Steve Smith noted that National Grid plans to connect approximately 10 gigawatts of new data center capacity over the next five years — against a UK peak system demand of roughly 45 gigawatts. National Grid estimates that flexible operation from the UK’s planned 6 GW of new data center capacity could provide 2 GW of dispatchable availability on demand.
Josh Parker from Nvidia cited a Duke University study finding that if AI data centers in the US were flexible for just one percent of the year, it could unlock 100 gigawatts of latent grid capacity, equivalent to avoiding the construction of 50 Hoover Dams of new generation.
Smith put the consumer arithmetic plainly: connecting 10 gigawatts of flexible data centres would reduce domestic energy bills by roughly £25 per household before flexibility benefits are counted at all. For a large hospital, the saving runs to £50,000–£100,000. For a major industrial user, half a million.
Issue 1 identified the interconnection queue as the binding constraint on where AI capability could physically locate. The flexibility thesis reframes this directly: AI factories that can flex their power draw on demand can qualify for faster, larger grid connections — turning a waiting problem into a negotiating asset. The queue does not disappear. The relationship to it changes fundamentally. |
04 · WHY EFFICIENCY GAINS MAKE THE CONSTRAINT MORE IMPORTANT, NOT LESS
The instinctive response to the efficiency story is reassurance: if AI processing per megawatt is improving tenfold year-on-year, the pressure on grid connection capacity must be easing. This reading is wrong, and understanding why it is wrong is the most important thing this update adds to Issue 1’s thesis.
Every time the cost of running an AI inference drops tenfold, the number of inferences run increases by more than tenfold. This is not a paradox — it is the mechanism. Cheaper AI unlocks use cases that were uneconomical at the previous price point: entire workflows that could not justify the cost per query now run continuously, across organizations that were not previously AI users at all. The ceiling rises before anyone hits it, and then demand fills the new space. Fifty years of computing history predict this outcome, and DeepSeek demonstrated it in miniature in January 2025: a model that cost a fraction of its predecessors did not reduce AI energy demand. It expanded the pool of organizations that could afford to use AI at scale.
The efficiency gains are therefore real, but they do not reduce the pressure on grid connection capacity. What they change is the character of the demand. More inference, less training. More distributed workloads requiring proximity to users, fewer latency-insensitive ones that can sit anywhere power is cheap. More continuous, always-on AI embedded in operational workflows, less batch processing that can be scheduled around grid conditions.
The constraint is not easing. It is evolving. The question Issue 1 asked — which assets retain value across the widest range of AI demand outcomes — now has a sharper answer. It is not simply connected capacity. It is flexible, connected capacity in locations that can serve the inference-heavy, user-proximate demand that efficiency gains are unlocking. That is a more specific asset than the one most allocators currently hold. |
This is precisely what makes the flexibility model demonstrated in London significant beyond its engineering detail. A facility that can flex its draw, defer non-critical workloads, and qualify for faster connection is not merely solving today’s queue problem. It is designed for the shape of demand that efficiency gains are creating: higher volume, more distributed, more continuous, and more sensitive to latency than the training-dominated build-out of the last three years.
Parker’s observation at Reset Connect — that the same inference workload now uses one-tenth the energy it did twelve months ago — should therefore be read not as evidence that the energy constraint is shrinking, but as evidence that inference is about to become vastly more pervasive. The constraint does not disappear when the cost of AI falls. It moves to wherever the next bottleneck sits. Right now, that bottleneck remains what Issue 1 identified: the speed at which new demand can be connected to and supplied with power from the grid. Efficiency makes that bottleneck more relevant, not less, because it accelerates the rate at which new demand arrives at the connection queue.
05 · SIGNAL UPDATE
Issue 1 identified five signals to monitor over the following twelve months. Two are directly updated by the London trial and the Reset Connect discussion.
Signal 5 — Behind-the-Meter Private Grid Emergence. The London trial demonstrates a complementary model to the behind-the-meter generation trend Signal 5 was tracking. Rather than bypassing the grid entirely with on-site gas turbines, the Nebius/Emerald approach keeps the AI factory grid-connected but makes it a controllable, responsive asset. Both models are now in parallel development: on-site generation for the largest projects that cannot wait for connection, and demand flexibility for facilities already connected. Watch for whether regulators begin to formalize faster connection pathways for flexible facilities — that would confirm the demand-response model is being institutionalized, not just trialed.
Signal 3 — Transformer and Switchgear Lead Times. The flexibility model addresses the queue problem from the demand side rather than the supply side. If AI factories can demonstrate grid responsiveness in exchange for faster and larger connection agreements, the effective bottleneck on deployment timelines shifts — even while physical equipment lead times remain extended. The two signals interact: flexibility buys time while manufacturing capacity catches up.
The London trial is now the operational blueprint for Emerald AI’s first commercial-scale deployment: NVIDIA’s 96 MW Aurora AI factory in Virginia, targeted for later in 2026.
The constraint thesis was right. Efficiency gains are not a reason to own less of the right infrastructure. They are a reason to be more precise about which infrastructure is right.