Innovation - @stuartgh

There is a phrase that has been doing an enormous amount of quiet work in AI governance for the last few years, and it is starting to buckle under the weight of it. The phrase is “human-in-the-loop.” It is supposed to mean that however fast or strange the system gets, there is still a person somewhere who can catch the mistake before it matters. It is meant to be the seatbelt. Increasingly, it looks more like a seatbelt drawn on with markers.

I want to make a fairly direct argument here. Human-in-the-loop was never really a safeguard against bad decisions. It was a safeguard against the appearance of nobody being responsible. Those are different things, and the gap between them is exactly where the risk lives.

The loop is real. The oversight is theatre

Bloomberg ran a piece recently on the state of Silicon Valley under agentic AI, and it is worth sitting with because it is not a thought experiment, it is a description of how people are actually living right now. Founders run half a dozen or more coding agents at once, each one checking in every ten minutes or so to ask what to do next. One founder keeps his laptop open at his kids’ soccer practice so he doesn’t miss a prompt. Another has been sleeping at the office for weeks because, in his words, the company is “toast” if it doesn’t hit a revenue target in time.

That is human in the loop, technically. There is, quite literally, a human, in a loop. He is being asked for input every few minutes, all day, sometimes through the night. But nobody watching that founder would call what’s happening “oversight.” It’s the opposite of oversight. It’s a person too depleted and too rushed to meaningfully evaluate any single one of the hundred decisions they’re waving through, because the system is generating requests faster than a tired brain can genuinely interrogate them. The loop hasn’t made humans safer. It’s made the human the bottleneck everyone is quietly trying to route around, including, eventually, themselves.

This is the part that tends to get lost in the governance conversation: being present is not the same as being informed. You can be looped in on every action and still have no real view of the thing that actually matters, which is what each action was based on.

Even the regulators have stopped pretending

Sarah Breeden’s Bank of England speech at the end of June made this official in a way that’s hard to argue with. She described the shift from AI that generates content, to AI that reasons, to agentic AI that can chain sequences of actions together on its own. And then she said the quiet part: relying on a human in the loop for every agent action “is unlikely to be realistic.”

That’s a central banker telling you the comforting mental model doesn’t scale. Which tracks, because it was never really designed to scale. It was designed for a world where AI produced one output at a time and a person reviewed it before anything happened. It was not designed for a world where the AI is the one doing the acting, repeatedly, across systems, faster than a calendar invite for a review meeting could even be sent.

So here’s the sharper problem underneath Breeden’s warning. Even where a human genuinely is reviewing something, what are they usually being shown? A recommendation, a confidence score, an approve button. Clean. Plausible. Nothing that reveals what assumption the whole thing is quietly resting on. If the only thing the human ever sees is the finished, polished output, they’re not really in the loop on the decision. They’re the final click on a decision that was assembled somewhere they never looked. That’s not an oversight. It’s a signature ceremony.

The loop was checking the wrong layer

Here is the reframe I think actually matters. “Human in the loop” is a question about process. Is a person present at some point in the sequence. That is a low bar, and agentic AI is about to make it lower still, because the sequences are getting longer and the checkpoints are getting sparser.

The question that actually protects you is a different one, and it sits underneath the process question rather than alongside it: what is this decision based on? Not the workflow. Not the approval step. Not whether the model produced a tidy explanation for its own output — a model can explain itself perfectly well while resting on a completely wrong assumption about the customer, the risk, or the market. The basis is the data used, the data quietly missing, the definition of success somebody baked in months ago, the time pressure distorting the call right now. You could staff a human-in-the-loop process with the most conscientious reviewers on the planet, and if nobody has stress-tested the basis before the loop even starts, the loop is reviewing a decision that was already compromised on the way in.

This is the space I’ve been building the Needle Framework around. Not as a replacement for human review, and not as another governance layer stacked on top of the ones you already have. As the check that happens before commitment — before the system acts, before it’s automated, before anyone signs anything. What is the decision actually standing on. What assumption is carrying the most weight? What would have to be true for this to be safe. Answer that properly, upstream, and the human-in-the-loop step downstream stops being theatre, because there’s finally something real for the human to be looking at.

And this is where Nadella’s point sharpens the argument

Satya Nadella made an observation recently about the future of the firm that I think extends this well past a compliance conversation. His line, roughly, is that the real opportunity isn’t in picking the best model. Everyone has access to more or less the same frontier models. The durable advantage comes from the learning loop a company builds around its own workflows, judgement, and accumulated institutional knowledge — what he calls token capital, sitting alongside human capital.

Put that next to the decision-basis argument and something clicks into place that’s easy to miss if you only think about this as risk management. The record of what a decision was based on isn’t just a defensive artifact you produce if a regulator ever asks. It’s the raw material of the learning loop Nadella is describing. Every time you write down what assumption a decision rested on, what evidence would have changed it, what context the AI didn’t have — you’re not just protecting yourself. You’re building the thing that makes your organisation’s AI usage genuinely yours, rather than a slightly customised wrapper around whichever model everyone else is also renting this quarter.

So the model is not the moat. Fine. But neither, on its own, is the workflow, or the audit trail, or the human dutifully clicking approve every ten minutes at his kid’s soccer practice. The moat, if there is one, is whatever your organisation actually knows about the basis of its own decisions — captured early enough to be useful, specific enough to be defensible, and honest enough to survive someone actually asking about it.

Human-in-the-loop was never going to give you that. It was only ever going to give you someone to ask, after the fact, what happened. The better question is upstream of the loop entirely, and it’s the one nobody’s put a name to yet. What was the decision based on?

Why execution-boundary AI governance needs upstream assumption testing

Most AI governance still begins in the wrong place. It starts with rules, which is an understandable instinct, because rules are visible — they can be written down, audited, checked, enforced, and explained afterwards. In a world of increasingly capable AI agents, the impulse to define firm boundaries on what a machine is and isn’t allowed to do feels like the responsible first move.

But the harder problem is rarely the rule itself, it’s the assumption hidden inside it. A rule can be morally attractive, technically enforceable, and still fail in exactly the situation where it matters most — not because it was poorly written or carelessly considered, but because it depends on a condition that has quietly stopped being true. That’s where the next phase of AI governance gets difficult: not at the level of slogans, but at the level of assumptions.

When the rule meets its exception

Consider one of the most emotionally powerful rules in AI governance: an autonomous weapon should not use lethal force without a human in the loop. As a default, it’s hard to object to. It protects human judgment, prevents machines from becoming independent killing systems, and keeps moral responsibility attached to human authority rather than algorithmic decision-making.

But Ben Goertzel in The Anthropic Fable Farce recently offered an edge case that exposes what happens when this rule is treated as absolute. Picture a drone, cut off from the network, that sees a man seconds away from pressing a button that will launch a weapon capable of killing a million people. The drone’s hard rule says it may not use force without human approval — but no approval is available, because the link is down. If the drone follows the rule, a million people die. The scenario forces the uncomfortable question of whether, in that situation, the drone should act.

The point isn’t that autonomous weapons should be given broad permission to kill. The point is sharper: the absolute rule depends on an assumption, specifically that a human decision path will remain available when the decision matters. If that assumption fails, the rule may no longer produce the safety outcome it was designed to protect, with devastating consequences.

This doesn’t make the rule disappear, and it doesn’t make the underlying moral concern go away, if anything, the risks of machine error, spoofing, false positives, escalation, and misuse become more serious, not less. What disappears is the absolute version of the rule. And once even one legitimate exception is admitted, the governance question changes shape. It’s no longer enough to say a human must always be in the loop. The harder question becomes: under what conditions, defined in advance, could an exception ever be admissible, and who has the authority to define those conditions?

Why accuracy isn’t the whole answer

One tempting response is to treat this as simply a question of accuracy: if the AI isn’t reliable enough, it shouldn’t act, and perhaps the rule can change once it becomes reliable enough. That’s partly true, accuracy matters enormously, and in the case of a drone that misidentifies the person, the weapon, the intent, or the consequence isn’t preventing catastrophe, it’s creating one. Any exception that allows force on weak evidence would be both morally and technically dangerous.

But accuracy isn’t the whole problem. Even a far more accurate system would still need a governance structure around the decision: what evidence counts, how uncertainty is handled, whether the signal might have been spoofed, what prior authority exists, how the decision gets recorded, and how responsibility is assigned afterwards. The question isn’t only whether the AI is right. It’s whether the conditions under which it may act have been identified, tested, bounded, and made explicit before the crisis occurs, which is a different kind of work entirely. It isn’t model evaluation, red-teaming, or post-hoc audit. It’s the upstream task of discovering what the rule actually depends on being true.

In the drone case, that means surfacing assumptions like: the human approval path will remain available, waiting for approval is safer than acting, inaction is morally neutral rather than itself a choice, the system can reliably distinguish catastrophe from ambiguity, the exception can’t be spoofed or exploited, and the action remains accountable afterwards. These aren’t secondary details, they’re the real structure underneath the rule. If they are never surfaced, the governance system can look safe while remaining brittle.

The execution boundary is not enough

A new class of AI governance solutions is emerging around what might be called the execution boundary: the point where an agent stops merely suggesting something and starts doing something that affects the world — updating a record, moving money, changing a parameter, sending a message, triggering a workflow, approving a transaction, or making an operational decision.

This shift matters, because an agent that takes action creates consequences directly, in a way a chatbot that gives a bad answer doesn’t. The basic idea behind execution-boundary governance is right: before an agent acts, the system should check whether the action is authorised, evidenced, within scope, and safe to proceed, and it should be able to allow, restrict, escalate, delay, or refuse accordingly, creating an evidence record so the decision can be reviewed later. That’s a major improvement over governance that only reviews what happened afterwards.

But that leaves the harder question upstream. A runtime governance layer can enforce constraints, check authority, record evidence, and block actions that fall outside a permitted corridor. What it can’t do by itself is know which constraints should exist in the first place. Before a layer can enforce a rule, someone has to identify the dependency that makes the rule necessary, and before a system can block a dangerous transition, someone has to recognise that the transition is dangerous. That identification is the missing upstream layer, and it doesn’t happen automatically just because the execution layer is well built.

What changes when assumptions become operating instructions

In ordinary decision-making, a weak assumption tends to produce a bad plan, a failed project, or wasted money. In agentic AI, the same weak assumption can become part of an operating system, and that changes the stakes considerably.

A team might assume a monitoring signal is reliable enough to trigger an intervention, and an agent acts on it before it’s stable. A company might assume a customer request implies valid consent, and an agent moves or exposes data on that basis. A platform might assume a human approval step exists somewhere in the workflow, and an agent routes around it because the condition was never made explicit. A security system might assume a blocked identity means a blocked capability, and the capability leaks through another channel anyway. In none of these cases is the agent malfunctioning, in fact it’s doing exactly what the system permits. That’s the danger: assumptions that once sat quietly in strategy documents and slide decks can become executable, and a governance system can enforce the wrong thing with impressive consistency.

This is why the real bottleneck in AI governance isn’t primarily enforcement. The questions that matter most, such as what does this workflow actually depend on being true, or which assumption would create the most damage if wrong, aren’t technical afterthoughts, they’re part of governance itself. In many failures, the agent won’t have broken the rule at all. The wrong rule will have been encoded, or the exception was never defined, or the evidence requirement never matched the real risk. The governance layer performs exactly as designed, but the design is still flawed.

Where a Needle-style approach fits

An upstream assumption layer shouldn’t replace the execution layer, authorise action, or try to resolve moral decisions in the moment. Its role is narrower and earlier: to help identify what a proposed action depends on being correct, which assumptions are stated and which are merely implied, where the evidence is weak, where authority is ambiguous, and where a candidate constraint should be defined before execution rather than discovered after it.

This is where a Needle-style framework becomes useful, not as a governance system in itself, but as a method for finding the hidden dependency before the governance system is asked to enforce anything. The question is simple: what must be true for this decision, rule, workflow, or agent action to be safe enough to proceed? The follow-up is harder: what happens if that assumption fails? Different domains will produce different answers. In the drone case, it’s whether the human loop remains available. In an enterprise agent workflow, it might be whether approval has genuinely been granted, whether data use is permitted, or whether a signal is strong enough to justify action. But the structure is the same in every case: a visible rule sits on top of a hidden dependency, and if the dependency fails, the rule may no longer behave as intended.

Why this matters commercially

For enterprises, this isn’t only an ethics question, it’s operational. If agents are going to touch real systems, governance has to become part of production, not just policy. But production governance only works if the right constraints were selected in the first place. Get that wrong, and the costs can show up everywhere.

The recent Starbucks Korea crisis is a striking example of the potential failure pattern. In May 2026, Starbucks Korea launched a “Tank Day” tumbler promotion on the anniversary of the Gwangju pro-democracy uprising, using language many Koreans read as echoing both the military crackdown and a notorious police torture cover-up. The campaign was reportedly developed with the help of generative AI, but the real failure was not that AI suggested the wrong words. It was that the company’s human governance process did not catch what those words meant.

The promotion passed through multiple layers of approval, while the cultural and historical assumption underneath it remained invisible: that the people signing off the campaign understood the society they were selling to. The commercial consequences of the huge error were immediate: the campaign was pulled, the local CEO was dismissed, sales fell sharply, stores were later scheduled to close early for nationwide history and social-sensitivity training, and Starbucks Korea announced changes to its marketing approval procedures. That is why the case matters beyond branding. Any system that acts at speed whether military, enterprise, or marketing can fail when its governance process checks whether the workflow was approved, but not whether the workflow is standing on an assumption no one has tested.

Execution governance reduces one class of risk. Upstream assumption discovery reduces another: the risk of governing the wrong problem entirely. The execution layer asks whether an action may proceed. The upstream layer asks whether anyone has understood what that action actually depends on. One without the other is incomplete.

The rule is not the governance

Rules aren’t self-contained objects. They carry assumptions about the world: that certain signals are reliable, certain actors are reachable, certain authorities are clear, certain exceptions are rare or manageable. But real systems break assumptions constantly. Networks fail, signals drift, people route around controls, edge cases appear, incentives shift, adversaries spoof conditions, and human approval paths disappear exactly when they’re most needed.

None of this is an argument against rules. It’s an argument against treating rules as though they explain themselves. Good AI governance will still need strong execution boundaries: evidence, authority checks, refusal paths, escalation paths, auditability, and accountability. But before any of that, it needs assumption-tested rules. Before a system decides what an AI agent is allowed to do, someone has to ask what must never be allowed, what may be allowed only under extreme conditions, and what assumption separates the two. The most important governance question may therefore come earlier than we think: not whether the system can enforce the rule, but whether anyone has validated the assumption the rule depends on.

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

@stuartgh

my thoughts on #crypto #blockchain #AI

Category Archives: Innovation

Human-in-the-Loop Doesn’t Work

Featured