Discussion about this post

User's avatar
Michael Klingebiel's avatar

The human review protocol section names the real problem precisely: nominal human presence is not meaningful review. A reviewer who can see the output but not the reasoning trace isn't oversight — they're a liability shield.

What you've mapped is the institutional layer. There's a parallel problem one level down, at the model level, that your three artifacts assume has already been solved: the system has to be capable of explaining itself before any notice, pathway, or review protocol can function. If the model routes a high-consequence query to a cheap output path and never flags it, the human reviewer has nothing to work with. The audit trail is already broken before the institution touches it.

The model-level equivalent of your three artifacts: a routing architecture that treats irreversible-consequence domains as hard triggers (not soft signals), an output constraint that prohibits inferred-safety language absent grounded evidence, and a logged audit record that distinguishes forced escalations from standard processing. Without those, the due process stack you've described is sitting on a foundation that can silently fail.

Good series. The Robodebt anchor is the right one.

Colleen Avarene's avatar

Marcela — the three-artifact framework turns due process from a legal principle into an engineering checklist, and that's exactly what's been missing from this conversation. Most public-sector AI debate stays at the level of "we need accountability" without specifying what accountability looks like as a workflow. You specified it.

The human review protocol section is the sharpest part. "Nominal human presence is not meaningful review" — that distinction is doing enormous work. A person who can see the output but not the reasoning, who can approve but not override without institutional friction, who exists in the loop on paper but not in practice — that person isn't oversight. They're a liability shield wearing a name badge.

I build custom AI agents for private-sector businesses and the same principles apply at a smaller scale. Every agent I build has scoped permissions, an escalation protocol, and a kill switch — because an agent that can act but can't be reviewed, paused, or reversed isn't a tool. It's a liability. The difference between public and private is the stakes, not the architecture. Your three artifacts could be a universal design standard, not just a government one.

The Robodebt anchor is the right one. Not hypothetical harm — documented harm. Decisions at scale with no explanation, no meaningful appeal, no human who could reverse them in time. That's what happens when the system is designed for throughput instead of contestability. Glad someone's writing the blueprint for the alternative.

2 more comments...

No posts

Ready for more?