Writing

The architecture of trust

What enterprises actually need before they'll let an AI agent into production. It's not about the model — it's about the infrastructure around it.

I was in a room with a CISO last year, demoing an AI agent that could resolve support tickets on its own. The demo was good. The agent classified incoming tickets, pulled context from the knowledge base, drafted responses, and escalated edge cases to a human. Resolution time dropped sixty per cent in our test environment. The CISO watched quietly, asked a few clarifying questions, then said: "This is impressive. But tell me what happens when it resolves a ticket incorrectly, the customer complains, and we need to explain to our auditors exactly what the system did and why."

That question ended the meeting. We had a partial answer, but the question itself revealed how much work sat between "working demo" and "something we can actually deploy." She wasn't asking about model accuracy. She was asking about accountability and organisational liability. Infrastructure problems, not model problems — and the actual blockers for enterprise AI adoption.

The trust gap

There's a persistent misunderstanding in the AI tooling world about why enterprises are slow to adopt agent systems. The common explanation: enterprises are bureaucratic, risk-averse, or behind. Occasionally true. Almost always insufficient. The real story is that enterprises have specific requirements for any system that touches production data, and most AI agent projects haven't been designed to meet them.

These requirements aren't arbitrary. Enterprises operate in regulated environments and they're contractually liable for the behaviour of their systems. They've learned, often through painful experience, that systems without audit trails and rollback capabilities create risks that compound over time.

When an enterprise evaluates an AI agent, the first question is rarely "does it work?" It's: "When it doesn't work, what happens next?"

What trust actually requires

After enough of these conversations, across data protection, financial services, SaaS platforms, I've found that enterprise trust requirements cluster into six categories. None are about the model itself.

Immutable audit trails. Every action the agent takes gets logged in a way that can't be retroactively modified. Not just "what happened" but "why." The inputs the agent received, the reasoning it followed, the tools it called, the outputs it produced. When an auditor asks "why did the system do this?", the answer has to be reconstructable from the logs alone. No re-running the model. No interpreting probabilistic outputs.

Human approval gates. The agent modifies customer data, sends an external message, executes a financial transaction, deploys code — it stops. Waits for a human. This can't be a global on/off switch. It needs to be a configurable policy mapping action types to risk tiers. Reading data? Autonomous. Sending a refund? Someone signs off. And the configuration itself must be auditable.

Rollback to pre-agent state. Straightforward for code changes (git revert). Significantly harder for database mutations and sent messages. The teams that handle this well think about rollback at design time, not after the first incident.

Rate limiting and circuit breakers. An agent in a runaway loop can do enormous damage in minutes. How many API calls should this agent make per session? Ten? A hundred? If it's making a thousand, something's wrong. Rate limits per tool, per session, per time window catch this. Circuit breakers detect failure patterns and stop the agent before damage compounds. Same pattern you'd apply to any distributed system component.

Compliance-aligned data handling. If the agent processes personal data, it has to comply with whatever regulatory framework applies: GDPR, HIPAA, SOC 2, or the organisation's own data classification policy. Where does data flow through the agent's pipeline? What gets logged, and what must never be logged? Does the LLM provider's data processing agreement actually cover your use case?

Operational observability. Separate from audit trails. The team running the agent needs real-time visibility: event streams, dashboards, alerts on anomalous behaviour. If the agent's confidence drops or it starts calling a tool in an unexpected pattern, someone should know now. Not tomorrow. Not when a customer complains.

Trust isn't a feature you add at the end. It's an architectural decision you make at the beginning, and it shapes everything that follows.

The build-versus-bolt problem

The most common mistake I see: treating trust as a layer you add after the agent works. A team builds a capable agent, shows impressive results, and then, when the enterprise asks about audit trails and rollback, scrambles to retrofit logging and approval gates into a system that wasn't designed for them.

This rarely works. Retrofitted audit trails capture what the wrapper logs, not what the agent actually did. Bolted-on approval gates end up all-or-nothing because the system wasn't designed for granular risk tiering. And rollback? It covers the easy cases (file changes) and misses the hard ones (API calls, database mutations).

The alternative is to design for trust from the start. That doesn't mean over-engineering. It means a small number of early architectural choices (structured event logging, a tool handler layer with approval hooks, a session model that tracks state) that make trust features natural extensions rather than painful retrofits.

What I've learned

I've built agent infrastructure that incorporates most of these requirements, and I've been in rooms where they were negotiated. A few observations that aren't obvious from the outside.

The people asking these questions aren't trying to block adoption. That CISO who ended my demo? In later conversations, she became one of the strongest advocates for deploying the system, once we'd addressed her concerns. Look, security and compliance teams are often the most pragmatic people in the room. They've seen what happens when systems ship without guardrails. Treat them as design partners, not obstacles.

The requirements aren't as onerous as they sound. An immutable audit trail is a structured log written to append-only storage. A human approval gate is an async checkpoint in your execution loop. Rate limiting is a counter and a threshold. None of these are technically hard. What makes them expensive is retrofitting.

And the investment pays off well beyond the enterprise sales cycle. The audit trails that satisfy a CISO also help your own team debug agent behaviour. The observability that meets compliance requirements also helps you understand performance. Trust infrastructure isn't overhead. It's engineering quality.

The teams that earn enterprise trust aren't the ones with the most impressive demos. They're the ones who can answer the CISO's question ("What happens when it goes wrong?") with architecture, not promises.

Found this useful? Share it with someone who needs to hear it.
Share on X LinkedIn
Next post
The question nobody asks before changing code with AI