Most teams putting LLMs into production still get authorisation wrong in the same way: they bolt the model onto an API key with broad scopes, wrap it in a system prompt that says "don't do bad things", and hope. That works until the day the model is asked to do something it shouldn't, and discovers it can. The fix is not a longer prompt. The fix is moving authorisation out of the model and into a deterministic policy layer that sits between the model and every tool it can call. This article is a working engineer's view of how to build that layer, what belongs in it, and the mistakes I keep seeing in regulated environments.
Why prompt-level guardrails are not authorisation
A system prompt that says "only answer questions about invoices from the user's own company" is a suggestion, not a control. The model will follow it most of the time, fail occasionally under adversarial input, and you will have no audit trail of why it complied or didn't. Authorisation, in any system that has to satisfy an auditor, means three things: a deterministic decision, a written policy that produced the decision, and a log that proves it. None of those are properties an LLM provides.
The other failure mode is scope inflation. A single OAuth token with read access to "all documents" gets handed to the model because it's easier than minting per-request tokens. The moment the model can call a search tool with that token, the model has every permission the token has, regardless of who is in the chat. Tool-layer authorisation closes this gap by re-checking, on every tool call, whether this user is allowed to do this action on this resource, with the model's identity treated as untrusted.
The shape of the layer
Concretely, the tool layer sits between the model runtime and any system the model can affect — databases, file stores, email, CRM, ledger, case management, anything. Every tool the model can call is defined by a JSON schema and a policy binding. When the model emits a tool call, the layer does, in order:
- Identity resolution. Who is the calling user? Not the model — the human or service principal whose session created this turn. The model never holds a token of its own.
- Argument validation. Does the tool call match the schema? Are IDs in the allowed shape? Reject anything malformed before it touches policy.
- Policy evaluation. Run the policy engine with the user, action, resource, and context. Get an allow/deny plus an obligation set (redactions, row filters, rate caps).
- Execution under obligations. If allowed, execute the tool with the obligations applied — for example, a SELECT becomes a SELECT with an added WHERE clause sourced from policy, not from the model.
- Response shaping. Strip fields the user is not entitled to see before the result returns to the model. The model never sees what it is not allowed to act on.
- Audit. Write the decision, the policy version, the inputs, and the obligations to an append-only log.
Each of those steps is boring on its own, which is the point. Boring is auditable.
Policy as code, not policy as prose
The policy itself should be code, not a paragraph in a config file. I use Rego (OPA) for most of this, but Cedar works, and a small homegrown DSL is fine if the rule set is narrow. What matters is that the policy is:
- Versioned in the same repo as the application, with a commit history an auditor can follow.
- Testable — every rule has a unit test with allow and deny fixtures, and the test suite runs in CI.
- Inspectable at runtime — given a decision, you can produce the exact rule path that produced it.
- Independent of the model — the policy engine has no knowledge that an LLM is on the other side. It sees a principal, an action, a resource, and a context, the same way it would for a REST endpoint.
This last point is the one teams skip. The moment you write rules that reason about "what the model intended", you have built a system you cannot audit. Policy decides on facts: user X is asking to read document Y, here is X's role, here is Y's classification, here are the tenant boundaries. The model's intent is not a fact.
Obligations: the part most teams miss
Allow/deny is the easy half. The harder half is obligations — the constraints that travel with an allow decision and modify execution. Three obligation types come up constantly in regulated work:
Row filters. A solicitor is allowed to query the matter database, but only for matters they are assigned to. The policy returns allow with an obligation: append WHERE assigned_solicitor_id = $user.id. The tool layer applies that to the SQL before it executes. The model never constructs the filter; the model never sees rows outside the filter.
Field redactions. A bookkeeper can read invoices but not the bank details on them. The obligation is a field mask applied to the result before it returns. If the model tries to summarise bank details, it can't, because they are not in its context.
Rate and volume caps. Bulk export is the classic exfiltration path. An obligation can cap a single tool call to N rows, or cap a session to a cumulative volume. The model can't argue its way past a counter.
Obligations matter because real-world authorisation is rarely binary. It is "yes, but only these rows, with these fields hidden, at this rate". A tool layer that only does allow/deny pushes the rest of the logic back into the application, where it gets implemented inconsistently. Putting obligations in policy means the rule lives in one place and is enforced everywhere the tool is called.
Identity, delegation, and the model-as-confused-deputy
The classic confused deputy problem applies directly here. The model is a deputy: it acts on behalf of a user, with permissions sourced from that user. If you give the model its own credentials, you have created a deputy with its own authority, and any prompt injection becomes a privilege escalation.
The pattern that works: the user's session produces a short-lived, narrowly-scoped token at the start of the turn. That token is held by the tool layer, never passed to the model, and never used outside the turn it was minted for. Tool calls execute under that token. When the turn ends, the token is destroyed. If the model is somehow tricked into emitting a tool call with a different user's ID in the arguments, the policy engine catches it because the principal in the decision context is sourced from the token, not from the arguments.
For multi-step agents that run across minutes or hours, you need delegation rather than re-use. The user authorises a delegated capability ("you may file documents in matter 4471 on my behalf for the next two hours"), the capability is recorded as a first-class object with its own ID and audit trail, and the agent's tool calls are evaluated against the capability rather than the user's full role. This is more work, but it is the only way to give an auditor a clean answer to "who authorised this action and when".
Logging that an auditor will accept
The log is not a debug aid. It is the artefact a regulator will read. For each tool call, capture: the user principal, the model and version that emitted the call, the tool name, the full arguments after validation, the policy version hash, the decision, the obligations applied, the resource IDs touched, and the result size. Hash-chain the entries so tampering is detectable. Store them somewhere the application cannot retroactively edit.
The reason for the policy version hash is specific: when an auditor asks "why was this allowed in March", you need to be able to reproduce the exact rule set that produced the decision. If your policy lives in a file that gets edited in place, you cannot answer that question. If it lives in git and the commit hash is in every log line, you can.
This is the piece that makes the difference between an AI system you can defend and one you can't. The methodology I use for the Intelligence Brain treats the audit log as the product, not a side effect — the point of running on-premise with deterministic authorisation is that you can hand a regulator a complete, signed record of every action the model influenced.
Where this fits in a wider architecture
Tool-layer authorisation is one of three load-bearing pieces in a defensible AI architecture. The other two are data minimisation at retrieval (the model never sees what it doesn't need to see, even within an allowed scope) and execution boundaries (tool calls run in environments that cannot reach beyond their stated effect). The Intelligence Brain is built around all three, because in regulated work no one of them is sufficient on its own. A tool layer with no retrieval discipline still leaks. Retrieval discipline with no execution boundaries still allows lateral movement. Execution boundaries with no authorisation are just sandboxes around the wrong action.
Where to start this week
Pick the single highest-risk tool your model can call — usually the one that reads customer or matter data — and rebuild it under this pattern: short-lived user token, JSON schema validation, OPA or Cedar policy with at least one row-filter obligation, and a hash-chained audit log. Do it for one tool, end-to-end, with tests, before you do it for ten. You will find the gaps in your identity model, your data classification, and your existing RBAC within the first day, and those gaps were already there — the model just made them visible.