Signing-key custody for autonomous agents.
Assume the model gets injected — then ask where the signing key lives. MPC, HSMs, multisig, and session keys, judged on one question: can a fully compromised agent reach the key?

Assume the model gets injected — then ask where the signing key lives. MPC, HSMs, multisig, and session keys, judged on one question: can a fully compromised agent reach the key?

Here is the architecture most agents that sign transactions actually ship with: a private key sits in an environment variable, the agent process reads it, and somewhere in the code is a function the model can cause to run — sign(payload) — that will sign whatever bytes it is handed. It works. It demos well. It is also the single worst place a key can live, because the prompt-injection class already told you the model will, eventually, be made to propose something hostile. When it does, this architecture signs it. One injection, one drained wallet, no exploit required.
The prompt-injection essay ended on a principle: assume the model is compromised, and build so that the compromise does not matter. This post applies that principle to the one operation where it matters most. Where does the signing key live, and can a fully compromised agent reach it? Everything else in agent key management is a footnote to that question.
Name the default precisely so it can be killed. A “hot key” is raw key material — a plain externally-owned-account private key — present in the agent’s runtime, reachable by the same process that runs the model loop. The model does not literally read the key; it does not need to. It needs only to reach the code path that uses it, and in a hot-key design that path is one tool call away.
Every defense in this post is a way of moving the key, the signing decision, or both out of that process — somewhere an injected model cannot follow. The key must live where the model’s process cannot directly reach it. That is the bar. Architectures clear it or they do not.
“Use a better wallet” is not a plan. The real question has two independent halves — where does the key material live, and who decides what gets signed — and the common custody models answer the first half differently:
| Model | Key material | Compromised agent can extract key? | Note |
|---|---|---|---|
| Hot EOA key | In the agent process | Yes | The default. Indefensible for anything funded. |
| HSM | In tamper-resistant hardware | No | Strong isolation; signs what it is asked. |
| Multisig | N separate keys, M required on-chain | Only its own share | A co-signer can veto; finality is on-chain. |
| MPC / TSS | Secret-shared; full key never assembled | No | Self-hosted keeps shares in your infra. |
| Session key | Scoped sub-key; root key stays offline | Only the session key’s authority | Least-privilege for signing; expires. |
HSMs put the key in tamper-resistant hardware: the agent calls a signing API and never sees key bytes. Multisig — a Safe is the reference — spreads signing across N keys and requires M, so a second signer (a human, a policy service) can refuse. MPC and threshold signatures secret-share the key so the complete key is never assembled anywhere; a compromised agent runtime has nothing to steal, and self-hosted MPC keeps every share inside infrastructure you control. One caveat worth knowing: many MPC products retain one share on their own servers for liveness and policy, which means their infrastructure co-signs every transaction — convenient, and a third-party dependency you should name rather than discover.
All four clear the bar from the last section: a compromised agent cannot walk off with the key. Which is necessary, and — read the table’s third column against its fourth — not the end of the story.
Here is the trap that catches teams who buy a custody product and call it done. Look at what the table answers: can the agent extract the key. Look at what it does not answer: can the agent get a bad transaction signed. An HSM signs what it is asked. An MPC service computes a signature over whatever payload the protocol receives. Move from a hot key to an HSM and the model can no longer steal the key — but if the model can still submit any payload and the signer signs it, the injected agent drains the wallet anyway. It just does so without ever touching key bytes.
Key isolation and signing control are two different controls. Custody — HSM, MPC, multisig — answers the first. The second is the privilege boundary from the prompt-injection essay: a deterministic policy engine between the model’s proposal and the signer, validating every intent against rules the model cannot see or edit. The model proposes a structured intent; policy checks it; only a passing intent reaches the signer. We drew that full model-to-signer path in the wallet-audit essay; custody is the lower half of that diagram done properly, and it is half. Ship both halves or you have moved the vulnerability, not closed it.
The most agent-shaped idea in this space comes from account abstraction. Under ERC-4337, an account is a smart contract, and a smart-contract account can issue session keys — subordinate keys whose authority is constrained on-chain. A session key can be scoped to specific contracts, given a spending cap, and set to expire: the canonical example is a game session key bounded to one game’s contracts with a 0.05 ETH ceiling and a four-hour life.
Turn that on an autonomous agent and it becomes least privilege for signing. The root key is generated, used once to authorize a session key, and goes back to cold storage — it never goes near the model. The agent holds only a session key scoped to exactly the contracts its task touches, capped at the value its task needs, and expiring on a horizon you choose. Now reason about a successful injection: the blast radius is not “the wallet,” it is “whatever this session key was scoped to” — and that scope lapses on its own. Session keys do not replace the policy engine; they shrink the authority the policy engine is defending, and a smaller authority is a smaller thing to get wrong. Pair them with the tiered ceilings from the spend-rails essay and the agent’s worst possible day is bounded before policy even runs.
Keys are not set-and-forget, and an agent makes that sharper, because the agent is the thing most likely to be compromised. Three operations have to exist on day one:
A custody design that cannot rotate, revoke, and recover is not a custody design. It is a key with good intentions.
Assemble the parts. The model proposes a structured intent and holds no key material. A deterministic policy engine validates the intent — destination, value, contract, rate — against rules outside the model’s reach. Only a passing intent reaches the signer, and the signer holds, at most, a session key scoped to the task; the root key is in MPC or an HSM or cold storage, never in the agent’s process. Revocation can cut the agent’s authority in seconds. A hostile MCP server (red-team it before you trust it) or a poisoned document can still inject the model — and the worst it achieves is a proposal that policy rejects, signed by nothing.
Before an autonomous agent signs anything funded:
The prompt-injection essay said to assume the model is compromised. Custody is where you find out whether you meant it. Put the key where a compromised model cannot follow — and put a policy engine between the model and the signer, because isolating the key was only ever half the job.

A one-word change to a system prompt can move accuracy by dozens of points, and a provider's model update can regress your app overnight. A prompt or model swap is a deploy. Give it a staged rollout and a one-action rollback path.
11 min →
The monthly inference bill arrives as one number, and nobody can say which agent, which customer, or which tool spent it. Agent cost is too variable to estimate and has to be attributed after the fact — per run, per tool, per tenant. The layer most stacks skip.
11 min →
An agent that asks permission for everything trains its reviewers to rubber-stamp, and the one dangerous action slips through in the noise. Approval gates belong on consequence and on uncertainty — not on every step. Where to put them.
12 min →