On-chain agents are not the same as agents that touch chains.
Four levels of agent-chain integration, three of which are conflated in every pitch deck. A short field guide to what people actually mean when they say "on-chain agent."

Four levels of agent-chain integration, three of which are conflated in every pitch deck. A short field guide to what people actually mean when they say "on-chain agent."

A founder pitched me an “on-chain AI agent” last month. I asked what made it on-chain. He said it had a wallet. I asked if the agent ran on-chain. He said it ran on a server he controlled. I asked if its prompts and tools were on-chain. He said no, those were stored in his Postgres.
That agent is not on-chain. That agent has a wallet. There is a difference, and the difference matters when you’re underwriting risk, valuing assets, or trying to figure out whether a system actually does what it claims.
This is a short field guide. Four levels of chain integration, in order of increasing on-chain-ness. The names are mine but the distinctions are real and load-bearing for anyone evaluating these systems.
The agent runs entirely off-chain. The operator holds the agent’s private keys. When the agent decides to spend money, it tells the operator’s backend, which signs a transaction on the agent’s behalf and broadcasts it.
Examples: most “AI trading agents” you’ve heard of from non-crypto-native operators. The Telegram-bot trading agents that exploded in 2024. The first generation of Virtuals Protocol agents before their custody migration.
What’s true about this level:
Level 1 agents are most of the agents people call “on-chain AI” today. They are agents that touch chains. They are not on-chain in any meaningful technical sense.
The agent’s keys are held by the agent process itself (or by a TEE that the agent process runs in). The operator does not have the private key. The agent signs its own transactions and broadcasts them directly to a chain RPC.
This is a real upgrade over Level 1. The operator no longer has unilateral authority to sign for the agent. If the operator’s server is compromised, the attacker has to also extract the agent’s signing key — which lives in a TEE, an HSM, or at minimum a separate process with restricted access.
Examples: ElizaOS agents running in their default configuration. Agents using Coinbase AgentKit’s smart wallet integration. Most Phala-hosted agents.
What’s true about Level 2:
Level 2 is what most people think they’re getting when they hire an “on-chain agent.” It’s a legitimate architecture for many use cases. It is not, in the strict sense, on-chain.
The agent’s keys are held by a TEE that has been attested by an on-chain registry. The agent’s code is committed on-chain (or in a content-addressable storage layer like IPFS, with the hash on-chain). The agent’s triggers — the conditions under which it acts — are also published on-chain. The agent runs on a network of TEE-equipped workers, any of which can take over if the current one fails.
This is the architecture that systems like Olas (Autonolas) ship at production scale. The agent is not running inside a smart contract — TEE compute is still off-chain — but every meaningful property of the agent (its identity, code, behavior policy, accountability) is anchored to the chain. Replace one TEE worker with another and the agent continues operating uninterrupted.
Examples: Olas agents (their entire framework is built around this). Advanced ElizaOS deployments with Phala TEE + on-chain code hashes. Market-making agents on Polymarket built to this spec sit at this level.
What’s true about Level 3:
Level 3 is where “on-chain agent” starts to be a meaningful claim. It’s also where the engineering complexity sharply increases.
The agent’s compute runs as a smart contract. There is no off-chain server. The agent’s “thinking” is contract bytecode; its “memory” is contract storage; its triggers are events on the chain; its actions are calls from its own contract address.
Real, useful, fully-on-chain agents are rare in 2026. Why? Because LLMs are expensive to evaluate on-chain. A single Claude call requires tens of millions of gas to verify, even if you had a way to run inference on-chain — which you don’t, directly.
What does exist at Level 4: agents with rule-based decision logic (no LLM). These are common — most DeFi “agents” (Yearn vaults, Curve gauges, Convex strategies) are fully on-chain rule executors that nobody calls AI agents but that have all the architectural properties of one. They live on-chain, take inputs from on-chain oracles, and act on-chain.
The bridge between “fully on-chain rule-based agent” and “fully on-chain LLM-driven agent” is closing. Two trends:
Both are real, both are shipping, neither is widely adopted yet. They turn off-chain compute into a credibly-on-chain-verified output. Combined with a Level 3 architecture, you get something approaching fully on-chain in the senses that matter for trust, while still allowing the actual model inference to happen at reasonable cost.
When teams ask us for “an on-chain agent,” they almost never want Level 4. What they actually want is some combination of:
The first one is cheap. The fourth one is expensive. Most clients can live somewhere in the middle, and the right architecture depends on the use case.
A high-stakes trading agent that holds millions of dollars: Level 3, with zkML proofs on the decision boundary. The operational expense is justified because the trust requirement is high.
A research assistant that does customer support for a startup: Level 1 or 2. Nobody is going to audit the decision-making, and the trust requirement is “the company stands behind the agent.”
A DAO that automates treasury management: Level 3, full stop. The agent’s code and policy have to be public, because that’s the point of the DAO. The compute can be off-chain (TEE worker) but the policy must be on-chain.
The pitch deck says “on-chain AI agent.” The marketing site says “fully autonomous.” The investor memo describes a “trust-minimized” system.
What you should ask, in order:
Most “on-chain AI” pitches answer (1) acceptably and degrade rapidly through (2) through (5). That’s not a fatal flaw — Level 2 and Level 3 systems are useful and valuable. But you should know what level you’re getting.
The phrase “on-chain agent” got popular because it sounds technically substantive. Treat it the same way you’d treat “AI-powered.” If someone says it, ask which level. If they can’t answer cleanly, they probably haven’t thought through the architecture; that’s information.
The agents I’d put real money behind, today, are Level 3 with at least optional zkML on the high-value decision boundaries. That’s a small set. It’s a real set. Build for it.

A one-word change to a system prompt can move accuracy by dozens of points, and a provider's model update can regress your app overnight. A prompt or model swap is a deploy. Give it a staged rollout and a one-action rollback path.
11 min →
The monthly inference bill arrives as one number, and nobody can say which agent, which customer, or which tool spent it. Agent cost is too variable to estimate and has to be attributed after the fact — per run, per tool, per tenant. The layer most stacks skip.
11 min →
An agent that asks permission for everything trains its reviewers to rubber-stamp, and the one dangerous action slips through in the noise. Approval gates belong on consequence and on uncertainty — not on every step. Where to put them.
12 min →