
On-chain agents are not the same as agents that touch chains.
Four levels of agent-chain integration, three of which are conflated in every pitch deck. A short field guide to what people actually mean when they say "on-chain agent."

FHE vs TEE for ML: when to use which.
Two ways to compute on data you can't see. One is cryptographically pure and 100,000x slower; the other is fast and depends on a chip vendor not being broken. A decision tree.

Most agent demos are lying about the latency. Here is the math.
A 4-second agent looks great on stage and falls over in production. The demo has a few tricks. Once you see them, the latency claims of every other framework get a lot less impressive.

A 280ms latency budget, broken down millisecond by millisecond.
Sub-300ms voice agents are a specific engineering problem. Here is every millisecond a packet spends between the user's mouth and the agent's reply — and where you actually claw the time back.

The x402 micropayment economy: what 119M transactions reveal.
HTTP 402 is no longer reserved. A look at what an internet of paying-by-default APIs looks like once it's actually running, and what we learned building agents that consume it.

Notes on agent budgets: why "let it think longer" is a bug.
An agent that hits a wall and asks for more compute is not reasoning. It is panicking. The budget is part of the spec, not a fallback.

Prompt injection is a vulnerability class, not a bug.
You do not patch prompt injection any more than you patched SQL injection. It is a vulnerability class with four members, and each one needs a different architectural defense.

Folding schemes for zkML, explained without the cryptography.
zkML cannot scale to large models because proving a whole computation in one shot is ruinously expensive. Folding schemes — Nova and its lineage — prove a long, repetitive computation step by step instead. Explained without the cryptography.

Five zkML libraries, benchmarked. Only one ships today.
EZKL, Modulus, Giza, Ora, RISC Zero. Same model, same input, same target chain. Proof times, gas costs, gotchas — and the one we'd put in front of a customer.

Confidential RAG: keep the context secret, not just the query.
Most private RAG protects the user's query in transit and leaves the corpus exposed. But the corpus is the sensitive asset — the embeddings, the vector store, and the chunks the model sees all need protecting.

LLM-as-judge is a model you also have to evaluate.
Teams wire an LLM into the eval harness as the judge and treat its scores as ground truth. But the judge is a model — with measurable biases, shaky calibration, and silent drift. Evaluate it before you trust it to gate a deploy.

Red-teaming an MCP server.
Everyone audits the agent. Almost nobody audits the servers it calls — and an MCP server writes straight into your model's context. This is the supply side of agent security.