Menu
← Field notes
◇ ARCHIVEPAGE 2 / 4 · OLDEST → NEWEST

The field notes archive.

2026.04.07 VOICE

Turn-taking is the hard part of voice agents.

Transcription is largely solved. Knowing when the caller has finished, when to stop for an interruption, and when an 'mm-hm' is not a turn — that is not. Endpointing, barge-in, and backchannels, measured.

2026.04.10 TRAINING

Decentralized training in 2026: what works, what's still vapor.

A grounded look at distributed pretraining across untrusted GPUs. DiLoCo, DisTrO, INTELLECT-2, Bittensor's Templar, 0G's DiLoCoX — what each actually shipped, and what hasn't.

2026.04.14 AGENTS

Multi-agent systems are usually one agent too many.

Splitting a task across coordinating agents adds context-handoff loss, compounding latency and cost, and a wider failure surface — overhead that usually exceeds the benefit. Start with one agent.

2026.04.17 PRIVACY

PII redaction that does not wreck retrieval.

Stripping PII before documents reach the embedding model is often necessary. But naive redaction destroys the semantic signal retrieval depends on. How to redact without wrecking retrieval.

2026.04.21 STANDARDS

A2A and MCP: two protocols, two jobs.

A2A and MCP get framed as rivals. They are not. MCP connects an agent to its tools; A2A connects agents to each other — different jobs at different layers, and a serious multi-agent system needs both.

2026.04.22 EVAL

You don't have a RAG problem. You have a chunking problem.

Most teams blame the retriever. The retriever is fine. Your chunks don't carry their context — and no amount of reranking saves them.

2026.04.25 SECURITY

Signing-key custody for autonomous agents.

Assume the model gets injected — then ask where the signing key lives. MPC, HSMs, multisig, and session keys, judged on one question: can a fully compromised agent reach the key?

2026.04.29 EVAL

Your golden set is rotting.

A golden evaluation set is not a fixed asset — it decays. The world changes, the product shifts, the team overfits, and the pass rate quietly stops meaning anything. Eval data needs a maintenance protocol.

2026.05.02 VOICE

Graceful failure for voice agents.

A voice call is real-time and unforgiving — there is no spinner to show, and dead air reads as a broken product. When STT, the LLM, or TTS fails mid-call, the system has to degrade, not drop.

2026.05.05 STANDARDS

What ERC-8004 actually means for agent identity.

Agents need to prove who they are to each other without going through a central directory. ERC-8004 is the first standard that ships the three registries needed for that. Here is what it does and what it does not.

2026.05.07 AGENTS

Context engineering beats prompt engineering for long-running agents.

For a long-running agent, the system prompt is a small part of the problem. The real discipline is managing the context window across the whole run as a budget — keep, drop, compact, retrieve.

2026.05.09 TRAINING

RL environments are the new dataset.

Post-training has shifted from supervised fine-tuning on static labeled data toward reinforcement learning, and that moves the unit of data work from a labeled file to an executable environment. Building good environments is the new data engineering — and the scarce input.

NEW ENGAGEMENT · INTAKE

Tell us about it.

The more specific you are, the more useful our first reply.

SERVICE AREA
↩ ENCRYPTED IN TRANSIT