01 / SERVICERAG that holds up under eval
Retrieval-augmented generation with the eval harness built in. We pick the chunker, the embedding model, and the retriever — contextual chunking and late-interaction retrieval where they earn their place — and benchmark every change against a golden set you'll keep using after we leave. The same harness ships on its own, for a model already in production with no way to know when it degrades.
- ── Hybrid + late-interaction retrieval, agentic re-query
- ── Contextual & late chunking per document class
- ── ColPali visual retrieval for layout-heavy docs
- ── Faithfulness + groundedness evals, CI-gated
pgvector · bge-m3 · cohere-rerank →
02 / SERVICEAgentic harnesses
Multi-step agents with tool use, tiered memory, and budgets that don't blow up production costs. Built on MCP, gated by evals, traced end-to-end — and that observability and eval layer retrofits onto an agent you already run. Optional x402 / ERC-8004 rails when the agent has to spend money or prove who it is.
- ── Tool orchestration over MCP servers
- ── Budgets, replanning, structured failure contracts
- ── Tiered memory — working, episodic, semantic
- ── Trace-level observability + per-step evals
LangGraph · MCP · Temporal · Langfuse →
03 / SERVICEVoice agents
Streaming voice systems on phone trees, kiosks, and apps, architected to a sub-300ms p95 budget. STT → LLM → TTS with model-based barge-in, drift detection, and PII-safe transcripts. Every component on a millisecond budget.
- ── Sub-300ms p95 latency budget
- ── Model-based barge-in + back-channeling
- ── Call recording + drift detection
- ── PII-safe transcripts
LiveKit · Deepgram · ElevenLabs →
04 / SERVICEPost-training & grounding
Fine-tuning and post-training on your data and your task — for when prompting and retrieval have hit their ceiling. SFT, preference tuning, and eval-gated checkpoint selection. The training recipe is handed off, so you can reproduce every result after we leave.
- ── SFT and preference tuning on your data
- ── Eval-gated checkpoint selection
- ── Distillation for latency and cost
- ── A reproducible training recipe at handoff
TRL · vLLM · Modal →
05 / SERVICEVerifiable inference (zkML / opML)
Cryptographic proof that a model produced a specific output from a specific input — without revealing the weights or the data. zkML (EZKL) for small, fixed models; optimistic-ML and TEE attestation when the model is too big to prove outright. Built for auditable risk models, oracle feeds, and prediction-market settlement.
- ── EZKL Halo2 proofs for small, fixed models
- ── opML + hardware-attested TEE (H100/H200) for production-scale models
- ── RISC Zero zkVM for general execution proofs
- ── On-chain verifier contracts + governance hooks
EZKL · RISC Zero · Ora · Phala →
06 / SERVICEDecentralized training & sovereign compute
DiLoCo-style distributed pretraining — a technique that now genuinely reaches 40–72B parameters over the open internet — plus Bittensor subnet design and GPU-market cost modeling. We build the validator policy, the emissions curve, and the verification layer so untrusted workers can still produce trusted gradients.
- ── Bittensor subnet design + validator policy
- ── DiLoCo / DisTrO communication-efficient SGD
- ── TOPLOC-style verification of rollouts
- ── GPU market routing (Akash, io.net, Crusoe)
Bittensor · PRIME-RL · Psyche · Akash →
07 / SERVICEOn-chain agents & autonomous economics
Agents that hold wallets, sign transactions, settle services with stablecoins, and prove who they are. We build agents for Polymarket, Base, and custom appchains — with hard refusal-on-edge and PnL ceilings.
- ── ERC-8004 identity + reputation registries
- ── x402 micropayments + AP2 mandates for tool calls
- ── ElizaOS / Olas / Virtuals composition
- ── Tiered spend ceilings, treasury isolation, circuit breakers
ElizaOS · ERC-8004 · x402 · Olas →
08 / SERVICEAI-agent security & audits
Security audits for agents that hold wallets and sign transactions. We red-team the prompt-injection-to-transaction attack surface that smart-contract auditors don't cover — because the contract is fine; the agent is the hole.
- ── Prompt-injection → transaction red-teaming
- ── Spend-limit and refusal-boundary review
- ── Signing-key isolation + MCP allowlist audit
- ── ERC-8004 identity hygiene
Foundry · ERC-8004 · custom injection suites →