Topic · 3 pieces
Agents
On orchestration, tool use, memory, and the specific failure modes of LLM-driven multi-step systems.
← All writing012 min → 021 min → 031 min →
Deterministic orchestration around non-deterministic models
How to build agent workflows you can replay, diff, and certify — when the underlying LLM call is none of those things.
Agent security is not model security
Why prompt-injection benchmarks tell you almost nothing about whether your agent is safe to deploy — and what to test instead.
End-to-end evals for agentic systems
Unit tests and benchmarks miss the failures that actually break agents. A pattern for evaluating the system as a whole.