1 series · 2 pieces
Series
Multi-part essays — read in order, or jump to any part. Each series has a thread holding it together; the parts compound.
Series
Reliability series
A multi-part argument. Reads best in order; each part references the last.
2 of 3 published
Part 1/3
Engineering
Deterministic orchestration around non-deterministic models
How to build agent workflows you can replay, diff, and certify — when the underlying LLM call is none of those things.
Part 2/3
Engineering
End-to-end evals for agentic systems
Unit tests and benchmarks miss the failures that actually break agents. A pattern for evaluating the system as a whole.
Part 3 · drafting
Coming up
Title under embargo. Subscribe to get it the day it ships.