Trust,
but Trace

A journal on what it takes to make ML systems trustworthy in production — traces, fixtures, contracts, replays. Written from the engineering side.

“Most of what an LLM agent does in production has never appeared in any evaluation set.”

Topics

Recurring threads

Six threads that keep coming back. Click one to see every piece tagged with it.

On the desk

What’s on the desk this season.

01
AI Engineering: Building Applications with Foundation Models
Chip Huyen · 2025
Book
02
Latent Space — Artificial Analysis on independent LLM evals
swyx & Micah Hill-Smith · Jan 2026
Podcast
03
Multimodal, Real-Time AI Agent Systems
Heiko Hotz & Sokratis Kartakis · 2027
Book
04
EU AI Act — Article 15: Accuracy, Robustness, Cybersecurity
European Commission · 2024
Reference
05
Hands-On Large Language Models
Jay Alammar & Maarten Grootendorst · 2024
Book
06
Constitutional Classifiers
Anthropic · 2025
Paper
07
Pragmatic Engineer — Inside engineering at frontier labs
Gergely Orosz · 2026
Podcast