Trust, but Trace

Trust,
but Trace

A journal on what it takes to make ML systems trustworthy in production — traces, fixtures, contracts, replays. Written from the engineering side.

“Most of what an LLM agent does in production has never appeared in any evaluation set.”

Topics

Recurring threads

Six threads that keep coming back. Click one to see every piece tagged with it.

On the desk

Reading list

What’s on the desk this season.

  • 01
    AI Engineering: Building Applications with Foundation Models
    Chip Huyen · 2025
    Book
  • 02
    Latent Space — Artificial Analysis on independent LLM evals
    swyx & Micah Hill-Smith · Jan 2026
    Podcast
  • 03
    Multimodal, Real-Time AI Agent Systems
    Heiko Hotz & Sokratis Kartakis · 2027
    Book
  • 04
    EU AI Act — Article 15: Accuracy, Robustness, Cybersecurity
    European Commission · 2024
    Reference
  • 05
    Hands-On Large Language Models
    Jay Alammar & Maarten Grootendorst · 2024
    Book
  • 06
    Constitutional Classifiers
    Anthropic · 2025
    Paper
  • 07
    Pragmatic Engineer — Inside engineering at frontier labs
    Gergely Orosz · 2026
    Podcast