Writing
Technical posts on RAG, evals, and production AI.
Two posts a month. No takes. No predictions. Just what we ship and what it taught us.
RSS · no marketing email, ever
- 9 min
Measuring hallucination without a vendor
Why most hallucination metrics are noise, what to measure instead, and a 40-line LLM-as-judge rubric you can run in CI tonight.
#evals#RAG - 12 min
The 11% — what mature evals actually look like
We surveyed 84 mid-market SaaS teams shipping AI. The 11% with real eval discipline have four things in common.
#evals#industry
Newsletter
Read what we ship.
Technical writing on RAG, evals, and production AI. Two posts a month. No marketing.
Read our latest writing