Understand AI from the inside out.
A path from black box to clear mental model. The interpretability research that explains the model, and the system knowledge that lets you build with it safely. No PhD required.
What Is Mechanistic Interpretability?
Features, circuits, superposition, and sparse autoencoders - the field explained from the inside out.
Six levels, in order. Take them one at a time.
- 01
Understand AI systems and black boxes
What a model is, what it isn't, and why “it usually works” is a dangerous standard once systems can act.
- 02
Learn core interpretability concepts
Features, circuits, superposition, and SAEs - the vocabulary for what's inside a model.
- 03
Understand agents, tools, context, and memory
How a model becomes an agent: the action loop, tool calling, retrieval, and the context window.
- 04
Build practical AI workflows
Wire models into real systems - MCP, RAG, and dev workflows you can explain and debug.
- 05
Add evals, logging, permissions, and safety
Turn “it seemed to work” into evidence, and “we trust it” into enforced limits.
- 06
Follow the frontier
Keep up through the Signal - what's real, what's hype, what builders can actually use.
Get new explainers as they ship.
Plus the weekly Signal. Free, no noise.