I work on AI safety — trying to make sure powerful AI systems don't go badly wrong. I'm interested in alignment, interpretability, and what it actually takes to build systems we can trust.
This is where I keep things: work I've done, ideas I'm chewing on, writing I've found worth holding, and a record of how I got here.