
OpenInterpretability
Developer ToolsOpen-source toolkit to audit what your LLM knows
About
The inaugural mech interp toolkit operates directly within Claude Code, Cursor, and Cline using MCP. Its production probes, such as FabricationGuard and agent-probe-guard, detect hallucinations and agent malfunctions. The suite includes the ProbeBench leaderboard and offers SAE training, scaling from a free 30-minute Colab notebook to research-level quality. It is released under the Apache-2.0 license.
Launched
May 14, 2026Week 10
Builder
BU
BuilderReviews
Be the first to review
Comments
Sign in to leave a comment
Sign In