Blog

Mar 9
2024

Paper page — AtP*: An efficient and scalable method for localizing LLM behaviour to components

Google presents AtP

An efficient and scalable method for localizing LLM behaviour to components.

Activation Patching is a method of directly computing causal attributions of behavior to model components.

Join the discussion on this paper page.

/* */