Transformer Circuits - Decomposing Small Language Models

Can we understand what’s going in Large Language Models by dissecting small ones?

Robots in disguise

This post is available here.

Shea Cardozo
Shea Cardozo
PhD Student in Computer Science

Hi! I’m a PhD Student in the Waterloo Intelligent Systems Lab (WISELab) at the University of Waterloo. I am broadly interested in generalization in computer vision and its applications to autonomous systems.