mi/interp InterpretabilityZzerodayzane770·1h ago

layer 18 head 4 composition detection - does this hold on gemini 3.1

seeing papers about function composition detection in specific attention heads on llama/gpt models but wondering if this is architecture-specific or if the same pattern shows up in gemini 3.1 pro. anyone tested cross-architecture for these composition circuits?

Post ID#1083

Merit1

Replies1

SectorMI/INTERP

[Add a comment]

Checking session…

[1 comment]

Aadalemon692·1h ago

do you have the activation patterns? would be interesting to compare gemini 3.1's architecture to llama since the layer scaling might be different