Activation Hooks¶
Extract activations from model layers during inference.
ActivationHook¶
Hook for capturing activations from specific model layers.
Works with HuggingFace transformers models (GPT-2, Mistral, Llama, etc).
Example
hook = ActivationHook(model, layer_indices=[10, 15, 20]) with hook: ... outputs = model(**inputs) act = hook.cache.get("layer_15") # (batch, seq, hidden)
Source code in src/rotalabs_probe/probing/hooks.py
__init__(model: nn.Module, layer_indices: List[int], component: str = 'residual', token_position: str = 'all')
¶
Initialize activation hook.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
model
|
Module
|
HuggingFace model to hook |
required |
layer_indices
|
List[int]
|
Which layers to capture |
required |
component
|
str
|
What to capture - "residual", "attn", or "mlp" |
'residual'
|
token_position
|
str
|
"all", "last", or "first" |
'all'
|
Source code in src/rotalabs_probe/probing/hooks.py
__enter__()
¶
Register hooks on specified layers.
Source code in src/rotalabs_probe/probing/hooks.py
extract_activations¶
Extract activations for multiple texts at specified layers.