Odd behavior when trying to get activation

I created a transformer model with one encoder layer:

class MyModel(nn.Module):
    def __init__(self):
        self.encoder_layer = nn.TransformerEncoderLayer(d_model=2, nhead=1, dim_feedforward=2)
        self.trans = nn.TransformerEncoder(self.encoder_layer, num_layers=1)
    def forward(self, x):
        y = self.trans(x)
        return y

model = MyModel()

x = torch.tensor([[1., 0]])

Getting the activation of the norm succeeds. However, when I follow up with this line:


I encounter this error: AttributeError: ‘tuple’ object has no attribute ‘detach’

And then when I followup with this line again:


I get the same error. So in the first iteration when I get the norm1 activations, all is fine. Then I try to get the multihead attention activation, it fails. And then when I try to go back adn get the norm1 activations, it fails even though it worked at first. Why does the multihead attention and 2nd time I get norm1 result in this error?

AttributeError: ‘tuple’ object has no attribute ‘detach’

Once a hook is registered it stays registered for the module and you would need to call handle.remove() to remove it.
I guess you are trying to call into three hooks in your latest run where one of them fails.

Oh got it thanks. Do you have any idea why I am seeing this error though? I want to look at the activations of the self attention.

Some modules return tuples and your hook is most likely calling .detach() on it directly as it expects a tensor. You could add some logic which checks the type of the input and either calls .detach() on a tensor or unpacks the tuple before calling .detach() on all tensors (or some of them; depending what you want).