For TransformerEncoder
network, when I created a forward hook, it does not get called if there are model.eval()
and with torch.no_grad():
Here is a code to reproduce:
import torch
model = torch.nn.TransformerEncoder(encoder_layer=torch.nn.TransformerEncoderLayer(100, 4, 200, batch_first=True), num_layers=3).to('cuda')
input = torch.randn(2, 10, 100).to('cuda')
def createHook(name):
print(f"Hook for {name} is set")
def hook(model, input, output):
print(f"Hook working")
return hook
for i in range(len(model.layers)):
model.layers[i].self_attn.register_forward_hook(createHook(f"t_layer_{i}"))
model.eval()
with torch.no_grad():
pred = model(input)
pred = pred.cpu().detach().numpy()
It gives “Hook for * is set” as output, but it does not print “Hook working”, (and it does not actually call the hook).
I observed that when there is only model.eval()
, it works. Also, when there is only with torch.no_grad()
, it also works. Somehow using both of them make it not working. And also, I only observed this with TransformerEncoder
Is this some kind of a bug? Or, am I missing something?