Hi, recently I noticed that when catching the gradients from a hook (through the grad_out parameter) that is triggered by a backward() call do not have requires_grad = True nor do they have a grad_fn attribute. However, when I trigger the same hook with autograd, then the values of grad_out caught from the hook are the same, yet they do have requires_grad = True and have a grad_fn attribute. I’m curious towards why this is happening? Thank you so much.
Here is my toy example for reference:
import torch torch.manual_seed(0) # define embedding and linear layers embedding_layer = torch.nn.Embedding(10, 5, padding_idx=0) fc = torch.nn.Linear(5, 6) random_ix = torch.randint(high=10, size=(5,)) embedding_list =  def hook(module, grad_in, grad_out): print("grad out", grad_out) # if triggered through autograd, this will have a grad_fn and requires_grad equal to True, otherwise not embedding_list.append(grad_out) # register hook on embedding layer embedding_layer.register_backward_hook(hook) # do forward pass embeds = embedding_layer(random_ix) output = fc(embeds) print("output", output) merged = torch.sum(output, dim=1) summed = merged.sum() print(summed) # trigger hook through autograd grad_auto = torch.autograd.grad(summed, embedding_layer.weight, create_graph=True) print("grad auto", grad_auto[random_ix]) # trigger hook through backward() call summed.backward()