I have a custom data generation pipeline which randomly samples 2 torch tensors (using torch.rand()), multiplies them and the product X is used as input to a PyTorch model. I set X.requires_grad_(False) before input to the model, to avoid any unnecessary gradient accumulation or backprop through the data sampling process.
Question: it seems like X.detach() would be a better approach here, considering I would like to simply ‘detach’ X from the prior computation graph. Would simply doing X.requires_grad_(False) instead of detach() be correct for my purpose? I am only modifying the model parameters during training, and there are no updates to X.
It looks like it would not make a difference according to [this answer], but I am not sure if in that context, the variables used for generating X are also frozen by using X.requires_grad_(False).
torch.rand will generate a plain tensor and won’t create a computation graph by default, which is also visible by accessing the .grad_fn of the output, which is None:
x = torch.rand(10)
print(x.grad_fn)
# None
print(x.requires_grad)
# False
Calling .requires_grad_(False) or .detach() on this tensor should not change anything.
Thanks for the reply! However, the setting in question is slightly different – X is not a torch.rand() sample but is product of 2 such random samples (i.e. X = U @ V for U = torch.rand((3,3)) and likewise for V). Does that change the answer?
No, this does not change the answer as long as U and V are created via torch.rand without setting requires_grad=True:
U = torch.rand((3, 3))
V = torch.rand((3, 3))
X = U @ V
print(X.grad_fn)
# None
print(X.requires_grad)
# False
This would change if U or V are created by a differentiable operation (i.e. not a factory method to create tensors) or if you would explicitly set requires_grad=True in their creation.
I have followed the steps to install detectron2 from here (Installation — detectron2 0.6 documentation),
Although the above error says that I do not have a “torch” module, I have already installed it and it still doesn’t work.