Detach() vs requires_grad_(False) -- generating data

theory_buff · January 17, 2024, 3:51pm

I have a custom data generation pipeline which randomly samples 2 torch tensors (using torch.rand()), multiplies them and the product X is used as input to a PyTorch model. I set X.requires_grad_(False) before input to the model, to avoid any unnecessary gradient accumulation or backprop through the data sampling process.

Question: it seems like X.detach() would be a better approach here, considering I would like to simply ‘detach’ X from the prior computation graph. Would simply doing X.requires_grad_(False) instead of detach() be correct for my purpose? I am only modifying the model parameters during training, and there are no updates to X.

It looks like it would not make a difference according to [this answer], but I am not sure if in that context, the variables used for generating X are also frozen by using X.requires_grad_(False).

Thanks!

ptrblck · January 17, 2024, 4:20pm

torch.rand will generate a plain tensor and won’t create a computation graph by default, which is also visible by accessing the .grad_fn of the output, which is None:

x = torch.rand(10)
print(x.grad_fn)
# None
print(x.requires_grad)
# False

Calling .requires_grad_(False) or .detach() on this tensor should not change anything.

theory_buff · January 17, 2024, 4:24pm

Thanks for the reply! However, the setting in question is slightly different – X is not a torch.rand() sample but is product of 2 such random samples (i.e. X = U @ V for U = torch.rand((3,3)) and likewise for V). Does that change the answer?

ptrblck · January 17, 2024, 4:29pm

No, this does not change the answer as long as U and V are created via torch.rand without setting requires_grad=True:

U = torch.rand((3, 3))
V = torch.rand((3, 3))
X = U @ V
print(X.grad_fn)
# None
print(X.requires_grad)
# False

This would change if U or V are created by a differentiable operation (i.e. not a factory method to create tensors) or if you would explicitly set requires_grad=True in their creation.

Arun_Sandy · January 17, 2024, 7:32pm

While trying to install detectron2 and deepspeed I’m getting below errors :

I have followed the steps to install detectron2 from here (Installation — detectron2 0.6 documentation),
Although the above error says that I do not have a “torch” module, I have already installed it and it still doesn’t work.

This is the same case with installing deepspeed:

My python version is 3.9 and torch version is :

Please let me know what might me the issue here.

ptrblck · January 17, 2024, 7:43pm

Your issue doesn’t seem to be related to this topic.

Arun_Sandy · January 17, 2024, 8:21pm

I have started this issue as a new topic in the below link :

Please have a look at it.