Detach but still make autograd trace the gradients

Abdelrahman_Akram · May 23, 2021, 7:12am

In the middle of my model, there is some for loops that I must do, there is no replace for it, so to make it faster I will use numba.njit, but it works for only numpy arrays and also if I detach the tensor the gradients won’t be recorded. So is there a solution for that, and also I heard of torch.jit.trace, will it do the same as numba.njit and record the gradients.

albanD · May 24, 2021, 1:46pm

Hi,

If you have to break the autograd, you can use a custom Function Extending PyTorch — PyTorch 1.8.1 documentation.
You will have to specify what the backward pass of this op is though.

Abdelrahman_Akram · May 24, 2021, 1:52pm

Oh I get it, another thing please, in this code

import torch
depth  = torch.randn((22000,3), requires_grad= True)
depth[depth<0].zero_()

rot_sin = torch.sin(torch.FloatTensor([0.4]))
rot_cos = torch.cos(torch.FloatTensor([0.4]))
rot_mat_T = torch.FloatTensor(
            [[rot_cos, 0, -rot_sin], [0, 1, 0], [rot_sin, 0, rot_cos]],
        )
rot_mat_T.requires_grad = True
depth = depth @ rot_mat_T

linear = torch.nn.Linear(3,4)
out = linear(depth)
out.sum().backward(retain_graph =True)
print(depth.grad)

It gives me this warning

UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the gradient for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations.
  print(depth.grad)

put if I replaced this line depth = depth @ rot_mat_T with new_depth = depth @ rot_mat_T
and out = linear(depth) to out = linear(new_depth) it works and print the gradient, why is that?

albanD · May 24, 2021, 1:58pm

When you do that, you make the “depth” python variable point to the result of depth @ rot_mat_T instead of the Tensor it was pointing out before (the result of torch.randn((22000,3), requires_grad= True)).

But the .grad field is only populated for leaf Tensors that require gradients.
And you can check that the “depth” just afterr creating it at the top .is_leaf==True.
But after depth = depth @ rot_mat_T the new Tensor “depth” points to has .is_leaf==False.

So the .grad field won’t be populated, hence the warning.

Abdelrahman_Akram · May 24, 2021, 2:01pm

So if I want the grads flow to the first depth variable, I have to make the new_depth step, right ?
and I f I did so won’t it take more memory !?

albanD · May 24, 2021, 2:19pm

Not really, because the computational graph keeps a reference to the original “depth” Tensor and so that Tensor stays alive anyways. So you won’t see any memory difference.