Numpy operations in pytorch autograd

ku294714 · May 15, 2020, 2:40pm

Hello All,

I have come across a situation where i have a function written in matlab and i call it in my pytorch model to perform final computation on my model output. But during back propagation, my model parameter’s gradients are not calculated.

I reproduced the issue with a toy example for better understanding:

import torch
import numpy as np
def b_func(x):
abc = torch.tensor([[1]],dtype = torch.float)
x = abc + x
return x
X = torch.tensor([[2]],dtype = torch.float, requires_grad = True)
W = X ** 2
Y = W ** 3
print(Y)
Z = b_func(Y)
print(Z)
Z.backward(retain_graph=True)
print(X.grad)

This is the expected behavior of pytorch and there is no issue here. However, when i perform the below code:

def a_func(x):
abc = np.array([[1]])
x = x.detach().numpy()
x = abc + x
x = torch.tensor(x, requires_grad = True)
return x
XX = torch.tensor([[2]],dtype = torch.float, requires_grad = True)
WW = X ** 2
YY = W ** 3
print(Y)
ZZ = a_func(Y)
print(ZZ)
ZZ.backward(retain_graph=True)
print(XX.grad)

What changes shall i make for gradients to flow in this scenario. Apart from using pytorch tensors.

albanD · May 15, 2020, 3:03pm

Hi,

When you use .detach(), you break the graph and so gradients cannot be computed anymore.

But the root issue is that for the autograd to work, the autograd engine needs to be able to know how to compute the gradient for each operation that is done.
Unfortunately, if you don’t use pytorch ops, it cannot know how to compute the gradients and so you won’t be able to get gradients.

You will either have to re-implement it using pytorch operations.
Or you will have to write a custom function where you tell the autograd how to compute the backward for that part where it does not know how to do it. See doc here on how to do this.

ku294714 · May 15, 2020, 3:06pm

Thank @albanD for quick response. I had that those solutions in my mind, I wish there was a workaround.
Thank you for the response, it makes things more clear now.