Usage of Numpy and gradients


I was wondering if gradients are automatically computed if Numpy is in the Forward() function of nn.module, i.e, torch tensors are converted to numpy arrays, a numpy op is applied and then we convert it back to torch tensors.

Is there any implication of doing so?



No, if you use numpy operations inside the forward of your module, they won’t create nodes in the computation graph of the network, and thus won’t be differentiated.

You can write a new Function though that uses numpy internally, but you need to provide the backward computation for it. Here is a nice introductory tutorial explaining how to use numpy to create new Functions.

1 Like