Autograd not working properly for LongTensor?


It seems like there is a difference in results when gradients are calculated for LongTensor

import torch
from torch.autograd import grad
x = torch.tensor([1,2,2], requires_grad=True)
l = torch.norm(x.float())
g = torch.autograd.grad(l, x)

Prints (tensor([ 0, 0, 0]),) which is wrong.

However, if change the code slightly to

x = torch.tensor([1.0,2,2], requires_grad=True) #Note 1.0 instead of 1
l = torch.norm(x.float())
g = torch.autograd.grad(l, x)

It prints (tensor([ 0.3333, 0.6667, 0.6667]),) which is correct.

Any idea what might be happening?

tensors of integral types shouldn’t require grad. this has been implemented as a hard constraint on master.

What is the reason for this restriction? Also, I think it might be better to throw an exception in this case as opposed to failing silently.

yes, as i said, it throws hard error on master now.

assuming you are doing gradient descent type optimization, then since integral types are discrete, so it’s natural to not allow this.