DasPantom
(Pan Kessel)
March 12, 2017, 11:36pm
1
Hello,
I am trying to write a Function for torch.trace (seems not to exist so far). Here is my code:
class CustomTrace(torch.autograd.Function):
def forward(self, mat):
self.save_for_backward(mat)
return torch.Tensor([torch.trace(mat)])
def backward(self, g):
mat, = self.saved_tensors
return torch.mul(torch.eye(int(mat.size()[0])),g[0])
When I try to check my result, its precision is not optimal, i.e.
input = (torch.autograd.Variable(torch.randn(2,2).double(), requires_grad = True),)
torch.autograd.gradcheck(CustomTrace(), input, eps = 1e-4, atol = 1e-3)
returns true. However,
input = (torch.autograd.Variable(torch.randn(2,2).double(), requires_grad = True),)
torch.autograd.gradcheck(CustomTrace(), input, eps = 1e-6, atol = 1e-4)
returns false. I would hope that my math is right. Is there any problem with my code or is this for some reason to be expected?
Thanks a lot!
smth
March 13, 2017, 11:26pm
2
Your function looks correct.
Some suggestions:
You dont need to save the entire mat
for backwards, only mat.size()
mat.size()[0]
-> mat.size(0)
you dont handle trace
for non-square inputs.
the output return type is not the same as the input return type, because you use torch.Tensor
. Instead use mat.new()
Overall, here’s a modified version that handles non-square inputs, and implements the other suggestions.
class CustomTrace(torch.autograd.Function):
def forward(self, input):
self.isize = input.size()
return input.new([torch.trace(input)])
def backward(self, grad_output):
isize = self.isize
grad_input = grad_output.new(isize).copy_(torch.eye(*isize))
grad_input.mul_(grad_output[0])
return grad_input
1 Like
DasPantom
(Pan Kessel)
March 14, 2017, 9:45am
3
Thanks a lot! Amazing to see how much these few lines of code can be improved.
Thanks for pointing out that trace also works for non-square matrices (the mathematician in me assumed that trace only works for square matrices). I will try to implement this.
Edit: I see. It already works for non-square matrices. Thanks a lot.
tachim
(Tudor Achim)
April 19, 2017, 5:09pm
4
FYI I think there’s something a little strange going on with gradcheck
. For example, check out what happens with this simple function:
correlation_product = lambda f, d: torch.mul(d, torch.dot(f, d).expand_as(d))
print(gradcheck(correlation_product, (f, d), eps=1e-6, atol=1e-3))
print(gradcheck(correlation_product, (f, d), eps=1e-6, atol=1e-2))
print(gradcheck(correlation_product, (f, d), eps=1e-6, atol=1e-1))
prints False, False, True
(f
and d
are declared as V(torch.FloatTensor(np.random.randn(5)), requires_grad=True)
). My understanding is that this is using PyTorch’s built-in autograd, so it seems strange to have precision issues of this magnitude. This is on pytorch 0.1.11+8aa1cef
.
smth
April 19, 2017, 5:36pm
5
you should ideally do gradchecks in double precision. Float precision might not be enough for finite difference to agree with analytical gradient.
2 Likes
tachim
(Tudor Achim)
April 19, 2017, 7:10pm
6
That was exactly the problem – thanks!
I am not sure if this was different in March 2017, but there is a ready to use trace function with autograd capabilities now: torch.trace
.
Just mentioning for everyone that gets here using a search engine.
3 Likes