DasPantom
(Pan Kessel)
March 12, 2017, 11:36pm
#1
Hello,

I am trying to write a Function for torch.trace (seems not to exist so far). Here is my code:

```
class CustomTrace(torch.autograd.Function):
def forward(self, mat):
self.save_for_backward(mat)
return torch.Tensor([torch.trace(mat)])
def backward(self, g):
mat, = self.saved_tensors
return torch.mul(torch.eye(int(mat.size()[0])),g[0])
```

When I try to check my result, its precision is not optimal, i.e.

```
input = (torch.autograd.Variable(torch.randn(2,2).double(), requires_grad = True),)
torch.autograd.gradcheck(CustomTrace(), input, eps = 1e-4, atol = 1e-3)
```

returns true. However,

```
input = (torch.autograd.Variable(torch.randn(2,2).double(), requires_grad = True),)
torch.autograd.gradcheck(CustomTrace(), input, eps = 1e-6, atol = 1e-4)
```

returns false. I would hope that my math is right. Is there any problem with my code or is this for some reason to be expected?

Thanks a lot!

smth
March 13, 2017, 11:26pm
#2
Your function looks correct.
Some suggestions:

You dont need to save the entire `mat`

for backwards, only `mat.size()`

`mat.size()[0]`

-> `mat.size(0)`

you dont handle `trace`

for non-square inputs.
the output return type is not the same as the input return type, because you use `torch.Tensor`

. Instead use `mat.new()`

Overall, here’s a modified version that handles non-square inputs, and implements the other suggestions.

```
class CustomTrace(torch.autograd.Function):
def forward(self, input):
self.isize = input.size()
return input.new([torch.trace(input)])
def backward(self, grad_output):
isize = self.isize
grad_input = grad_output.new(isize).copy_(torch.eye(*isize))
grad_input.mul_(grad_output[0])
return grad_input
```

1 Like

DasPantom
(Pan Kessel)
March 14, 2017, 9:45am
#3
Thanks a lot! Amazing to see how much these few lines of code can be improved.

Thanks for pointing out that trace also works for non-square matrices (the mathematician in me assumed that trace only works for square matrices). I will try to implement this.

Edit: I see. It already works for non-square matrices. Thanks a lot.

tachim
(Tudor Achim)
April 19, 2017, 5:09pm
#4
FYI I think there’s something a little strange going on with `gradcheck`

. For example, check out what happens with this simple function:

`correlation_product = lambda f, d: torch.mul(d, torch.dot(f, d).expand_as(d))`

```
print(gradcheck(correlation_product, (f, d), eps=1e-6, atol=1e-3))
print(gradcheck(correlation_product, (f, d), eps=1e-6, atol=1e-2))
print(gradcheck(correlation_product, (f, d), eps=1e-6, atol=1e-1))
```

prints `False, False, True`

(`f`

and `d`

are declared as `V(torch.FloatTensor(np.random.randn(5)), requires_grad=True)`

). My understanding is that this is using PyTorch’s built-in autograd, so it seems strange to have precision issues of this magnitude. This is on pytorch `0.1.11+8aa1cef`

.

smth
April 19, 2017, 5:36pm
#5
you should ideally do gradchecks in double precision. Float precision might not be enough for finite difference to agree with analytical gradient.

2 Likes

tachim
(Tudor Achim)
April 19, 2017, 7:10pm
#6
That was exactly the problem – thanks!

I am not sure if this was different in March 2017, but there is a ready to use trace function with autograd capabilities now: `torch.trace`

.

Just mentioning for everyone that gets here using a search engine.

3 Likes