While running it after adding it into my model, I get the following error:
File “/usr/local/lib/python3.5/dist-packages/torch/autograd/variable.py”, line 146, in backward
self._execution_engine.run_backward((self,), (gradient,), retain_variables)
RuntimeError: could not compute gradients for some functions
I believe the backward pass for torch.sign() returns all gradients simply as zeros and not the output expected from a straight through estimator.
Here is the code and outputs to reproduce this:
import torch
import torch.nn as nn
from torch.autograd import Variable, Function
class BinaryLayer(nn.Module):
def forward(self, input):
return torch.sign(input)
input = torch.randn(4,4)
input = Variable(input, requires_grad=True)
model = BinaryLayer()
output = model(input)
loss = output.mean()
>>> loss.backward()
>>> input
Variable containing:
-1.4272 1.5698 2.6661 0.4438
0.4978 0.8987 1.6969 0.2067
0.3880 -2.1434 -1.1588 -0.5567
-1.2435 -0.1010 0.7215 -0.9209
[torch.FloatTensor of size 4x4]
>>> input.grad
Variable containing:
0 0 0 0
0 0 0 0
0 0 0 0
0 0 0 0
[torch.FloatTensor of size 4x4]
Hence, I implemented this layer. It works when used in isolation. Here is a snippet to verify it:
I believe the backward pass for torch.sign() returns all gradients simply as zeros and not the output expected from a straight through estimator.
Ok I see. But first, the correct straight through estimator for the derivative of sign is not clamping grad_output between -1 and 1. You want grad_output being 0 where the input is smaller than -1 or bigger than 1. So, something like this:
Really sorry about making a mistake in the straight through estimator, probably was half asleep Thank you very much for the help, my layer is working now after calling the class inherited from Function in my layer inherited from nn.Module