Error: save_for_backward can only save input or output tensors
It does not allow me to save any intermediate tensors.
Also, in backward definition if I do .data will it still work?
Thank you,
Error: save_for_backward can only save input or output tensors
It does not allow me to save any intermediate tensors.
Also, in backward definition if I do .data will it still work?
Thank you,
You can use the ctx
to save everything you want exept for input or output that should only be saved with save_for_backward
(to avoid bugs).
The backward definition works with Variable
by default. If your backward implementation is not differentiable (meaning you can’t do backward of backward), then you can use the @once_differentiable
(can be imported torch.autograd.function
) decorator on top of the @static_method
for your backward and it will receive directly Tensor
and will raise a proper error if you try to differentiate it.
Thank you for the quick response.
This is really helpful. However, my gradchecks passes what could be the reason for that.
Also, to clarify as I am new to python, saving with ctx means similar to self?
Example:
ctx.intermediate_result = torch.Tensor(2, 3)
Thanks!
Yes it does mean the same thing.
The gradcheck is expected to pass because it only checks the first derivative.
Also, this implies that whatever intermediate results we save in forward should be wrapped in Variable inside forward definition itself ?
Thanks!
Depending on which version of pytorch you’re using
If the input is a Variable
, then yes, wrap everything in Variable
s.
If the input is a Tensor
, then just compute your forward and return a Tensor
.
I have summarized what I am doing with custom operations
class Name(Function):
@staticmethod
def forward(ctx, input):
ctx.save_for_backward(input)
ctx.intermediate_results = tensor
return loss
@staticmethod
def backward(ctx, grad_output):
if torch.is_tensor(grad_output):
tensor = ctx.intermediate_results
inputs = ctx.saved_tensors
else:
tensor = Variable(ctx.intermediate_results)
inputs = ctx.saved_variables
# do gradient computation
return grad_input
##OR Case 2
class Name(Function):
@staticmethod
def forward(ctx, input):
ctx.save_for_backward(input)
ctx.intermediate_results = Variable(tensor)
return loss
@staticmethod
def backward(ctx, grad_output):
tensor = ctx.intermediate_results
inputs = ctx.saved_variables
# do gradient computation
return grad_input
The second one works fine when I feed Variables?
which one is correct?
I don’t know how to test the backward of the first one using just tensors?
What is the correct methodology?
Sorry maybe my statement was unclear.
When you will call Name.apply(input)
, depending on which version of pytorch you’re currently running (this behaviour changed in the latest releases), you’ll get either Tensor or Variable (that depends on the pytorch version, not what you do).