Custom Loss Function cannot save intermediate results

Error: save_for_backward can only save input or output tensors

It does not allow me to save any intermediate tensors.

Also, in backward definition if I do .data will it still work?

Thank you,

You can use the ctx to save everything you want exept for input or output that should only be saved with save_for_backward (to avoid bugs).
The backward definition works with Variable by default. If your backward implementation is not differentiable (meaning you can’t do backward of backward), then you can use the @once_differentiable (can be imported torch.autograd.function) decorator on top of the @static_method for your backward and it will receive directly Tensor and will raise a proper error if you try to differentiate it.

3 Likes

Thank you for the quick response.

This is really helpful. However, my gradchecks passes what could be the reason for that.
Also, to clarify as I am new to python, saving with ctx means similar to self?

Example:

ctx.intermediate_result = torch.Tensor(2, 3)

Thanks!

Yes it does mean the same thing.
The gradcheck is expected to pass because it only checks the first derivative.

1 Like

Also, this implies that whatever intermediate results we save in forward should be wrapped in Variable inside forward definition itself ?

Thanks!

Depending on which version of pytorch you’re using :smiley:
If the input is a Variable, then yes, wrap everything in Variables.
If the input is a Tensor, then just compute your forward and return a Tensor.

I have summarized what I am doing with custom operations

class Name(Function):
     @staticmethod
     def forward(ctx, input):
         ctx.save_for_backward(input)
         ctx.intermediate_results = tensor
         return loss

     @staticmethod
     def backward(ctx, grad_output):
        if torch.is_tensor(grad_output):
            tensor = ctx.intermediate_results
            inputs = ctx.saved_tensors
        else:
            tensor = Variable(ctx.intermediate_results)
            inputs = ctx.saved_variables
        # do gradient computation
        return grad_input
  
##OR Case 2

class Name(Function):
     @staticmethod
     def forward(ctx, input):
         ctx.save_for_backward(input)
         ctx.intermediate_results = Variable(tensor)
         return loss

     @staticmethod
     def backward(ctx, grad_output):
        tensor = ctx.intermediate_results
        inputs = ctx.saved_variables
        # do gradient computation
        return grad_input

The second one works fine when I feed Variables?
which one is correct?
I don’t know how to test the backward of the first one using just tensors?
What is the correct methodology?

Sorry maybe my statement was unclear.
When you will call Name.apply(input), depending on which version of pytorch you’re currently running (this behaviour changed in the latest releases), you’ll get either Tensor or Variable (that depends on the pytorch version, not what you do).

1 Like