Custom Loss Function cannot save intermediate results

pjavia · March 26, 2018, 9:10pm

Error: save_for_backward can only save input or output tensors

It does not allow me to save any intermediate tensors.

Also, in backward definition if I do .data will it still work?

Thank you,

albanD · March 27, 2018, 10:17am

You can use the ctx to save everything you want exept for input or output that should only be saved with save_for_backward (to avoid bugs).
The backward definition works with Variable by default. If your backward implementation is not differentiable (meaning you can’t do backward of backward), then you can use the @once_differentiable (can be imported torch.autograd.function) decorator on top of the @static_method for your backward and it will receive directly Tensor and will raise a proper error if you try to differentiate it.

pjavia · March 27, 2018, 2:34pm

Thank you for the quick response.

This is really helpful. However, my gradchecks passes what could be the reason for that.
Also, to clarify as I am new to python, saving with ctx means similar to self?

Example:

ctx.intermediate_result = torch.Tensor(2, 3)

Thanks!

albanD · March 27, 2018, 2:36pm

Yes it does mean the same thing.
The gradcheck is expected to pass because it only checks the first derivative.

pjavia · March 27, 2018, 2:58pm

Also, this implies that whatever intermediate results we save in forward should be wrapped in Variable inside forward definition itself ?

Thanks!

albanD · March 27, 2018, 3:37pm

Depending on which version of pytorch you’re using
If the input is a Variable, then yes, wrap everything in Variables.
If the input is a Tensor, then just compute your forward and return a Tensor.

pjavia · March 27, 2018, 5:20pm

I have summarized what I am doing with custom operations

class Name(Function):
     @staticmethod
     def forward(ctx, input):
         ctx.save_for_backward(input)
         ctx.intermediate_results = tensor
         return loss

     @staticmethod
     def backward(ctx, grad_output):
        if torch.is_tensor(grad_output):
            tensor = ctx.intermediate_results
            inputs = ctx.saved_tensors
        else:
            tensor = Variable(ctx.intermediate_results)
            inputs = ctx.saved_variables
        # do gradient computation
        return grad_input
  
##OR Case 2

class Name(Function):
     @staticmethod
     def forward(ctx, input):
         ctx.save_for_backward(input)
         ctx.intermediate_results = Variable(tensor)
         return loss

     @staticmethod
     def backward(ctx, grad_output):
        tensor = ctx.intermediate_results
        inputs = ctx.saved_variables
        # do gradient computation
        return grad_input

The second one works fine when I feed Variables?
which one is correct?
I don’t know how to test the backward of the first one using just tensors?
What is the correct methodology?

albanD · March 28, 2018, 10:06am

Sorry maybe my statement was unclear.
When you will call Name.apply(input), depending on which version of pytorch you’re currently running (this behaviour changed in the latest releases), you’ll get either Tensor or Variable (that depends on the pytorch version, not what you do).