Why to expect gradient of non learnable parameter

matthias.l · March 18, 2020, 8:58pm

When writing a custom function with custom forward and backward:

Assume the module

class IMyFunction(torch.autograd.Function):
    @staticmethod
    def forward(ctx, x, a, b):
        # a is a learnable parameter. b is a constant tensor
        # b.requires_grad is False
        ctx.save_for_backward(x, a, b)
        # Compute y ...
        return y

    @staticmethod
    def backward(ctx, grad_y):
        x, a, b = ctx.saved_tensors
        # Compute gradient of x and learnable parameter a
        return d_x, d_a

where a is a trainable parameter and b a constant tensor.

Using this module results in the error message:

Traceback (most recent call last):
File “./test.py”, line 217, in test_both
y.backward(y_grad)
File “/home/matthias/.local/lib/python3.7/site-packages/torch/tensor.py”, line 166, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File “/home/matthias/.local/lib/python3.7/site-packages/torch/autograd/init.py”, line 103, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: function MyFunctionBackward returned an incorrect number of gradients (expected 3, got 2)

My question: Why are backward functions required to return a gradient for non learnable parameters?
How to mark a inputs as constant? (No gradient is computed)

Or Is this there a bug in
https://github.com/pytorch/pytorch/blob/master/torch/csrc/autograd/custom_function.h ?

albanD · March 18, 2020, 9:12pm

No this is expected. You should return as many elements as there were inputs to the forward function. If something is non-differentiable, you should return None.