3 Positional arguments in backwards

mordith · April 5, 2018, 11:49am

Hi,

I have two loss variables, A and B.
When I do A.backward() everything works fine, but when I do B.backward() i get the following error:
TypeError: backward() takes 2 positional arguments but 3 were given

I expect that the difference is in the history of the variables but I cannot find it.
Any ideas or directions will be appreciated.

albanD · April 5, 2018, 12:11pm

Hi,

Could you give more details on the model you’re using? If you have custon Module or Function in it?

mordith · April 6, 2018, 8:16am

I have an architecture that consists largely of regular Modules and Functions, but also uses two custom Functions, each one associated with a different loss (I add their code below).

The loss functions themselves are also non-trivial but are all a series of torch functions and are probably okay so for now I will only add the code for the custom Functions to maintain brevity.

The custom Function for loss A is simply adding some noise to the input:

    @staticmethod
    # bias is an optional argument
    def forward(ctx, input, stdev):
        normal_sample = torch.normal(torch.zeros(input.size()), torch.zeros(input.size()) + stdev).cuda()

        # re-center the means to the input
        output = input + normal_sample

        ctx.stdev = stdev
        ctx.mark_dirty(input)
        ctx.save_for_backward(input, output)

        return output

    @staticmethod
    def backward(ctx, grad_output):
        input, output = ctx.saved_variables
        stdev = ctx.stdev

        tensor_output = output.data
        tensor_input = input.data
        tensor_output.normal_(0, stdev[0][0])

        # re-center the means to the input
        tensor_output.add(tensor_input)

        del ctx.stdev
        return Variable(tensor_output), None

The custom function for loss B uses its input to weigh multinomial sampling and outputs the resulting indexing:

    @staticmethod
    def forward(ctx, input):
        one = torch.ones(1).cuda()

        if len(input.size()) == 1:
            input = torch.unsqueeze(input, 0)

        output = torch.zeros(input.size())
        if input.is_cuda:
            output = output.cuda()

        # sample from categorical with p = input
        _index = torch.multinomial(input + constants.epsilon, 1, False)

        output.scatter_(1, _index, torch.unsqueeze(one.repeat(_index.size()[0]),1))

        ctx.mark_dirty(input)

        ctx.save_for_backward(input, output)

        return _index.float(), output

    @staticmethod
    def backward(ctx, grad_output):
        input, output = ctx.saved_variables
        gradInput = torch.zeros(input.size()).cuda()
        gradInput.copy_(output)
        gradInput.div_(input)
        gradInput.mul_(grad_output)
        return gradInput

albanD · April 6, 2018, 12:51pm

Hi,

The error in the second function is that since it has two outputs from the forward, it will get two grad_ inputs in the backward method. Here you expect only one hence the error.

sdh517 · August 23, 2019, 8:08pm

This might be a dumb question, but I have a similar scenario and want to give multiple outputs from my custom forward function similar to what mordith was doing. One of these will need to be differentiated but the others will not. For example,

def forward(ctx, input)
…
return differentiated_output, non_differentiated_output_for_diagnostics, another_one

albanD · August 26, 2019, 2:57pm

Your forward should mark the non differentiable outputs and the backward should return None for them.