Gradient of a custom layer that needs the target output

I want to define a custom layer as the last layer of a neural network, and use a custom loss function. The problem is, that the custom layer needs the target label in order to compute it’s forward and backward functions, but only the loss function has access to the target label, I was wondering what is the standard way of implementing this.

There are two things I think you could do:

  1. subclass torch.nn.Module and write a constructor for it that takes a target tensor. You can then use target in the forward and backwards functions. Something like https://github.com/jcjohnson/pytorch-examples#pytorch-custom-nn-modules but add an __init__ constructor to take the target tensor.

  2. Write your custom layer like a loss function. You could subclass _Loss like in here https://github.com/pytorch/pytorch/blob/master/torch/nn/modules/loss.py or write a forward function that takes target as a parameter.

3 Likes