Only update weights in regard to one of the outputs

Hey everyone!

I’m playing around with two networks similar to a GAN, where one network is updated through the other, similar to the updates to the generator of a GAN through the discriminator.
I want the discriminator-ish network to learn to discriminate between two classes, so it has two outputs. The ‘generator’ network should only update through the loss for one of the outputs of the discriminator though, so I thought of using a [1, 0] vector as the input to loss.backward().
I’m using a Softmax Loss, which gives me a scalar, so I can’t use the gradient mask like this:

    g_err = loss(prediction_g, label.long())
    gradient_mask = torch.Tensor([1, 0])
    g_err.backward(gradient=gradient_mask)

Basically, the ‘generator’ network should not know about the 'discriminator’s second output but only get updated based on what the ‘discriminator’ learned for its output[0].

Does anyone have an idea how to do that?

Thanks!

So would it work if you create a new tensor to hold output[0], and then use that for computing the loss. Then when you call loss.backward() that will compute the gradients only based on that element.

1 Like

That’s what I was thinking as well, but can I still use Softmax then or do I have to use a sigmoid?

You can still use the softmax. Just make sure the dimension for softmax is correctly specified, since by indexing, the dimension associated with the batch will be dropped.

1 Like

That makes sense, perfect! Thank you very much!