How to prevent for part of the loss the create gradient


I am developing a sort of creature which is similar to Auto encoder or VAE

I want the encoder to have a loss of his own (f) and than take its outputs (Y) and feed them the decoder with his own loss (g)

I want g to have derivatives up to Y (namely I dont want the gradient of g to reach the encoder)

However, when I run the code I can see that even if the loss is only g and the optimizer is allowed to change only the encoder, learning takes place i.e. the decoder’s loss changes the weights of the encoder.

What can I do?

How did you make sure that the encoder doesn’t get gradients? Did you detach Y before feeding it into the decoder?
Also, why is the optimizer allowed to update the encoder, if it shouldn’t be trained?

Thanks for the quick reply.
The optimizer allows to change the encoder because it is configured to What i did was the following:
loss = encoder_f
loss.backward( retain_graph=True )
Y= autograd.Variable(z,requires_grad=False)
g, _ = decoderl(Y, x)
loss = g

It works, the question if it “torch enough”?


Based on your code it doesn’t seem that the output of the encoder is fed to the decoder, so the calculated loss of the decoder won’t create any gradients in the encoder.

If you are worries about best practices in PyTorch:

  • check, if you really need to use retain_graph since based on your code snippet it doesn’t seem as if it’s needed
  • Variables are deprecated so you can use tensors after PyTorch 0.4

Sorry this line
loss = encoder_f
Should be written
loss,Y = encoder_f
." Variable s are deprecated so you can use tensors after PyTorch 0."
Not sure I understood

Indeed retain_graph is redundant and I replaced the auto_grad with this
Y = torch.tensor(Y, requires_grad=False)