How to prevent for part of the loss the create gradient

natank1 · September 26, 2020, 9:44pm

Hi

I am developing a sort of creature which is similar to Auto encoder or VAE

I want the encoder to have a loss of his own (f) and than take its outputs (Y) and feed them the decoder with his own loss (g)

I want g to have derivatives up to Y (namely I dont want the gradient of g to reach the encoder)

However, when I run the code I can see that even if the loss is only g and the optimizer is allowed to change only the encoder, learning takes place i.e. the decoder’s loss changes the weights of the encoder.

What can I do?

ptrblck · September 27, 2020, 2:05am

How did you make sure that the encoder doesn’t get gradients? Did you detach Y before feeding it into the decoder?
Also, why is the optimizer allowed to update the encoder, if it shouldn’t be trained?

natank1 · September 29, 2020, 4:37am

Thanks for the quick reply.
The optimizer allows to change the encoder because it is configured to What i did was the following:
loss = encoder_f
loss.backward( retain_graph=True )
optimizer_enc.step()
Y= autograd.Variable(z,requires_grad=False)
optimizer_dec.zero_grad()
g, _ = decoderl(Y, x)
loss = g
loss.backward()
optimizer_dec.step()

It works, the question if it “torch enough”?

Thanks

ptrblck · September 29, 2020, 9:43am

Based on your code it doesn’t seem that the output of the encoder is fed to the decoder, so the calculated loss of the decoder won’t create any gradients in the encoder.

If you are worries about best practices in PyTorch:

check, if you really need to use retain_graph since based on your code snippet it doesn’t seem as if it’s needed
Variables are deprecated so you can use tensors after PyTorch 0.4

natank1 · September 29, 2020, 10:42am

Sorry this line
loss = encoder_f
Should be written
loss,Y = encoder_f
." Variable s are deprecated so you can use tensors after PyTorch 0."
Not sure I understood

natank1 · September 29, 2020, 11:52am

OK
Indeed retain_graph is redundant and I replaced the auto_grad with this
Y = torch.tensor(Y, requires_grad=False)