I created a model of 3 parts encoder, decoder and discriminator, and created a separate optimizer for each part this way
optimizer_encoder = optim.Adam(model.encode.parameters(), lr=args.lr) optimizer_decoder = optim.Adam(model.decode.parameters(), lr=args.lr) optimizer_discriminator = optim.Adam(model.dis.parameters(), lr=args.lr)
in the training loop I try to backpropagate losses this way
model.zero_grad() loss_encoder.backward(retain_graph=True) # # someone likes to clamp the grad here # #[p.grad.data.clamp_(-1,1) for p in model.encode.parameters()] # # update parameters optimizer_encoder.step() # clean others, so they are not afflicted by encoder loss model.zero_grad() #decoder # if train_dec: loss_decoder.backward(retain_graph=True) #[p.grad.data.clamp_(-1,1) for p in model.decode.parameters()] optimizer_decoder.step() #clean the discriminator model.dis.zero_grad() # # #discriminator # # # if train_dis: loss_discriminator.backward() # # #[p.grad.data.clamp_(-1,1) for p in model.dis.parameters()] optimizer_discriminator.step()
Here a simple schematic the model
I need the parameters of each part to be updated only through backpropagating it’s own loss, that’s why the above code was implemented this way
but I get this error
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [256, 256]], which is output 0 of TBackward, is at version 3; expected version 2 instead. Hint: the backtrace further above shows the operation that failed to compute its gradient. The variable in question was changed in there or anywhere later. Good luck!
any idea how I can solve this? in my case would it be ok to switch back to pytorch 1.4?