Hi. I have a system with a discriminator, encoders, decoders and classifiers where: I want to approximate two distributions in an adversarial process using the encoders and the discriminator; I want to keep the original features of the distribution using encoders and decoders; I want to keep the system discriminative to the classification task using a classifier.
My problem is that I want to update the encoders’ weights using two different optimizers:
optimizer_G = torch.optim.Adam(
itertools.chain(encoder_a.parameters(), encoder_v.parameters(), encoder_l.parameters(),
decoder_a.parameters(), decoder_l.parameters(), decoder_v.parameters()), weight_decay=decay, lr=lr, betas=(b1, b2))
optimizer_E = torch.optim.Adam(
itertools.chain(encoder_a.parameters(), encoder_v.parameters(), encoder_l.parameters(),
classifier.parameters()), lr=lr, betas=(b1, b2), weight_decay=decay)
The main training process I am using is the following:
x = batch[:-1]
x_a = Variable(x[0], requires_grad=False).squeeze()
x_v = Variable(x[1], requires_grad=False).squeeze()
x_t = Variable(x[2], requires_grad=False)
y = Variable(batch[-1].view(-1, output_dim), requires_grad=False)
# encoder-decoder
optimizer_G.zero_grad()
a_en = encoder_a(x_a)
v_en = encoder_v(x_v)
l_en = encoder_l(x_t)
a_de = decoder_a(a_en)
v_de = decoder_v(v_en)
l_de = decoder_l(l_en)
rl1 = pixelwise_loss(a_de, x_a) + pixelwise_loss(v_de, x_v) + \
pixelwise_loss(l_de, x_t) # reconstruction loss
g_loss = alpha * (adversarial_loss(discriminator(l_en), valid) +
adversarial_loss(discriminator(v_en), valid)) + (1 - alpha) * (rl1)
g_loss.backward(retain_graph=True)
optimizer_G.step()
# classifier
optimizer_E.zero_grad()
a = classifier(a_en)
v = classifier(v_en)
l = classifier(l_en)
c_loss = criterion(a, y) + criterion(l, y) + \
criterion(v, y) # classification loss
c_loss.backward(retain_graph=True)
optimizer_E.step()
The error I get is:
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [50, 50]], which is output 0 of TBackward, is at version 2; expected version 1 instead.
I believe the problem is that optimizer_G.step() changes the weights inplace that are otherwise needed to compute the c_loss.backward(), as noted by @albanD in this issue, but I don’t know how to circumvent this problem. Do you have any idea?