I’ve seen similar posts but their problem is different than mine and their solutions don’t seem to apply.

I have 3 models: source_generator, target_generator and a classifier (discriminator). The generators are feature extractors that process images and output an embedding. Each generator is intended to process only images from its domain, either source or target. The discriminator calssifies the embeddings into target or source labels. The goal of this training is to implement Adversarial Discriminative Domain Adaptation for the target_generator model.

The problem comes with the training loop:

```
discriminator_optimizer = optim.Adam(discriminator.parameters(), lr=args.learning_rate)
generator_optimizer = optim.Adam(target_generator.parameters(), lr=args.learning_rate)
loss = torch.nn.BCEWithLogitsLoss()
# Train loop
print('Training...')
for epoch in range(args.epochs):
# Train step
for i, batch in enumerate(train_dl):
inputs = batch['image']
labels = batch['fake']
# Separate the inputs by label
source_index = labels == 1
target_index = labels == 0
source_inputs = inputs[source_index].float().to(device)
target_inputs = inputs[target_index].float().to(device)
# Zero grads
discriminator_optimizer.zero_grad()
# Process each input with corresponding generator
source_outputs = source_generator(source_inputs)[0]
target_outputs = target_generator(target_inputs)[0]
# Compute discriminator
source_labels = torch.Tensor(np.array([np.ones(1) for _ in range(len(source_inputs))])).to(device)
target_labels = torch.Tensor(np.array([np.zeros(1) for _ in range(len(target_inputs))])).to(device)
source_loss = loss(discriminator(source_outputs), source_labels)
target_loss = loss(discriminator(target_outputs), target_labels)
discriminator_loss = source_loss + target_loss
# Do backpropagation for discriminator
discriminator_loss.backward(retain_graph=True)
discriminator_optimizer.step()
# Do backpropagation for generator
target_loss.backward()
generator_optimizer.step()
```

I get RuntimeErrors related to a variable that has been modified:

```
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.FloatTensor [16, 1]], which is output 0 of AsStridedBackward0, is at version 2; expected version 1 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).
```

So I get the problem is something related with the fact that the discriminator model and the target_generator model are somehow related because the latter is used to compute the input of the first.

¿Is there another approach to this? ¿Can I somehow backpropagate only a model at a time?