Ah OK, this makes sense.
fake_g_loss.backward()
will use its own loss value to calculate the gradients. You could seed the code to make it reproducible and check the gradients after the backward call e.g. via:
print(generator.weight.grad.abs().sum())
Then you could play around with scaling fake_g_loss_detached
and would see that no value changes the computed gradients.
I’m not sure, if this would fit your use case, but in case you want to use a different code path in the backward pass, you could check e.g. this approach described by @tom.