I want to calculate the gradient and use the same gradient to minimize one part and maximize another part of the same network (kind of adversarial case). For me, Ideal case would be, if there are two optimizers responsible for two part of the network/model and one of the optimizers has a negative learning rate. But it seems that PyTorch does not allow negative learning rate.
In this case what I am doing is:
loss.backward() optimzer_for_one_part of the model.step()
Problem is, This time the again calculated gradient will not be the same(values are different but flopped of course) because some weights of the same network (same computation graph) have already been changed. But, Ideally, I want to use the flipped version of the previous gradient.
How can I achieve this?