Unrolling adversarial networks

Hi,

Has anyone implemented the GAN unrolling procedure from https://arxiv.org/pdf/1611.02163v3.pdf?

Is the point that we simulate optimisation of the discriminator for K steps, then use the final error at D to optimise the parameters of D from before we started unrolling?

Does this mean we have to store a copy of D’s parameters before we start the unrolling procedure? Does this mean that we end up discarding all the optimisation changes that occurred during unrolling? (Excluding the fact that they are implicitly bundled into the final output from the unrolling).

Also, do we unroll for both real and fake data through D, or just fake data?

Note that one of the authors has a tensorflow implementation here (https://github.com/poolio/unrolled_gan/blob/master/Unrolled%20GAN%20demo.ipynb) but that hasn’t helped my understanding very much.

Any help with this would be very much appreciated, cheers!

I don’t really know how unrolled GANs work, but as far as I remember they require taking gradient of functions of another gradient, and we don’t support that yet. It’s on our roadmap and we’re actively working on that, but it might take us some time.

Cool, cheers @apaszke!

this is supported now right? How can it be done?

It is supported for most things, one notable exception are the cudnn recurrent modules (like nn.LSTM). You can use torch.autograd.grad in your forward and use .backward just as you usually do in PyTorch.

Best regards

Thomas

Does it support recurrent modules using Pytorch 1.0 now ?