Help with conditional GAN



I am new to pytorch. I have been hacking the GAN example to adapt it for image inpainting on MS coco (downsampled to 64x64). I first just wanted to get the code working on the version of MSCOCO I have. Running the code as is produces sharp, albeit structureless images.

First, I wanted to condition the generator on the unmasked part of the image (the central 32x32 pixels are masked). To do this I generated an embedding of the masked image using a pretrained vggnet.

Q. I load the model from torchvision with ‘features’, as I want to used conv7,
pseudo code:
vggnet = models.vgg19(pretrained=True).features
conditioning = vggnet(masked_image)
conditioning = conditioning.view(conditioning.size(0),-1)

Then I concatenate the noise vector and the flattened conditioning. Is this the correct/best way to do this?

Q. I also want to augment the generators loss based on This approach adds an L1 loss to the generator objective between the masked generator output and the masked image (referred to as context loss in the paper)
To do this I produced a 3x64x64 mask in numpy, moved it to a torch Variable and simply multiplied it by the generators output ( * for elemwise multiply). I then use the L1Loss criterion to compute the loss.
Again, is this the correct way to achieve this?

All the code seems to be running, however since the images still all look like modern art, it is hard for me to say if it is correct.

ps. I noticed that I can’t use a 64x64 mask as it will not automatically broadcast like numpy. Is this the case, or did I make some mistake?