GAN training fails for different image normalization constants

I’m training a GAN on a single ImageNet class (~1300 images) and I found that the image normalization constants I pass to torch.transforms.Normalize seem to affect the training a lot.

For example, the following two images are obtained by the generator, after approximately the same number of epochs, on the “panda” class, using mean = (0.485, 0.456, 0.406) and std=(0.229, 0.224, 0.225) for the left image, and mean = std = (0.5, 0.5, 0.5) for the right image.

g_output_bad g_output_good

I haven’t run the training for long enough to exclude that the “left” model will eventually improve, but still I find this to be a significant difference for what I thought was a relatively unimportant parameter.

Here is a script which (hopefully) reproduces the error: https://gist.github.com/simopal6/c6484df00d5747dfe33f7ed67383c6fd

By default it uses the “good” normalization constants, unless you pass the --bad flag. Also, it needs the dataset, in a directory structure suitable for ImageFolder. You can use any ImageNet folder, though my results were obtained on the “panda” (n02510455) class. For convenience, I uploaded that class on Dropbox at this link: https://www.dropbox.com/s/7kdjz99zqlkp2jb/dataset.zip?dl=0 (I’m not sure I’m allowed to do that, but whatever).

P.S. the model is actually a Wasserstein GAN

wgan is pretty sensitive to hyperparameters. Have you calculated the means and stds for just panda images? Maybe that will serve better as normalization constants.

No, although I experienced the same problem when training on 40 classes. Anyway, I’ll try to compute the statistics on the panda class alone and see what happens.

That’s interesting. I’d guess the std difference is playing a more important role than mean.

Also, in G, did you try use some FC layer at beginning to increase Z size? Putting a convT at first might be too “local”.

I used the models from the original WGAN code: https://github.com/martinarjovsky/WassersteinGAN/blob/master/models/dcgan.py.

Here are two more example outputs after 7500 training epochs:

g_output_bad g_output_good

So after changing stuff a bit (adding a SLOGAN single-sided Lipschitz objective penalty, BatchNorm->InstanceNorm) and so, it would seem that the normalization matters a lot less. This is with bad=True after ~ 550 iters.

Best regards

Thomas

g_output_bad

1 Like

Wow, thanks Thomas! Do you have any particular explanation of why those changes would be less affected by normalization?

By the way, can you link this SLOGAN paper for me? I’m looking for it, but I’m only getting results for the “slogan” word…

Hello Simone,

sure. SLOGAN is introduced here.

Your code with small adaptations (probably not all are needed, I just fiddled around a bit, e.g. I don’t use argparse as I’m on Jupyter, sorry):

Best regards

Thomas

Thanks a lot, that was very helpful!