GAN training fails for different image normalization constants

simopal6 · November 29, 2017, 4:30pm

I’m training a GAN on a single ImageNet class (~1300 images) and I found that the image normalization constants I pass to torch.transforms.Normalize seem to affect the training a lot.

For example, the following two images are obtained by the generator, after approximately the same number of epochs, on the “panda” class, using mean = (0.485, 0.456, 0.406) and std=(0.229, 0.224, 0.225) for the left image, and mean = std = (0.5, 0.5, 0.5) for the right image.

g_output_bad g_output_good

I haven’t run the training for long enough to exclude that the “left” model will eventually improve, but still I find this to be a significant difference for what I thought was a relatively unimportant parameter.

Here is a script which (hopefully) reproduces the error: gan_failure_normalization.py · GitHub

By default it uses the “good” normalization constants, unless you pass the --bad flag. Also, it needs the dataset, in a directory structure suitable for ImageFolder. You can use any ImageNet folder, though my results were obtained on the “panda” (n02510455) class. For convenience, I uploaded that class on Dropbox at this link: Dropbox - File Deleted - Simplify your life (I’m not sure I’m allowed to do that, but whatever).

P.S. the model is actually a Wasserstein GAN

SimonW · November 29, 2017, 4:49pm

wgan is pretty sensitive to hyperparameters. Have you calculated the means and stds for just panda images? Maybe that will serve better as normalization constants.

simopal6 · November 29, 2017, 5:08pm

No, although I experienced the same problem when training on 40 classes. Anyway, I’ll try to compute the statistics on the panda class alone and see what happens.

SimonW · November 29, 2017, 5:17pm

That’s interesting. I’d guess the std difference is playing a more important role than mean.

Also, in G, did you try use some FC layer at beginning to increase Z size? Putting a convT at first might be too “local”.

simopal6 · November 29, 2017, 5:19pm

I used the models from the original WGAN code: https://github.com/martinarjovsky/WassersteinGAN/blob/master/models/dcgan.py.

simopal6 · November 30, 2017, 8:36am

Here are two more example outputs after 7500 training epochs:

g_output_bad g_output_good

tom · November 30, 2017, 8:57pm

So after changing stuff a bit (adding a SLOGAN single-sided Lipschitz objective penalty, BatchNorm->InstanceNorm) and so, it would seem that the normalization matters a lot less. This is with bad=True after ~ 550 iters.

Best regards

Thomas

g_output_bad

simopal6 · December 1, 2017, 8:51am

Wow, thanks Thomas! Do you have any particular explanation of why those changes would be less affected by normalization?

By the way, can you link this SLOGAN paper for me? I’m looking for it, but I’m only getting results for the “slogan” word…

tom · December 1, 2017, 9:51am

Hello Simone,

sure. SLOGAN is introduced here.

Your code with small adaptations (probably not all are needed, I just fiddled around a bit, e.g. I don’t use argparse as I’m on Jupyter, sorry):

gist.github.com

https://gist.github.com/t-vi/5fad0a6181eb9485b25b0935396f8687

gan_failure_normalization.py

import torch
from PIL import Image
from torch.utils.data import DataLoader
import torchvision
from torchvision import transforms, datasets
from torch.autograd import Variable
import torch.nn as nn
import torch.optim
import torch.backends.cudnn as cudnn; cudnn.benchmark = True

This file has been truncated. show original

Best regards

Thomas

simopal6 · December 6, 2017, 9:42am

Thanks a lot, that was very helpful!