(Python)DCGAN only outputs white pixels

Martyn · December 18, 2021, 6:37pm

Hey! I’m learning how to use Pytorch and how to use GANs so I followed Pytorch’s tutorial to create a DCGAN.

However, I wanted to use my own dataset, which I created based on keras’ mnist.load_data() function, which returns X_train in the shape (n_samples, height, width) and y_train in (n_samples,). So I didn’t use torchvision.dataset.ImageFolder neither torch.utils.data.DataLoader.

My dataset got shape (2092, 3, 28,28) and, to apply normalization, I did dataset = dataset/127.5 - 1 so all pixel values would be between -1 and 1.

I also did some modifications in the networks so, instead of working with 64x64 images, they would work with 28x28:

# Size of feature maps in generator
ngf = 28
# Size of feature maps in discriminator
ndf = 28

class Generator(nn.Module):
    def __init__(self, ngpu, ):
        super(Generator, self).__init__()
        self.ngpu = ngpu
        self.main = nn.Sequential(
            # input is Z, going into a convolution
            nn.ConvTranspose2d( nz, ngf * 8, 7, 1, 0, bias=False),
            nn.BatchNorm2d(ngf * 8),
            nn.ReLU(True),
            # state size. (ngf*8) x 7 x 7
            nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 4),
            nn.ReLU(True),
            # state size. (ngf*4) x 14 x 14
            nn.ConvTranspose2d( ngf * 4, ngf * 2, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ngf * 2),
            nn.ReLU(True),
            # state size. (ngf*2) x 28 x 28
            nn.ConvTranspose2d( ngf * 2, ngf, 1, 1, 0, bias=False),
            nn.BatchNorm2d(ngf),
            nn.ReLU(True),
            # state size. (ngf) x 28 x 28
            nn.ConvTranspose2d( ngf, nc, 1, 1, 0, bias=False),
            nn.Tanh()
            # state size. (nc) x 28 x 28
        )

    def forward(self, input):
        return self.main(input)

class Discriminator(nn.Module):
    def __init__(self, ngpu):
        super(Discriminator, self).__init__()
        self.ngpu = ngpu
        self.main = nn.Sequential(
            # input is (nc) x 28 x 28
            nn.Conv2d(nc, ndf, 10, 1, 1, bias=False),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Dropout(0.4, inplace=False),
            # state size. (ndf) x 21 x 21
            nn.Conv2d(ndf, ndf * 2, 10, 1, 1, bias=False),
            nn.BatchNorm2d(ndf * 2),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Dropout(0.4, inplace=False),
            # state size. (ndf*2) x 14 x 14
            nn.Conv2d(ndf * 2, ndf * 4, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ndf * 4),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Dropout(0.4, inplace=False),
            # state size. (ndf*4) x 7 x 7
            nn.Conv2d(ndf * 4, ndf * 8, 3, 2, 1, bias=False),
            nn.BatchNorm2d(ndf * 8),
            nn.LeakyReLU(0.2, inplace=True),
            nn.Dropout(0.4, inplace=False),
            # state size. (ndf*8) x 4 x 4
            nn.Conv2d(ndf * 8, 1, 4, 1, 0, bias=False), # This returns (1,1,1)
            nn.Sigmoid()
        )

    def forward(self, input):
        return self.main(input)

I also made some edits in the training loop:

for epoch in range(num_epochs):
        netD.zero_grad()
        # Format batch
        b_size = batch_size
        real_cpu = dataset[np.random.randint(0, dataset.shape[0], size=batch_size), :, :, :].to(device)
        # Forward pass real batch through D
        output = netD(real_cpu).view(-1)
        label = torch.full((real_cpu.shape[0],), real_label, dtype=torch.float, device=device)
        # Calculate loss on all-real batch
        errD_real = criterion(output, label)
        # Calculate gradients for D in backward pass
        errD_real.backward()
        D_x = output.mean().item()

        ## Train with all-fake batch
        # Generate batch of latent vectors
        noise = torch.randn(b_size, nz, 1, 1, device=device)
        # Generate fake image batch with G
        [...]

The rest of the code remains the same as the tutorial(including the rest of the training loop after the […])

However, whenever I plot the generated images, I get only some white squares. I’ve printed the Discriminator, labels_D and Generator output and Generator was generating only tensors with pixel values around -0.03: tensor([[[-2.9896e-02, -4.0004e-02, -1.5176e-02, ..., -6.8201e-02, 5.9256e-03, -3.3405e-02], [-4.2447e-02, 1.9592e-02, -6.5596e-02, ..., 7.0819e-02, -8.5874e-04, -2.7269e-02], [-2.5704e-02, 1.2020e-02, 5.5241e-02, ..., -4.2040e-02, -1.6354e-02, -5.8591e-05], .... Discriminator and labels_D, however, seem to be fine, D outputs a tensor with shape (b_size,) where each number is correctly within range [0,1], while labels_D is a tensor with shape (b_size,) full of 1s.

Can someone help me fix this generator problem? I can’t see any problem in my code…

Martyn · December 18, 2021, 9:28pm

I’ve modified the code a bit, but I’m still having some problems.

I’ve taken the pixels in generated_images and multiplied them by 0.5, and then added +0.5, so they all would be in range [0,1], now I’m able to get colored images…but they’re all interference images:

This is the result after 100 epochs. It isn’t that much epochs, but I think that the result should be slightly better, right?

I’ve modified my dataset so my images would be 64x64, then I copy+pasted the generator and discriminator from the tutorial. Kept the training loop AND the Dropout layers in the Discriminator, though. And the result is the same.

I’ve searched a bit and learned that this could mean that the model failed to converge, but how could this happens if now I’m using literally the same parameters from the tutorial? Same parameters for the structure, same for the optimizers…