Increasing DCGAN convolutional layer filter size problem

I’m trying to implement a DCGAN, with heavy inspiration from DCGAN example from the Pytorch github and a dataset of 64x64 images, but I’ve been unable to increase the filter sizes of the convolutional layers in the discriminator and generator. The implementation works with a max filter value of 512 (e.g. Conv2d(256,512,kernel_size, stride), but I have had no luck trying to increase the filter value to a max of 1024.

The above implementation’s discriminator is as follows (which I have used in my implementation):

nc = 3
ndf = 64
  self.main = nn.Sequential(
            # input is (nc) x 64 x 64
            nn.Conv2d(nc, ndf, 4, 2, 1, bias=False),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf) x 32 x 32
            nn.Conv2d(ndf, ndf * 2, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ndf * 2),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf*2) x 16 x 16
            nn.Conv2d(ndf * 2, ndf * 4, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ndf * 4),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf*4) x 8 x 8
            nn.Conv2d(ndf * 4, ndf * 8, 4, 2, 1, bias=False),
            nn.BatchNorm2d(ndf * 8),
            nn.LeakyReLU(0.2, inplace=True),
            # state size. (ndf*8) x 4 x 4
            nn.Conv2d(ndf * 8, 1, 4, 1, 0, bias=False),
            nn.Sigmoid()
        )

I want to change the model such that the last layer is:
nn.Conv2d(ndf*16,1,4,1,0,bias=False)

so that the final layer has 1024 filters, as opposed to 512 (and similarly for the generator). However, when I implement this, changing only that and keeping the kernel size, channels (nc) etc. constant, I get

RuntimeError: Calculated input size: (2 x 2). Kernel size: (4 x 4). Kernel size can't greater than actual input size

How do I construct a discriminator with a final convolutional layer that has 1024 output filters (and the equivalent convolutional transpose layer in the generator with 1024 input filters)? I imagine I should be able to do this without changing the inputs (batch of 64 images that are 64px by 64px).

Any help would be greatly appreciated.

The first argument to Conv2d are the in_channels.
From the docs:

Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True)

The last layer in the discriminator has therefore one output channel and only one “pixel” (the shape before the last layer is [batch_size, ndf*8, 4, 4], so that a 4x4 kernel returns only one value).

If you want to change the number of filters/channels try:

... 
nn.Conv2d(ndf * 4, ndf * 16, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 16),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*16) x 4 x 4
nn.Conv2d(ndf * 16, 1, 4, 1, 0, bias=False),
nn.Sigmoid()

Thanks @ptrblck so much! I thought that out_channels ought to be 2 * n_channels (and couldn’t be 4*in_channels) so I hadn’t considered that. Does this mean in general out_channels need only be some scalar multiple of in_channels?

Changed my discriminator and generator as per your suggestions and it works!
Really appreciate the quick, clear and super helpful reply.

Thanks again!

You are welcome!
out_channels defines the number of kernels and as long as you use “vanilla” convolutions, i.e. without grouping, you can use any number you want.

Remember that each filter uses all input channels. Grouping changes this behavior.

1 Like

I tried to follow your advices,

I changed the Discriminator without any problem
:slight_smile:

class Discriminator(nn.Module):
def init(self, ngpu):
super(Discriminator, self).init()
self.ngpu = ngpu
self.main = nn.Sequential(
# input is (nc) x 64 x 64
nn.Conv2d(nc, ndf, 4, 2, 1, bias=False),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf) x 32 x 32
nn.Conv2d(ndf, ndf * 2, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 2),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf2) x 16 x 16
nn.Conv2d(ndf * 2, ndf * 4, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 4),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf
4) x 8 x 8
nn.Conv2d(ndf * 4, ndf * 16, 4, 2, 1, bias=False),
nn.BatchNorm2d(ndf * 16),
nn.LeakyReLU(0.2, inplace=True),
# state size. (ndf*16) x 4 x 4
nn.Conv2d(ndf * 16, 1, 4, 1, 0, bias=False),
nn.Sigmoid()
)

and I tried to change the Generator like that :

class Generator(nn.Module):
def init(self, ngpu):
super(Generator, self).init()
self.ngpu = ngpu
self.main = nn.Sequential(
# input is Z, going into a convolution
nn.ConvTranspose2d( nz, ngf * 8, 4, 1, 0, bias=False),
nn.BatchNorm2d(ngf * 8),
nn.ReLU(True),
# state size. (ngf8) x 4 x 4
nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 4),
nn.ReLU(True),
# state size. (ngf
4) x 8 x 8
nn.ConvTranspose2d( ngf * 4, ngf * 2, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 2),
nn.ReLU(True),
# state size. (ngf*2) x 16 x 16
nn.ConvTranspose2d( ngf * 4, ngf * 16, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 16),
nn.ReLU(True),
# state size. (ngf) x 32 x 32
nn.ConvTranspose2d( ngf * 16, 1, 4, 1, 0, bias=False),
nn.Tanh()
# state size. (nc) x 64 x 64
)

But when I run the training code I have this error :

Also, should I change the image_size from 64 to 1024 ?

I’d love to have some help :slight_smile:

It looks like the input channels of an nn.ConvTranspose2d layer are set to a wrong value.

class Generator(nn.Module):
    def __init__(self, ngpu):
        super(Generator, self).__init__()
        self.ngpu = ngpu
        self.main = nn.Sequential(
        # input is Z, going into a convolution
        nn.ConvTranspose2d( nz, ngf * 8, 4, 1, 0, bias=False),
        nn.BatchNorm2d(ngf * 8),
        nn.ReLU(True),
        # state size. (ngf8) x 4 x 4
        nn.ConvTranspose2d(ngf * 8, ngf * 4, 4, 2, 1, bias=False),
        nn.BatchNorm2d(ngf * 4),
        nn.ReLU(True),
        # state size. (ngf4) x 8 x 8
        nn.ConvTranspose2d( ngf * 4, ngf * 2, 4, 2, 1, bias=False),
        nn.BatchNorm2d(ngf * 2),
        nn.ReLU(True),
        # state size. (ngf*2) x 16 x 16
        nn.ConvTranspose2d( ngf * 2, ngf * 16, 4, 2, 1, bias=False), # !!! Change in_channels to ngf*2
        nn.BatchNorm2d(ngf * 16),
        nn.ReLU(True),
        # state size. (ngf) x 32 x 32
        nn.ConvTranspose2d( ngf * 16, 1, 4, 1, 0, bias=False),
        nn.Tanh()
        # state size. (nc) x 64 x 64
        )

    def forward(self, x):
        x = self.main(x)

See the comment where to change it.

Thanks a lot for your answer !

I’ve done what you said but now I have this error when I start the training loop :

Starting Training Loop…
---------------------------------------------------------------------------
RuntimeError Traceback (most recent call last)
in ()
** 31 noise = torch.randn(b_size, nz, 1, 1, device=device)**
** 32 # Generate fake image batch with G**
—> 33 fake = netG(noise)
** 34 label.fill_(fake_label)**
** 35 # Classify all fake batch with D**

6 frames
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py in batch_norm(input, running_mean, running_var, weight, bias, training, momentum, eps)
** 1695 return torch.batch_norm(**
** 1696 input, weight, bias, running_mean, running_var,**
-> 1697 training, momentum, eps, torch.backends.cudnn.enabled
** 1698 )**
** 1699 **

RuntimeError: running_mean should contain 128 elements not 1024

Do you advise me to change some number here ?

# Number of workers for dataloader
workers = 2

# Batch size during training
batch_size = 128

# Spatial size of training images. All images will be resized to this
# size using a transformer.
image_size = 64

# Number of channels in the training images. For color images this is 3
nc = 3

# Size of z latent vector (i.e. size of generator input)
nz = 100

# Size of feature maps in generator
ngf = 64

# Size of feature maps in discriminator
ndf = 64

# Number of training epochs
num_epochs = 5

# Learning rate for optimizers
lr = 0.0002

# Beta1 hyperparam for Adam optimizers
beta1 = 0.5

# Number of GPUs available. Use 0 for CPU mode.
ngpu = 1

I tried to change “image_size” from 64 to 554 and 1024 but this didn’t worked. I also tried to change batch_size from 128 to 1024.

Thanks again for your help :slight_smile:

It seems some batchnorm layer in your generator does not use the same number of input channels as the output channels of the preceding conv layer.
E.g.:

...
nn.ConvTranspose2d( ngf * 2, ngf * 16, 4, 2, 1, bias=False),
nn.BatchNorm2d(ngf * 16),
...

Here you can see that the nn.BatchNorm2d takes ngf*16 input channels, which corresponds to the same number of output channels of nn.ConvTranspose2d.
Could you check these numbers?

Thank you so much for your help, I changed that and now it works :slight_smile: