Getting Error in WGAN-GP

Sumera_Rounaq · November 18, 2021, 10:40am

Below is my code of Generator part of WGAN. This code is generating tensor of dimension [50, 1, 28, 28] and i want of [50,1,100,100]. I am unable to find what is the reason and how could I resolve it.

I also want to debug the neural network build inside to understand the flow of values of “hidden_dim”. Any help shall be highly appreciated.

Note: I have already resize images to 100 * 100 and now i want generator to generate 100* 100 images.

class Generator(nn.Module):
    '''
    Generator Class
    Values:
        z_dim: the dimension of the noise vector, a scalar
        im_chan: the number of channels in the images, fitted for the dataset used, a scalar
              (MNIST is black-and-white, so 1 channel is your default)
        hidden_dim: the inner dimension, a scalar
    '''
    def __init__(self, z_dim=64, im_chan=1, hidden_dim=64):
        super(Generator, self).__init__()
        self.z_dim = z_dim
        # Build the neural network
        self.gen = nn.Sequential(
            self.make_gen_block(z_dim, hidden_dim * 4), 
            self.make_gen_block(hidden_dim * 4, hidden_dim * 2, kernel_size=4, stride=1),
            self.make_gen_block(hidden_dim * 2, hidden_dim),
            self.make_gen_block(hidden_dim, im_chan, kernel_size=4, final_layer=True),
        )

    def make_gen_block(self, input_channels, output_channels, kernel_size=3, stride=2, final_layer=False):
        '''
        Function to return a sequence of operations corresponding to a generator block of DCGAN;
        a transposed convolution, a batchnorm (except in the final layer), and an activation.
        Parameters:
            input_channels: how many channels the input feature representation has
            output_channels: how many channels the output feature representation should have
            kernel_size: the size of each convolutional filter, equivalent to (kernel_size, kernel_size)
            stride: the stride of the convolution
            final_layer: a boolean, true if it is the final layer and false otherwise 
                      (affects activation and batchnorm)
        '''
        if not final_layer:
            return nn.Sequential(
                nn.ConvTranspose2d(input_channels, output_channels, kernel_size, stride),
                nn.BatchNorm2d(output_channels),
                nn.ReLU(inplace=True),
            )
        else:
            return nn.Sequential(
                nn.ConvTranspose2d(input_channels, output_channels, kernel_size, stride),
                nn.Tanh(),
            )

    def forward(self, noise):
        '''
        Function for completing a forward pass of the generator: Given a noise tensor,
        returns generated images.
        Parameters:
            noise: a noise tensor with dimensions (n_samples, z_dim)
        '''
        x = noise.view(len(noise), self.z_dim, 1, 1)
        return self.gen(x)

ZimoNitrome · November 18, 2021, 1:39pm

I am not exactly sure what you mean. But if you want to debug the shape (and hidden dim) of the tensor between each block, I like to use a simple “PrintBlock”:

class PrintBlock(nn.Module):
    def forward(self, x):
        print(x.shape)
        return x

Putting it into your code:

class Generator(nn.Module):
    '''
    Generator Class
    Values:
        z_dim: the dimension of the noise vector, a scalar
        im_chan: the number of channels in the images, fitted for the dataset used, a scalar
              (MNIST is black-and-white, so 1 channel is your default)
        hidden_dim: the inner dimension, a scalar
    '''
    def __init__(self, z_dim=64, im_chan=1, hidden_dim=64):
        super(Generator, self).__init__()
        self.z_dim = z_dim
        # Build the neural network
        self.gen = nn.Sequential(
            PrintBlock(),
            self.make_gen_block(z_dim, hidden_dim * 4), 
            PrintBlock(),
            self.make_gen_block(hidden_dim * 4, hidden_dim * 2, kernel_size=4, stride=1),
            PrintBlock(),
            self.make_gen_block(hidden_dim * 2, hidden_dim),
            PrintBlock(),
            self.make_gen_block(hidden_dim, im_chan, kernel_size=4, final_layer=True),
            PrintBlock(),
        )

    ...

Assuming you are using a noise vector of shape [50, 54]:

net = Generator()
out = net.forward(torch.rand([50, 64]))

prints:

torch.Size([50, 64, 1, 1])
torch.Size([50, 256, 3, 3])
torch.Size([50, 128, 6, 6])
torch.Size([50, 64, 13, 13])
torch.Size([50, 1, 28, 28])

Maybe from here you can understand how to change the convolution kernels, strides, and padding to get the desired output shape.

Sumera_Rounaq · November 19, 2021, 9:13am

Thank you so much. This is 100% what i want.