How to modify GAN dimensions?

I’m training a Wasserstein GAN with Gradient Penalty (WGAN-GP) but incur mismatch in my real and fake dimensions.

My data (real) is torch.Size([50,1,50,300]). The issue is the conversion from fake_noise (torch.Size([50,300])) to fake(torch.Size([50, 1, 28, 28])) as such: fake = gen(fake_noise).

cur_step = 0
generator_losses = []
critic_losses = []
for epoch in range(n_epochs):
    for real in tqdm(dataloader):
        cur_batch_size = len(real) # len = 50
        real = real.to(device)

        mean_iteration_critic_loss = 0
        for _ in range(crit_repeats):
            crit_opt.zero_grad()
            fake_noise = get_noise(cur_batch_size, z_dim, device=device)
            fake = gen(fake_noise) # len = 50
            crit_fake_pred = crit(fake.detach())
            crit_real_pred = crit(real)
            epsilon = torch.rand(len(real), 1, 1, 1, device=device, requires_grad=True)
            gradient = get_gradient(crit, real, fake.detach(), epsilon)
            gp = gradient_penalty(gradient)
            crit_loss = get_crit_loss(crit_fake_pred, crit_real_pred, gp, c_lambda)
     25             epsilon = torch.rand(len(real), 1, 1, 1, device=device, requires_grad=True)
---> 26             gradient = get_gradient(crit, real, fake.detach(), epsilon)
     27             gp = gradient_penalty(gradient)
     28             crit_loss = get_crit_loss(crit_fake_pred, crit_real_pred, gp, c_lambda)

<ipython-input-28-35150bdb454c> in get_gradient(crit, real, fake, epsilon)
     13     '''
     14     # Mix the images together
---> 15     mixed_images = real * epsilon + fake * (1 - epsilon)
     16 
     17     # Calculate the critic's scores on the mixed images

RuntimeError: The size of tensor a (300) must match the size of tensor b (28) at non-singleton dimension 3

This is my generator class (gen):

class Generator(nn.Module):
    def __init__(self, z_dim=10, im_chan=1, hidden_dim=300 ):
        super(Generator, self).__init__()
        self.z_dim = z_dim
        # Build the neural network
        self.gen = nn.Sequential(
            self.make_gen_block(z_dim, hidden_dim * 4),
            self.make_gen_block(hidden_dim * 4, hidden_dim * 2, kernel_size=4, stride=1),
            self.make_gen_block(hidden_dim * 2, hidden_dim),
            self.make_gen_block(hidden_dim, im_chan, kernel_size=4, final_layer=True),
        )

    def make_gen_block(self, input_channels, output_channels, kernel_size=3, stride=2, final_layer=False):
        if not final_layer:
            return nn.Sequential(
                nn.ConvTranspose2d(input_channels, output_channels, kernel_size, stride),
                nn.BatchNorm2d(output_channels),
                nn.ReLU(inplace=True),
            )
        else:
            return nn.Sequential(
                nn.ConvTranspose2d(input_channels, output_channels, kernel_size, stride),
                nn.Tanh(),
            )

    def forward(self, noise):
        x = noise.view(len(noise), self.z_dim, 1, 1)
        return self.gen(x)

Could you post the complete stack trace, which should show which operation raises this error?
I assume you are passing a noise input in the shape [50, 300] and expect the model to output a tensor as [50, 1, 28, 28]?

1 Like

I’m passing a noise input in the shape [50, 300] and expect the model to output a tensor as it returns a wrong tensor of [50,1,50,300]. Currently, I get [50, 1, 28, 28] which is why I get an error. (Updated the stack trace!)

If the spatial shapes of your output are wrong, I would assume that your model architecture is also wrong or was it working before at one point?
Given your output is smaller than expected, I guess you might need to add more transposed convolutions to the model.