First of all, thank you for caring for pytorch newbie as to be reading this question right now.
In the matter of the question itself, I am implementing a DCGAN Discriminator, here is the code:
class Discriminator(nn.Module): def __init__(self, filter_sizes, leaky_relu_alpha): super(Discriminator, self).__init__() # Network architecture self.conv_1 = nn.Conv2d(in_channels=img_channels, out_channels=filter_sizes, kernel_size=4, stride=2, padding=1) self.conv_2 = nn.Conv2d(in_channels=filter_sizes, out_channels=filter_sizes, kernel_size=4, stride=2, padding=1) self.conv_2_bn = nn.BatchNorm2d(filter_sizes) self.conv_3 = nn.Conv2d(in_channels=filter_sizes, out_channels=filter_sizes, kernel_size=4, stride=2, padding=1) self.conv_3_bn = nn.BatchNorm2d(filter_sizes) self.conv_4 = nn.Conv2d(in_channels=filter_sizes, out_channels=filter_sizes, kernel_size=4, stride=2, padding=1) self.conv_4_bn = nn.BatchNorm2d(filter_sizes) self.dense = nn.Linear(in_features=filter_sizes * (img_size//16) * (img_size//16), out_features=1) # Hyperparameters self.filter_sizes = filter_sizes self.leaky_relu_alpha = leaky_relu_alpha def forward(self, x): # Conv 1 | out:[16 x 16 x 128] x = self.conv_1(x) x = F.leaky_relu(x, self.leaky_relu_alpha) # Conv 2 | out:[8 x 8 x 256] x = self.conv_2(x) x = self.conv_2_bn(x) x = F.leaky_relu(x, self.leaky_relu_alpha) # Conv 3 | out:[4 x 4 x 512] x = self.conv_3(x) x = self.conv_3_bn(x) x = F.leaky_relu(x, self.leaky_relu_alpha) # Conv 4 | out:[2 x 2 x 1024] x = self.conv_4(x) x = self.conv_4_bn(x) x = F.leaky_relu(x, self.leaky_relu_alpha) # Classification layer x = x.view(-1, self.filter_sizes * (img_size//16) * (img_size//16)) x = self.dense(x) x = F.sigmoid(x) return x
When I try to execute it, I get this error when I try to evaluate the output of the model with the given Binary Cross Entropy criterion (BCELoss):
ValueError: Target and input must have the same number of elements. target nelement (128) != input nelement (288)
So, to my understanding the problem here is that I am not handling dimensionality correctly, and as the strided convolution changes the width and height tensor dimensions to unexpected values, the flattening layer is wrapping the convolution output values into more samples than the inputted batch_size.
Now, I have checked the documentation on Conv2d layers, and, following the given output sizes formula, I think I should be properly following the reduction on width and height dimensions.
Also, when I check the input dimensions, it is a tensor with its 0 dimension properly being 128; the error happens at the very first batch and iteration of the training, so this doesn’t seem to be the problem described in this post.
Any direction in this matter would be highly appreciated.
Have a nice start of the week, my good folk!