Hello!
I am new to PyTorch and I am at the moment building my first ever GAN network.
While choosing the proper layer architecture I have noticed some behavior that I cannot understand and I am really interested in the inner-working of this function.
Let’s say, that I have got a batch of data looking like this:
input_data = torch.randn(64, 100, 1, 1)
As I understand, it can be interpreted as a set of 64 pictures, of height=100 pixels, width=1 pixel in a grayscale.
I want to use this random data to generate some images, so I insert it into a network in which the first layer is a ConvTranspose2d. This layer requires the size of the input data to be specified and the size of the output data as well. It looks strange for me, because I thought, that the size of an output image is determined by the size of an input image, stride, and padding. However, the layer canfit the output into the given size.
Example 1:
conv = ConvTranspose2d(100, 512, 4, 1, 0)
output = conv(input_data)
print(output.size())
torch.Size([64, 512, 4, 4])
Example 2:
conv = ConvTranspose2d(100, 3157, 4, 1, 0)
output = conv(input_data)
print(output.size())
torch.Size([64, 3157, 4, 4])
Probably, I do not understand some of the Convolutional Transpose layer’s mechanics but I find it very interesting and I wonder whether someone knows a simple answer to the question:
How does the ConvTranspose2d fit the output into an arbitrary number of output channels?
I will be grateful for your help