Consider I have decoder that maps the latent space to some image
latent_variable.shape = (batch_size=64, latent_dim=10)
I want to decoding it into an image with size (batch_size=64, channels=3, height=32, width=64)
using 1 linear layers and 7 layers of nn.ConvTranspose2d
My current code is
layer1 = nn.Linear(latent_dim, 256) # gives (batch_size, 256)
My goals is to add 7 more nn.ConvTranspose2d
layers after this;
But I’m very confused how to add these layers up to make my (batch_size, channels=3, height=32, width=64)
matched up (especially the height and width are different)
Can someone help me clarify how does nn.ConvTranspose2d
work ?