nn.Cov2D how to decide the output image with different size

Consider I have decoder that maps the latent space to some image

latent_variable.shape = (batch_size=64, latent_dim=10)

I want to decoding it into an image with size (batch_size=64, channels=3, height=32, width=64) using 1 linear layers and 7 layers of nn.ConvTranspose2d

My current code is

layer1 = nn.Linear(latent_dim, 256) # gives (batch_size, 256)

My goals is to add 7 more nn.ConvTranspose2d layers after this;

But I’m very confused how to add these layers up to make my (batch_size, channels=3, height=32, width=64) matched up (especially the height and width are different)

Can someone help me clarify how does nn.ConvTranspose2d work ?