I have my input image which is 128 x 96 size, and it is fed to the net, after a set of layers, I want to output the image of the same size.
My last layer now is ConvAct, and if to check the size of tensor, it is (10, 1, 96, 128), so to have a linear layer which would return me 128*96 image - what input size I should use?
Would you like to return the output image shape directly from the linear layer or would you like to reshape the output to
[batch_size, 1, 128, 96]?
You could use conv layers only, which could return the desired shape.
If you really want to use a linear layer, you could use:
lin = nn.Linear(1*96*128, 1*128*96)
act = torch.randn(10, 1, 96, 128) # your activation in the forward pass
out = lin(x)
out = out.view(out.size(0), 1, 128, 96)
Could you explain, why the spatial dimensions are transposed? It this on purpose or a typo/error?