I am trying to figure out why my convolutional decoder is much slower than the encoder.
Here are the architectures:
ConvEncoder(
(model): Sequential(
(0): Conv2d(3, 32, kernel_size=(4, 4), stride=(2, 2))
(1): ReLU()
(2): Conv2d(32, 64, kernel_size=(4, 4), stride=(2, 2))
(3): ReLU()
(4): Conv2d(64, 128, kernel_size=(4, 4), stride=(2, 2))
(5): ReLU()
(6): Conv2d(128, 256, kernel_size=(4, 4), stride=(2, 2))
(7): ReLU()
)
)
ConvDecoder(
(model): Sequential(
(0): Linear(in_features=230, out_features=1024, bias=True)
(1): Reshape()
(2): ConvTranspose2d(1024, 128, kernel_size=(5, 5), stride=(2, 2))
(3): ReLU()
(4): ConvTranspose2d(128, 64, kernel_size=(5, 5), stride=(2, 2))
(5): ReLU()
(6): ConvTranspose2d(64, 32, kernel_size=(6, 6), stride=(2, 2))
(7): ReLU()
(8): ConvTranspose2d(32, 3, kernel_size=(6, 6), stride=(2, 2))
)
)
The batch shape is (50, 50, 3, 64, 64).
The encoder takes about 16ms and the decoder 130ms on a GTX 1080TI.
I am using cuda.synchronize() for the timings.
What is the reason for these very different times?