How to convert to linear

output size of conv layers is sometimes a mystery for me, there is a formula (How to calculate the output size after Conv2d in pytorch?), but my general rule of thumb is that if kernel size is 3, use padding of 1 to keep size the same, if kernel is 5, use padding of 2, etc. and dont you even sized kernels.

You can flatten tensors using nn.Flatten, there are other ways like reshaping, but they might corrupt your data on the way through your network and cause unexpected problems (for me it was network generating checkboard garbage at output, it was a GAN, replacing .reshape with .permute solved the issue)