PyTorch convolutions

Hi,

We know, that sequential applying 1x3 and 3x1 conv is ~equivalent to 3x3 conv, but much faster (less computations). So, I wonder does PyTorch optimizes 3x3 to 1x3+3x1 conv under the hood? Or how can I use this feature in the most PyTorch-like way?

Thanks!

that sequential applying 1x3 and 3x1 conv is ~equivalent to 3x3 conv,

This is only true if the 3x3 kernel has rank 1. Empirically, most learned conv kernels has full rank, so it is not generally true in practice.