Why is 2D conv slower than equivalent 1D conv?

Sam_Lerman · June 21, 2022, 9:16pm

I have two networks and one of them uses 2D conv layers where one of the kernel dims is size 1, and the other just uses 1D conv, but is equivalent.

So why does the latter run faster if they are mathematically the same?

ptrblck · June 21, 2022, 9:28pm

Could you share a minimal, executable code snippet to reproduce the slowdown as well as the output of python -m torch.utils.collect_env, please?

If you are using the GPU (and thus cuDNN for convolutions) both calls should be dispatched to the same internal cuDNN call, so I’m unsure what might be causing it (unless some use cases were filtered out due to known issues).

Sam_Lerman · June 21, 2022, 9:43pm

Oh I’m running it on my macbook’s cpu.

I haven’t tested this one exactly, but maybe the following:

model = nn.Sequential(nn.Conv1d(1, 8, kernel_size=5, stride=1),
            nn.BatchNorm1d(8),
            nn.Dropout(0.2),
            nn.ReLU(),
            nn.MaxPool1d(kernel_size=2, stride=2))

vs.

model = nn.Sequential(nn.Conv2d(1, 8, kernel_size=(5,1), stride=1),
            nn.BatchNorm2d(8),
            nn.Dropout(0.2),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=(2,1), stride=2))

Output:

Collecting environment information...
PyTorch version: 1.8.0
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: macOS 12.4 (arm64)
GCC version: Could not collect
Clang version: 13.1.6 (clang-1316.0.21.2.5)
CMake version: version 3.21.3

Python version: 3.9 (64-bit runtime)
Is CUDA available: False
CUDA runtime version: No CUDA
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A

Versions of relevant libraries:
[pip3] numpy==1.21.2
[pip3] pytorch-lightning==1.4.8
[pip3] torch==1.10.2
[pip3] torchaudio==0.9.1
[pip3] torchinfo==1.7.0
[pip3] torchmetrics==0.5.1
[pip3] torchsampler==0.1.1
[pip3] torchsummary==1.5.1
[pip3] torchvision==0.11.3
[conda] numpy                     1.21.2           py39h1f3b974_0    conda-forge
[conda] pytorch                   1.8.0           cpu_py39hc766e51_1    conda-forge
[conda] pytorch-lightning         1.4.8                    pypi_0    pypi
[conda] torch                     1.10.2                   pypi_0    pypi
[conda] torchaudio                0.9.1                    pypi_0    pypi
[conda] torchinfo                 1.7.0              pyhd8ed1ab_0    conda-forge
[conda] torchmetrics              0.5.1                    pypi_0    pypi
[conda] torchsampler              0.1.1                    pypi_0    pypi
[conda] torchsummary              1.5.1                    pypi_0    pypi
[conda] torchvision               0.9.0a0+83171d6          pypi_0    pypi

ptrblck · June 21, 2022, 10:49pm

Thanks for the update. I’m unfortunately not familiar enough with the backend on MAC CPU workloads, so we would need to wait for some experts for this architecture.

Sam_Lerman · June 21, 2022, 11:35pm

Oh, if it’s mac CPU specific then I would consider it a non-issue. This slowdown wouldn’t occur on CUDA GPUs?

ptrblck · June 21, 2022, 11:37pm

It should not, but you would need to profile it using your setup (GPU, PyTorch version, CUDA + cuDNN + cublas version etc.) as the performance would depend on the used environment.

Sam_Lerman · June 21, 2022, 11:38pm

Thanks! I’ll get back to you if I still find it slowed-down