Conv3D runs very slow in fp16 and bf16

Seems like there’s an issue for it: Significant Memory Regression in `F.conv3d` with `bfloat16` Inputs in `PyTorch 2.9.0` · Issue #166643 · pytorch/pytorch · GitHub

This comment seems to have a fix (installling cudnn 9.15+ via pip).

1 Like