I have an NVIDIA A5000 RTX and an NVIDIA Titan RTX card.
When using PyTorch’s native amp, an epoch takes around 40 minutes on the Titan RTX as opposed to 2 hours on the A5000 RTX. Nothing else is changed. I run the same scripts merely changing CUDA_VISIBLE_DEVICES from 0 to 1 using the different GPU.
This seems very strange to me. What am I missing?