Backwards pass runs ~50 times slower on 0.5.0 + GPU versus 0.4.0 + CPU

Could be because your GPU really isn’t supported anymore. But in any case I would also try updating the NVIDIA drivers. Could you check GPU usage info via e.g., nvidia-smi (in the terminal)?

Your problem may also be related to this thread here: GPU utilized 99% but Cudnn not used, extremely slow

In addition to torch.cuda.is_available() returning true, have you checked that torch.backends.cudnn.version() returns sth other than None?