Pytorch fft on multiple gpus

I am using pytorch function torch.rfft() and torch.irfft() inside the forward path of a model. It runs fine on single GPU. However, when I train the model on multiple GPUs, it fails and gave the error:

RuntimeError: cuFFT error: CUFFT_INTERNAL_ERROR

Does anybody has the intuition why this is the case? Thanks!

Hi @Tim_Zhang – are you using torch.nn.DataParallel for training on multiple GPUs? If so, this could be some sort of initialization bug where cuFFT is initialized on the first device only and not others.

I have the same problem here, when using DataParallel with torch.fft() or torch.rfft(), it generates error without error message. I.e. the visual studio pops up saying" An unhandled win32 exception ocurred in python.exe"

I managed to reproduce this issue and reported it at https://github.com/pytorch/pytorch/issues/24176.