I’m getting a RuntimeError: cuFFT error: CUFFT_EXEC_FAILED
when calling torch.irfft
on arrays of particular sizes and when trying to use multiple GPUs (I’m on an AWS p3.8xlarge). The test case below was distilled from my larger application, thus the particular way of determining the slices and array sizes.
import os
os.environ['CUDA_LAUNCH_BLOCKING'] = "1"
import torch
sz = 65
n_gpus = 2
for i in range(n_gpus):
arr = torch.rand((sz, sz//2 + 1, 2), dtype=torch.float32, device=i)
# Get the crash
torch.irfft(arr, 2, signal_sizes=(sz, sz))
Setting n_gpus
to 1, or setting sz
(array size) to most even numbers seems to avoid this error. But I would like to understand how to avoid this error for these particular settings because I don’t want someone to try my code with an array size that randomly gives this error