Cuda error in torchaudio spectrogram

giampierosalvi · October 16, 2021, 9:44am

I am running the same code on two machines running Ubuntu 20.04. In one machine the code runs fine, in the other I get the error reported below. The only difference I can see are the graphic cards (NVIDIA Titan Xp in the machine that works and NVIDIA GeForce RTX 3090 in the one that doesn’t). The version of torch and torchaudio are 1.5.0 and 0.5.0 (these are required for the experiment to be reproducible). The machine that works runs python 3.7.9, the one that doesn’t runs python 3.8.10. Both machines have cuda V10.1.243.

Let me know if there is other information that I missed and that is relevant to this question.

Do you have any suggestion on what to look for to solve this problem?
Thank you!
Giampiero

Using device ‘cuda:0’
Creating Datasets
Calculating mean and std from dataset
Traceback (most recent call last):
File “main.py”, line 665, in
main(args)
File “main.py”, line 621, in main
task_and_more, dataloaders, model_and_more = prepare_for_task(args)
File “main.py”, line 555, in prepare_for_task
datasets = create_datasets(args, paths[‘data_path’], flags[‘load_dataset_extra_stats’])
File “main.py”, line 252, in create_datasets
datasets = set_transforms_on_datasets(args, datasets, transforms_device)
File “main.py”, line 303, in set_transforms_on_datasets
stats = get_dataset_stats_and_write(datasets[‘train’], args[‘device’],
File “…/preprocessing.py”, line 79, in get_dataset_stats_and_write
mean, std, min_val, max_val = calc_dataset_stats(dataloader, device=device)
File “…/processing.py”, line 72, in calc_dataset_stats
input_channel = dataloader.dataset[0][‘image’].shape[0]
File “…/CLEAR_dataset.py”, line 433, in getitem
game_with_image = self.transforms(game_with_image)
File “…/venv/lib/python3.8/site-packages/torchvision/transforms/transforms.py”, line 61, in call
img = t(img)
File “…/data_interfaces/transforms.py”, line 36, in call
specgram = self.spectrogram_transform(sample[‘image’])[0, :, :]
File “…/venv/lib/python3.8/site-packages/torch/nn/modules/module.py”, line 550, in call
result = self.forward(*input, **kwargs)
File “…/venv/lib/python3.8/site-packages/torchaudio/transforms.py”, line 81, in forward
return F.spectrogram(waveform, self.pad, self.window, self.n_fft, self.hop_length,
File “…/venv/lib/python3.8/site-packages/torchaudio/functional.py”, line 276, in spectrogram
spec_f /= window.pow(2.).sum().sqrt()
RuntimeError: CUDA error: invalid device function

ptrblck · October 16, 2021, 9:52am

Your Ampere GPU (RTX 3090) needs CUDA>=11.0 and I don’t know which exact PyTorch binary (or source build) you’ve installed.
In any case, the CUDA11 support came on PyTorch ~1.7 so your 1.5.0 installation will most likely be using CUDA10, which is causing the error.

The pip wheels and conda binaries ship with their own CUDA runtime and your local CUDA toolkit is used for a source build or to build custom CUDA extensions.

giampierosalvi · October 16, 2021, 11:10am

Thank you for the quick reply. You are right, I got it to work with
pip uninstall torch torchvision torchaudio
pip install torch==1.7.1+cu110 torchvision==0.8.2+cu110 torchaudio==0.7.2 -f https://download.pytorch.org/whl/torch_stable.html

Best