Accelerate Spectrograms with GPU and PyTorch?

I am working with data that I spend a lot of time converting 1D signals to 2D spectrograms that are then fed to a CNN in Pytorch. I have a 2nd GPU and wanted to know if it was reasonable to accelerate my fft/spectrogram workflow through PyTorch on my GPU instead of what I am currently doing in my dataloader?

What I am doing now:
-In my custom dataset (using dataset class) I initialize and build my dataset of time-series audio data before training begins (fast). Then inside my getitem method I have some things that get done to the time-series data before I pass the data sample to a custom fft function that builds the spectrograms in the way I need and then returns that spectrogram as the data sample.
-My spectrogram function is custom built for various things I need but the main workhorse is built around the np.fft.fft() function, which is where I need to accelerate.
-I am using the “num_workers” kwarg in the PyTorch dataloader to better use my cores.

What I would like to do:
Either in my getitem or elsewhere, I’d like to send it to my 2nd GPU (GTX 980ti) in hopes to faster generate the spectrograms then send them to my main GPU to pass the data through the model.

Is this reasonable? Can I expect to see accelerations in contrast to a Ryzen 3950x and utilizing the “num_workers” kwarg in the dataloader? How would I do this? I’m open to ideas for other libraries and tools that exist out of PyTorch, but not sure where to start and seeking insight from this community.


I think the best way to speed this up would be to move it as preprocessing.
Have a seperate script that converts your audio data to the spectrogram and save them to disk.
Then your dataloader in the training script will just load the spectrograms directly.

Inconveniently this will not be accessible because I have certain transformations that are applied to my time-series data before I generate a spectrogram. Also this would eliminate any data augmentation that I have available in the time-series data as well.

You can indeed operate on the main thread and to process that in a second gpu.
Even if I didn’t a proper profiling from my experience it may be worse to pre process them as the dimensionality is usually way higher. For example in my case an wavelength of 16k elements becomes into a 512x256x2.
Anyway I only recommend this is the main workload is the stft.if you have additional heavy preprocessing multiprocessing may be better

1 Like