How to compute the MelSpectrogram on batches of data?

Hey there,

I can’t figure out how to run the MelSpectrogram (from torchaudio) on batches of data.

I checked the torchaudio docs:

https://pytorch.org/audio/_modules/torchaudio/transforms.html#MelSpectrogram

and the waveform needs to be in this format: (channel, time)

which makes sense for one wave, but what if we have batches of data?

Thanks!

This transformation might only work on a single sample (like torchvision.transforms), as it’s usual use case would probably be to apply it in the __getitem__ method on each data sample.