Hi
I recently noticed that it is not possible to create the MFCC of a tensor that is already on a gpu. This is because two tensors are created during initialization, which cannot be moved to another device using the arguments of the MFCC class.
The tensors are the filterbanks for transforming the spectogram to mel scale and the dct matrix used by the mfcc.
Example code that crashes:
import torch
from torchaudio.transforms import MFCC
device = torch.device('cuda')
mfcc = MFCC(melkwargs={ "wkwargs": { "device": device } })
data = torch.rand([400, 400]).to(device)
print(mfcc(data))
To be able to compute the MFCC of a tensor that is on a gpu, one has to add these two lines after the creation of the mfcc object, which is not good, as one has to access torchaudio internals.
mfcc.MelSpectrogram.mel_scale.fb = mfcc.MelSpectrogram.mel_scale.fb.to(device)
mfcc.dct_mat = mfcc.dct_mat.to(device)
It would be great if this was possible without accessing the internals. I’d be happy to open a PR on the torchaudio repo, but I read that additions to the code should be discussed before creating a PR.
The way I would solve this issue in my PR would be to create a parameter “device” for the MFCC class, a parameter “mel_scale_kwargs” for the MelSpectrogram class and a parameter “device” for the MelScale class. Passing a device argument would transfer the dct_mat to that device, and pass this device to the spectogram window function (“melkwargs” > “wkwargs” > “device”) and the mel scale (“melkwargs” > “mel_scale_kwargs” > “device”), which transfer their respective tensors to the device.
I’m happy about any thoughts or feedback.