Error while obtaining an Inverse Spectrogram

I am working on a project wherein I’m converting audio of size 60s to a spectrogram, and have the need to convert said spectrogram back to audio, at this stage I get an error “Tensor must have a last dimension of size 2”

the code I used is as follows:

wave, samp_rate = torchaudio.load(filpath)
transform = torchaudio.transforms.Spectrogram(n_fft=1000)

spec = transform(wave)

rev_trans = torchaudio.transforms.InverseSpectrogram(n_fft=1152)

waveform = rev_trans(spec)  # Here is where the error arises

after checking for the size of “spec” it says it is a tensor of [2, 501, 5295] size.

Hi @Amit_Kenkre, are you using torchaudio 0.10? If so, you need to set power=None in Spectrogram to return complex-valued Tensor, otherwise it will return the magnitude of the spectrogram.

Second, the n_fft in InverseSpectrogram and Spectrogram must be the same. You may want to change n_fft settings to suit your need.