I am working on a project wherein I’m converting audio of size 60s to a spectrogram, and have the need to convert said spectrogram back to audio, at this stage I get an error “Tensor must have a last dimension of size 2”
the code I used is as follows:
wave, samp_rate = torchaudio.load(filpath)
transform = torchaudio.transforms.Spectrogram(n_fft=1000)
spec = transform(wave)
rev_trans = torchaudio.transforms.InverseSpectrogram(n_fft=1152)
waveform = rev_trans(spec) # Here is where the error arises
after checking for the size of “spec” it says it is a tensor of [2, 501, 5295] size.