I’m currently trying to use torchaudio.transforms.Spectrogram
with power = None
to return the complex spectrogram. The forward
method of this class returns a tensor with a shape of (...,2)
. I am aware that the last two dimensions represent the real and imaginary parts, but it is not clear from the documentation which one is which and how they represent a complex tensor.
For example, if torchaudio.transforms.Spectrogram
returns x
with shape (256,256,2)
, is x[...,0]
the real part or the imaginary part? Also, if x
was a complex tensor, then would it be expressed as x[...,0] + x[...,1]j
or x[...,1] + x[...,0]j
?