Hi,
I am confused about the output shape from STFT. Given
print (y.shape) s = torch.stft(y, frame_length=128, hop=32) print (s.shape)
we have
torch.Size([3, 7936]) torch.Size([3, 245, 65, 2])
According to the doc, “Returns the real and the imaginary parts together as one tensor of size (∗×N×2), where ∗ is the shape of input signal, N is the number of ω s considered depending on fft_size and return_onesided, and each pair in the last dimension represents a complex number as real part and imaginary part.”, * is the shape of input signal, so I would expect we have a returned tensor with shape [3, 7936, N, 2]. So, we is “245” computed given input length “7936”.
Thanks