Hello everyone,

I have several questions regarding touchaudio modules:

- There is a hyperparameter named n_fft in transform classes (for example stft, spectrogram,…). What is this exactly?! (this quantity is mostly referred to as freq bins). Is it the number of data points (from sound wave) that can be transformed?
- If win_length > n_fft, there might be zero padding. But what will be the case for the opposite situation?
- When I perform
`T.spectrogram`

on a vector (sound wave), it returns a matrix. The last dimension of this matrix is n_frames. What is this and how can we compute this number?

Many thanks already.