Hello!
I am using the LibreSpeech dataset and the MFCC transform from torchaudio to compute the MFCC coeffecients for each waveform.
The MFCC transform returns a tensor with the following shape:
1 x 24 x N
1 = the waveform
24 = the number of MFCC coefficients
N = the number of time samples (I think?)
I then squeeze this and transpose it so it is now:
N * 24
During training I have tensors of shape:
5 * N * 24
Where 5 is the batch size I am using
How do I reshape this tensor during training or collation so that I can feed it into my Neural net which accepts 24 input_size and outputs 28 (my number of classes).
Many thanks!