Hello,
I used torchaudio to extract the MFCC from an audio so that it has the same amount of frames as the corresponding video I extracted - 525. However, the shape of the MFCC is (1, 20, 525).
How can I reshape the data to get a dimension of (525, 20). So that I can manipulate it alongside the video frames?
Thank you for your help