There is no sampling occuring when I apply offset and n_frames in the load function

ThiruRJST · October 30, 2021, 10:43am

Is n_frames equivalent to librosa’s duration?

pytorch_version: 1.9.0+cu111
torchaudio_version: 0.9.0

I’m currently working in audio classification task where I found that if the audio file is provided entirely in load_audio() function, it provides the stereo output of the sampled version with default sampling_rate

But, if i provide offset and n_frames into the function, it doesn’t resample it.

Actually I have a doubt that is n_frames is equivalent to librosa’s duration.

In my case the offset is 30 and duration is 10 seconds

reading the audio file with offset and n_frames

nateanl · November 3, 2021, 10:32am

Hi @ThiruRJST, thanks for sharing it. In torchaudio, the frame_offset and num_frames are in frames or samples, thus it’s different from librosa’s duration argument.

If you want to extract 10 seconds of audio from the 30th second, you can call the method torchaudio.load(file_path, frame_offset=30*sample_rate, num_frames=10*sample_rate).