Hi, I’m trying to use torchvision.io.video.read_video but I’m noticing some weird behavior and I wonder if there’s an issue with this method or if I just don’t understand something. Torchvision version is up to date (0.9.1).
Situation: I want to read a 30 seconds clip (audio + video), 25 fps, audio at 44100Hz
I use read_video(pth), everything works fine, outputs shape [750, …], [2, 1323008]
– maths: 25frame * 30sec=750, 44100samples*30sec=1323000
I use read_video(pth, pts_units=‘sec’), same
I use read_video(pth, 0, 10, pts_units=‘sec’), shapes are [251, …], [2, 12]
I use read_video(pth, 0, 12500, pts_units=‘pts’), shapes are [25, …], [2, 13312]
I don’t really understand the two last outputs. For me it should be something like [251, …], [2, 441000], and [25, …], [2, 44100].
I would like to be able to only read parts of the video.
Thanks for any help