Hi,
I’ve realized that torchvision as well as other libraries suck as skvideo and opencv retrieve less amount of frames than ffmpeg.
Video example: https://drive.google.com/file/d/1DIRsDf1SrLOTGbVejoL-PEIlxDPP0LMC/view?usp=sharing
Context:
I’m rencoding a dataset of youtube videos to 25.0 FPS via ffmpeg
Recording (.mkv) contains audio stream and video stream.
Both streams are same duration (according to metadata info from ffprobe)
Audio stream’s duration match the ones stated by metadata
Extracting frames via unix command line with ffmpeg provides a proper amount of frames (3688 in case of the given video example)
ffmpeg -i /media/jfm/Slave/SkDataset/videos/cello/1u3yHICR_BU.mkv %05d.bmp
Extracting frames via imageio matches the one from unix ffmpeg.
w = mimread(PATH, memtest=False)
Using other librarias like torchvision.io video reader, skvideo or even opencv videocapture gather less frames. The amount of frames are less than the expected ones. I’ve tried to debug the video reader from torchvision in order to see if it’s skipping frames with negative stamps but seems not to be the case. Altough by-default seeking point is 0. Anyway reproducing the video seems not to generate black frames indicating (I think) that the video stream contains only positive timestamps (which also makes sense since the whole video has been rencoded)
import skvideo.io
import skvideo.datasets
videodata = skvideo.io.vread(PATH)
Any idea? An example where this issue happens is given above.
btw I think @fmassa is developing the video reader.