I am working with some pretty large video datasets and till now I was extracting the different frames and loading the sequences of the video by loading the individual frames. Because this is a bit inefficient in terms of disk space, I would like to know how this approach fares against using videoreader and seeking the separate frames (or use opencv seek), regarding the loading speed.
I would assume the actual performance depends on the libraries you are using, your system, and use case. E.g. you could try to use DALI’s video reader which would allow you to load and process the frames on the GPU, or e.g. the
VideoReader class from `torchvision, which would also allow GPU decoding of the frames.