I’ve created a standalone implementation with comprehensive documentation of a generic Video Dataset PyTorch class for data loading. It is very user-friendly, fast, and effective. Torchvision’s transforms pair very well with it for video preprocessing and augmentation as well.
When I was recently looking for a way to load video datasets in PyTorch, I couldn’t find anything that was not slow or smelted into a messy codebase. So, to anyone looking for loading video data, I strongly recommend checking this Github Repository out.
With my two 3090s, it reduced excess data loading time from 800ms to 20ms, making my arbitrary model train 3x faster than when using torchvision’s native video loaders and giving me much better frame selection from videos.