Torchvision transforms for videos

braindotai · July 12, 2020, 11:53am

How can I apply transformations like, Resize, CenterCrop, RandomCrop, RandomHorizontalFlip etc… to a read video, of type torch.tensor with four dimensions -> (channels, frames, height, width). (Its okay if I’d have to reshape this…)

ptrblck · July 13, 2020, 1:06am

You could create custom transformations, which would apply the torchvision.transforms in a loop on each sample (or rewrite the transformations so that they would work on batched inputs).
torchvision has some internal video transforms. Since the API isn’t finalized, this code might break and shouldn’t be used, if you rely on backwards compatibility.

lab176344 · July 27, 2021, 11:40am

Don’t know if you are still looking for a solution, might be this will help GitHub - facebookresearch/pytorchvideo: A deep learning library for video understanding research.

braindotai · July 30, 2021, 6:23am

Yup I’ve also recently discovered this… and its awesome
Thanks for your time.

Sara176 · January 20, 2023, 1:19pm

hi
i am a beginner , my inputs are videos processed and stored in tensors of shape
[batch size, number of channels , number of frames , height , width]
i need to apply transformations to use pretrained resnet , is there anything else i can try ?

ptrblck · January 20, 2023, 6:57pm

What did you already try and what kind of issues were you seeing?