Torchvision transforms for videos

How can I apply transformations like, Resize, CenterCrop, RandomCrop, RandomHorizontalFlip etc… to a read video, of type torch.tensor with four dimensions -> (channels, frames, height, width). (Its okay if I’d have to reshape this…)

You could create custom transformations, which would apply the torchvision.transforms in a loop on each sample (or rewrite the transformations so that they would work on batched inputs).
torchvision has some internal video transforms. Since the API isn’t finalized, this code might break and shouldn’t be used, if you rely on backwards compatibility.

Don’t know if you are still looking for a solution, might be this will help GitHub - facebookresearch/pytorchvideo: A deep learning library for video understanding research.

1 Like

Yup I’ve also recently discovered this… and its awesome :slight_smile:
Thanks for your time.

hi
i am a beginner , my inputs are videos processed and stored in tensors of shape
[batch size, number of channels , number of frames , height , width]
i need to apply transformations to use pretrained resnet , is there anything else i can try ?

What did you already try and what kind of issues were you seeing?