Import pytorchvideo transformer model

Hi

I am trying to import the last MViT model from model zoo with pretrained weights

link: Model Zoo and Benchmarks — PyTorchVideo documentation

there are many examples for slow_r50/ slowfast_r50 but I could not find any for MViT

for example “x3d s” model can be loaded using the following code

model_name = 'x3d_s'
model = torch.hub.load('facebookresearch/pytorchvideo', model_name, pretrained=True)

I found this example here X3D | PyTorch

but how can I load MViT? I have tried using the combination of arch and depth as is the case with many models (though not all) but did not work.

plus what will be the input shape.

Could you please help? Thanks

You can guide yourself with the github repository to see how it is done for x3d_s and look how it should be done for the transformer.

This is for the x3d_s. As you can see, you can load any model from this file by using the names that are used in the def.

  • x3d_s: line 68
  • x3d_m: line 100
  • x3d_l: line 132
  • etc.

If you now go to the transformer file, you can look for the definitions

  • mvit_base_16x4: line 57
  • mvit_base_32x3: line 92
  • mvit_base_16: line 127
  • etc

So now you only need to choose the one you want and do the same thing in your example code.

# For example
model_name = "mvit_base_32x3"
model = torch.hub.load('facebookresearch/pytorchvideo', model_name, pretrained=True)

Thank you so much for your detailed reply. It’s just not gonna solve this one but also will help me in future.

1 Like