Feature extraction in torchvision.models.vit_b_16

plz how can i use fine tune vit in regression task nn. Linear(768, num_classes =1)

You can also replace the heads with nn.Identity() to extract the class token feature of the encoder

import torch
import torch.nn as nn
import torchvision.models as models

self.f = torchvision.models.vit_b_32()
self.f.heads = nn.Identity()

Hi @ptrblck and @diegoaichele, can you please help me on this: Using nn.Sequential to use models.video.mvit_v2_s for feature extraction