plz how can i use fine tune vit in regression task nn. Linear(768, num_classes =1)
You can also replace the heads with nn.Identity()
to extract the class token
feature of the encoder
import torch
import torch.nn as nn
import torchvision.models as models
self.f = torchvision.models.vit_b_32()
self.f.heads = nn.Identity()
Hi @ptrblck and @diegoaichele, can you please help me on this: Using nn.Sequential to use models.video.mvit_v2_s for feature extraction