I need to extract features for the images before classifying them… or removing the last layer for the classification model using vit-PyTorch
I tried to ignore the classification layer by
self.mlp_head = nn.Identity()
then do that code
from max_vit import MaxViT
from extractor import Extractor
model = MaxViT(
num_classes = 0,
dim = 192,
depth = (2, 6, 14, 2)
)
feature= model(img)
but got the shape for each image torch.Size([1, 1536, 7, 7])
and got the file size 300MB for 1000 images ! … is there anything wrong with the code, please