Getting each vision transformer's block prediction

binbbaz · March 16, 2022, 1:01pm

I’m curious to know if it’s possible to get prediction from each vision transformer block if I’m using a pretrained model

import timm
model = torch.hub.load('facebookresearch/deit:main', 'deit_small_patch16_224', pretrained=True)
model.eval()
#load data (imagenet)
prediction = model(data)

There are 12 blocks in the loaded model with 384 in and out features.
Will it be possible to get the predictions from each of the 12 blocks?