Can I finetune (in my case feature extracting) the pre-trained VisionTransformer model the same way as described in the pytorch Tutorial (Finetuning Torchvision Models — PyTorch Tutorials 1.2.0 documentation) ?
In the tutorial only convolutional neural networks are used, so do I have to modify anything for the VisionTrasformer model (VisionTransformer — Torchvision 0.13 documentation) or does it just work the same way?