SwinTransformer issue by pytorch pretrain models

februarysea · November 12, 2023, 1:13am

I am currently experimenting with pretrained SwinTransformer models from torchvision and check my model structure by torchinfo. The code is as follows:

from torchvision.models import swin_t, Swin_T_Weights
import torch.nn as nn
from torchinfo import summary

model = swin_t(weights=Swin_T_Weights.DEFAULT)
features = nn.Sequential(*list(model.children())[:-1])
print(summary(model, input_size=(4, 3, 224, 224)))
print(summary(features, input_size=(4, 3, 224, 224))).

The output from torchinfo is a bit strange. These are the results for the last few layers. I believe the output shape of AdaptiveAvgPool2d in the feature module should also be [4, 768, 1, 1], but instead, it is [4, 7, 1, 1]. Can you tell me what’s wrong with my code?

# model
├─LayerNorm: 1-2                                   [4, 7, 7, 768]            1,536
├─AdaptiveAvgPool2d: 1-3                           [4, 768, 1, 1]            --
├─Linear: 1-4                                      [4, 1000]                 769,000

# features
├─LayerNorm: 1-2                                   [4, 7, 7, 768]            1,536
├─AdaptiveAvgPool2d: 1-3                           [4, 7, 1, 1]              --