With a resnet50 or larger, I am trying to change the fc layer into a conv layer followed by a fc. I need to do this because of hardware constraint (2048 feature vector too large) and resnet34 may be too small to learn.
from torchvision.models import resnet50
model = resnet50(pretrained=True)
....
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)v
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
(fc): Linear(in_features=2048, out_features=1000, bias=True)
that last fc layer should be this
(heads): EmbeddingHead(
(pool_layer): GlobalAvgPool(output_size=1)
(bottleneck): Sequential(
(0): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
(fc): Linear(in_features=512, out_features=125, bias=True)
)
From what I read this should work
self.bottleneck = nn.Sequential(
nn.Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False),
nn.BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True),
)
self.logits = nn.linear(512, 125)
self.heads = nn.Sequential(*[self.bottleneck, self.logits])
model.fc = self.heads
but i keep getting
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [512, 2048, 1, 1], but got 2-dimensional input of size [8, 2048] instead
The goal is after training I can remove the fc layer at the end and have the 512 feature vectors