If the original fc layer is defined as follow:
fc = nn.Linear(2048,751)
Does it equal to the following two definition?
fc = nn.Conv2d(in_channels=2048, out_channels=751, kernel_size=[1, 1], stride=(1, 1))
or fc = nn.Conv2d(in_channels=128, out_channels=751, 4, 4 )
Assume that the input feature map is 128@55, that is the size of each feature map is 55, the number of classes is 751. I think the your equivalent is incorrect.