GradCam on cifar10 obtains feature maps 1x1

Hi,

I’m implementing the GradCam algorithm on several architectures, mainly Resnets. The main issue is that the feature maps are very small in the last block, precisely 1x1.

In particular, giving a batch of 64x3x32x32 (CIFAR10) to a Resnet18,
the feature maps after the layer4.conv2 are [1, 512, 1, 1]

Therefore the cams are of size ([64, 1, 1, 1]), which are very small (and not informative at all!)

Is there a simple way to improve this situation? I’m considering interpolating CIFAR10 to 128x128 before training, to have at least more informative CAMs 4x4.

Do you have any advice?

I really appreciate any help you can provide.
G