Feature visualization not repeateable using torchcam for a CNN classifier

Hello PyTorch forums, and thanks for all the help you have provided me so far!

I’m trying to visualize the features (filters) of my Resnet CNN when applied to a binary classification problem. The goal is to see somehow how my model is interpreting images of sawn timber when classifying them as either A or B. Ideally I would like to see a feature map highlighting (big weights) things like knots as the deciding factor for the classification.

My current attempt is using the torch-cam package from github, which to my understanding is the go-to method, yes? My code with some comments/thoughts is:

!pip install torchcam
from torchcam.cams import SmoothGradCAMpp

# model.eval() is exectured prior to the code in this post.

# Using a fixed sample image
img = Image.open('/content/drive/My Drive/data/Lundgrens collection/A/63546298.png')

# Transformations are resize, to tensor, and normalize.
# Shape becomes [1, 3, 50, 783] which is very wide!
img_tensor = torch.unsqueeze(transformations(img),0)

# The image colors look very strange, I think this is due to the normalization, which is ok I guess.
fig, ax = plt.subplots(figsize=(36, 6))
ax.imshow(np.transpose(img_tensor[0], (1, 2, 0)), interpolation='nearest')

### Layer 1
# Hook your model before the forward pass
cam_extractor = SmoothGradCAMpp(model, 'layer1') # What is "layer1" here? See below.
# By default the last conv layer will be selected
out = model(img_tensor)
# Retrieve the CAM
activation_map = cam_extractor(out.squeeze(0).argmax().item(), out)

fig, ax = plt.subplots(figsize=(36, 6))
ax.imshow(activation_map, interpolation='nearest')

The problem is that if I run this code repeatedly the feature visualization is never the same. It is completely different, seemingly as if a random image is being sent to the model. Furthermore, I don’t think I understand the hooking of “layer1”, since as shown below layer1 consists of several sub-layers. So maybe I just don’t understand how to use the torch-cam package? Reading the source code is a bit too complicated to me and I don’t understand it enough to tell what is wrong here.


Thanks for any help or feedback you could provide! I’m no expert on pytorch so don’t hold back if I’m doing something silly.

I’m a little bit confused by this comment since that line of code doesn’t set the model in evaluation mode, it only moves the model from the source device to the "cpu" device.

But maybe your comment means that somewhere else (in code you’ve not shared) you’ve called model.eval()? Just want to make sure, because judging from your description it definitely seems like the model is not in evaluation mode.

You are correct, That code is run previously, I’ll make an edit to try and make it more clear, thanks!

OK, so I looked at the source code for SmoothGradCAMpp, and they’re adding random noise to the input, which explains what you’re describing. If you’re interested, you can find the relevant line of code here: torch-cam/gradcam.py at b1c61003f036b754cc3bf21fc24208da96f25019 · frgfm/torch-cam · GitHub

If you want to get the same output for every subsequent run of your code, the simplest solution is to use a fixed seed for PyTorch’s random number generator;

torch.random.set_manual_seed(42)  # Change '42' to whatever you want

Hope this helps.

Thanks that really explains things!

Why though?? Isn’t the point of feature visualization to understand how the weights behave for an image? In my case, it seems that the noise is dominating and more or less the only thing I see is something related to this noise. The feature maps are always completely unrelated to the previous iteration.

Am I misunderstanding something here, why would you want this? Can I turn the noise completely off?

Edit: I remembered something and found out that I already do this!


The reasoning can be found in the Smooth Grad-CAM ++ paper. They’re computing an average (aka. smoothed) sensitivity map over a small neighborhood of the input image. They make this “neighborhood” by creating multiple noisy versions of the input.

If the noise is dominating, then there might something wrong with your transformations. I would ensure whatever transformations you’re using are appropriate for the CNN model you’re using (looks like ResNet). Also, make sure you scale any images to [0, 1] (or [0, 255]`) for visualizing using matplotlib.

But honestly, there are so many things that could cause your problem, and it’s just a guessing game (at least it is for me) unless we have a reproducible example. Maybe you could make and share a minimal example on Google Colab? Looks like you’re already using Colab, so hopefully not too much of a problem to create a simple example that doesn’t require access to files in your Google Drive, etc.

If you want a method without random additive noise, use GradCAMpp instead of SmoothGradCAMpp.

1 Like

You are much more accurate in your guess than you think - the solution was to use GradCAM! This makes the feature maps repetable and I can now investigate them!

Big thanks to you for your fast and very helpful insights!

1 Like