How to visualize the actual convolution filters in CNN

Sohrab_Salimian · February 20, 2018, 3:37pm

Hey all just wondering how can I visualize the actual convolution filters in a CNN, i already can display the output of the convolution when an input is given to it I just wanted to know how I can display the actual convolution filter

ptrblck · February 20, 2018, 3:58pm

You could get the weights and use matplotlib for it:

conv1 = nn.Conv2d(3, 1, 3)
weight = conv1.weight.data.numpy()
plt.imshow(weight[0, ...])

Sohrab_Salimian · February 20, 2018, 4:35pm

I see that makes alot of sense thank you very mcuh!

PGG-DeepAI · September 21, 2020, 10:10pm

Im trying to reach the different filters of conv2d layers in a resnet, can you help me getting there please?
If I understand it correctly, this only get the first Conv2d layer filter??

weight = conv1.weight.data.numpy()

Or does the operation unfold all the Conv2d weights over conv1? Thanks in advance!

ptrblck · September 22, 2020, 4:43am

This operation returns the weight tensors, which contains all filters.
Note that the code snippet is a bit old by now and you shouldn’t use the .data attribute anymore.

PGG-DeepAI · September 22, 2020, 11:51pm

Thank you for your guidance!
So when I do that operation I get a 8x8 grid filters (so 64 filters of variable sizes), doing that over a resnet18. Visualizing the resnet18 model feature maps, I see the model is composed of 72 layers (ResNet blocks are included). If all weights are corresponding to filters (avgpool and cn) I imagine that is showing “filters” of also Relu layers and only 8 untrainable layers are excluded those corresponding to the ResNet skip blocks??
You are everywhere here, reading you a lot. Thank you for your effort and teachings!

ptrblck · September 23, 2020, 6:25am

Be a bit careful about the shape of the weight parameter.
The filters in nn.Conv2d are stored as [output_channels=nb_filters, input_channels, kernel_height, kernel_width].
In the default setup, each filter (number of filters is defined by out_channels) will use all input channels to calculate its activation map.
Have a look as CS231n - Convolutional Layer for more information on the shape of conv layers.

No, nn.ReLU() doesn’t have any trainable parameters and thus no filters.
Note that the conv.weight parameter gives you the convolution filters, not the activations in case you are mixing these up.

PGG-DeepAI · September 23, 2020, 4:26pm

Re-thank you!
I review again the output, I’m getting 64 filters of 7x7 and that is bothering me because it doesn’t match the model layers and neither do the filters size!

If I print the model I see 72 layers of those, 20 Conv2d layers (matches layers display in mapping showing Conv2d) of those I see only the first has a 7x7 filter (…kernel (7,7) 16 others 3x3, and 3 1x1.
Maxpool, and AvgPool don’t have trainable parameters also…
So what I’m seeing in the filter output??
¿¿64 times the first filter running through Res blocks and beeing modified?? That’s my best guess but Im quite puzzeled.
If so it means conv1 parameter in fact does NOT store full tensor of weights and to access the other filters I must do something like: filter = model_conv.layer1.0.conv1.weight.clone() BUT Im not able to access layer1-4: 0 and 1 layer blocks, (wich contains the other conv1 tensors) that way.

My code for model:

model_conv = torchvision.models.resnet18(pretrained=True)
for param in model_conv.parameters():
     param.requires_grad = False
num_ftrs = model_conv.fc.in_features
model_conv.fc = nn.Linear(num_ftrs, 2) 
model_conv = model_conv.to(device)
criterion = nn.CrossEntropyLoss()
optimizer_conv = optim.SGD(model_conv.fc.parameters(), lr=0.001, momentum=0.9)
exp_lr_scheduler = lr_scheduler.StepLR(optimizer_conv, step_size=7, gamma=0.1)

Print of model:


ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
    (1): BasicBlock(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (layer2): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (downsample): Sequential(
        (0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): BasicBlock(
      (conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (layer3): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (downsample): Sequential(
        (0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): BasicBlock(
      (conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (layer4): Sequential(
    (0): BasicBlock(
      (conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (downsample): Sequential(
        (0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
        (1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      )
    )
    (1): BasicBlock(
      (conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    )
  )
  (avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
  (fc): Linear(in_features=512, out_features=2, bias=True)
)

Code for visualizing filters:

    def visTensor(tensor, ch=0, allkernels=False, nrow=8, padding=1): 
        n,c,w,h = tensor.shape

        if allkernels: tensor = tensor.view(n*c, -1, w, h)
        elif c != 3: tensor = tensor[:,ch,:,:].unsqueeze(dim=1)

        rows = np.min((tensor.shape[0] // nrow + 1, 64))    
        grid = utils.make_grid(tensor, nrow=nrow, normalize=True, padding=padding)
        plt.figure( figsize=(nrow,rows) )
        plt.imshow(grid.numpy().transpose((1, 2, 0)))


    if __name__ == "__main__":
        filter = model_conv.layer1[0].conv1.weight.clone()
        print(filter.shape)
        visTensor(filter.cpu(), ch=0, allkernels=False)

        plt.axis('off')
        plt.ioff()
        plt.show()

Hope you can give more insights. Tremeandously thankfull for your help!

EDIT: OK, so we can access like this model_conv.layer1[0].conv1.weight.clone() that gives 64 filters of 64 channels of 3x3 size!

ptrblck · September 24, 2020, 12:20am

I’ve edited your post and ask you to not use any expletives in your posts.

You can access different layers by directly calling the attribute, e.g.:

model_conv.layer1[0].conv1.weight
model_conv.layer2[1].conv2.weight
...

PGG-DeepAI · September 24, 2020, 4:23pm

hahah Thanks for your help, sure I will be more carefull about expletives!

So to answer myself:

ResNet18 layer0 (Conv2d) inputs RGB images(3 channels) and outputs 64 channels. Thats why conv1.weight tensor is 64 filters of 3x7x7 (or 64 rgb 7x7 filters), as convolutions make one filter for each output channel per input channel.
In other layers, weight tensor will have, for example, a shape like 64,64,3,3 (in layer1[0]: conv1 and conv2 attribute) and that means 64 filters with 64 channels with size 3x3, making that 4096 3x3 b/w normalized filters.

So to summarize the number of filters in a single Conv2d = number of output channels * number of input channels. And filters will have the same number of channels as input channels in a Conv2d (if input channels = 3 you can make RGB filters joining all in_channels_groups from .weight attribute and thus dividing by 3 that number of filters)

Thanks a lot!

PD: To access even deeper blocks, attribute .convX is “skipped”, for example in ResNet18:

filter = model_conv.layer2[0].downsample[0].weight.clone()

Gives you Conv2d weight torch containing 128 filters with 64 channels of size 1x1

EdenWilly9999 · September 30, 2020, 12:35pm

Hello PGG-DeepAI,
I followed your code and can successfully display the kernels, however, it can only display the greyscale image, when I try to set cmap in plt.imshow, it doesn’t affect. Could you give me some suggestion on that?
In addition, when using “filter = model_conv.layer1[0].conv1.weight.clone()”, it means that we extract the filters of a specific layer and visualize it. However, what I need is to feed an image to my network, run the training and then visualize the kernels, what step should I implement?
Thanks for your help

PGG-DeepAI · October 7, 2020, 3:59pm

Hey!
So I explained as short as I could not to be punished by @ptrblck with my explaitives
Every filter with that code is printed in black and white (as you are normalizing only one channel and one filter at a time) and only the convolution layer that inputs 3 channels (normally the first layer which inputs RGB images) will make a filter torch with the right size for making an RGB filter image (3xFxF beeing f the size of filter) otherwise your filters will make no sense (for ex. trying to divide a 64 channel filter joining 3 channels at a time) but you can try and see what kind of RGB image you make!

For your second question: filters = weights on a convolution layer, you will only get those with a trained model. I imagine you already did this if you are visualizing them! Maybe your are seeking to visualize the activation map?

Cheers!