Visualizing 3D filters from r3d_18

So I have trained an r3d_18 model on a video dataset. I want to visualize the filters of the first convolution layer.
I tried this:
model.state_dict().keys()
This gives:

odict_keys([‘stem.0.weight’, ‘stem.1.weight’, ‘stem.1.bias’, ‘stem.1.running_mean’, ‘stem.1.running_var’, ‘stem.1.num_batches_tracked’, ‘layer1.0.conv1.0.weight’, ‘layer1.0.conv1.1.weight’, ‘layer1.0.conv1.1.bias’, ‘layer1.0.conv1.1.running_mean’, ‘layer1.0.conv1.1.running_var’, ‘layer1.0.conv1.1.num_batches_tracked’, ‘layer1.0.conv2.0.weight’, ‘layer1.0.conv2.1.weight’, ‘layer1.0.conv2.1.bias’, ‘layer1.0.conv2.1.running_mean’, ‘layer1.0.conv2.1.running_var’, ‘layer1.0.conv2.1.num_batches_tracked’, ‘layer1.1.conv1.0.weight’, ‘layer1.1.conv1.1.weight’, ‘layer1.1.conv1.1.bias’, ‘layer1.1.conv1.1.running_mean’, ‘layer1.1.conv1.1.running_var’, ‘layer1.1.conv1.1.num_batches_tracked’, ‘layer1.1.conv2.0.weight’, ‘layer1.1.conv2.1.weight’, ‘layer1.1.conv2.1.bias’, ‘layer1.1.conv2.1.running_mean’, ‘layer1.1.conv2.1.running_var’, ‘layer1.1.conv2.1.num_batches_tracked’, ‘layer2.0.conv1.0.weight’, ‘layer2.0.conv1.1.weight’, ‘layer2.0.conv1.1.bias’, ‘layer2.0.conv1.1.running_mean’, ‘layer2.0.conv1.1.running_var’, ‘layer2.0.conv1.1.num_batches_tracked’, ‘layer2.0.conv2.0.weight’, ‘layer2.0.conv2.1.weight’, ‘layer2.0.conv2.1.bias’, ‘layer2.0.conv2.1.running_mean’, ‘layer2.0.conv2.1.running_var’, ‘layer2.0.conv2.1.num_batches_tracked’, ‘layer2.0.downsample.0.weight’, ‘layer2.0.downsample.1.weight’, ‘layer2.0.downsample.1.bias’, ‘layer2.0.downsample.1.running_mean’, ‘layer2.0.downsample.1.running_var’, ‘layer2.0.downsample.1.num_batches_tracked’, ‘layer2.1.conv1.0.weight’, ‘layer2.1.conv1.1.weight’, ‘layer2.1.conv1.1.bias’, ‘layer2.1.conv1.1.running_mean’, ‘layer2.1.conv1.1.running_var’, ‘layer2.1.conv1.1.num_batches_tracked’, ‘layer2.1.conv2.0.weight’, ‘layer2.1.conv2.1.weight’, ‘layer2.1.conv2.1.bias’, ‘layer2.1.conv2.1.running_mean’, ‘layer2.1.conv2.1.running_var’, ‘layer2.1.conv2.1.num_batches_tracked’, ‘layer3.0.conv1.0.weight’, ‘layer3.0.conv1.1.weight’, ‘layer3.0.conv1.1.bias’, ‘layer3.0.conv1.1.running_mean’, ‘layer3.0.conv1.1.running_var’, ‘layer3.0.conv1.1.num_batches_tracked’, ‘layer3.0.conv2.0.weight’, ‘layer3.0.conv2.1.weight’, ‘layer3.0.conv2.1.bias’, ‘layer3.0.conv2.1.running_mean’, ‘layer3.0.conv2.1.running_var’, ‘layer3.0.conv2.1.num_batches_tracked’, ‘layer3.0.downsample.0.weight’, ‘layer3.0.downsample.1.weight’, ‘layer3.0.downsample.1.bias’, ‘layer3.0.downsample.1.running_mean’, ‘layer3.0.downsample.1.running_var’, ‘layer3.0.downsample.1.num_batches_tracked’, ‘layer3.1.conv1.0.weight’, ‘layer3.1.conv1.1.weight’, ‘layer3.1.conv1.1.bias’, ‘layer3.1.conv1.1.running_mean’, ‘layer3.1.conv1.1.running_var’, ‘layer3.1.conv1.1.num_batches_tracked’, ‘layer3.1.conv2.0.weight’, ‘layer3.1.conv2.1.weight’, ‘layer3.1.conv2.1.bias’, ‘layer3.1.conv2.1.running_mean’, ‘layer3.1.conv2.1.running_var’, ‘layer3.1.conv2.1.num_batches_tracked’, ‘layer4.0.conv1.0.weight’, ‘layer4.0.conv1.1.weight’, ‘layer4.0.conv1.1.bias’, ‘layer4.0.conv1.1.running_mean’, ‘layer4.0.conv1.1.running_var’, ‘layer4.0.conv1.1.num_batches_tracked’, ‘layer4.0.conv2.0.weight’, ‘layer4.0.conv2.1.weight’, ‘layer4.0.conv2.1.bias’, ‘layer4.0.conv2.1.running_mean’, ‘layer4.0.conv2.1.running_var’, ‘layer4.0.conv2.1.num_batches_tracked’, ‘layer4.0.downsample.0.weight’, ‘layer4.0.downsample.1.weight’, ‘layer4.0.downsample.1.bias’, ‘layer4.0.downsample.1.running_mean’, ‘layer4.0.downsample.1.running_var’, ‘layer4.0.downsample.1.num_batches_tracked’, ‘layer4.1.conv1.0.weight’, ‘layer4.1.conv1.1.weight’, ‘layer4.1.conv1.1.bias’, ‘layer4.1.conv1.1.running_mean’, ‘layer4.1.conv1.1.running_var’, ‘layer4.1.conv1.1.num_batches_tracked’, ‘layer4.1.conv2.0.weight’, ‘layer4.1.conv2.1.weight’, ‘layer4.1.conv2.1.bias’, ‘layer4.1.conv2.1.running_mean’, ‘layer4.1.conv2.1.running_var’, ‘layer4.1.conv2.1.num_batches_tracked’, ‘fc.weight’, ‘fc.bias’])

I think the first layers weights would be ‘layer1.0.conv1.0.weight’.

I did that using:
cnn_weights = model.state_dict()['layer1.0.conv1.0.weight'].cpu()

The shape:
cnn_weights.shape

Result:

torch.Size([64, 64, 3, 3, 3])

Now how do I plot this? Both XY and XT.

You could iterate all input channels, filters, and the depth of each filter to visualize the 3x3 kernel e.g. via matplotlib.pyplot.imshow.
I’m not aware of any good techniques to visualize 4D kernels.