I have a 3D CNN whose initial layers look like:
(conv1): Conv3d(3, 64, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1))
(pool1): MaxPool3d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(conv2): Conv3d(64, 192, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1))
(pool2): MaxPool3d(kernel_size=3, stride=2, padding=0, dilation=1, ceil_mode=False)
(conv3): Conv3d(192, 384, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1))
(conv4): Conv3d(384, 256, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1))
(conv5): Conv3d(256, 256, kernel_size=(3, 3, 3), stride=(1, 1, 1), padding=(1, 1, 1))
(pool5): MaxPool3d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
Now I have extracted the 1st layer as:
cnn_weights = model.state_dict()['module.conv1.weight'].cpu()
The shape:
cnn_weights.shape
gives:
torch.Size([64, 3, 3, 3, 3])
Can you all please help me understand what each of those 5 dimensions represent, i.e. which one is height, depth, width, …
Also in the
kernel_size=(3, 3, 3)
what does each of the 3 dimensions represent.
Thanks!