Consider an output of a convolution which returns a tensor with F filters where each filter is (W, H, C) tensor (width, height, channels).
Is there a simple way to “unpack” the channels so that there are F * Cgrayscale filters? In other words, converting a 4D tensor of shape (F, W, H, C) to (F*C, W, H, 1) or (F*C, W, H) respectively, such that it gets sliced among the last dimension and stacked in the first?
The output of a convolution will have the following dimensions: [batch_size, number_of_kernels, w, h].
I think you would like to see the kernels, which have the dimensions: [number_of_kernels, input_channels, kernel_width, kernel_height].
Oops you’re right, my first sentence is actually not what I’ve been doing, I’m not dealing with outputs of a convolution, but rather visualizing convolution filters from pre-trained networks.
The filters actually are [num_kernels, num_channels, width, height]
and then ended up trying to reshape the [64, 11, 11, 3] into [64 * 3, 11, 11] without doing the transpose the other way around, but I guess now that I’ve written this it makes no sense to do that since the filters are already stored as [64, 3, 11, 11] so I can just do as you showed weights.view(-1, 11, 11).
As a somewhat related question, any idea if there is a way of doing data.view(F*C, H, W) without having to specify the remaining dimensions? Something like data.flatten(0, 1) which would be equivalent to data.reshape(-1, *data.shape[2:])?