Weights & Biases

morning,

I am confused. When i store the weights and bias from a convolutional layer using:

kernels_int_in =conv2d.weight.detach().cpu()
conv2d.bias.detach().cpu()

the output shape for weight is [out_channel,in_channel,kernal_height,Kernal_width] and the size of the bias is a 1D vector of size [out_channel].

when the weights are plotted using:

            kernels_t = kernels_int_in - kernels_int_in.min()
            kernels_t = kernels_t / kernels_t.max()
            ind=0
            fig, axarr = plt.subplots(4,32)
            plt.subplots_adjust(wspace=.02, hspace=.02)
            while ind !=128:
                for idx in range(4):
                    for idy in range(32):
                        axarr[idx,idy].imshow(kernels_t[ind,0,:,:].squeeze())
                        # axarr[idx,idy].imshow(act[ind])
                        ind=ind+1

the question is, why are the in_channels stored and what are they telling me? Also over 30 epochs in my case the values of the out_channels do not change significantly, is this suggesting that the kernels are not appropriate since the overall error stays quite high?

Can you expand on this? I’m not sure what you mean by the in_channels being stored. Are you asking about that dimension in general? I see that in your plot making that you’re always indexing into in_channels with 0 (so you might not be seeing all the data in your plots).

Convolution kernel weights are typically pretty small in the first place. For example, in resnet18, some of their kernels weights, on average, are around 0.002 or even 0.0002 (depending on where you sample). I think it would be more appropriate to take a look at their gradients overtime. That will be a better indicator of seeing how those layers are doing.

what i am trying to plot is the actual kernals that are used in conv2D

You need to iterate over your in channels to get all the information about your kernels. Conv2D kernels are 4 dimensional shapes that act upon every single input channel.