How to separate each neuron's weights and bias values for convolution and fc layers?

Ajinkya.Bankar · November 13, 2021, 2:20pm

My network has convolution and fully connected layers, and I want to access each neuron’s weights and bias values. If I use

for name, param in network.named_parameters():
    print(name, param.shape)

I get layer name and whether it is .weight or .bias tensor along with dimensions. How can I get each neuron’s dimensions along with its weights and bias term?

ptrblck · November 14, 2021, 1:25am

Could you explain the terminology a bit, please?
The weights in a layer are often called “neurons”, so I’m unsure what exactly you are looking for.

Ajinkya.Bankar · November 14, 2021, 4:31pm

Thank you for the reply. If we have a fully connected network, as shown in the following figure Network

Then each neuron can be described as
Neuron

I want to access the weights and biases for each neuron. And my network has convolutional layers also. Kindly let me know if you need more details.

ptrblck · November 14, 2021, 10:45pm

Each output of the linear layer will be created by the corresponding row of the weight matrix.
I.e.:

lin = nn.Linear(10, 20)
x = torch.randn(2, 10)

out = lin(x)

would yield an output of [batch_size=2, 20] and each of the 20 output values is calculated via:

i = 10
out0 = torch.matmul(x, lin.weight[i:i+1, :].T) + lin.bias[i]

for the ith output.

InnovArul · November 15, 2021, 12:22am

Just adding to @ptrblck’s answer, layer.weight[i] will give the weights of i'th neuron in a layer (both conv and linear) and layer.bias[i] will give the bias of i'th neuron.

Ajinkya.Bankar · November 15, 2021, 1:46pm

Thanks, @ptrblck and @InnovArul. I want to know from @InnovArul if the convolution layer tensor’s .weight and .bias dimensions are [16,6,5,5] and [16] respectively, then are the neuron weights [i,:,:,:] and bias [i] where i = 1 to 16? Does it mean there are 16 neurons in the convolution layer?

InnovArul · November 15, 2021, 2:37pm

Yes you are right. There are 16 neurons in the convolution layer with weight dimension [16,6,5,5].

Ajinkya.Bankar · November 15, 2021, 2:47pm

@InnovArul, thanks for your help. However, I am confused with the neuron terminology answered at https://stackoverflow.com/a/52273707/15009452
The answer says that the number of neurons in a convolution layer depends on the size of the image. But, if the layer dimensions are [16,6,5,5], there are 6 input channels with 16 filters in the current layer producing 16 output channels having filter dimensions as 5x5. I don’t see any relation with the input size here. Can you please explain? It would be a great help.

InnovArul · November 15, 2021, 3:56pm

ok. I understand the confusion. Maybe I contributed to it too.
Basically weight[i] is the way to get the parameters of the neuron. But the number of neuron calculations is different between the linear layer and conv layer.

Linear layer:
With respect to linear layers, it is clear how many neurons are present in the layer. (layer.weight.shape[0])

Conv layer:
With conv layers, the number of neurons calculation is a bit tricky which is explained in that StackOverflow post. I will rephrase it here as I understand.
In 2D conv layers, a neuron is an entity that looks at only a small slice of the input. Lets take the same example from StackOverflow where input size = 1x27x27 (channels=1, height=27, width=27), slice size = 1x3x3. So there are 9x9=81 slices of size 1x3x3 assuming non-overlapping slices.

Since conv layer holds the invariance property, a set of 81 neurons is supposed to share the same weights (i.e., layer.weight[i]). There can be N sets of 81 neurons in a conv layer (N = layer.weight.shape[0]). So the total number of neurons is N x 81.

Hope this makes sense.

Ajinkya.Bankar · November 15, 2021, 4:36pm

@InnovArul, thanks again for a detailed response. So, as per my example of layer.weight.shape=[16,6,5,5], there must be N=16 sets of neurons. Kindly correct me if I am wrong. May I know how can we count the number of neurons in each set if we know only layer.weight.shape=[16,6,5,5]? Or do we need to consider the original input size and number of input sets to the layer (i.e., 6 in this case)?

InnovArul · November 15, 2021, 7:00pm

To calculate the actual number of neurons within the set, we need to know the number of slices, which can only be calculated if we know input size.

Ajinkya.Bankar · November 15, 2021, 7:49pm

@InnovArul, thanks for your help.

fabiola · April 4, 2022, 7:37am

@InnovArul @ptrblck I have a question linked to @Ajinkya.Bankar 's post. In a convolution layer where the weights are for instance as in @Ajinkya.Bankar case, of shape [16,6,5,5] and the bias of shape [16], before the output of this layer is given to an activation function, the same bias is applied to each of the neurons in a same filter (of shape [6,5,5])?
If that’s the case, is it possible (does it make sense?) to modify the bias applied to each filter so not all neurons of a same filter are added the same bias ?

ptrblck · April 5, 2022, 7:04am

The bias is added after the weight (kernel) was applied to the input, not to the weights directly.
Here is a small example:

conv = nn.Conv2d(3, 16, 3)
weight = conv.weight
bias = conv.bias

x = torch.randn(2, 3, 24, 24)

# standard approach 
out = conv(x)

# manual approach
out_manual = nn.functional.conv2d(x, weight)
out_manual = out_manual + bias[None, :, None, None]

print((out - out_manual).abs().max())
# tensor(4.1723e-07, grad_fn=<MaxBackward1>)

I don’t think adding the bias to the weight would make sense since the model could just train the weights to account for this offset.

fabiola · April 5, 2022, 12:47pm

Thanks for you reply.
I didn’t express myself well, I will rephrase my question.
out_manual is of shape (2, 16, 22, 22) and bias of shape (16). If I’m not mistaken, the same value bias[i] is added to all values of out_manual[2, i, 22, 22]. Would it be possible instead (and does it make sense) to add a bias tensor of shape (1, 1, 22, 22) (with different values "inside it ") to out_manual[2, i, 22, 22]?
The bias tensor would thus be of shape (1, 16, 22, 22) instead of (16).

Thanks !

InnovArul · April 5, 2022, 12:49pm

In my understanding, having a bias of shape (1, 1, 22, 22) would make the conv layer spatially dependent. i.e., the very property of translation invariance that the conv layer is known for, will be lost.
Does that make sense?