Unexpected items returned by model.parameters()

adrien · June 13, 2018, 7:43am

I have a simple CNN image classifier inspired by the CIFAR-10 example from the tutorial. I want to check the norm of the gradient with respect to all weights of the network. I read here on this forum that one way to do this is accessing Module.parameters() after the .backward() pass

However when trying this:

for p in net.parameters():
                
                print(type(p), p.grad.size())

I have the following output:

<class 'torch.nn.parameter.Parameter'> torch.Size([6, 3, 5, 5])
<class 'torch.nn.parameter.Parameter'> torch.Size([6])
<class 'torch.nn.parameter.Parameter'> torch.Size([16, 6, 5, 5])
<class 'torch.nn.parameter.Parameter'> torch.Size([16])
<class 'torch.nn.parameter.Parameter'> torch.Size([120, 400])
<class 'torch.nn.parameter.Parameter'> torch.Size([120])
<class 'torch.nn.parameter.Parameter'> torch.Size([84, 120])
<class 'torch.nn.parameter.Parameter'> torch.Size([84])
<class 'torch.nn.parameter.Parameter'> torch.Size([10, 84])
<class 'torch.nn.parameter.Parameter'> torch.Size([10])

first line: I recognize the dimension of the weight kernel of layer 1
second line: ???

Every line out of two is an unexpected item to me. Why does it appear in Module.parameters() ? Thanks for helping me understanding this.

Here is my network class

class Net2(nn.Module):
    
    def __init__(self):
        
        super(Net2, self).__init__()
        
        self.conv = nn.Sequential(nn.Conv2d(3, 6, 5),
                                 nn.ReLU(),
                                 nn.MaxPool2d(2, 2),
                                 nn.Conv2d(6, 16, 5),
                                 nn.ReLU(),
                                 nn.MaxPool2d(2, 2))
        
        self.fc = nn.Sequential(nn.Linear(16*5*5, 120),
                               nn.ReLU(),
                               nn.Linear(120, 84),
                               nn.ReLU(),
                               nn.Linear(84, 10))
        
    def forward(self, x):
        
        x = self.conv(x)
        x = x.view(-1, 16*5*5)
        x = self.fc(x) 
        
        return x

tom · June 13, 2018, 7:50am

Try

for n, p in net.named_parameters():
                print(n, type(p), p.grad.size())

for an additional clue.

Best regards

Thomas

adrien · June 13, 2018, 8:03am

Hi Thomas,

Thanks a lot for your quick reaction. I was not aware of the existence of Module.named_parameters() and indeed it helped me understanding what was going on. The new output is:

conv.0.weight <class 'torch.nn.parameter.Parameter'> torch.Size([6, 3, 5, 5])
conv.0.bias <class 'torch.nn.parameter.Parameter'> torch.Size([6])
conv.3.weight <class 'torch.nn.parameter.Parameter'> torch.Size([16, 6, 5, 5])
conv.3.bias <class 'torch.nn.parameter.Parameter'> torch.Size([16])
fc.0.weight <class 'torch.nn.parameter.Parameter'> torch.Size([120, 400])
fc.0.bias <class 'torch.nn.parameter.Parameter'> torch.Size([120])
fc.2.weight <class 'torch.nn.parameter.Parameter'> torch.Size([84, 120])
fc.2.bias <class 'torch.nn.parameter.Parameter'> torch.Size([84])
fc.4.weight <class 'torch.nn.parameter.Parameter'> torch.Size([10, 84])
fc.4.bias <class 'torch.nn.parameter.Parameter'> torch.Size([10])

So these additional lines refer to the bias parameters. By the way I didn’t expect the bias to be constant for every channel so I learnt something else…

Best regards,

Adrien