I have a simple CNN image classifier inspired by the CIFAR-10 example from the tutorial. I want to check the norm of the gradient with respect to all weights of the network. I read here on this forum that one way to do this is accessing Module.parameters() after the .backward() pass
However when trying this:
for p in net.parameters():
print(type(p), p.grad.size())
I have the following output:
<class 'torch.nn.parameter.Parameter'> torch.Size([6, 3, 5, 5])
<class 'torch.nn.parameter.Parameter'> torch.Size([6])
<class 'torch.nn.parameter.Parameter'> torch.Size([16, 6, 5, 5])
<class 'torch.nn.parameter.Parameter'> torch.Size([16])
<class 'torch.nn.parameter.Parameter'> torch.Size([120, 400])
<class 'torch.nn.parameter.Parameter'> torch.Size([120])
<class 'torch.nn.parameter.Parameter'> torch.Size([84, 120])
<class 'torch.nn.parameter.Parameter'> torch.Size([84])
<class 'torch.nn.parameter.Parameter'> torch.Size([10, 84])
<class 'torch.nn.parameter.Parameter'> torch.Size([10])
first line: I recognize the dimension of the weight kernel of layer 1
second line: ???
Every line out of two is an unexpected item to me. Why does it appear in Module.parameters() ? Thanks for helping me understanding this.
Here is my network class
class Net2(nn.Module):
def __init__(self):
super(Net2, self).__init__()
self.conv = nn.Sequential(nn.Conv2d(3, 6, 5),
nn.ReLU(),
nn.MaxPool2d(2, 2),
nn.Conv2d(6, 16, 5),
nn.ReLU(),
nn.MaxPool2d(2, 2))
self.fc = nn.Sequential(nn.Linear(16*5*5, 120),
nn.ReLU(),
nn.Linear(120, 84),
nn.ReLU(),
nn.Linear(84, 10))
def forward(self, x):
x = self.conv(x)
x = x.view(-1, 16*5*5)
x = self.fc(x)
return x