Model.parameters() is None while training

Hello ~

When I training my model, the loss is NAN, So I print every model.papameter(), and find that is NoneType, so Could some one tell me what will resut in that ?
some code is below:

....
model.zero_grad()
        loss = model((cur_node, pos_node, neg_nodes, edge_type))
        loss.backward()
        
        #nn.utils.clip_grad_norm(model.parameters(), 2)
        
        for p in model.parameters():
            print p.grad.data
            break
            
        optimizer.step()
        #print loss
        total_loss += loss.data
...

the output is

Variable containing:

Columns 0 to 9 
-1.3588  1.2406 -0.1455 -0.1899  1.2147 -0.7475  0.5859  0.4702 -0.5938  0.6821

Columns 10 to 19 
 0.7803  0.2557 -0.2681 -0.2382  0.6268 -0.2070  0.3533 -0.7263  0.6950 -0.9427

Columns 20 to 29 
-0.3123  0.6115  0.0838  0.1744  1.4121  0.0689  0.9217 -0.2162  0.0674 -0.8346

Columns 30 to 39 
-0.7390  0.8208  1.3796 -0.2581 -1.1375 -0.0216 -0.3374 -0.4057  0.1571  0.7450
[torch.FloatTensor of size 1x40]

Variable containing:
-3.7563
[torch.FloatTensor of size 1]

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-89-af6595e1045c> in <module>()
     24 
     25         for p in model.parameters():
---> 26             print p.grad.data
     27             break

AttributeError: 'NoneType' object has no attribute 'data'

Hi,
The .grad field of a Variable is only filled when a gradient is actually computed for this Variable.
You are getting this error because at least one parameter of your model does not have a .grad field, which means that it was not used when computing loss. This can be caused for example if you have an extra nn.Parameter in one of your modules that is not actually used in the forward.
You can change your print to the following to avoid the error:

for p in model.parameters():
    if p.grad is not None:
        print(p.grad.data)
1 Like

OK, get it. And that works, some parameteres are not used for I train the model with SGD. Thanks for your perfect answer.