I understand that when print list(model.parameters())[0].grad
is None
, it means that the graph is breaking somewhere. According to a previous answer, I shouldn’t be taking .data out of a variable and repacking it as a variable. That I am surely avoiding.
When I print this in the beginning of the first epoch, it gives None, but every epoch after that it gives a weight matrix. I hope this behavior is correct. The problem being that each of those weight matrices are the same and not updating. That is the weight matrices remain the same after each epoch.
I am unable to understand what could be the issue here.
[Update]
So I did more debugging, and found that list(model.parameters())[1].grad, list(model.parameters())[2].grad
and other parameters in the model were giving updated matrices after each epoch. So I think the model is working correctly.
My 0th parameter was the embedding layer weights, which aren’t updating. Should it be set to requires_grad=True
?