I have simple NN network which is consist of convolution layer followed by Linear layer to train RL agent using REINFORCE algorithm. I use
torch.optim.AdamW, for optimizer
nn.init.kaiming_normal_(m.weight, nonlinearity=‘relu’) for initialization
Although I set the torch.manual_seed(seed), numpy seed and random seed, I sometimes get the following error:
Function ‘AddmmBackward0’ returned nan values in its 1th output.
I have couple of question:
(1) nan value in its 1th output means there is problem in the weights, or the ipnput?
(2) Even when I have seed, Why do I sometimes get the error and sometimes my program successfully ends with reproducible output?
(3) How can I fix it?
I appreciate your help