I am getting ‘nan’ loss after 1st epoch on a large dataset, please tell me all possible reasons for ‘nan’ loss value.
check this :
dict_values([tensor(5.5172, device='cuda:0', grad_fn=<NllLossBackward>), tensor(nan, device='cuda:0', grad_fn=<DivBackward0>), tensor(3.7665, device='cuda:0', grad_fn=<BinaryCrossEntropyWithLogitsBackward>), tensor(inf, device='cuda:0', grad_fn=<DivBackward0>)])