Sure, I printed the gradient after the backward() and it shows this:
tensor([[nan, nan, nan, …, nan, nan, nan],
[nan, nan, nan, …, nan, nan, nan],
[nan, nan, nan, …, nan, nan, nan],
…,
[nan, nan, nan, …, nan, nan, nan],
[nan, nan, nan, …, nan, nan, nan],
[nan, nan, nan, …, nan, nan, nan]], device=‘cuda:0’)
1 Like