I built my own loss function to calculate the log-likelihood for a
Mixture Density Network, connected to
LSTM. However, there seems to be a problem somewhere (the loss goes to infinity, and the whole thing collapses).
While I am debugging, I found that after I get the loss, and do the
backward() step, the loss doesn’t have any
loss.grad = None
Is this normal?