Getting nan for gradients with LSTMCell

Stella · April 25, 2018, 7:20pm

We are doing a customized LSTM using LSTMCell, on a binary classification, loss is BCEwithlogits.

We traced the problem back to loss.backward(). The calculated loss is not nan, but the gradients calculated from the loss are nans.

Things we’ve tried but not working

Any ideas?! Much appreciated!