Model parameters are becoming nan after some batch backpropagation

my model contains lstm with batch. model is so simple, But after running some batches not even completed one epoch the model parameters are becoming nan. I tried with learning rate as 0.0 even though they are becoming zero. may i know why? My code is at https://bitbucket.org/Vsbiradar/siamese-network/src/b5d6428c045cdcc1900fa216bece08d0012dab9b/siamese-network/siamese.py?at=master&fileviewer=file-view-default

One obvious possible reason is because your training rate is too large and gradients blow up.

Look at the gradients of your model layers. Also, try gradient clipping.

But my learning rate is 0.And algorithm is SGD. I dont know why the model parameters are even changing with 0 learning rate

1 Like

Did you try using a different optimizer? Adam for instance?

Yes. Adam and Adadelta. any of these are not working

Did you solve it ? And which version do you apply?

Did you solve it? I have the same problem. Can you give me some advice?