my model contains lstm with batch. model is so simple, But after running some batches not even completed one epoch the model parameters are becoming nan. I tried with learning rate as 0.0 even though they are becoming zero. may i know why? My code is at https://bitbucket.org/Vsbiradar/siamese-network/src/b5d6428c045cdcc1900fa216bece08d0012dab9b/siamese-network/siamese.py?at=master&fileviewer=file-view-default
One obvious possible reason is because your training rate is too large and gradients blow up.
Look at the gradients of your model layers. Also, try gradient clipping.
But my learning rate is 0.And algorithm is SGD. I dont know why the model parameters are even changing with 0 learning rate
Did you try using a different optimizer? Adam for instance?
Yes. Adam and Adadelta. any of these are not working
Did you solve it ? And which version do you apply?