Gradients are becoming nan

Vaijenath_Biradar · September 19, 2017, 7:24pm

I am trying to train siamese network for sentence similarity task. i am using same lstm with pack_padded_sequence to two sentences and getting the norm difference between the two final output of two sequences as similarity and finding the error with actual similarity score and backpropagating. after some time (in first epoch only) gradients are becoming very low and then they are becoming nan.

smth · September 20, 2017, 4:32am

gradient of norm at 0 is inf.
We fixed this instability 2 days ago in the master branch, so that for norm, the subgradient is used instead.

It’ll be part of the next release, or if you are interested you can install the master branch from source via instructions: https://github.com/pytorch/pytorch#from-source

Vaijenath_Biradar · September 20, 2017, 5:03am

okay. Thanks a lot for reply.