How to add an "epsilon" to the .log_prob method?

phan_phan · October 4, 2019, 2:25pm

Hi,

Doing predictions of 3d trajectories, I am using a LSTM to output the parameters of a distribution (3d gaussian). Then, computing the loss : -dist.log_prob(ground_truth).mean(), to maximize the likelihood of the ground truth values.

Instead of log(prob), I would like the method to compute log(prob + espilon). Is it possible ?

The issue is : log_prob worked on a test dataset, but on the real, noised dataset, the loss explodes after a while. I can see in some trajectories random points that are suddenly really far from the previous and next points. This likely causes some prob to be infinitesimal, which causes some -log_prob() to be really big, which causes the grads to explode.
I think log(prob + epsilon) would solve my problem, but the method .log_prob() does not propose this.

PS : Another solution would be to clip the gradient, but as discussed in this github issue, it is not (yet) doable when using nn.LSTM… only when using nn.LSTMCell.
=> log_prob(prob + eps) would kind of “limit” the log to small values, and most importantly limit the grads to small values too.

Dhawgupta · January 18, 2021, 6:53am

I also need to do something similar, did you have any luck getting around this issue. I don’t want to apply gradient clipping as well and epsilon seems like a good way to keep gradients small.