Hi,

Doing predictions of 3d trajectories, I am using a LSTM to output the parameters of a distribution (3d gaussian). Then, computing the loss : `-dist.log_prob(ground_truth).mean()`

, to maximize the likelihood of the ground truth values.

Instead of `log(prob)`

, I would like the method to compute ** log(prob + espilon)**. Is it possible ?

The issue is : `log_prob`

worked on a *test* dataset, but on the real, noised dataset, the loss explodes after a while. I can see in some trajectories random points that are suddenly really far from the previous and next points. This likely causes some `prob`

to be infinitesimal, which causes some `-log_prob()`

to be really big, which causes the grads to explode.

I think `log(prob + epsilon)`

would solve my problem, but the method `.log_prob()`

does not propose this.

PS : Another solution would be to clip the gradient, but as discussed in this github issue, it is not (yet) doable when using `nn.LSTM`

… only when using `nn.LSTMCell`

.

=> `log_prob(prob + eps)`

would kind of “limit” the log to small values, and most importantly limit the grads to small values too.