I wrote a seq2seq model and tried to implement minimum risk training (Eq. (13) in the paper: Minimum Risk Training for Neural Machine Translation)

I added

`torch.autograd.set_detect_anomaly(True)`

at the beginning of the model.

It outputed an error

`RuntimeError: Function 'ExpBackward' returned nan values in its 0th output.`

According to the tracceback, it has sth to do with the 2nd line of code below:

` seq_nll = seq_nll-torch.max(seq_nll, dim=-1)[0].unsqueeze(1)`

`seq_probs = torch.pow(torch.exp(seq_nll), 0.005) `

`normalizer = torch.sum(seq_probs, dim=-1).view(-1, 1)`

seq_nll is a tensor of shape (64,3) with very negative numbers like `[ -94.5122, -50.0515, -76.2685]`

.

These numbers are log likelihood of different sequences.

The `exp`

operation is to obtain the probability of those sequences.

The `power`

operation is to scale the probs.

The `normalizer`

is the sum of those re-scaled probs.

I guess the problem here is related to those very negative numbers.

Is there a stable way to implemente the above code?