try to add clip_grad_norm_()
. Update in your code:
loss.backward()
clip_grad_norm_(model.parameters(), max_norm=10)
optimizer.step()
And I would suggest you to refer my post Efficient train/dev sets evaluation. I think it might help you.
try to add clip_grad_norm_()
. Update in your code:
loss.backward()
clip_grad_norm_(model.parameters(), max_norm=10)
optimizer.step()
And I would suggest you to refer my post Efficient train/dev sets evaluation. I think it might help you.