Hidden units saturate in a seq2seq model in PyTorch

dhpollack (David Pollack) October 25, 2017, 5:15pm 3

Proper way to do gradient clipping?

for people trying to just get an answer quickly: torch.nn.utils.clip_grad_norm(mdl_sgd.parameters(),clip) or with in-place clamp: W.grad.data.clamp_(-clip,clip) also similar Q:

Perhaps try this

show post in topic

Home
Categories
Guidelines
Terms of Service
Privacy Policy

Powered by Discourse, best viewed with JavaScript enabled