So somewhere in my implementation of Graves handwriting generation (I should just put it on github, I know…), I have
if hidden.requires_grad:
hidden.register_hook(lambda x: x.clamp(min=-GRAD_CLIPPING, max=GRAD_CLIPPING))
Note that it looks mostly like yours, but
I check whether the parameter wants a gradient (though I have not recently checked whether that is necessary, your error seems to suggest it may),
I don’t overwrite hidden with the return value of register_hook, the documentation says it returns a handle. You can and should just use the variable you used register_hook on. [Edit: I see you are returning v not the return of register hook, so you are good on that, sorry for the confusion.]
It seems that we cannot register a hook on nn.RNN / nn.LSTM because CuDNN does the whole operation in a single CUDA kernel. However, we can register a hook on RNNCell, LSTMCell.
(Ref : https://github.com/pytorch/pytorch/issues/1407)