I’ve tried taking the code from https://github.com/spro/practical-pytorch/blob/master/seq2seq-translation/seq2seq-translation.ipynb
and making it work on the GPU, converted the models + variable initializations to .cuda().
Unfortunately getting the following behavior:
File "/home/jd/pytorch/examples/translation/translate_gpu.py", line 296, in trainEpochs
loss = train(input_variable, target_variable, encoder, decoder, encoder_optimizer, decoder_optimizer, criterion)
File "/home/jd/pytorch/examples/translation/translate_gpu.py", line 247, in train
File "/home/jd/anaconda/lib/python2.7/site-packages/torch/autograd/variable.py", line 158, in backward
self._execution_engine.run_backward((self,), (gradient,), retain_variables)
RuntimeError: could not compute gradients for some functions (CudnnRNN)
It works on the CPU. Any hints as to what might be causing this behavior? The RuntimeError sadly doesn’t mention which operations couldn’t get the gradient computed. Is there a list of supported/unsupported ops I could reference somewhere?
Code + data can be found here: https://github.com/ponythewhite/pytorch_fun
I don’t think it’s your fault. We’ll have to look into that. Thanks for the report.
Should I create an issue on Github, or will you? Would be great if I could track the status.
I think it’s similar to this issue. I’ll need to take a look first, and if it’s not resolved I’ll open an issue myself. Thanks!
Awesome, thanks! Would love to help with this, but can’t do much in C++.
I’ve submitted a PR with a fix for that just a few seconds ago It will be included in a next release, that we’ll upload tomorrow.
Thank you! I also ran into this but forgot to report it.
Got 0.1.9 hot off the press and I can confirm it works, approx 3.5x speedup on my GPU. Any ideas for making it faster? Is it a bad idea to make new Variables in the training loop?
Oh no, you’ve found the new version before it has been taken down! There’s been one new bug introduced by my fix for this issue, and we’ll reupload the packages today, so please update it again.
What do you want to make faster? I can’t tell without seeing the code
And no, creating Variables is very very cheap, so you can do it at every step without any problem.
Still working well with the newer 0.1.9. I was referring to the same code as @ponythewhite if you happened to look at that already.
Yup, that’s been fixed! The fix initially introduced another bug, but that’s fixed as well.
I also run into this bug when I am trying to realize Layer-wise Relevance Propagation.
Concretly, I rewrite backward function for each layer in Net, for example
def forward(self, input, weight, bias):
self.eps = 1e-16
self.Z = weight.t()[None, :, :] * input[:, :, None]
self.Zs = self.Z.sum(dim=1, keepdim=True)
if bias is not None:
self.Zs += bias[None, None, :]
return (R[:, None, :] * self.Z / (self.Zs + (2 * (self.Zs >= 0) - 1) * self.eps)).sum(dim=2)
Unfortunately, it reported as follows
Traceback (most recent call last):
File "G:\Explainer\_Geo_Exp_inter\infectionLRP.py", line 117, in <module>
File "D:\Anaconda3\lib\site-packages\torch\tensor.py", line 195, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "D:\Anaconda3\lib\site-packages\torch\autograd\__init__.py", line 98, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: could not compute gradients for some functions
It has confused me for a few days and I don’t really kown what cause it and how to fix it.
Please see the doc here on how to write a proper autograd.Function.
Old style custom Function are not supported anymore.