Loss.backward() raise "RuntimeError: matrix and matrix expected" only on GPU not on CPU

Hi, all,

I implemented the similar variant of Stack LSTM parser (Dyer+, 2015) and checked that my model works well on CPU.
So to accelerate learning procedure, I tried to migrate my code to GPU compatible code.
But in the backward calculation phase, the following errors are raised only on GPU.

   loss.backward() # loss == Variable containing: 3.6408 [torch.cuda.FloatTensor of size 1 (GPU 0)]
  File "/home/xxx/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/autograd/variable.py", line 167, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables)
  File "/home/xxx/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/autograd/__init__.py", line 99, in backward
    variables, grad_variables, retain_graph)
RuntimeError: matrix and matrix expected at /opt/conda/conda-bld/pytorch_1512387374934/work/torch/lib/THC/generic/THCTensorMathBlas.cu:237

Does anyone help to resolve this issue?
I’m sorry if this issue is duplicated.


[Dyer+, 2015] Transition-Based Dependency Parsing with Stack Long Short-Term Memory

Same issue here. Works fine on CPU but fails on backward on GPU throwing that same RuntimeError.

Definitely sounds like a bug or a missing error check if this works on the CPU but not GPU. Could one of you please provide a minimum script that produces the error, so that I can investigate this?

Sorry for not answering earlier, but the problem “randomly” got solved after uninstalling and re-installing pytorch.

Now I am having again this same issue, after a few non-obvious changes to my code.
I have tried building a minimum version with no success so far, GPU works fine.