I implemented the similar variant of Stack LSTM parser (Dyer+, 2015) and checked that my model works well on CPU.
So to accelerate learning procedure, I tried to migrate my code to GPU compatible code.
But in the backward calculation phase, the following errors are raised only on GPU.
loss.backward() # loss == Variable containing: 3.6408 [torch.cuda.FloatTensor of size 1 (GPU 0)] File "/home/xxx/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/autograd/variable.py", line 167, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph, retain_variables) File "/home/xxx/anaconda3/envs/pytorch_p36/lib/python3.6/site-packages/torch/autograd/__init__.py", line 99, in backward variables, grad_variables, retain_graph) RuntimeError: matrix and matrix expected at /opt/conda/conda-bld/pytorch_1512387374934/work/torch/lib/THC/generic/THCTensorMathBlas.cu:237
Does anyone help to resolve this issue?
I’m sorry if this issue is duplicated.
[Dyer+, 2015] Transition-Based Dependency Parsing with Stack Long Short-Term Memory