(problem is fixed now!!) (problem is fixed now!!, I just need to change my label from 1-5 to 0-4, and then use n_classes = 5)
Hi, there. I am working on reproducing a paper ‘Very Deep Convolutional Networks for Text Classification’ as a final project. I have been runing with no problem on my own GPU.
But, when I run it on a server, the following error pops up.
Traceback (most recent call last):
File "ypf_1_9.py", line 313, in <module>
loss.backward()
File "/mnt/bwpy/single/usr/lib/python3.5/site-packages/torch/tensor.py", line 93, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/mnt/bwpy/single/usr/lib/python3.5/site-packages/torch/autograd/__init__.py", line 90, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: cublas runtime error : an internal operation failed at /dev/shm/cmaclean/python-single/portage/sci-libs/caffe2-0.4.1/work/pytorch-0.4.1/aten/src/THC/THCBlas.cu:249
I am new to deep learning, my knowledge is limited. It could be my code is wrong somewhere but I can not tell right now. I have to get it working on the server to run hundreds of hours. Any help is appreciated!