loss is created and printed but I have an error on loss.backward():
tensor(0.6947, grad_fn=< AddBackward0 >)
Traceback (most recent call last):
File “main.py”, line 140, in
main(config)
File “main.py”, line 80, in main
solver.train()
File “/project/6027897/meyta/Codes/Joint/4-11/Compile_nets.py”, line 346, in train
loss.backward()
File “/home/meyta/.local/lib/python3.8/site-packages/torch/tensor.py”, line 245, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File “/home/meyta/.local/lib/python3.8/site-packages/torch/autograd/init.py”, line 145, in backward
Variable._execution_engine.run_backward(
RuntimeError: could not create a primitive
This error was recently reported in this thread. As I couldn’t reproduce it, I’ve asked the user to create an issue on GitHub, but cannot find any issue with this error message, so could you create an issue instead with an executable code snippet (if possible), please?
Thank you for your reply. The code works properly in some versions, and in some others the error changes to:
File "main.py", line 140, in <module>
main(config)
File "main.py", line 80, in main
solver.train()
File "/project/6027897/meyta/Joint/4-11/Compile_nets.py", line 344, in train
loss.backward()
File "/home/meyta/ENV/lib/python3.6/site-packages/torch/tensor.py", line 221, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph)
File "/home/meyta/ENV/lib/python3.6/site-packages/torch/autograd/__init__.py", line 132, in backward
allow_unreachable=True) # allow_unreachable flag
RuntimeError: label is too far
I tried with a few systems and versions, but I couldn’t find out what was the exact source of the error. So, reproducing errors is not straightforward for me. Now, the code is working on my system but I still face the error on the server (computecanada).
File “run.py”, line 83, in
learn.fit_one_cycle(epochs=30,max_lr=1e-3)
File “C:\Users\v.huseynov\PycharmProjects\dim\learn.py”, line 85, in fit_one_cycle
self._fit(epochs=epochs, cyclic=True,)
File “C:\Users\v.huseynov\PycharmProjects\dim\learn.py”, line 137, in fit
loss.backward()
File “C:\ProgramData\Anaconda3\envs\dim\lib\site-packages\torch_tensor.py”, line 307, in backward
torch.autograd.backward(self, gradient, retain_graph, create_graph, inputs=inputs)
File "C:\ProgramData\Anaconda3\envs\dim\lib\site-packages\torch\autograd_init.py", line 156, in backward
allow_unreachable=True, accumulate_grad=True) # allow_unreachable flag
RuntimeError: could not create a primitive
HI , i got this error running model on cpu device. When i track running in colab track that in starting second epoch ram is dramatically increase. maybe problem in ram???
I don’t know, as I wasn’t able to reproduce or isolate another error with the same error message.
Based on this post I could find some references to potentially missing CPU instructions, but don’t know if that was indeed the root cause.