My model can run slowly in cpu, but it cannot run in GPU.
When I was using CUDA(10.0.130), I will get Segmentation fault (core dumped)
.
So I try to use gdb python
, and I got:
Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007f231cdd9cc0 in _IO_vfprintf_internal (s=s@entry=0x7ffd3aee5f00, format=<optimized out>, format@entry=0x7f2319b6e4f0 "expected %s (got %s)", ap=ap@entry=0x7ffd3aee64a8)
at vfprintf.c:1632
1632 vfprintf.c: No such file or directory.
(gdb) where
#0 0x00007f231cdd9cc0 in _IO_vfprintf_internal (s=s@entry=0x7ffd3aee5f00, format=<optimized out>, format@entry=0x7f2319b6e4f0 "expected %s (got %s)", ap=ap@entry=0x7ffd3aee64a8)
at vfprintf.c:1632
#1 0x00007f231ce01a49 in _IO_vsnprintf (string=0x7ffd3aee6070 "expected \235+\373\263U", maxlen=<optimized out>, format=0x7f2319b6e4f0 "expected %s (got %s)", args=0x7ffd3aee64a8)
at vsnprintf.c:114
#2 0x00007f231963d54d in torch::formatMessage(char const*, __va_list_tag*) () from /root/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/lib/libtorch_python.so
#3 0x00007f231963db11 in torch::TypeError::TypeError(char const*, ...) () from /root/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/lib/libtorch_python.so
#4 0x00007f23198c3a71 in torch::utils::(anonymous namespace)::new_with_tensor(c10::TensorTypeId, c10::ScalarType, at::Tensor const&) ()
from /root/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/lib/libtorch_python.so
#5 0x00007f23198c5e20 in torch::utils::legacy_tensor_ctor(c10::TensorTypeId, c10::ScalarType, _object*, _object*) ()
from /root/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/lib/libtorch_python.so
#6 0x00007f2319897a40 in torch::tensors::Tensor_new(_typeobject*, _object*, _object*) () from /root/anaconda3/envs/pytorch/lib/python3.7/site-packages/torch/lib/libtorch_python.so
#7 0x000055b361d92239 in _PyObject_FastCallKeywords ()
#8 0x000055b361dee6b2 in _PyEval_EvalFrameDefault ()
#9 0x000055b361d2f059 in _PyEval_EvalCodeWithName ()
#10 0x000055b361d3033c in _PyFunction_FastCallDict ()
#11 0x000055b361d46a03 in _PyObject_Call_Prepend ()
#12 0x000055b361d3b8d2 in PyObject_Call ()
#13 0x000055b361deb1ab in _PyEval_EvalFrameDefault ()
#14 0x000055b361d2f059 in _PyEval_EvalCodeWithName ()
#15 0x000055b361d3033c in _PyFunction_FastCallDict ()
#16 0x000055b361d46a03 in _PyObject_Call_Prepend ()
#17 0x000055b361d89baa in slot_tp_call ()
#18 0x000055b361d9261b in _PyObject_FastCallKeywords ()
#19 0x000055b361deea79 in _PyEval_EvalFrameDefault ()
#20 0x000055b361d2f059 in _PyEval_EvalCodeWithName ()
#21 0x000055b361d2ff24 in PyEval_EvalCodeEx ()
#22 0x000055b361d2ff4c in PyEval_EvalCode ()
#23 0x000055b361e48a14 in run_mod ()
#24 0x000055b361e51f11 in PyRun_FileExFlags ()
#25 0x000055b361e52104 in PyRun_SimpleFileExFlags ()
#26 0x000055b361e53bbd in pymain_main.constprop ()
#27 0x000055b361e53e30 in _Py_UnixMain ()
#28 0x00007f231cdab830 in __libc_start_main (main=0x55b361d0fd20 <main>, argc=2, argv=0x7ffd3aee7828, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
stack_end=0x7ffd3aee7818) at ../csu/libc-start.c:291
#29 0x000055b361df9052 in _start () at ../sysdeps/x86_64/elf/start.S:103
So what should I change my code, or it is a pytorch bug?